Details
-
New Feature
-
Status: Closed (View Workflow)
-
P2
-
Resolution: Done
-
None
-
None
-
Low
-
Jumbo: > 45 days
-
Folijet
-
-
117
-
R1
-
R1
Description
Team estimation - 90 days
UXPROD-3135 was split into UXPROD-3193 for stability and reliability and UXPROD-3191 for performance; abreaux to close UXPROD-3135 once all issues moved from it to the new features
Current situation or problem:
1.High CPU/Memory consumption on modules
2.Duplicates created upon import
3. SRS can fail when processing message during import
4. If we have infrastructure issue (like DB not available, module being restarted or network failure), we are sending DI_ERROR instead of retrying
5. De-duplication of status messages for progress bar
Investigation required for:
6. Race condition on start (Kafka consumers start working before DB is configured) OR Periodical DB shutdown after SRS restart. Jobs get stuck if not able to update status in DB (messages ACKed even if we could not process them)
7.Kafka consumers stop reading messages eventually, breaking job progress until module restart.
8.mod-data-import stores input file in memory, limiting size of uploaded file and possibly having oom
9.Consumer gets disconnected from Kafka cluster
In scope
Out of scope
Use case(s)
Proposed solution/stories
1*.*Significantly decrease size of payload:
- Remove immutable parts. Instead fetch them on demand and cache locally for reuse.
- Change message handling mechanism (currently relies on pt1 - profile) (optional)
- Move archiving to Kafka instead of module level
2.Make consumers behave idempotent. Add pass-through identifier to de-duplicate messages.
3.Generate "INSTANCE CREATED" from mod-inventory. Consume in SRS to update HRID in BIB and in INVENTORY to continue processing.
4.Do not ACK messages in Kafka if there's not a logic, but infrastructure error/exception. Split failed processing results into 2 categories:
- IO errors - do not ack. retry until fixed
- Business logic - DI_ERROR and Ack current message
Remove unnecessary topics (* ready for post processing and hrid set)
5.De-duplicate status messages per-record while tracking progress
Problems 6,7,8 and 9 require investigation
Possible solution for problem 8 - Split to chunks, put to database, work with database/temp storage. Partially done (to be investigated)
Links to additional info:
Data Import Stabilization plan - Vladimir Shalaev - FOLIO Wiki
Questions
TestRail: Results
Attachments
Issue Links
- defines
-
UXPROD-47 Batch Importer (Bib/Acq)
-
- Analysis Complete
-
-
UXPROD-3135 NFR: R3 2021 Kiwi Data Import Stabilization and Reliability work
-
- Closed
-
- is continued by
-
UXPROD-3210 NFR: R1 2022 Lotus Data import Stability/Reliability work
-
- Closed
-
- is defined by
-
KAFKAWRAP-10 Provide property to set compression type for producer configuration
-
- Closed
-
-
KAFKAWRAP-11 Remove duplicate kafka headers
-
- Closed
-
-
MODDATAIMP-390 Spike: Memory not released after import
-
- Closed
-
-
MODDATAIMP-430 Data Import logs shows file delete ERROR and stack trace
-
- Closed
-
-
MODDATAIMP-440 SPIKE: Data Import job is creating duplicate records with or without any background activity
-
- Closed
-
-
MODDATAIMP-465 Fix memory leaks after import.
-
- Closed
-
-
MODDATAIMP-474 SPIKE: Review PTF job that created more Inventory records than were in the file & fix
-
- Closed
-
-
MODDATAIMP-544 Test and merge PRs on reducing DI event payload
-
- Closed
-
-
MODDATAIMP-548 Provide system properties to set chunk size for each marc record format
-
- Closed
-
-
MODDATAIMP-558 Hosted envs performance became slow after changes for reducing di-payload
-
- Closed
-
-
MODDICONV-200 GET data-import-profiles/actionProfiles returns 200 response (empty body) when oom
-
- Closed
-
-
MODDICORE-198 Fix the effect of DI_ERROR messages when trying to duplicate records on the import job progress bar
-
- Closed
-
-
MODINV-405 Remove zipping mechanism for data import event payloads and use cache for params
-
- Closed
-
-
MODINV-417 SPIKE: LeaveGroup request failed with error: The coordinator is not aware of this member
-
- Closed
-
-
MODINV-493 Update marc-holdings related event handlers according to payload reducing work
-
- Closed
-
-
MODINV-494 Ensure mapping parameters for holdings and item related handlers before mapping
-
- Closed
-
-
MODINV-553 Fix the effect of DI_ERROR messages when trying to duplicate records on the import job progress bar
-
- Closed
-
-
MODINVOICE-251 Remove zipping mechanism for data import event payloads and use cache for params
-
- Closed
-
-
MODINVOICE-314 Provide recordId header for events correlation in mod-invoice
-
- Closed
-
-
MODINVSTOR-794 Memory Leaks: io.vertx.core.impl.DuplicatedContext
-
- Closed
-
-
MODSOURCE-286 Remove zipping mechanism for data import event payloads and use cache for params
-
- Closed
-
-
MODSOURCE-339 SPIKE: Crash Postgres DB on Rancher during handling DI_INVENTORY_INSTANCE_CREATED_READY_FOR_POST_PROCESSING
-
- Closed
-
-
MODSOURCE-390 Fix the effect of DI_ERROR messages when trying to duplicate records on the import job progress bar
-
- Closed
-
-
MODSOURMAN-463 Create storage and API for MappingRules and MappingParams
-
- Closed
-
-
MODSOURMAN-464 Store snapshots of MappingRules and MappingParams to the database
-
- Closed
-
-
MODSOURMAN-465 Remove MappingRules, MappingParams, and JobProfileSnapshot from the event payload
-
- Closed
-
-
MODSOURMAN-466 Remove zipping mechanism for data import event payloads
-
- Closed
-
-
MODSOURMAN-481 Clean-up backend log and gracefully handle ERROR message
-
- Closed
-
-
MODSOURMAN-521 Remove duplicate kafka headers
-
- Closed
-
-
MODSOURMAN-522 Fix the effect of DI_ERROR messages when trying to duplicate records on the import job progress bar
-
- Closed
-
-
MODSOURMAN-575 Add mechanism for detection and logging inability to create/connect Kafka consumers.
-
- Closed
-
-
MODSOURMAN-579 Data Import failing to decode record chunk (Kiwi release)
-
- Closed
-
- mentioned in
-
Page Loading...