Details
-
New Feature
-
Status: Closed (View Workflow)
-
P2
-
Resolution: Done
-
None
-
None
-
Lotus R1 2022
-
Out of scope
-
Low
-
Jumbo: > 45 days
-
Folijet
-
-
125
-
R1
-
R1
Description
BE est is Jumbo; rough estimate is 120 days
Current situation or problem:
- High CPU/Memory consumption on modules
- Duplicates may created upon import for holdings and items (instances were fixed)
- Confirm that SRS does fail when processing during import
# If we have infrastructure issue (like DB not available, module being restarted or network failure), we are sending DI_ERROR instead of retrying
Investigation required for:
- Race condition on start (Kafka consumers start working before DB is configured) OR Periodical DB shutdown after SRS restart. Jobs get stuck if not able to update status in DB (messages ACKed even if we could not process them)
- Kafka consumers stop reading messages eventually, breaking job progress until module restart.
- mod-data-import stores input file in memory, limiting size of uploaded file and possibly having oom
- Consumer gets disconnected from Kafka cluster
Proposed solution/stories
- Make consumers behave idempotent. Add pass-through identifier to de-duplicate messages.
- Generate "INSTANCE CREATED" from mod-inventory. Consume in SRS to update HRID in BIB and in INVENTORY to continue processing.
- Do not ACK messages in Kafka if there's not a logic, but infrastructure error/exception. Split failed processing results into 2 categories:
- IO errors - do not ack. retry until fixed
- Business logic - DI_ERROR and Ack current message
- Remove unnecessary topics (* ready for post processing and hrid set)
- De-duplicate status messages per-record while tracking progress
One possible solution: Split to chunks, put to database, work with database/temp storage. Partially done (to be investigated)
Links to additional info:
Update to wherever the plan is now stored
Data Import Stabilization plan - Vladimir Shalaev - FOLIO Wiki
Questions
TestRail: Results
Attachments
Issue Links
- continues
-
UXPROD-3193 NFR: R3 2021 Kiwi Data import Stability/Reliability work
-
- Closed
-
- is continued by
-
UXPROD-3429 NFR: R2 2022 Morning Glory Data import Stability/Reliability/Performance work
-
- Closed
-
- is defined by
-
KAFKAWRAP-3 Implement error handler contract for KafkaConsumerWrapper
-
- Closed
-
-
KAFKAWRAP-7 SPIKE: Prevent losing Kafka messages due to infrastructure failure
-
- Closed
-
-
MODDATAIMP-473 SPIKE: Review PTF reports and create tickets for improvements
-
- Closed
-
-
MODDATAIMP-491 Improve logging to be able to trace the path of each record and file_chunks
-
- Closed
-
-
MODDATAIMP-495 SPIKE: Analysis of the possibilities of implementing idempotence through monitoring processed records with the participation of the persistence level
-
- Closed
-
-
MODDATAIMP-500 SPIKE: Design approach for assigning UUIDs for entities created by DI
-
- Closed
-
-
MODDATAIMP-566 SPIKE: Investigate unreachable (garbage) objects
-
- Closed
-
-
MODINV-408 Implement ProcessRecordErrorHandler for Kafka Consumers
-
- Closed
-
-
MODINV-460 SPIKE: Analyze possibilities to implement some constraints on the persistence level for blocking duplicate "holdings" and "items".
-
- Closed
-
-
MODINV-547 Idempotence for createInstanceEventHandler
-
- Closed
-
-
MODINV-548 SPIKE: Investigate approach for events deduplication during holdings and item creation
-
- Closed
-
-
MODINV-584 Improve logging to be able to trace the path of each record and file_chunks mod-inventory
-
- Closed
-
-
MODINV-588 Implement deduplication for Instances
-
- Closed
-
-
MODINV-589 Implement deduplication for Holdings
-
- Closed
-
-
MODINV-590 Implement deduplication for Items
-
- Closed
-
-
MODINV-591 Implement deduplication for Authorities
-
- Closed
-
-
MODINVOICE-252 Implement ProcessRecordErrorHandler for Kafka Consumers
-
- Closed
-
-
MODSOURCE-290 Implement ProcessRecordErrorHandler for Kafka Consumers
-
- Closed
-
-
MODSOURCE-402 Properly handle DB failures during events processing
-
- Closed
-
-
MODSOURMAN-474 Implement ProcessRecordErrorHandler for Kafka Consumers
-
- Closed
-
-
MODSOURMAN-598 Properly handle DB failures during events processing
-
- Closed
-
- relates to
-
UXPROD-3471 NFR: R2 2022 Morning Glory: Implement flow control for Data Import
-
- Closed
-