Uploaded image for project: 'mod-inventory'
  1. mod-inventory
  2. MODINV-400

Errors in Subsequent Updates

    XMLWordPrintable

Details

    • Folijet Sprint 114
    • 0.5
    • Folijet
    • Yes

    Description

      Overview:
      When performing multiple updates (or one update after another) on the same set of records, the following behaviors were observed in the following sequence of jobs launched:

      0. First update, job's status: Completed
      1. Second update, job's status: Completed with Errors
      2. Third update, job's status: Failed
      3. New updates or create jobs could not even start.
      4. Restarted mod-inventory, nothing seemed to happen.

      • Note: When attempting a CREATE job after things got stuck in step #3,¬†only the DI_RAW_MARC_BIB_RECORDS_CHUNK_READ and DI_RAW_RECORDS_CHUNK_PARSED topics had new messages in them. The other DI messages didn't have any new messages.

      Between #1 and #2 we saw the following errors in mod-inventory:

      11:10:16 [] [] [] [] ERROR dateItemEventHandler Error updating inventory Item: org.folio.processing.exceptions.MappingException: java.lang.NullPointerException

      11:10:17 [] [] [] [] ERROR KafkaConsumerWrapper Error while processing a record - id: 20 subscriptionPattern: SubscriptionDefinition(eventType=DI_INVENTORY_ITEM_MATCHED, subscriptionPattern=cap2\.Default\.\w{4,}\.DI_INVENTORY_ITEM_MATCHED)

      io.vertx.core.impl.NoStackTraceThrowable: Failed to process data import event payload

      In mod-srs's log there is this error message:

      org.jooq.exception.DataAccessException: SQL [null]; ERROR: insert or update on table "raw_records_lb" violates foreign key constraint "fk_raw_records_records"

      After restarting mod-inventory, the following errors were logged in mod-inventory's log:

      15:58:59 [] [] [] [] INFO SubscriptionState [Consumer clientId=kafka-cache-reader-events_cache, groupId=kafka-cache-e98dd93c4557] Resetting offset for partition events_cache-0 to offset 26661431.

      Exception in thread "main" java.util.concurrent.TimeoutException

      Sporadically, these errors were logged as well:

      14:36:44 [] [] [] [] WARN ? Thread Thread[vert.x-worker-thread-14,5,main] has been blocked for 90230 ms, time limit is 60000 ms

      14:40:30 [] [] [] [] INFO ConsumerCoordinator [Consumer clientId=consumer-DI_SRS_MARC_BIB_RECORD_MATCHED.mod-inventory-16.3.1-22, groupId=DI_SRS_MARC_BIB_RECORD_MATCHED.mod-inventory-16.3.1] Setting offset for partition cap2.Default.fs09000000.DI_SRS_MARC_BIB_RECORD_MATCHED-0 to the committed offset FetchPosition{offset=2000, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[b-1.temp-data-import-test.kh4zs4.c11.kafka.us-east-1.amazonaws.com:9092 (id: 1 rack: use1-az4)], epoch=0}}

      Could the process that cleans the cache gets blocked or creates a blocker for downstream processing?

      Steps to Reproduce:
      abreaux's video in dataimport_folijet_ptf Slack channel has all the steps https://folio-project.slack.com/archives/G01PFEDAF6H/p1619499196163700

      Expected Results:
      Multiple updates (via the data import mechanism) on the same dataset would work all the time without causing errors or getting DI to get bogged down and unusable for everyone.

      Actual Results:
      See above description.

      Interested parties:
      abreaux OleksiiKuzminov

      TestRail: Results

        Attachments

          Issue Links

            Activity

              People

                ruslan_lavrov Ruslan Lavrov
                mtraneis Martin Tran
                Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved:

                  TestRail: Runs

                    TestRail: Cases