Uploaded image for project: 'mod-data-import'
  1. mod-data-import
  2. MODDATAIMP-419

Import fails with "'idx_records_matched_id_gen', duplicate key value violates unique constraint" SRS logs

    XMLWordPrintable

    Details

    • Template:
    • Sprint:
      Folijet Sprint 117
    • Story Points:
      3
    • Development Team:
      Folijet
    • Release:
      R3 2021 Bug Fix
    • Affected Institution:
      Cornell

      Description

      This issue is observed in a Honeysuckle HF3 environment and in Iris, but may have been fixed by Hotfix 1

      Additional info from one of Nick Cappadona's comments below:

      1. Data Import Run A: Create 10,300 SRS MARC Bib and Instances using new Data Import default job profile for Iris (cu_initial-create_10300recs.mrc)
      2. Retrieve Instance UUIDs via SRS MARC Query API:
        { "fieldsSearchExpression": "948.d ^= 'cu-batch'" }
      3. Export the full MARC for the 10,300 records using Data Export default job profile
      4. Create & associate DI profiles
      5. Process the exported MARC (cleanup OCLC identifiers in 035$a)
        Note: When we originally ran this and encountered the error, we  DID NOT strip the 999 ff
      6. Data Import Run B: Update the SRS MARC Bib records using job profile from #4 (1 file: 10,300 records)

      I decided to try running through the test on the Iris reference environment and was unsuccessful in making it through all steps. Here are the results:

      1. DI Run A: Initial create of 10,300 SRS MARC Bib and Instances (hrid: 17): 12 m
      2. Retrieve Instance UUIDs via SRS MARC Query API: 629 ms { "fieldsSearchExpression": "948.d ^= 'cu-batch'" }
      3. Export the full MARC for the 10,300 records (hrid: 8): 3 m
      4. DI profiles ported via API and manually linked/related
      5. Process the exported MARC (cleanup OCLC identifiers in 035$a AND strip 999 ff) with external script: 4 s
      6. DI Run B: Update the 10,300 SRS MARC Bib records: Stuck at 37% after ~10 m

      A follow up to SRS MARC Query API reveals 6,471 of the records remain in their original state, so we can conclude that 3,829 were updated (37%)

      { "fieldsSearchExpression": "(035.a ^= '(OCoLC)oc' or 035.a ^= '(OCoLC)0') and 948.d ^= 'cu-batch'" }

      ======================

      When attempting to import a file - the import fails and the following message is observed in mod-source-record-storage logs
      "idx_records_matched_id_gen", duplicate key value violates unique constraint

      New Import's which fail are attempts to update records from a previous Data Import for a batch of appox 14K that also had issues (specifically the earlier batch failed with – Completed with errors status.)

      Need analysis to understand state of related db table entries for Data Import attempts which are failing with this constraint error

        TestRail: Results

          Attachments

          1. 419_issue.jpg
            419_issue.jpg
            108 kB
          2. cu_initial-create_10300recs.mrc
            10.75 MB
          3. image-2021-06-29-16-09-13-617.png
            image-2021-06-29-16-09-13-617.png
            164 kB
          4. image-2021-06-29-16-58-14-498.png
            image-2021-06-29-16-58-14-498.png
            155 kB
          5. job-execution-12971.json
            311 kB
          6. moddataimp419_action.json
            0.3 kB
          7. moddataimp419_job.json
            0.3 kB
          8. moddataimp419_mapping.json
            7 kB
          9. moddataimp419_match.json
            2 kB
          10. mod-srm-dcnl.csv
            114 kB
          11. mod-srs-dcnl.csv
            257 kB
          12. Screen Shot 2021-04-26 at 9.25.04 PM.png
            Screen Shot 2021-04-26 at 9.25.04 PM.png
            379 kB
          13. Screen Shot 2021-05-12 at 3.41.39 PM.png
            Screen Shot 2021-05-12 at 3.41.39 PM.png
            261 kB
          14. Screen Shot 2021-10-11 at 6.52.06 PM.png
            Screen Shot 2021-10-11 at 6.52.06 PM.png
            208 kB

            Issue Links

              Activity

                People

                Assignee:
                afedasiuk Aliaksandr Fedasiuk
                Reporter:
                cgodfrey Carole Godfrey
                Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                  Dates

                  Created:
                  Updated:

                    TestRail: Runs

                      TestRail: Cases