Uploaded image for project: 'mod-source-record-storage'
  1. mod-source-record-storage
  2. MODSOURCE-121

Improve performance for receiving records from SRS

    XMLWordPrintable

Details

    • Folijet

    Description

      There are five use cases identified for MARC source records retrieval.

      Use Case Comment
      1. Retrieve single records based on their Instance ID in order to feed the View Source in the UI
      2. Retrieve all records Retrieve paged responses as quickly as possible to the OAI-PMH modules.
      3. Retrieve records for a period based on createdDate and updatedDate Retrieve records filtered by their createdDate and updatedDates for certain date- and timespans for OAI-PMH
      4. Retrieve MARC records for data export Identifying records for the export will be Inventory driven so we will be getting underlying MARC records from SRS based on the Instance Id (and later by Holdings Id as well)
      5. Retrieve MARC records based on custom criteria

      High-level strategy to improve SRS performance.

      1. We have to get rid of RMB PostgreSQL Client and CQL because it does not provide us with fine-grained control over how SQL statements and especially Where clauses are generated.
      2. Fields in JSON documents that are used in the search conditions intensively must be taken out as separate columns of the table. Based on the use cases analysis DB indexes must be created for those columns. The consequence is that DB tables must be managed by Liquibase instead of RMB and schema.json
      3. We must define several strategies to retrieve data and each strategy must be used for a particular use case(s).
        1. Retrieve all MARC source records in chunks (it means the row set must be ordered, the most efficient way here to do this is ordering by Id) (Covers use case 2)
        2. Retrieve MARC source records for a period in chunks (it means the row set must be ordered, the most efficient way here to do this is ordering by updatedDate. To make criteria simpler only updatedDate must be considered. CreatedDate should not be used) (Covers use case 3)
        3. Retrieve MARC source records by Instance Ids. InstanceId field must be indexed (Covers use cases 1 and 4)
        4. Retrieve MARC records by custom criteria. Used by use case 5.
      4. Data structures returned by SQL queries as well as returned by HTTP end-points must be narrowed and contain only data valuable for a consumer. (see sql functions in the script)
      5. These changes must be implemented ASAP because of the breaking nature of them. Also migration scripts must be created. (see script for a baseline)

      TestRail: Results

        Attachments

          Issue Links

            Activity

              People

                OleksiiKuzminov Oleksii Kuzminov
                OleksiiKuzminov Oleksii Kuzminov
                Votes:
                0 Vote for this issue
                Watchers:
                16 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved:

                  TestRail: Runs

                    TestRail: Cases