Uploaded image for project: 'FOLIO'
  1. FOLIO
  2. FOLIO-2050

SPIKE Design batch create / update API endpoint standard



    • Template:
    • Development Team:
      Core: Platform



      There have been frequent requests for batch APIs for various types of records in FOLIO

      Some modules have started to implement batching, either in response to this, or performance concerns.

      These implementations all differ, I think it could be valuable to try to decide on a pattern for these.

      Framing Questions

      Size expectations or restrictions

      Jon Miller has suggested that we may want to use this to load typical batch sizes of 100 .. 2000 records.

      Synchronous or asynchronous response

      Should the server not respond until the batch processing has finished, or should it respond promptly (maybe after some validation) with the ability to monitor the status of the operation?

      How does this affect the client?

      Is this decision affected by the size of batch we allow, as that is likely a primary component of latency?

      Should a response include complete representation of all of the records created, or references or even no information at all (except failures, depending upon the question below)?

      Complete or partial success / failure

      Should a batch only succeed if all records and valid, or should it be acceptable for some records to be invalid?

      What should happen if all of the records are valid however persistence of some of them fails (this is likely related to the transactions topic below)?

      Database transactions (specific to storage modules)

      Should the records that are created be done so in a single transaction?

      How could this decision affect the handling partial success or failure, if we decide we also want this?

      How does this affect resource usage, e.g. a connection has to used exclusively for each batch operation, which could lead to connection contention within the module?

      Streamed processing of records

      To constrain memory usage during batch operations, should the set of records be processed as a stream of single (or multiple records)?

      How does this affect validation, any restrictions on batch size or database transaction semantics?

      For example, if we wanted to validate all records prior to any persistence, we might need to be able to process the stream more than once.

      Processing Semantics

      • Optional ID
      • JSON schema validation

      What requirements are we missing?

        TestRail: Results


            Issue Links



                sekjal Ian Walls
                marcjohnson Marc Johnson
                0 Vote for this issue
                20 Start watching this issue



                    TestRail: Runs

                      TestRail: Cases