Uploaded image for project: 'UX Product'
  1. UX Product
  2. UXPROD-1752

Prevent update conflicts (via optimistic locking): platform support for detection

    XMLWordPrintable

    Details

    • Template:
      UXPROD features
    • Back End Estimate:
      XXL < 30 days
    • Development Team:
      Core: Platform
    • Calculated Total Rank:
      130
    • Rank: Chalmers (Impl Aut 2019):
      R1
    • Rank: Chicago (MVP Sum 2020):
      R1
    • Rank: Cornell (Full Sum 2021):
      R1
    • Rank: Duke (Full Sum 2021):
      R1
    • Rank: 5Colleges (Full Jul 2021):
      R1
    • Rank: FLO (MVP Sum 2020):
      R1
    • Rank: GBV (MVP Sum 2020):
      R1
    • Rank: hbz (TBD):
      R1
    • Rank: Lehigh (MVP Summer 2020):
      R1
    • Rank: Leipzig (Full TBD):
      R1
    • Rank: Leipzig (ERM Aut 2019):
      R1
    • Rank: MO State (MVP June 2020):
      R1
    • Rank: TAMU (MVP Jan 2021):
      R1
    • Rank: U of AL (MVP Oct 2020):
      R1

      Description

      NOTE: Optimistic locking is the solution described in this feature. Libraries ranked this based on the idea of preventing update conflicts, not necessarily based on the specific solution of optimistic locking.

      Problem statement

      In FOLIO, most storage modules follow the "last writer wins" strategy for handling record updates. From the UI perspective this may lead to a situation when a stale record (older version of a give record) previously loaded into the UI may override a more recent version on the server. Hence relevant updates may get lost in the process and the user is not made aware of what has happened.

      Scope: The scope of this issue is to create platform support for optimistic locking which modules can make use of on a case-by-case basis (opt-in). Focus of this feature is on simple "detection" and "prevention" (identifying when a collision has occurred and preventing it). Additional tools and mechanisms for handling collisions when they occur (e.g. diffs, merges etc.)). There are 3 phases, two of which are in scope for this feature:

      1. Detect collisions but do not prevent them. Just log in the system log that a mid-air collision has occurred. There is no instant benefit from this behaviour — the platform remains susceptible to collisions. But, once detection is deployed, we can review the logs to evaluate how often collisions occur and which APIs are at risk.
      2. Prevent an update when a collision gets detected. This builds on detection and additionally prevents the update from taking place. This is a “breaking” change from the API point of view: clients (end-users or batch processes alike) will start seeing an error returned (409 Conflict) when their update collides with another update. The immediate benefit is that we “protect” the system from collisions but we also create a fairly terrible user-experience and probably “break” a lot of batch processes that right now happily update records because FOLIO is so forgiving. This will be implemented as an opt-in feature so functional apps can implement when ready.
      3. (Out of Scope): Tools. Built tools for handling the “409 Conflict” errors. It could be a simple “resubmit my changes anyway” button in the UI that lets the user to force their changes (risk for messing up is with the user) and a way to “retry” for batch processes. It could also be something fancy when end-users can review the conflict and choose which changes to keep and which to drop, etc.

      Proposed solution

      Handling of updates in FOLIO should rely on more explicit semantics, both in the storage (backend) APIs and the way it is communicated to the user through the UI.

      From the storage and API perspective, optimistic locking is the proposed strategy to handle conflicts:

      • optimistic locking – each record state is marked with a "version number" (or a timestamp, hash, etc) which is returned to the client along with the record. The client includes the version number during the update and the server checks that the version hasn't changed before it writes the record back. If the record is dirty (version doesn't match) the update is aborted. In practice for a REST API (typical FOLIO uses case) this means using ETag with a combination of If-Match conditional request and 412 (precondition failed) and 409 (conflict) error codes.

      In general, optimistic locking is used when the risk of collisions (updates to the same record) is low and when the lock granularity is high ((ie duration of any given update is short).

      Use cases collected from community add others that seem likely

      • Not frequent: 2 users editing the same record at the same time
        • User A and User B editing the same record at the same time (not frequent) – users, orders, instances, holdings, items, requests
        • User A editing an item and User B creating a request for that item
        • User A editing and item and User B putting that item on course reserve at the same time
        • User A editing an invoice and User B trying to approve the same invoice at the same time
        • User A editing an item and User B deleting the item before User A's edits are saved (see UIIN-730)
        • User A editing a request and User B cancelling the request before User A's edits are saved (see UIREQ-344)
        • When attempting to update holdings and their items concurrently the holdings updates will ever so often interfere with the item updates, effectively nullifying the latter (see MODINVSTOR-516). This particular item is being addressed via RMB-388.
        • User A and User B generating a new number using the number generator for call number or accession number (number generator runs separate queries for selecting and incrementing the number (GBV); not relevant if FOLIO combines select and increment into one query) (not a challenge in FOLIO because the functionality does not exist)
      • More frequent: 1 user and system trying to act on the same record, either individual records or batch
        • User A editing a user and system batch process is updating lots of users
        • User A editing an instance/holding/item and data import updating the same record (consider the DI redesign that is taking place now)
        • User A editing an item and checkout trying to update the item status
        • User A editing an item and bulk renewal trying to update the item
        • User A editing a budget and system applying a transaction to that budget at the same time
        • User A editing an instance/holdings/item after data import ran in Preview mode but before the data import changes were committed
        • User A editing a request while the request is being expired (request expiration date or hold shelf expiration date) - rare
      • Two automated processes acting on the same record
        • Checkout happening and updating status on an item record at the same time as import updating the item
        • Data import happening at 2 libraries within the same tenant, affecting the same record (e.g. 5 Colleges processing new cataloging records)

      User impact

      This approach will not prevent collisions, but it would notify the user when they happen and offer them a choice. Something like, "Sorry AgentB has already updated the record and your working copy might not be up-to-date? Would you like to:" (a) Update anyway (b) Reload.

      • OL means that in certain situations the update operation will fail which needs to be communicated to the user, The UI should then allow the user to choose the next step, e.g by refreshing the state of the record in the browser and re-applying original changes.

      This situation can happen when multiple data imports are happening at the same time (or data import and a user acting on the same record at the same time) and can affect many records at the same time. Cleanup can then be very time-consuming and confusing.

        TestRail: Results

          Attachments

            Issue Links

              Activity

                People

                Assignee:
                jakub Jakub Skoczen
                Reporter:
                jakub Jakub Skoczen
                Votes:
                0 Vote for this issue
                Watchers:
                26 Start watching this issue

                  Dates

                  Created:
                  Updated:
                  Resolved:

                    TestRail: Runs

                      TestRail: Cases