Uploaded image for project: 'FOLIO'
  1. FOLIO
  2. FOLIO-704

Investigate a central compound object index

    XMLWordPrintable

Details

    • CP: Roadmap backlog
    • Core: Platform

    Description

      As the front-end is currently desirous of a way to search for users based on related objects (e.g., get a list of all the users with permissions X, Y and Z), it makes sense to evaluate the benefits of maintaining an index of composite objects to facilitate this.

      For example, to execute the previously mentioned request, we have to query the permissions module first to get a list of all permission-association objects that contain permissions X, Y and Z. Then, from this list of objects, we have to compile a list of usernames that these objects possess. Then we have to query the user module to get a list of all user records that match our list of usernames. Then, depending on the return format requested by the client, we may have to join the user objects together with the permissions objects to return the compound result.

      The logic gets more complicated when we add in features like sorting the result set based on a field in a module outside of the users module, or doing pagination. We also run into the issue of degraded performance for the "manual searching" approach when we increase the number of records that need to be searched.

      A possible solution would be to use something that's happy indexing huge amounts of data (e.g. solr) and inserting composite records into this index.

      For the schema, we could adopt a dot notation to indicate the object and subfield. For example, to search by userid, you'd use field "user.id". For a permission name, something like permissionsUser.permission_name.

      The good news is that this makes record retrieval stupidly easy. It's just a straightforward query, and translating between CQL and Solr queries is not hard. We'd be free to introduce as much complex logic as we wanted, as well as choose fields to sort on and suchlike.

      The downside, of course, is building and maintaining the index. How can this be done in a reliable fashion with the least amount of burden on the maintainers of the individual modules that contribute to the composite record?

      For example, we could maintain a message queue and allow modules to push messages to it whenever a record is created, updated or deleted. These messages would then be consumed in order and used to update the composite record index.

      It is worth noting that Solr supports partial document updating...which seems particular relevant here, since no one module would be sending enough information to update an entire document. https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents

      Another option would be to try to implement this with no changes to the storage modules, and have a way to externally monitor for changes and then write them back to the indexes. One way to do this might be some kind of low level database trigger, though I worry that this might really violate the KISS principle. Another possibility could be a filter-level Okapi module that would listen for request types to various modules (e.g. PUT, POST, DELETE) and then create some kind of message into a queue for some process to query the module for changes and write these back to the index.

      Since this seems to potentially overlap several different issues, I'd like to determine fairly quickly whether or not this is a road worth going down, or if we want to try to implement the "manual searching" solution for the short term, at least.

      TestRail: Results

        Attachments

          1. screenshot-1.png
            screenshot-1.png
            39 kB
          2. screenshot-2.png
            screenshot-2.png
            27 kB
          3. screenshot-3.png
            screenshot-3.png
            29 kB

          Issue Links

            Activity

              People

                shale99 shale99
                kurt Kurt Nordstrom
                Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                  Created:
                  Updated:

                  Time Tracking

                    Estimated:
                    Original Estimate - Not Specified
                    Not Specified
                    Remaining:
                    Remaining Estimate - 0 minutes
                    0m
                    Logged:
                    Time Spent - 1 hour
                    1h

                    TestRail: Runs

                      TestRail: Cases