Uploaded image for project: 'FOLIO'
  1. FOLIO
  2. FOLIO-1246

Implement Postgres Full Text Search functionality



    • Core: Platform


      In Durham we have discussed that the current approach to searching (which, essentially, is based on translating CQL queries to complex PSQL queries with LIKE/ILIKE/regex and creating BTREE and GIN indices on particular down-cased/unaccented columns, with support from RMB and CQL2PG) may not be flexible enough to provide the level of search functionality and performance expected. (Note: we do not have concrete requirements here, we are assuming that what is expected matches functionality in modern OPAC systems).

      One of the approaches discussed was to use the Postgres built-in "Full Text Search" capability, documented here:


      In short, the essential pieces of full text search in PG are two functions/data structures:

      • tsvector
      • tsquery

      tsvector takes a “document” which can be a single column, multiple columns from a single table or multiple tables, a view, or the entire JSONB document and returns a list (vector) of lexemes, which are normalised tokens (case/accent normalised, with stopwords filtered and roots extrated).

      tsquery does the same but with the user input, e.g a list of words to match. tsquery supports the same booleans as CQL so the translation is rather trivial — & for AND, || for OR, ! for NOT, <-> for PROX

      One can run both tsvector and tsquery at the query time, but this obviously does not scale so normally tsvector is used at the indexing time to build a GIN index on the selected data.
      tsvector/tsquery are the most fundamental data structures, PG includes more advanced functionalities expected in a search engine: ranking, highlighting/snippets, etc. It does not include specialized facet data structure – this needs to be implemented through other methods.

      Since PG 10, tsvector can take the whole JSONB contents in one go:

      This, however, drops the keys and only indexes the “contents” of the JSONB, so we would not be able to “scope” the searches to a given field. One proposed approach here is to use triggers to do advanced indexing:

      TestRail: Results


          Issue Links



                jakub Jakub Skoczen
                jakub Jakub Skoczen
                0 Vote for this issue
                11 Start watching this issue



                  TestRail: Runs

                    TestRail: Cases