Uploaded image for project: 'mod-finance-storage'
  1. mod-finance-storage
  2. MODFISTO-215

Remove uuid_generate_v4(), it fails in pgpool replication



    • ACQ Sprint 116
    • 2
    • Thunderjet


      uuid_generate_v4() creates a different UUID in each node of a replicated PostgreSQL environment.
      This has been reported for the similar function gen_random_uuid() with Pgpool-II replication in #sys-ops on Slack. From the Pgpool-II manual:

      There is no guarantee that any data provided using a context-dependent mechanism (e.g. random number, transaction ID, OID, SERIAL, sequence), will be replicated correctly on multiple backends.

      RMB documentation mentions it in the upgrading guide for RMB 25 and in connection with "generateId": https://github.com/folio-org/raml-module-builder/#the-post-tenant-api


      • Remove uuid_generate_v4() and "CREATE EXTENSION uuid-ossp" from
      • budget_encumbrances_rollover.sql and
      • pending_payment_cross_module.ftl

      For predefined records use a hard-coded UUID.
      For calculated records re-use the UUID of the primary source record.
      Or use the md5sum trick: If a string (for example a JSONB::text) is unique calculate md5sum(s)::uuid to get a unique UUID.



      It is not simple to replace uuid_generate_v4() in these scripts. They create new records based on existing ones, and they need new ids which have to be different from existing ones and unique.

      • Using a hard-coded id is not an option because many records are created in the same statement and ids needs to be different.
      • Reusing existing ids is not an option when new records are added to a table based on other records in the same table, otherwise we would have the same uuid for different records in the same table.
      • Using md5() only works if we find text that is garanteed to be unique. What if 2 rows in a new table are identical ?

      I looked at using row_number() to generate new ids, but this can be non-deterministic, so that is not an option in the context of pgpool.

      Creating new records only from Java would resolve the issue, but this could cause major performance issues in some cases.

      For something like

      INSERT INTO ...
      SELECT uuid_generate_v4(), ...
      FROM transaction tr

      we can use 


      where the string 'A' is a unique identifier for the script and '1' is a number for the statement in the script. tr.id should be unique for this statement, so the generated uuid should be unique (although it is not as good as a random number generated with uuid_generate_v4())

      It gets more complicated when the generated records are created from joins. For instance:

      INSERT INTO ...
      SELECT uuid_generate_v4(), ...
      FROM transaction tr
      LEFT JOIN fund fund ON fund.id = tr.fromFundId

      In general we can't simply use tr.id because there might be several rows returned with the same tr.id. tr.id||fund.id can't be used either because fund.id could be null. A general solution for this case is:


       This can be simplified to

      md5(concat('A2', tr.id, fund.id))::uuid

      because concat ignores null values.

      There is a problem with md5()::uuid: the generated uuids do not match the regular expression used to verify them everywhere:


      A solution is to use uuid_generate_v5 from uuid-ossp instead. As a bonus, it uses sha1 instead of md5. It does require a namespace, however. A nil uuid works for that. The previous expression becomes

      public.uuid_generate_v5(public.uuid_nil(), concat('A2', tr.id, fund.id))

      This means we have to keep the uuid-ossp extension.


      TestRail: Results


          Issue Links



                Andrei_Makaranka Andrei Makaranka
                julianladisch Julian Ladisch
                0 Vote for this issue
                3 Start watching this issue



                  TestRail: Runs

                    TestRail: Cases