Uploaded image for project: 'mod-source-record-storage'
  1. mod-source-record-storage
  2. MODSOURCE-339

SPIKE: Crash Postgres DB on Rancher during handling DI_INVENTORY_INSTANCE_CREATED_READY_FOR_POST_PROCESSING

    XMLWordPrintable

Details

    • 1
    • Folijet

    Description

      Overview:

      After starting the SRS module we received in the log 10 following messages:

      09:53:46.118 [vert.x-worker-thread-19] DEBUG KafkaConsumerWrapper [54905eqId] Consumer - id: 10 subscriptionPattern: SubscriptionDefinition(eventType=DI_INVENTORY_INSTANCE_CREATED_READY_FOR_POST_PROCESSING, subscriptionPattern=folio\.Default\.\w{1,}\.DI_INVENTORY_INSTANCE_CREATED_READY_FOR_POST_PROCESSING) a Record has been received. key: 99 currentLoad: 5 globalLoad: 5

      09:53:46.314 [vert.x-worker-thread-19] DEBUG taImportKafkaHandler [55101eqId] Data import event payload has been received with event type: DI_INVENTORY_INSTANCE_CREATED_READY_FOR_POST_PROCESSING and correlationId: bbdb7cc6-5183-4ef3-9e60-9760f5995596

      These messages we found in Kibana:

      Jul 13, 2021 9:54:14 AM io.vertx.sqlclient.impl.SocketConnectionBase

      WARNING: Backend notice: severity='WARNING', code='57P02', message='terminating connection because of crash of another server process', detail='The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.', hint='In a moment you should be able to reconnect to the database and repeat your command.', position='null', internalPosition='null', internalQuery='null', where='SQL statement "delete from diku_mod_source_record_storage.marc_indexers where marc_id = NEW.id"

      PL/pgSQL function insert_marc_indexers() line 4 at SQL statement', file='postgres.c', line='2663', routine='quickdie', schema='null', table='null', column='null', dataType='null', constraint='null'

      09:54:18.092 [vert.x-worker-thread-13] ERROR ocessingEventHandler [86879eqId] Failed to handle instance event {}

      io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: Host is unreachable: pg-folio.folijet.svc.cluster.local/10.100.205.81:5432

      Caused by: java.net.NoRouteToHostException: Host is unreachable

      Notes: The error code='57P02' means: "crash_shutdown"

       

      Postgres logs:

      2021-07-13 09:54:14.532 GMT [21099] CONTEXT: SQL statement "delete from diku_mod_source_record_storage.marc_indexers where marc_id = NEW.id"
      PL/pgSQL function insert_marc_indexers() line 4 at SQL statement
      2021-07-13 09:54:14.456 GMT [125] WARNING: terminating connection because of crash of another server process
      2021-07-13 09:54:14.456 GMT [125] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
      2021-07-13 09:54:14.456 GMT [79] HINT: In a moment you should be able to reconnect to the database and repeat your command.
      2021-07-13 09:54:14.632 GMT [21143] FATAL: the database system is shutting down

       

      The same situation with the database in the SRM logs:

      09:54:14.830 [vert.x-eventloop-thread-1] ERROR PostgresClient [291627484eqId] queryAndAnalyze: Connection reset by peer - SELECT

      java.io.IOException: Connection reset by peer

      09:54:24.232 [vert.x-worker-thread-10] ERROR PostgresClient [291636886eqId] Connection refused: pg-folio.folijet.svc.cluster.local/10.100.205.81:5432

      Steps to Reproduce:

      1. Restart SRS on Rancher
      2. Check logs in Kibana

      Expected Results:

             Сorrect operation of the import

      Actual Results:

             See attached logs.

       

      TestRail: Results

        Attachments

          1. postgres.log
            24 kB
            Aliaksandr Fedasiuk
          2. srm.log
            28 kB
            Aliaksandr Fedasiuk
          3. srs.log
            17 kB
            Aliaksandr Fedasiuk

          Issue Links

            Activity

              People

                afedasiuk Aliaksandr Fedasiuk
                afedasiuk Aliaksandr Fedasiuk
                Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved:

                  TestRail: Runs

                    TestRail: Cases