Uploaded image for project: 'mod-source-record-storage'
  1. mod-source-record-storage
  2. MODSOURCE-339

SPIKE: Crash Postgres DB on Rancher during handling DI_INVENTORY_INSTANCE_CREATED_READY_FOR_POST_PROCESSING

    XMLWordPrintable

Details

    • 1
    • Folijet

    Description

      Overview:

      After starting the SRS module we received in the log 10 following messages:

      09:53:46.118 [vert.x-worker-thread-19] DEBUG KafkaConsumerWrapper [54905eqId] Consumer - id: 10 subscriptionPattern: SubscriptionDefinition(eventType=DI_INVENTORY_INSTANCE_CREATED_READY_FOR_POST_PROCESSING, subscriptionPattern=folio\.Default\.\w{1,}\.DI_INVENTORY_INSTANCE_CREATED_READY_FOR_POST_PROCESSING) a Record has been received. key: 99 currentLoad: 5 globalLoad: 5

      09:53:46.314 [vert.x-worker-thread-19] DEBUG taImportKafkaHandler [55101eqId] Data import event payload has been received with event type: DI_INVENTORY_INSTANCE_CREATED_READY_FOR_POST_PROCESSING and correlationId: bbdb7cc6-5183-4ef3-9e60-9760f5995596

      These messages we found in Kibana:

      Jul 13, 2021 9:54:14 AM io.vertx.sqlclient.impl.SocketConnectionBase

      WARNING: Backend notice: severity='WARNING', code='57P02', message='terminating connection because of crash of another server process', detail='The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.', hint='In a moment you should be able to reconnect to the database and repeat your command.', position='null', internalPosition='null', internalQuery='null', where='SQL statement "delete from diku_mod_source_record_storage.marc_indexers where marc_id = NEW.id"

      PL/pgSQL function insert_marc_indexers() line 4 at SQL statement', file='postgres.c', line='2663', routine='quickdie', schema='null', table='null', column='null', dataType='null', constraint='null'

      09:54:18.092 [vert.x-worker-thread-13] ERROR ocessingEventHandler [86879eqId] Failed to handle instance event {}

      io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: Host is unreachable: pg-folio.folijet.svc.cluster.local/10.100.205.81:5432

      Caused by: java.net.NoRouteToHostException: Host is unreachable

      Notes: The error code='57P02' means: "crash_shutdown"

       

      Postgres logs:

      2021-07-13 09:54:14.532 GMT [21099] CONTEXT: SQL statement "delete from diku_mod_source_record_storage.marc_indexers where marc_id = NEW.id"
      PL/pgSQL function insert_marc_indexers() line 4 at SQL statement
      2021-07-13 09:54:14.456 GMT [125] WARNING: terminating connection because of crash of another server process
      2021-07-13 09:54:14.456 GMT [125] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
      2021-07-13 09:54:14.456 GMT [79] HINT: In a moment you should be able to reconnect to the database and repeat your command.
      2021-07-13 09:54:14.632 GMT [21143] FATAL: the database system is shutting down

       

      The same situation with the database in the SRM logs:

      09:54:14.830 [vert.x-eventloop-thread-1] ERROR PostgresClient [291627484eqId] queryAndAnalyze: Connection reset by peer - SELECT

      java.io.IOException: Connection reset by peer

      09:54:24.232 [vert.x-worker-thread-10] ERROR PostgresClient [291636886eqId] Connection refused: pg-folio.folijet.svc.cluster.local/10.100.205.81:5432

      Steps to Reproduce:

      1. Restart SRS on Rancher
      2. Check logs in Kibana

      Expected Results:

             Сorrect operation of the import

      Actual Results:

             See attached logs.

       

      TestRail: Results

        Attachments

          1. postgres.log
            24 kB
          2. srm.log
            28 kB
          3. srs.log
            17 kB

          Issue Links

            Activity

              People

                afedasiuk Aliaksandr Fedasiuk
                afedasiuk Aliaksandr Fedasiuk
                Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved:

                  TestRail: Runs

                    TestRail: Cases