Uploaded image for project: 'mod-user-import'
  1. mod-user-import
  2. MODUIMP-30

mod-user-import module crashes on loading 30k Users in performance environment

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • P3
    • Resolution: Done
    • None
    • 3.6.4
    • None
    • Core: Platform
    • University of Alabama

    Description

      Overview:
      After doing an investigation, mod-user-import module is failing with OOM because it cannot hold the complete payload for 30k+ Users in the heap. The following are observations:

      Load 1k users - 24.31 seconds CPU - 43% memory - 80%
      Load 10k users - 3 minutes 13 seconds CPU - 46% memory - 81%
      Load 20k users - 6 minutes 40 seconds CPU - 68% memory - 85%
      Load 30k users - Fails with 504 Gateway Time-out. ECS task drains connection with OOM and before it fails memory is 103%

      Environment details:
      61 back-end modules deployed in 110 ECS services
      3 okapi ECS services
      8 m5.large EC2 instances
      2 db.r5.xlarge AWS RDS instance (1 reader, 1 writer)

      mod-user-import container memory setup:
      512 mb

      Steps to Reproduce:

      curl --request POST 'https://okapi-gcp1-us-east-1.int.aws.folio.org/user-import' \
      --header 'x-okapi-tenant: fs09000000' \
      --header 'x-okapi-token: eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJmb2xpbyIsInVzZXJfaWQiOiI5ZWI2NzMwMS02ZjZlLTQ2OGYtOWIxYS02MTM0ZGMzOWE2ODQiLCJpYXQiOjE2MDIyNjMzOTgsInRlbmFudCI6ImZzMDkwMDAwMDAifQ.Nw9qjCyUyaIFYkM-r-7mKcn9028CsS-aUX5t84Tz-2I' \
      --header 'Content-Type: application/json' \
      --data-raw '{
        "users": [
          {
            "username": "Simpson_Stout_0",
            "externalSystemId": 0,
            "barcode": 100000,
            "active": true
          },
          ... elided
        ],
        "totalRecords": 30000,
        "deactivateMissingUsers": false,
        "updateOnlyPresentFields": false,
        "sourceType": "PERF-118"
      }'
      

      Log into some FOLIO environment as User X
      Log into perf env https://goldenrod-cap1.int.aws.folio.org/ as folio user or you can POST
      to authn/login from Postman

      Expected Results:
      All users should be successfully imported to the env

      Actual Results:

      <html>
      <head><title>502 Bad Gateway</title></head>
      <body bgcolor="white">
      <center><h1>502 Bad Gateway</h1></center>
      <hr><center>nginx/1.10.3</center>
      </body>
      </html>
      

      Additional Information:
      Log just before task restarts:

      08 Oct 2020 16:18:17:928 INFO  UserImportAPI [1318722eqId] User creation and update has finished for the current batch.
      08 Oct 2020 16:18:17:928 INFO  UserImportAPI [1318722eqId] Aggregating user import result.
      08 Oct 2020 16:18:17:931 INFO  LogUtil [1318725eqId] org.folio.rest.RestVerticle start  invoking postUserImport
      08 Oct 2020 16:18:17:941 ERROR UserImportAPI [1318735eqId] Failed to add permissions for user with externalSystemId: PERF-118_29988
      

      Attached are the service CPU utilization graph(mod-user-import-service-spike.png), service memory utilization graph(mod-import-user-OOM.png), and heap dump(mod-user-import-hep-java-utils.png) after module crash.

      Interested parties:
      All consumers of POST /user-import API

      TestRail: Results

        Attachments

          Issue Links

            Activity

              People

                Unassigned Unassigned
                varunjavalkar Varun Javalkar
                Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved:

                  TestRail: Runs

                    TestRail: Cases