Uploaded image for project: 'mod-data-export'
  1. mod-data-export
  2. MDEXP-470

Export fails after /data-export/expire-jobs endpoint invocation

    XMLWordPrintable

Details

    • Firebird Sprint 127
    • 3
    • Firebird
    • R3 2021 Bug Fix

    Description

      Overview:
      Endpoint /data-export/expire-jobs is executed every 6 hours to expire jobs with no updates more than 1 hour. In case of huge amount of instances to export, a process takes a couple of hours or more, however as per the logs, interval between 2 updates of job is much less than 1 hour (for example, 2021-11-02T17:36:20.392+00:00 and 2021-11-02T17:36:56.902+00:00). At the same time, if regular expire-jobs endpoint is invoked between 2 job updates, a job expires with status FAILED. On the other hand, if regular expire-jobs endpoint is not invoked, the result can be successful (see Additional Information).

      Possible reason:
      If look at the time when data-export/expire-jobs is invoked:

      and then look at the lastUpdatedDate of job IN_PROGRESS:

      and then look at this job when it fails:

      then it is noticeable that this job cannot expire because last update when job was IN_PROGRESS is at 2021-11-02T17:37:05.370+00:00, and expire-jobs endpoint is invoked at 2021-11-02T17:37:05,373. According to the logic of the code, at least 1 hour must be passed to consider job as expired. However, less than 1 second passed.

      Update:
      If look at the last line of log_1_1.txt, then the following information appears: 2021-11-04T15:22:35.811Z ./docker-entrypoint.sh: line 64: 67 Killed (possible reason - out of memory error). Right after that we can see log_1_2.txt (it seems module is restarted).
      The same situation here log_2_1.txt and here log_2_2.txt, however in this case there is no ./docker-entrypoint.sh: line 64: 67 Killed before restarting the module.

      Evidence:
      Look at the time of job completed:
      Compare with the time of expire-jobs endpoint invocation:

      Steps to Reproduce:

      1. Run export with SearchInstanceCQLQuerySourceFolio.cql

      Expected Results:
      Export finished with 'Completed with errors' (in most cases), and ability to download MARC file.

      Actual Results:
      Export finished with 'Failed' status, MARC file cannot be downloaded.

      Additional Information:
      See . There are two exports with the same cql file, however with different results.
      See also https://wiki.folio.org/pages/viewpage.action?pageId=73539315
      See also another bug (lastUpdatedDate after completedDate):

      TestRail: Results

        Attachments

          1. completed_date_7412.PNG
            completed_date_7412.PNG
            40 kB
          2. expire_jobs_endpoint.PNG
            expire_jobs_endpoint.PNG
            72 kB
          3. expire_jobs_invocation.PNG
            expire_jobs_invocation.PNG
            23 kB
          4. failed_only_6_hours_expire.PNG
            failed_only_6_hours_expire.PNG
            127 kB
          5. failed_vs_completed_with_errors.PNG
            failed_vs_completed_with_errors.PNG
            136 kB
          6. job_failed.PNG
            job_failed.PNG
            60 kB
          7. job_in_progress_before_expire.PNG
            job_in_progress_before_expire.PNG
            57 kB
          8. log_1_1.txt
            4.13 MB
          9. log_1_2.txt
            1.69 MB
          10. log_2_1.txt
            11.54 MB
          11. log_2_2.txt
            10.77 MB
          12. no_out_of_memory.png
            no_out_of_memory.png
            42 kB
          13. screenshot-1.png
            screenshot-1.png
            269 kB
          14. screenshot-2.png
            screenshot-2.png
            200 kB
          15. SearchInstanceCQLQuerySourceFolio.cql
            0.0 kB
          16. updated_after_completed_bug.PNG
            updated_after_completed_bug.PNG
            56 kB

          Issue Links

            Activity

              People

                Oleksandr_Bozhko Oleksandr Bozhko
                Oleksandr_Bozhko Oleksandr Bozhko
                Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved:

                  TestRail: Runs

                    TestRail: Cases