Endpoint /data-export/expire-jobs is executed every 6 hours to expire jobs with no updates more than 1 hour. In case of huge amount of instances to export, a process takes a couple of hours or more, however as per the logs, interval between 2 updates of job is much less than 1 hour (for example, 2021-11-02T17:36:20.392+00:00 and 2021-11-02T17:36:56.902+00:00). At the same time, if regular expire-jobs endpoint is invoked between 2 job updates, a job expires with status FAILED. On the other hand, if regular expire-jobs endpoint is not invoked, the result can be successful (see Additional Information).
If look at the time when data-export/expire-jobs is invoked:
and then look at the lastUpdatedDate of job IN_PROGRESS:
and then look at this job when it fails:
then it is noticeable that this job cannot expire because last update when job was IN_PROGRESS is at 2021-11-02T17:37:05.370+00:00, and expire-jobs endpoint is invoked at 2021-11-02T17:37:05,373. According to the logic of the code, at least 1 hour must be passed to consider job as expired. However, less than 1 second passed.
If look at the last line of log_1_1.txt, then the following information appears: 2021-11-04T15:22:35.811Z ./docker-entrypoint.sh: line 64: 67 Killed (possible reason - out of memory error). Right after that we can see log_1_2.txt (it seems module is restarted).
The same situation here log_2_1.txt and here log_2_2.txt, however in this case there is no ./docker-entrypoint.sh: line 64: 67 Killed before restarting the module.
Steps to Reproduce:
- Run export with SearchInstanceCQLQuerySourceFolio.cql
Export finished with 'Completed with errors' (in most cases), and ability to download MARC file.
Export finished with 'Failed' status, MARC file cannot be downloaded.
See . There are two exports with the same cql file, however with different results.
See also https://wiki.folio.org/pages/viewpage.action?pageId=73539315
See also another bug (lastUpdatedDate after completedDate):