Details
-
Type:
Bug
-
Status: Closed (View Workflow)
-
Priority:
P3
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Template:
-
Sprint:Firebird Sprint 121, Firebird Sprint 122, Firebird Sprint 123
-
Story Points:5
-
Development Team:Firebird
-
Release:R3 2021
Description
Overview:
When harvesting a collection of 400k records , the harvest completes after only several thousands record have been harvested. Investigation of the mod-oai-pmh.instances table shows that not all the records are streamed from the inventory.
Steps to Reproduce:
Start initial harvest
Expected Results:
All records are harvested
Actual Results:
The harvest finishes after only a portion of the records is harvested. The resumptionToken in the last response is <resumptionToken cursor="cursorvalue"></resumptionToken>. Next record Id and request Ids are missing.
Additional Information:
So far I wasn't able to recreate it our oaipmh testing environment using the same data set but the issue manifest itself on multiple production sites. Here is additional information from the harvesting:
Number of entries loaded into mod-oai-pmh.instances varies (it's been as low as 12K but I've gotten 300K too), and I see the same pattern when watching the system in live time, namely:
- Entries start appearing quickly in mod-oai-pmh.instances
- As soon as they stop appearing, no more will
- I'm able to harvest as many entries that appear in the table
- Empty resumption token is returned
We cannot recreate this issue in our test environments.
TestRail: Results
Attachments
Issue Links
- defines
-
UXPROD-3027 OAI-PMH maintenance - Kiwi
-
- Closed
-
- relates to
-
MODOAIPMH-353 Handle invalid instances that break down the downloading instances process.
-
- Open
-
-
MODOAIPMH-300 Not all instances ids are saved for initial harvest
-
- Closed
-