Details
-
Task
-
Status: Closed (View Workflow)
-
P3
-
Resolution: Duplicate
-
None
-
None
-
customfield_11100 39227
-
8
-
Folijet
Description
I'm able to export 8.7 million instances in 17 minutes, but, it takes 3.5 hours to do the corresponding SRS records. The following is why:
2020-09-02 15:54:39.479 CDT [608878] diku_mod_source_record_storage@folio LOG: execute <unnamed>: select "records_lb"."id", "records_lb"."snapshot_id", "records_lb"."matched_id", "records_lb"."generation", "records_lb"."record_type", "records_lb"."instance_id", "records_lb"."state", "records_lb"."leader_record_status", "records_lb"."order", "records_lb"."suppress_discovery", "records_lb"."created_by_user_id", "records_lb"."created_date", "records_lb"."updated_by_user_id", "records_lb"."updated_date" from "records_lb" limit $1 2020-09-02 15:54:39.479 CDT [608878] diku_mod_source_record_storage@folio DETAIL: parameters: $1 = '5000' 2020-09-02 15:54:39.491 CDT [608878] diku_mod_source_record_storage@folio LOG: execute <unnamed>: select count(*) from "records_lb" 2020-09-02 15:54:45.150 CDT [608878] diku_mod_source_record_storage@folio LOG: execute <unnamed>: select "raw_records_lb"."id", "raw_records_lb"."content" from "raw_records_lb" where "raw_records_lb"."id" = cast($1 as uuid) 2020-09-02 15:54:45.150 CDT [608878] diku_mod_source_record_storage@folio DETAIL: parameters: $1 = '0096d136-d6f5-5410-ad50-58fc6d2ded73' 2020-09-02 15:54:45.153 CDT [608878] diku_mod_source_record_storage@folio LOG: execute <unnamed>: select "id", "content" from "marc_records_lb" where "id" = cast($1 as uuid) 2020-09-02 15:54:45.153 CDT [608878] diku_mod_source_record_storage@folio DETAIL: parameters: $1 = '0096d136-d6f5-5410-ad50-58fc6d2ded73' 2020-09-02 15:54:45.158 CDT [608878] diku_mod_source_record_storage@folio LOG: execute <unnamed>: select "error_records_lb"."id", "error_records_lb"."content", "error_records_lb"."description" from "error_records_lb" where "error_records_lb"."id" = cast($1 as uuid) 2020-09-02 15:54:45.158 CDT [608878] diku_mod_source_record_storage@folio DETAIL: parameters: $1 = '0096d136-d6f5-5410-ad50-58fc6d2ded73' 2020-09-02 15:54:45.158 CDT [608878] diku_mod_source_record_storage@folio LOG: execute <unnamed>: select "raw_records_lb"."id", "raw_records_lb"."content" from "raw_records_lb" where "raw_records_lb"."id" = cast($1 as uuid) 2020-09-02 15:54:45.158 CDT [608878] diku_mod_source_record_storage@folio DETAIL: parameters: $1 = '7afa10f6-7766-5192-bf20-37f784909b33' 2020-09-02 15:54:45.160 CDT [608878] diku_mod_source_record_storage@folio LOG: execute <unnamed>: select "id", "content" from "marc_records_lb" where "id" = cast($1 as uuid) 2020-09-02 15:54:45.160 CDT [608878] diku_mod_source_record_storage@folio DETAIL: parameters: $1 = '7afa10f6-7766-5192-bf20-37f784909b33' 2020-09-02 15:54:45.161 CDT [608878] diku_mod_source_record_storage@folio LOG: execute <unnamed>: select "error_records_lb"."id", "error_records_lb"."content", "error_records_lb"."description" from "error_records_lb" where "error_records_lb"."id" = cast($1 as uuid) 2020-09-02 15:54:45.161 CDT [608878] diku_mod_source_record_storage@folio DETAIL: parameters: $1 = '7afa10f6-7766-5192-bf20-37f784909b33' 2020-09-02 15:54:45.161 CDT [608878] diku_mod_source_record_storage@folio LOG: execute <unnamed>: select "raw_records_lb"."id", "raw_records_lb"."content" from "raw_records_lb" where "raw_records_lb"."id" = cast($1 as uuid) 2020-09-02 15:54:45.161 CDT [608878] diku_mod_source_record_storage@folio DETAIL: parameters: $1 = '7c09c30e-648e-5136-aad0-4c73bcb7155d'
It's doing N+1 queries looking up the raw records, error records, and marc records, rather than doing a JOIN on the tables. Can you switch to doing a JOIN on the tables? I think you will find that there is a dramatic increase in performance.
TestRail: Results
Attachments
Issue Links
- defines
-
UXPROD-559 Migrate Bibliographic & Holdings data
-
- Closed
-
- is duplicated by
-
MODSOURCE-204 SRS revision: Refactor get records to perform optimized database join over n+1 lookups
-
- Closed
-