mod-patron-blocks's GET /automated-patron-blocks performs worse in Juniper compared to Iris. For example, in a standard 8-concurrent users test, the API is about 78% worse and with 20 users it is about 259% worse. The API seems to be contributing to the module's high CPU utilization percentage and also the database's CPU utilization percentage. The SQL statement: SELECT jsonb FROM fs09000000_mod_patron_blocks.item_checked_out_event WHERE (jsonb->>'userId') = [userId] seems to be a slow query, taking anywhere from 200ms to over 2 seconds to execute, depends on the number of concurrent users in the test. Additionally, the count_estimate() call wrapping this statement is also detrimental to the overall response time. The query seems to be doing sequential scanning.
WARN PostgresClient EXPLAIN ANALYZE SELECT jsonb FROM fs09000000_mod_patron_blocks.item_checked_out_event WHERE (jsonb->>'userId') = '1625cfe2-b3cf-4dec-ba23-e34a8bf02683' LIMIT 10000
Limit (cost=1000.00..39655.25 rows=2579 width=427) (actual time=2.633..1938.491 rows=57 loops=1)
-> Gather (cost=1000.00..39655.25 rows=2579 width=427) (actual time=2.632..1938.454 rows=57 loops=1)
Workers Planned: 2
Workers Launched: 0
-> Parallel Seq Scan on item_checked_out_event (cost=0.00..38397.35 rows=1075 width=427) (actual time=2.377..1938.090 rows=57 loops=1)
Filter: ((jsonb ->> 'userId'::text) = '1625cfe2-b3cf-4dec-ba23-e34a8bf02683'::text)
Rows Removed by Filter: 515789
Planning time: 0.046 ms
Execution time: 1950.555 ms
Steps to Reproduce:
Run the check-in/out test with 5 concurrent users or more to see the observed behaviors.
Response time to not vary so much from Iris, +/- 5%-20%.
Response time varies greatly and database seems to be struggling to execute the aforementioned query.
Please see Juniper performance test results "Check-in-check-out Test Report (Juniper)" for more details.
Analysis copied from that wiki page:
item_checked_out_event userId=="1d4adbf2-4a0c-4777-b495-f29d45bd1711" LIMIT 10000 already has a correct b-tree index (schema) since September 2020 (
MODPATBLK-48 = PR 37). The long execution time of 2154 ms is caused by the high limit of 10000 records to fetch.
To calculate the user summary the code fetches all events for a user, sorts them by time, and creates a summary of it: rebuild source code
If there were 5000 loans during the last 20 years for the given userId all 5000 item_checked_out_events are loaded and processed.
One possible way to improve the performance: For each user store the timestamp of the latest event that has been processed at the last user summary calculation. Next time when rebuilding the user summary only fetch new events (events after that timestamp). To make this fast create a combined index userId + metadata.createdDate and use a special SQL query jsonb->>'userId'=$1 AND (jsonb->'metadata'->>'createdDate')::timestamp > $2