How to optimize large psql query - sql

I'm looking for suggestions/direction on how I can improve this large query here.
When I explain/analyze it, I see some weak spots, such as large over-estimations, and slow sequential scans on joins.
However, after checking some indexes, and digging in, I'm still at a loss as to how I can improve this:
Query:
WITH my_activities AS (
WITH people_relations AS (
SELECT people.id AS people_relations_id, array_agg(DISTINCT type) AS person_relations, companies.id AS company_id, companies.name AS company_name, companies.platform_url AS company_platform_url FROM people
INNER JOIN relationships AS person_relation ON platform_user_id = 6 AND person_relation.person_id = people.id AND person_relation.type != 'Suppressee'
LEFT OUTER JOIN companies ON people.company_id = companies.id
GROUP BY people.id, people.title, companies.id, companies.name, companies.platform_url)
SELECT owner_person.id,
owner_person.full_name,
owner_person.first_name,
owner_person.last_name,
owner_person.title,
owner_person.headshot AS owner_headshot,
owner_person.public_identifier AS owner_public_identifier,
owner_relations.person_relations AS owner_relationships,
owner_relations.company_id AS owner_company_id,
owner_relations.company_name AS owner_company_name,
owner_relations.company_platform_url AS owner_company_platform_url,
recipient_relations.person_relations AS recipient_relationships,
activities.id AS activity_id,
activities.key AS activity_key,
recipient.id AS recipient_id,
recipient.full_name AS recipient_full_name,
recipient.title AS recipient_title,
recipient.headshot AS recipient_headshot,
recipient.public_identifier AS recipient_public_identifier,
recipient_relations.company_name AS recipient_company_name,
recipient_relations.company_platform_url AS recipient_company_platform_url,
recipient_person.type AS recipient_relation,
coalesce(t_posts.id, t_post_likes.id, t_post_comments.id) AS trackable_id,
trackable_type,
coalesce(t_posts.post_date, t_post_comments.created_time, t_post_likes_post.post_date, activities.occurred_at) AS trackable_date,
coalesce(t_posts.permalink, t_post_comments.permalink, t_post_likes_post.permalink) AS trackable_permalink,
coalesce(t_posts.content, t_post_comments_post.content, t_post_likes_post.content) AS trackable_content,
trackable_companies.name AS trackable_company_name,
trackable_companies.platform_url AS trackable_company_platform_url,
t_post_comments.comment as trackable_comment FROM people AS owner_person
INNER JOIN activities ON activities.owner_id = owner_person.id AND activities.owner_type = 'Person'
AND ((activities.key = 'job.changed' AND activities.occurred_at > '2022-01-31 15:09:54') OR
(activities.key != 'job.changed' AND activities.occurred_at > '2022-04-24 14:09:54'))
LEFT OUTER JOIN li_user_activities ON activities.id = li_user_activities.activity_id AND li_user_activities.platform_user_id = 6
AND li_user_activities.dismissed_at IS NULL
LEFT OUTER JOIN icp_ids ON owner_person.id = icp_ids.icp_id
LEFT OUTER JOIN companies as trackable_companies ON trackable_companies.id = activities.trackable_id AND activities.trackable_type = 'Company'
LEFT OUTER JOIN posts as t_posts ON activities.trackable_id = t_posts.id AND activities.trackable_type = 'Post'
LEFT OUTER JOIN post_likes as t_post_likes ON activities.trackable_id = t_post_likes.id AND activities.trackable_type = 'PostLike'
LEFT OUTER JOIN posts as t_post_likes_post ON t_post_likes.post_id = t_post_likes_post.id
LEFT OUTER JOIN post_comments as t_post_comments ON activities.trackable_id = t_post_comments.id AND activities.trackable_type = 'PostComment'
LEFT OUTER JOIN posts as t_post_comments_post ON t_post_comments.post_id = t_post_comments_post.id
LEFT OUTER JOIN people AS recipient ON recipient.id = activities.recipient_id
LEFT OUTER JOIN relationships AS recipient_person ON recipient_person.person_id = recipient.id
INNER JOIN people_relations AS owner_relations ON owner_relations.people_relations_id = owner_person.id
LEFT OUTER JOIN people_relations AS recipient_relations ON recipient_relations.people_relations_id = recipient.id
WHERE ((recipient.id IS NULL OR recipient.id != owner_person.id) ) AND (key != 'asdasd'))
SELECT owner_relationships AS owner_relationships,
json_agg(DISTINCT recipient_relationships) AS recipient_relationships,
id,
jsonb_build_object('id', id, 'first_name', first_name, 'last_name', last_name, 'full_name', full_name, 'title', title, 'headshot', owner_headshot, 'public_identifier', owner_public_identifier, 'profile_url', ('https://' || owner_public_identifier), 'company', jsonb_build_object( 'id', owner_company_id, 'name', owner_company_name, 'platform_url', owner_company_platform_url )) AS owner,
json_agg( DISTINCT jsonb_build_object('id', activity_id,
'key', activity_key,
'recipient', jsonb_build_object('id', recipient_id, 'full_name', recipient_full_name, 'title', recipient_title, 'headshot', recipient_headshot, 'public_identifier', recipient_public_identifier, 'profile_url', ('https://' || recipient_public_identifier), 'relation', recipient_relationships, 'company', jsonb_build_object('name', recipient_company_name, 'platform_url', recipient_company_platform_url)),
'trackable', jsonb_build_object('id', trackable_id, 'type', trackable_type, 'comment', trackable_comment, 'permalink', trackable_permalink, 'date', trackable_date, 'content', trackable_content, 'company_name', trackable_company_name, 'company_platform_url', trackable_company_platform_url)
)) AS data
FROM my_activities
GROUP BY id, first_name, last_name, full_name, title, owner_headshot, owner_public_identifier, owner_relationships, owner_company_id, owner_company_name, owner_company_platform_url
Explain (also seen here: https://explain.dalibo.com/plan/3pJg ):
GroupAggregate (cost=654190.74..655692.10 rows=21448 width=298) (actual time=3170.209..3267.033 rows=327 loops=1)
Group Key: my_activities.id, my_activities.first_name, my_activities.last_name, my_activities.full_name, my_activities.title, my_activities.owner_headshot, my_activities.owner_public_identifier, my_activities.owner_relationships, my_activities.owner_company_id, my_activities.owner_company_name, my_activities.owner_company_platform_url
-> Sort (cost=654190.74..654244.36 rows=21448 width=674) (actual time=3168.944..3219.547 rows=2733 loops=1)
Sort Key: my_activities.id, my_activities.first_name, my_activities.last_name, my_activities.full_name, my_activities.title, my_activities.owner_headshot, my_activities.owner_public_identifier, my_activities.owner_relationships, my_activities.owner_company_id, my_activities.owner_company_name, my_activities.owner_company_platform_url
Sort Method: external merge Disk: 3176kB
-> Subquery Scan on my_activities (cost=638222.87..646193.71 rows=21448 width=674) (actual time=3142.221..3210.966 rows=2733 loops=1)
-> Hash Right Join (cost=638222.87..645979.23 rows=21448 width=706) (actual time=3142.219..3210.753 rows=2733 loops=1)
Hash Cond: (recipient_relations.people_relations_id = recipient.id)
CTE people_relations
-> GroupAggregate (cost=142850.94..143623.66 rows=34343 width=152) (actual time=1556.908..1593.594 rows=33730 loops=1)
Group Key: people.id, companies.id
-> Sort (cost=142850.94..142936.80 rows=34343 width=129) (actual time=1556.875..1560.123 rows=33780 loops=1)
Sort Key: people.id, companies.id
Sort Method: external merge Disk: 3816kB
-> Gather (cost=1647.48..137915.08 rows=34343 width=129) (actual time=1405.433..1537.693 rows=33780 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop Left Join (cost=647.48..133480.78 rows=14310 width=129) (actual time=570.743..710.682 rows=11260 loops=3)
-> Nested Loop (cost=647.05..104036.25 rows=14310 width=55) (actual time=570.719..655.804 rows=11260 loops=3)
-> Parallel Bitmap Heap Scan on relationships person_relation (cost=646.62..13074.28 rows=14310 width=13) (actual time=570.627..579.277 rows=11260 loops=3)
Recheck Cond: (platform_user_id = 6)
Filter: ((type)::text <> 'Suppressee'::text)
Rows Removed by Filter: 12
Heap Blocks: exact=1642
-> Bitmap Index Scan on index_relationships_on_platform_user_id_and_person_id (cost=0.00..638.03 rows=34347 width=0) (actual time=2.254..2.254 rows=33829 loops=1)
Index Cond: (platform_user_id = 6)
-> Index Scan using people_pkey on people (cost=0.43..6.36 rows=1 width=46) (actual time=0.006..0.006 rows=1 loops=33780)
Index Cond: (id = person_relation.person_id)
-> Index Scan using companies_pkey on companies (cost=0.43..2.06 rows=1 width=82) (actual time=0.005..0.005 rows=1 loops=33780)
Index Cond: (id = people.company_id)
-> CTE Scan on people_relations recipient_relations (cost=0.00..686.86 rows=34343 width=104) (actual time=0.018..4.247 rows=33730 loops=1)
-> Hash (cost=488466.12..488466.12 rows=21448 width=2209) (actual time=3142.015..3191.555 rows=2733 loops=1)
Buckets: 2048 Batches: 16 Memory Usage: 655kB
-> Merge Join (cost=487925.89..488466.12 rows=21448 width=2209) (actual time=3094.438..3187.748 rows=2733 loops=1)
Merge Cond: (owner_relations.people_relations_id = activities.owner_id)
-> Sort (cost=5272.71..5358.57 rows=34343 width=112) (actual time=1622.739..1626.249 rows=33730 loops=1)
Sort Key: owner_relations.people_relations_id
Sort Method: external merge Disk: 4128kB
-> CTE Scan on people_relations owner_relations (cost=0.00..686.86 rows=34343 width=112) (actual time=1556.912..1610.745 rows=33730 loops=1)
-> Materialize (cost=482653.17..482746.77 rows=18719 width=2113) (actual time=1471.676..1552.408 rows=69702 loops=1)
-> Sort (cost=482653.17..482699.97 rows=18719 width=2113) (actual time=1471.672..1543.930 rows=69702 loops=1)
Sort Key: owner_person.id
Sort Method: external merge Disk: 84608kB
-> Gather (cost=64235.86..464174.85 rows=18719 width=2113) (actual time=1305.158..1393.927 rows=81045 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Hash Left Join (cost=63235.86..461302.95 rows=7800 width=2113) (actual time=1289.165..1311.400 rows=27015 loops=3)
Hash Cond: (t_post_comments.post_id = t_post_comments_post.id)
-> Nested Loop Left Join (cost=51190.69..443455.30 rows=7800 width=1700) (actual time=443.623..511.046 rows=27015 loops=3)
Join Filter: ((activities.trackable_type)::text = 'PostComment'::text)
Rows Removed by Join Filter: 1756
-> Parallel Hash Left Join (cost=51190.26..395642.27 rows=7800 width=1408) (actual time=443.580..471.580 rows=27015 loops=3)
Hash Cond: (recipient.id = recipient_person.person_id)
-> Nested Loop Left Join (cost=26667.49..366532.83 rows=7800 width=1408) (actual time=214.602..348.548 rows=6432 loops=3)
Filter: ((recipient.id IS NULL) OR (recipient.id <> owner_person.id))
Rows Removed by Filter: 249
-> Nested Loop Left Join (cost=26667.06..310170.84 rows=7800 width=1333) (actual time=214.591..338.396 rows=6681 loops=3)
Join Filter: ((activities.trackable_type)::text = 'Company'::text)
Rows Removed by Join Filter: 894
-> Hash Left Join (cost=26666.63..257110.20 rows=7800 width=1259) (actual time=214.566..324.738 rows=6681 loops=3)
Hash Cond: (activities.id = li_user_activities.activity_id)
-> Nested Loop (cost=25401.21..255737.89 rows=7800 width=1259) (actual time=208.406..315.896 rows=6681 loops=3)
-> Parallel Hash Left Join (cost=25400.78..199473.40 rows=7800 width=1161) (actual time=208.367..216.663 rows=6681 loops=3)
Hash Cond: (t_post_likes.post_id = t_post_likes_post.id)
-> Nested Loop Left Join (cost=12700.61..182373.75 rows=7800 width=623) (actual time=143.176..167.675 rows=6681 loops=3)
Join Filter: ((activities.trackable_type)::text = 'PostLike'::text)
Rows Removed by Join Filter: 1095
-> Parallel Hash Left Join (cost=12700.17..131647.07 rows=7800 width=611) (actual time=143.146..156.428 rows=6681 loops=3)
Hash Cond: (activities.trackable_id = t_posts.id)
Join Filter: ((activities.trackable_type)::text = 'Post'::text)
Rows Removed by Join Filter: 1452
-> Parallel Seq Scan on activities (cost=0.00..115613.42 rows=7800 width=61) (actual time=0.376..80.040 rows=6681 loops=3)
Filter: (((key)::text <> 'asdasd'::text) AND ((owner_type)::text = 'Person'::text) AND ((((key)::text = 'job.changed'::text) AND (occurred_at > '2022-01-31 15:09:54'::timestamp without time zone)) OR (((key)::text <> 'job.changed'::text) AND (occurred_at > '2022-04-24 14:09:54'::timestamp without time zone))))
Rows Removed by Filter: 27551
-> Parallel Hash (cost=8996.19..8996.19 rows=44719 width=550) (actual time=57.638..57.639 rows=35776 loops=3)
Buckets: 8192 Batches: 16 Memory Usage: 4032kB
-> Parallel Seq Scan on posts t_posts (cost=0.00..8996.19 rows=44719 width=550) (actual time=0.032..14.451 rows=35776 loops=3)
-> Index Scan using post_likes_pkey on post_likes t_post_likes (cost=0.43..6.49 rows=1 width=12) (actual time=0.001..0.001 rows=1 loops=20042)
Index Cond: (id = activities.trackable_id)
-> Parallel Hash (cost=8996.19..8996.19 rows=44719 width=550) (actual time=35.322..35.322 rows=35776 loops=3)
Buckets: 8192 Batches: 16 Memory Usage: 4000kB
-> Parallel Seq Scan on posts t_post_likes_post (cost=0.00..8996.19 rows=44719 width=550) (actual time=0.022..10.427 rows=35776 loops=3)
-> Index Scan using people_pkey on people owner_person (cost=0.43..7.21 rows=1 width=98) (actual time=0.014..0.014 rows=1 loops=20042)
Index Cond: (id = activities.owner_id)
-> Hash (cost=951.58..951.58 rows=25107 width=4) (actual time=6.115..6.116 rows=25698 loops=3)
Buckets: 32768 Batches: 1 Memory Usage: 1160kB
-> Seq Scan on li_user_activities (cost=0.00..951.58 rows=25107 width=4) (actual time=0.011..3.578 rows=25698 loops=3)
Filter: ((dismissed_at IS NULL) AND (platform_user_id = 6))
Rows Removed by Filter: 15722
-> Index Scan using companies_pkey on companies trackable_companies (cost=0.43..6.79 rows=1 width=82) (actual time=0.002..0.002 rows=0 loops=20042)
Index Cond: (id = activities.trackable_id)
-> Index Scan using people_pkey on people recipient (cost=0.43..7.21 rows=1 width=83) (actual time=0.001..0.001 rows=1 loops=20042)
Index Cond: (id = activities.recipient_id)
-> Parallel Hash (cost=16874.67..16874.67 rows=466168 width=4) (actual time=79.735..79.736 rows=372930 loops=3)
Buckets: 131072 Batches: 16 Memory Usage: 3840kB
-> Parallel Seq Scan on relationships recipient_person (cost=0.00..16874.67 rows=466168 width=4) (actual time=0.021..35.805 rows=372930 loops=3)
-> Index Scan using post_comments_pkey on post_comments t_post_comments (cost=0.42..6.12 rows=1 width=300) (actual time=0.001..0.001 rows=0 loops=81045)
Index Cond: (id = activities.trackable_id)
-> Parallel Hash (cost=8996.19..8996.19 rows=44719 width=425) (actual time=726.076..726.076 rows=35776 loops=3)
Buckets: 16384 Batches: 16 Memory Usage: 3264kB
-> Parallel Seq Scan on posts t_post_comments_post (cost=0.00..8996.19 rows=44719 width=425) (actual time=479.054..488.703 rows=35776 loops=3)
Planning Time: 5.286 ms
JIT:
Functions: 304
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 22.990 ms, Inlining 260.865 ms, Optimization 1652.601 ms, Emission 1228.811 ms, Total 3165.267 ms
Execution Time: 3303.637 ms
UPDATE:
Here's the plan with jit=off:
https://explain.dalibo.com/plan/EXn

It looks like essentially all your time is going to doing just-in-time compilations. Turn off JIT (jit=off in the config file, or set jit=off; to do it in the session.)
I thought turning JIT off would make it fall a lot more than that, since the original attributed all but 3303.637 - 3165.267 = 138 ms to JIT. You should alternate a few times between JIT on and off to see if the times you originally report are reproducible or might just be to differences in caching effects.
Also, the times you report are 2-3 times longer than the times the plan itself reports. That is another thing you should check to see how reproducible it is. Maybe most of the time is spent formatting the data to send, or sending it over the network. (That seems unlikely with only 240 rows, but I don't know what else would explain it.)
The time spent is spread thinly throughout the plan now, so there is no one change that could be made to any of the nodes that would make a big difference to the overall time. And I don't see that the estimation errors are driving any plan choices where better estimates would lead to better choices.
Given the lack of a clear bottleneck, opportunities to speed up would probably be faster drives or more RAM for caching or increasing max_parallel_workers_per_gather so you can get more work done in parallel.
Looking at the text of the query, I don't understand what its motivation is so that limits my ability to make suggestions. But there a lot of DISTINCTs there. Are some of the joins generating needless duplicate rows, which are then condensed back down with the DISTINCTs? If so, maybe using WHERE exists (...) could improve things.

Related

How to improve the performance of queries when joining a lot of huge tables?

This is my SQL script, I have to join 7 tables
SELECT concat_ws('-', it.item_id, it.model_id) AS product_id,
concat_ws('-', aip.partner_item_id, aip.partner_model_id) AS product_reseller_id,
i.name as item_name,
im.name AS model_name,
p.partner_code,
sum(it.quantity) AS transfer_total,
sum(isb.remaining_item) as remaining_stock,
sum(isb.sold_item) as partner_sold
FROM transfer t
INNER JOIN partner p ON p.reseller_store_id = t.reseller_store_id
INNER JOIN item_transfer it ON t.id = it.transfer_id
INNER JOIN item i ON i.id = it.item_id
INNER JOIN item_model im ON it.model_id = im.id
INNER JOIN affiliate_item_mapping aip on it.item_id = aip.seller_item_id and it.model_id = aip.seller_model_id
and t.reseller_store_id = aip.reseller_store_id
LEFT JOIN inventory_summary_branch isb on isb.inventory_summary_id = concat_ws('-', aip.partner_item_id, aip.partner_model_id)
WHERE p.store_id = 9805
GROUP BY it.item_id, it.model_id, p.partner_code, i.id, im.id, aip.id, isb.inventory_summary_id
This is the result of SQL EXPLAIN:
GroupAggregate (cost=13861.57..13861.62 rows=1 width=885) (actual time=1890.392..1890.525 rows=15 loops=1)
Group Key: it.item_id, it.model_id, p.partner_code, i.id, im.id, aip.id, isb.inventory_summary_id
Buffers: shared hit=118610
-> Sort (cost=13861.57..13861.58 rows=1 width=765) (actual time=1890.310..1890.338 rows=21 loops=1)
Sort Key: it.item_id, it.model_id, p.partner_code, aip.id, isb.inventory_summary_id
Sort Method: quicksort Memory: 28kB
Buffers: shared hit=118610
-> Nested Loop (cost=1.27..13861.56 rows=1 width=765) (actual time=73.156..1890.057 rows=21 loops=1)
Buffers: shared hit=118610
-> Nested Loop (cost=0.85..13853.14 rows=1 width=753) (actual time=73.134..1889.495 rows=21 loops=1)
Buffers: shared hit=118526
-> Nested Loop (cost=0.43..13845.32 rows=1 width=609) (actual time=73.099..1888.733 rows=21 loops=1)
Join Filter: ((p.reseller_store_id = t.reseller_store_id) AND (it.transfer_id = t.id))
Rows Removed by Join Filter: 2142
Buffers: shared hit=118442
-> Nested Loop (cost=0.43..13840.24 rows=1 width=633) (actual time=72.793..1879.961 rows=21 loops=1)
Join Filter: ((aip.seller_item_id = it.item_id) AND (aip.seller_model_id = it.model_id))
Rows Removed by Join Filter: 6003
Buffers: shared hit=118379
-> Nested Loop Left Join (cost=0.43..13831.47 rows=1 width=601) (actual time=72.093..1861.415 rows=24 loops=1)
Buffers: shared hit=118307
-> Nested Loop (cost=0.00..11.44 rows=1 width=572) (actual time=0.042..0.696 rows=24 loops=1)
Join Filter: (p.reseller_store_id = aip.reseller_store_id)
Rows Removed by Join Filter: 150
Buffers: shared hit=7
-> Seq Scan on partner p (cost=0.00..10.38 rows=1 width=524) (actual time=0.026..0.039 rows=6 loops=1)
Filter: (store_id = 9805)
Buffers: shared hit=1
-> Seq Scan on affiliate_item_mapping aip (cost=0.00..1.03 rows=3 width=48) (actual time=0.006..0.043 rows=29 loops=6)
Buffers: shared hit=6
-> Index Scan using branch_id_inventory_summary_id_inventory_summary_branch on inventory_summary_branch isb (cost=0.43..13820.01 rows=1 width=29) (actual time=77.498..77.498 rows=0 loops=24)
Index Cond: ((inventory_summary_id)::text = concat_ws('-'::text, aip.partner_item_id, aip.partner_model_id))
Buffers: shared hit=118300
-> Seq Scan on item_transfer it (cost=0.00..5.31 rows=231 width=32) (actual time=0.024..0.391 rows=251 loops=24)
Buffers: shared hit=72
-> Seq Scan on transfer t (cost=0.00..3.83 rows=83 width=16) (actual time=0.011..0.256 rows=103 loops=21)
Buffers: shared hit=63
-> Index Scan using pk_item on item i (cost=0.42..7.81 rows=1 width=152) (actual time=0.022..0.023 rows=1 loops=21)
Index Cond: (id = it.item_id)
Buffers: shared hit=84
-> Index Scan using pk_item_model on item_model im (cost=0.43..8.41 rows=1 width=20) (actual time=0.016..0.018 rows=1 loops=21)
Index Cond: (id = it.model_id)
Buffers: shared hit=84
Planning time: 10.051 ms
Execution time: 1890.943 ms
Of course, this statement works fine, but it's slow. Is there a better way to write this code?
How can I improve the performance? Join or sub-query is better in this case? Anyone, please give me a hand
2 things can help you
do VACCUME ANALYZE for all the tables involved.
create indexe on item_transfer.item_id & model_id
Essentially all of your time (77.498*24) is spend on the index scan of branch_id_inventory_summary_id_inventory_summary_branch.
About the only explanation I can see for this is that the index isn't suited to the query, and it is being full-index scanned (in lieu of full scanning the table), rather than being efficiently scanned. This probably means the index includes the column inventory_summary_id, but it is not the leading column. (It would be nice if EXPLAIN were to make this inefficient type of usage clearer than it currently does).
You would probably benefit from an index such as on inventory_summary_branch (inventory_summary_id) which has a better chance of being used efficiently.
I don't know why it wouldn't just do a hash join of that table. Maybe your work_mem is too low.
Inner joins will always be slower, especially with so many tables.
You could change from an inner join on the whole table to just the columns you need and see if that improves it at all:
From:
INNER JOIN partner p ON p.reseller_store_id = t.reseller_store_id
To:
inner join (select id, partner_code from partner) as p ON p.reseller_store_id = t.reseller_store_id
See if that speeds things up at all.
If not I would recommend indexes on the keys

Postgres: Slow query when using OR statement in a join query

We run a join query between 2 tables.
The query has an OR statement that compares one column from the left table and one column from the right table. The query performance is very low, and we fixed it by changing the OR to UNION.
Why is this happening? I'm looking for a detailed explanation or a reference to the documentation that might shed a light on the issue.
Query with Or Statment:
db1=# explain analyze select count(*)
from conversations
join agents on conversations.agent_id=agents.id
where conversations.id=1 or agents.id = '123';
**Query plan**
----------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=**11017.95..11017.96** rows=1 width=8) (actual time=54.088..54.088 rows=1 loops=1)
-> Gather (cost=11017.73..11017.94 rows=2 width=8) (actual time=53.945..57.181 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=10017.73..10017.74 rows=1 width=8) (actual time=48.303..48.303 rows=1 loops=3)
-> Hash Join (cost=219.26..10016.69 rows=415 width=0) (actual time=5.292..48.287 rows=130 loops=3)
Hash Cond: (conversations.agent_id = agents.id)
Join Filter: ((conversations.id = 1) OR ((agents.id)::text = '123'::text))
Rows Removed by Join Filter: 80035
-> Parallel Seq Scan on conversations (cost=0.00..9366.95 rows=163995 width=8) (actual time=0.017..14.972 rows=131196 loops=3)
-> Hash (cost=143.56..143.56 rows=6056 width=16) (actual time=2.686..2.686 rows=6057 loops=3)
Buckets: 8192 Batches: 1 Memory Usage: 353kB
-> Seq Scan on agents (cost=0.00..143.56 rows=6056 width=16) (actual time=0.011..1.305 rows=6057 loops=3)
Planning time: 0.710 ms
Execution time: 57.276 ms
(15 rows)
Changing the OR to UNION:
db1=# explain analyze select count(*) from (
select *
from conversations
join agents on conversations.agent_id=agents.id
where conversations.installation_id=1
union
select *
from conversations
join agents on conversations.agent_id=agents.id
where agents.source_id = '123') as subquery;
**Query plan:**
----------------------------------------------------------------------------------------------------------------------------------
Aggregate (**cost=1114.31..1114.32** rows=1 width=8) (actual time=8.038..8.038 rows=1 loops=1)
-> HashAggregate (cost=1091.90..1101.86 rows=996 width=1437) (actual time=7.783..8.009 rows=390 loops=1)
Group Key: conversations.id, conversations.created, conversations.modified, conversations.source_created, conversations.source_id, conversations.installation_id, bra
in_conversation.resolution_reason, conversations.solve_time, conversations.agent_id, conversations.submission_reason, conversations.is_marked_as_duplicate, conversations.n
um_back_and_forths, conversations.is_closed, conversations.is_solved, conversations.conversation_type, conversations.related_ticket_source_id, conversations.channel, brain_convers
ation.last_updated_from_platform, conversations.csat, agents.id, agents.created, agents.modified, agents.name, agents.source_id, organizati
on_agent.installation_id, agents.settings
-> Append (cost=219.68..1027.16 rows=996 width=1437) (actual time=5.517..6.307 rows=390 loops=1)
-> Hash Join (cost=219.68..649.69 rows=931 width=224) (actual time=5.516..6.063 rows=390 loops=1)
Hash Cond: (conversations.agent_id = agents.id)
-> Index Scan using conversations_installation_id_b3ff5c00 on conversations (cost=0.42..427.98 rows=931 width=154) (actual time=0.039..0.344 rows=879 loops=1)
Index Cond: (installation_id = 1)
-> Hash (cost=143.56..143.56 rows=6056 width=70) (actual time=5.394..5.394 rows=6057 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 710kB
-> Seq Scan on agents (cost=0.00..143.56 rows=6056 width=70) (actual time=0.014..1.938 rows=6057 loops=1)
-> Nested Loop (cost=0.70..367.52 rows=65 width=224) (actual time=0.210..0.211 rows=0 loops=1)
-> Index Scan using agents_source_id_106c8103_like on agents agents_1 (cost=0.28..8.30 rows=1 width=70) (actual time=0.210..0.210 rows=0 loops=1)
Index Cond: ((source_id)::text = '123'::text)
-> Index Scan using conversations_agent_id_de76554b on conversations conversations_1 (cost=0.42..358.12 rows=110 width=154) (never executed)
Index Cond: (agent_id = agents_1.id)
Planning time: 2.024 ms
Execution time: 9.367 ms
(18 rows)
Yes. or has a way of killing the performance of queries. For this query:
select count(*)
from conversations c join
agents a
on c.agent_id = a.id
where c.id = 1 or a.id = 123;
Note I removed the quotes around 123. It looks like a number so I assume it is. For this query, you want an index on conversations(agent_id).
Probably the most effective way to write the query is:
select count(*)
from ((select 1
from conversations c join
agents a
on c.agent_id = a.id
where c.id = 1
) union all
(select 1
from conversations c join
agents a
on c.agent_id = a.id
where a.id = 123 and c.id <> 1
)
) ac;
Note the use of union all rather than union. The additional where condition eliminates duplicates.
This can take advantage of the following indexes:
conversations(id, agent_id)
agents(id)
conversations(agent_id, id)

Postgresql Query Slows Inexplicably with Addition of WHERE Constraint

I have the following PostgreSQL query, which contains a few subqueries. The
query runs almost instantly until I add the WHERE lb.type = 'Marketing'
constraint, which causes it to take about 3 minutes. I find it inexplicable that
the addition of such a simple constraint causes such an extreme slowdown, but
I'm guessing it must point to a fundamental flaw in my approach.
I'm hoping for assistance on a few fronts:
Is my use of subqueries to select the latest records from specific tables appropriate, or could it cause performance issues?
What should I be looking for in the execution plan when trying to diagnose issues?
How should I go about determining what indexes to create for complex queries such as these?
Why could the additional WHERE constraint be causing such a massive slowdown?
The table structure is as follows:
CREATE TABLE sales.leads
(
lead_id integer NOT NULL DEFAULT nextval('sales.leads_lead_id_seq'::regclass),
batch_id integer,
expired integer NOT NULL DEFAULT 0,
closed integer NOT NULL DEFAULT 0,
merged integer NOT NULL DEFAULT 0,
CONSTRAINT leads_pkey PRIMARY KEY (lead_id)
)
CREATE TABLE sales.lead_batches
(
batch_id integer NOT NULL DEFAULT nextval('sales.lead_batches_batch_id_seq'::regclass),
inserted_datetime timestamp without time zone,
type character varying(100) COLLATE pg_catalog."default",
uploaded smallint NOT NULL DEFAULT '0'::smallint,
CONSTRAINT lead_batches_pkey PRIMARY KEY (batch_id)
)
CREATE TABLE sales.lead_results
(
lead_result_id integer NOT NULL DEFAULT nextval('sales.lead_results_lead_result_id_seq'::regclass),
lead_id integer,
assigned_datetime timestamp without time zone NOT NULL,
user_id character varying(255) COLLATE pg_catalog."default" NOT NULL,
resulted_datetime timestamp without time zone,
result character varying(255) COLLATE pg_catalog."default",
CONSTRAINT lead_results_pkey PRIMARY KEY (lead_result_id)
)
CREATE TABLE sales.personal_details
(
lead_id integer,
title character varying(50) COLLATE pg_catalog."default",
first_name character varying(100) COLLATE pg_catalog."default",
surname character varying(255) COLLATE pg_catalog."default",
email_address character varying(100) COLLATE pg_catalog."default",
updated_date date NOT NULL
)
CREATE TABLE sales.users
(
user_id character varying(50) COLLATE pg_catalog."default" NOT NULL,
surname character varying(255) COLLATE pg_catalog."default",
name character varying(255) COLLATE pg_catalog."default"
)
Query:
SELECT l.*, pd.*, lr.resulted_datetime, u.name
FROM sales.leads l
INNER JOIN sales.lead_batches lb ON l.batch_id = lb.batch_id
LEFT JOIN (
SELECT pd_sub.*
FROM sales.personal_details pd_sub
INNER JOIN (
SELECT lead_id, MAX(updated_date) AS updated_date
FROM sales.personal_details
GROUP BY lead_id
) sub ON pd_sub.lead_id = sub.lead_id AND pd_sub.updated_date = sub.updated_date
) pd ON l.lead_id = pd.lead_id
LEFT JOIN (
SELECT lr_sub.*
FROM sales.lead_results lr_sub
INNER JOIN (
SELECT lead_id, MAX(resulted_datetime) AS resulted_datetime
FROM sales.lead_results
GROUP BY lead_id) sub
ON lr_sub.lead_id = sub.lead_id AND lr_sub.resulted_datetime = sub.resulted_datetime
) lr ON l.lead_id = lr.lead_id
LEFT JOIN sales.users u ON u.user_id = lr.user_id
WHERE lb.type = 'Marketing'
AND lb.uploaded = 1
AND l.merged = 0
Execution plan:
Nested Loop Left Join (cost=10485.51..17604.18 rows=34 width=158) (actual time=717.862..168709.593 rows=18001 loops=1)
Join Filter: (l.lead_id = pd_sub.lead_id)
Rows Removed by Join Filter: 687818215
-> Nested Loop Left Join (cost=6487.82..12478.42 rows=34 width=135) (actual time=658.141..64951.950 rows=18001 loops=1)
Join Filter: (l.lead_id = lr_sub.lead_id)
Rows Removed by Join Filter: 435482960
-> Hash Join (cost=131.01..1816.10 rows=34 width=60) (actual time=1.948..126.067 rows=17998 loops=1)
Hash Cond: (l.batch_id = lb.batch_id)
-> Seq Scan on leads l (cost=0.00..1273.62 rows=32597 width=44) (actual time=0.032..69.763 rows=32621 loops=1)
Filter: (merged = 0)
Rows Removed by Filter: 5595
-> Hash (cost=130.96..130.96 rows=4 width=20) (actual time=1.894..1.894 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on lead_batches lb (cost=0.00..130.96 rows=4 width=20) (actual time=1.078..1.884 rows=4 loops=1)
Filter: (((type)::text = 'Marketing'::text) AND (uploaded = 1))
Rows Removed by Filter: 3866
-> Materialize (cost=6356.81..10661.81 rows=1 width=79) (actual time=0.006..1.362 rows=24197 loops=17998)
-> Nested Loop Left Join (cost=6356.81..10661.81 rows=1 width=79) (actual time=96.246..633.701 rows=24197 loops=1)
Join Filter: ((u.user_id)::text = (lr_sub.user_id)::text)
Rows Removed by Join Filter: 1742184
-> Gather (cost=6356.81..10659.19 rows=1 width=72) (actual time=96.203..202.086 rows=24197 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=5356.81..9659.09 rows=1 width=72) (actual time=134.595..166.341 rows=8066 loops=3)
Hash Cond: ((lr_sub.lead_id = lead_results.lead_id) AND (lr_sub.resulted_datetime = (max(lead_results.resulted_datetime))))
-> Parallel Seq Scan on lead_results lr_sub (cost=0.00..3622.05 rows=44605 width=72) (actual time=0.033..17.333 rows=35684 loops=3)
-> Hash (cost=5110.36..5110.36 rows=16430 width=12) (actual time=134.260..134.260 rows=24194 loops=3)
Buckets: 32768 Batches: 1 Memory Usage: 1391kB
-> HashAggregate (cost=4781.76..4946.06 rows=16430 width=12) (actual time=122.823..129.022 rows=24204 loops=3)
Group Key: lead_results.lead_id
-> Seq Scan on lead_results (cost=0.00..4246.51 rows=107051 width=12) (actual time=0.020..71.768 rows=107051 loops=3)
-> Seq Scan on users u (cost=0.00..1.72 rows=72 width=23) (actual time=0.002..0.007 rows=73 loops=24197)
-> Materialize (cost=3997.68..5030.85 rows=187 width=31) (actual time=0.003..2.033 rows=38211 loops=18001)
-> Hash Join (cost=3997.68..5029.92 rows=187 width=31) (actual time=52.802..85.774 rows=38211 loops=1)
Hash Cond: ((personal_details.lead_id = pd_sub.lead_id) AND ((max(personal_details.updated_date)) = pd_sub.updated_date))
-> HashAggregate (cost=1811.38..2186.06 rows=37468 width=8) (actual time=23.330..35.345 rows=38212 loops=1)
Group Key: personal_details.lead_id
-> Seq Scan on personal_details (cost=0.00..1623.92 rows=37492 width=8) (actual time=0.014..4.636 rows=38232 loops=1)
-> Hash (cost=1623.92..1623.92 rows=37492 width=35) (actual time=29.058..29.058 rows=38211 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2809kB
-> Seq Scan on personal_details pd_sub (cost=0.00..1623.92 rows=37492 width=35) (actual time=0.026..17.231 rows=38232 loops=1)
Planning time: 1.966 ms
Execution time: 168731.769 ms
I have an index on lead_id on all tables, and an additional index on (type, uploaded) in lead_batches.
Thanks very much in advance for any assistance!
EDIT:
The execution plan without the additional WHERE constraint:
Hash Left Join (cost=15861.46..17780.37 rows=30972 width=158) (actual time=765.076..844.512 rows=32053 loops=1)
Hash Cond: (l.lead_id = pd_sub.lead_id)
-> Hash Left Join (cost=10829.21..12630.45 rows=30972 width=135) (actual time=667.460..724.297 rows=32053 loops=1)
Hash Cond: (l.lead_id = lr_sub.lead_id)
-> Hash Join (cost=167.39..1852.48 rows=30972 width=60) (actual time=2.579..36.683 rows=32050 loops=1)
Hash Cond: (l.batch_id = lb.batch_id)
-> Seq Scan on leads l (cost=0.00..1273.62 rows=32597 width=44) (actual time=0.034..22.166 rows=32623 loops=1)
Filter: (merged = 0)
Rows Removed by Filter: 5595
-> Hash (cost=121.40..121.40 rows=3679 width=20) (actual time=2.503..2.503 rows=3679 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 234kB
-> Seq Scan on lead_batches lb (cost=0.00..121.40 rows=3679 width=20) (actual time=0.011..1.809 rows=3679 loops=1)
Filter: (uploaded = 1)
Rows Removed by Filter: 193
-> Hash (cost=10661.81..10661.81 rows=1 width=79) (actual time=664.855..664.855 rows=24197 loops=1)
Buckets: 32768 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2821kB
-> Nested Loop Left Join (cost=6356.81..10661.81 rows=1 width=79) (actual time=142.634..647.146 rows=24197 loops=1)
Join Filter: ((u.user_id)::text = (lr_sub.user_id)::text)
Rows Removed by Join Filter: 1742184
-> Gather (cost=6356.81..10659.19 rows=1 width=72) (actual time=142.590..241.913 rows=24197 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=5356.81..9659.09 rows=1 width=72) (actual time=141.250..171.403 rows=8066 loops=3)
Hash Cond: ((lr_sub.lead_id = lead_results.lead_id) AND (lr_sub.resulted_datetime = (max(lead_results.resulted_datetime))))
-> Parallel Seq Scan on lead_results lr_sub (cost=0.00..3622.05 rows=44605 width=72) (actual time=0.027..15.322 rows=35684 loops=3)
-> Hash (cost=5110.36..5110.36 rows=16430 width=12) (actual time=140.917..140.917 rows=24194 loops=3)
Buckets: 32768 Batches: 1 Memory Usage: 1391kB
-> HashAggregate (cost=4781.76..4946.06 rows=16430 width=12) (actual time=127.911..135.076 rows=24204 loops=3)
Group Key: lead_results.lead_id
-> Seq Scan on lead_results (cost=0.00..4246.51 rows=107051 width=12) (actual time=0.020..74.626 rows=107051 loops=3)
-> Seq Scan on users u (cost=0.00..1.72 rows=72 width=23) (actual time=0.002..0.006 rows=73 loops=24197)
-> Hash (cost=5029.92..5029.92 rows=187 width=31) (actual time=97.561..97.561 rows=38213 loops=1)
Buckets: 65536 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2660kB
-> Hash Join (cost=3997.68..5029.92 rows=187 width=31) (actual time=52.712..85.099 rows=38213 loops=1)
Hash Cond: ((personal_details.lead_id = pd_sub.lead_id) AND ((max(personal_details.updated_date)) = pd_sub.updated_date))
-> HashAggregate (cost=1811.38..2186.06 rows=37468 width=8) (actual time=23.831..35.015 rows=38214 loops=1)
Group Key: personal_details.lead_id
-> Seq Scan on personal_details (cost=0.00..1623.92 rows=37492 width=8) (actual time=0.012..4.995 rows=38234 loops=1)
-> Hash (cost=1623.92..1623.92 rows=37492 width=35) (actual time=28.468..28.468 rows=38213 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2809kB
-> Seq Scan on personal_details pd_sub (cost=0.00..1623.92 rows=37492 width=35) (actual time=0.024..17.089 rows=38234 loops=1)
Planning time: 2.058 ms
Execution time: 849.460 ms
The execution plan with nested_loops disabled:
Hash Left Join (cost=13088.17..17390.71 rows=34 width=158) (actual time=277.646..343.924 rows=18001 loops=1)
Hash Cond: (l.lead_id = pd_sub.lead_id)
-> Hash Right Join (cost=8055.91..12358.31 rows=34 width=135) (actual time=181.614..238.365 rows=18001 loops=1)
Hash Cond: (lr_sub.lead_id = l.lead_id)
-> Hash Left Join (cost=6359.43..10661.82 rows=1 width=79) (actual time=156.498..201.533 rows=24197 loops=1)
Hash Cond: ((lr_sub.user_id)::text = (u.user_id)::text)
-> Gather (cost=6356.81..10659.19 rows=1 width=72) (actual time=156.415..190.934 rows=24197 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=5356.81..9659.09 rows=1 width=72) (actual time=143.387..178.653 rows=8066 loops=3)
Hash Cond: ((lr_sub.lead_id = lead_results.lead_id) AND (lr_sub.resulted_datetime = (max(lead_results.resulted_datetime))))
-> Parallel Seq Scan on lead_results lr_sub (cost=0.00..3622.05 rows=44605 width=72) (actual time=0.036..22.404 rows=35684 loops=3)
-> Hash (cost=5110.36..5110.36 rows=16430 width=12) (actual time=143.052..143.052 rows=24194 loops=3)
Buckets: 32768 Batches: 1 Memory Usage: 1391kB
-> HashAggregate (cost=4781.76..4946.06 rows=16430 width=12) (actual time=131.793..137.760 rows=24204 loops=3)
Group Key: lead_results.lead_id
-> Seq Scan on lead_results (cost=0.00..4246.51 rows=107051 width=12) (actual time=0.023..78.918 rows=107051 loops=3)
-> Hash (cost=1.72..1.72 rows=72 width=23) (actual time=0.061..0.061 rows=73 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on users u (cost=0.00..1.72 rows=72 width=23) (actual time=0.031..0.039 rows=73 loops=1)
-> Hash (cost=1696.05..1696.05 rows=34 width=60) (actual time=25.068..25.068 rows=17998 loops=1)
Buckets: 32768 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2084kB
-> Hash Join (cost=10.96..1696.05 rows=34 width=60) (actual time=0.208..18.630 rows=17998 loops=1)
Hash Cond: (l.batch_id = lb.batch_id)
-> Seq Scan on leads l (cost=0.00..1273.62 rows=32597 width=44) (actual time=0.043..13.065 rows=32623 loops=1)
Filter: (merged = 0)
Rows Removed by Filter: 5595
-> Hash (cost=10.91..10.91 rows=4 width=20) (actual time=0.137..0.137 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Scan using lead_batches_type_idx on lead_batches lb (cost=0.28..10.91 rows=4 width=20) (actual time=0.091..0.129 rows=4 loops=1)
Index Cond: ((type)::text = 'Marketing'::text)
Filter: (uploaded = 1)
-> Hash (cost=5029.92..5029.92 rows=187 width=31) (actual time=96.005..96.005 rows=38213 loops=1)
Buckets: 65536 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2660kB
-> Hash Join (cost=3997.68..5029.92 rows=187 width=31) (actual time=52.166..84.592 rows=38213 loops=1)
Hash Cond: ((personal_details.lead_id = pd_sub.lead_id) AND ((max(personal_details.updated_date)) = pd_sub.updated_date))
-> HashAggregate (cost=1811.38..2186.06 rows=37468 width=8) (actual time=23.785..34.403 rows=38214 loops=1)
Group Key: personal_details.lead_id
-> Seq Scan on personal_details (cost=0.00..1623.92 rows=37492 width=8) (actual time=0.013..4.680 rows=38234 loops=1)
-> Hash (cost=1623.92..1623.92 rows=37492 width=35) (actual time=27.960..27.960 rows=38213 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2809kB
-> Seq Scan on personal_details pd_sub (cost=0.00..1623.92 rows=37492 width=35) (actual time=0.019..15.350 rows=38234 loops=1)
Planning time: 2.469 ms
Execution time: 346.590 ms
You are basically missing some important indexes here.
For testing improvements I've set up the tables myself and tried to fill them with test data with similar distribution as read from the explain plans.
My baseline performance was ~160 seconds: https://explain.depesz.com/s/WlKO
The first thing I did was creating indexes for the foreign key references (although not all will be necessary):
CREATE INDEX idx_personal_details_leads ON sales.personal_details (lead_id);
CREATE INDEX idx_leads_batches ON sales.leads (batch_id);
CREATE INDEX idx_lead_results_users ON sales.lead_results (user_id);
That brought us down to ~112 seconds: https://explain.depesz.com/s/aRcf
Now, most of the time get's actually spend on the self-joins (table personal_details using latest updated_date and table lead_results using latest resulted_datetime). Based on this, I came up with the following two indexes:
CREATE INDEX idx_personal_details_updated ON sales.personal_details (lead_id, updated_date DESC);
CREATE INDEX idx_lead_results_resulted ON sales.lead_results (lead_id, resulted_datetime DESC);
...which then immediately brings us down to ~110 milliseconds: https://explain.depesz.com/s/dDfk
Debugging help
What has helped me in debugging which indexes where most effective, I first rewrote the query to eliminate any sub-select and instead use a dedicated CTE for each of them:
WITH
leads_update_latest AS (
SELECT lead_id, MAX(updated_date) AS updated_date
FROM sales.personal_details
GROUP BY lead_id
),
pd AS (
SELECT pd_sub.*
FROM sales.personal_details pd_sub
INNER JOIN leads_update_latest sub ON (pd_sub.lead_id = sub.lead_id AND pd_sub.updated_date = sub.updated_date)
),
leads_result_latest AS (
SELECT lead_id, MAX(resulted_datetime) AS resulted_datetime
FROM sales.lead_results
GROUP BY lead_id
),
lr AS (
SELECT lr_sub.*
FROM sales.lead_results lr_sub
INNER JOIN leads_result_latest sub ON (lr_sub.lead_id = sub.lead_id AND lr_sub.resulted_datetime = sub.resulted_datetime)
),
leads AS (
SELECT l.*
FROM sales.leads l
INNER JOIN sales.lead_batches lb ON (l.batch_id = lb.batch_id)
WHERE lb.type = 'Marketing'
AND lb.uploaded = 1
AND l.merged = 0
)
SELECT l.*, pd.*, lr.resulted_datetime, u.name
FROM leads l
LEFT JOIN pd ON l.lead_id = pd.lead_id
LEFT JOIN lr ON l.lead_id = lr.lead_id
LEFT JOIN sales.users u ON u.user_id = lr.user_id
;
Surprisingly, by rewriting the query alone into CTE's, the PostgreSQL planner was way faster and took just ~2.3 seconds without any of the indexes: https://explain.depesz.com/s/lqzq
...with optimization:
FK-indexes down to ~230 milliseconds: https://explain.depesz.com/s/a6wT
However, with the other combined indexes, the CTE version degraded:
combined reverse indexes up to ~270 milliseconds: https://explain.depesz.com/s/TNNm
However, as these combined indexes speed up the original query a lot, they also grow a lot faster than single-column indexes and they are an additional write cost to account for in regards to the DB scalability.
As a result, it might make sense to go for a CTE-version performing a bit slower but fast enough to be able omit two additional indexes that the DB has to maintain.

Poor performing postgres sql

Here's my sql, followed by the explanation. I need to improve the performance. Any ideas?
PostgreSQL 9.3.12 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.1) 4.8.4, 64-bit
explain analyze
SELECT DISTINCT "apts"."id", practices.name AS alias_0
FROM "apts"
LEFT OUTER JOIN "patients" ON "patients"."id" = "apts"."patient_id"
LEFT OUTER JOIN "practices" ON "practices"."id" = "apts"."practice_id"
LEFT OUTER JOIN "eligibility_messages" ON "eligibility_messages"."apt_id" = "apts"."id"
WHERE (apts.eligibility_status_id != 1)
AND (eligibility_messages.current = 't')
AND (practices.id = '104')
ORDER BY practices.name desc
LIMIT 25 OFFSET 0
Limit (cost=881321.34..881321.41 rows=25 width=20) (actual time=2928.225..2928.227 rows=25 loops=1)
-> Sort (cost=881321.34..881391.94 rows=28240 width=20) (actual time=2928.223..2928.224 rows=25 loops=1)
Sort Key: practices.name
Sort Method: top-N heapsort Memory: 26kB
-> HashAggregate (cost=880242.03..880524.43 rows=28240 width=20) (actual time=2927.213..2927.319 rows=520 loops=1)
-> Nested Loop (cost=286614.55..880100.83 rows=28240 width=20) (actual time=206.180..2926.791 rows=520 loops=1)
-> Seq Scan on practices (cost=0.00..6.36 rows=1 width=20) (actual time=0.018..0.031 rows=1 loops=1)
Filter: (id = 104)
Rows Removed by Filter: 108
-> Hash Join (cost=286614.55..879812.07 rows=28240 width=8) (actual time=206.159..2926.643 rows=520 loops=1)
Hash Cond: (eligibility_messages.apt_id = apts.id)
-> Seq Scan on eligibility_messages (cost=0.00..561275.63 rows=2029532 width=4) (actual time=0.691..2766.867 rows=67559 loops=1)
Filter: current
Rows Removed by Filter: 3924633
-> Hash (cost=284614.02..284614.02 rows=115082 width=12) (actual time=121.957..121.957 rows=91660 loops=1)
Buckets: 16384 Batches: 2 Memory Usage: 1974kB
-> Bitmap Heap Scan on apts (cost=8296.88..284614.02 rows=115082 width=12) (actual time=19.927..91.038 rows=91660 loops=1)
Recheck Cond: (practice_id = 104)
Filter: (eligibility_status_id <> 1)
Rows Removed by Filter: 80169
-> Bitmap Index Scan on index_apts_on_practice_id (cost=0.00..8268.11 rows=177540 width=0) (actual time=16.856..16.856 rows=179506 loops=1)
Index Cond: (practice_id = 104)
Total runtime: 2928.361 ms
First, rewrite the query to a more manageable form:
SELECT DISTINCT a."id", pr.name AS alias_0
FROM "apts" a JOIN
"practices" pr
ON pr."id" = a."practice_id" JOIN
"eligibility_messages" em
ON em."apt_id" = a."id"
WHERE (a.eligibility_status_id <> 1) AND
(em.current = 't') AND
(a.practice_id = 104)
ORDER BY pr.name desc ;
Notes:
The WHERE clause turns the outer joins into inner joins anyway, so you might as well express them correctly.
I doubt pr.id is actually a string
The patients table isn't used, so I just removed it.
Perhaps you don't even need the select distinct any more.
Switched the condition in the where to apts rather than practices.
If this isn't fast enough, you want indexes, probably on apts(practice_id, eligibility_status_id, id), practices(id), and eligibility_messages(apt_id, current).

Optimise the PG query

The query is used very often in the app and is too expensive.
What are the things I can do to optimise it and bring the total time to milliseconds (rather than hundreds of ms)?
NOTES:
removing DISTINCT improves (down to ~460ms), but I need to to get rid of cartesian product :( (yeah, show better way of avoiding it)
removing OREDER BY name improves, but not significantly.
The query:
SELECT DISTINCT properties.*
FROM properties JOIN developments ON developments.id = properties.development_id
-- Development allocations
LEFT JOIN allocation_items AS dev_items ON dev_items.development_id = properties.development_id
LEFT JOIN allocations AS dev_allocs ON dev_items.allocation_id = dev_allocs.id
-- Group allocations
LEFT JOIN properties_property_groupings ppg ON ppg.property_id = properties.id
LEFT JOIN property_groupings pg ON pg.id = ppg.property_grouping_id
LEFT JOIN allocation_items prop_items ON prop_items.property_grouping_id = pg.id
LEFT JOIN allocations prop_allocs ON prop_allocs.id = prop_items.allocation_id
WHERE
(properties.status <> 'deleted') AND ((
properties.status <> 'inactive'
AND (
(dev_allocs.receiving_company_id = 175 OR prop_allocs.receiving_company_id = 175)
AND developments.status = 'active'
)
OR developments.company_id = 175
)
AND EXISTS (
SELECT 1 FROM development_participations dp
JOIN participations p ON p.id = dp.participation_id
WHERE dp.allowed
AND p.user_id = 387 AND p.company_id = 175
AND dp.development_id = properties.development_id
LIMIT 1
)
)
ORDER BY properties.name
EXPLAIN ANALYZE
Unique (cost=72336.86..72517.53 rows=1606 width=4336) (actual time=703.766..710.920 rows=219 loops=1)
-> Sort (cost=72336.86..72340.87 rows=1606 width=4336) (actual time=703.765..704.698 rows=5091 loops=1)
Sort Key: properties.name, properties.id, properties.status, properties.level, etc etc (all columns)
Sort Method: external sort Disk: 1000kB
-> Nested Loop Left Join (cost=0.00..69258.84 rows=1606 width=4336) (actual time=25.230..366.489 rows=5091 loops=1)
Filter: ((((properties.status)::text <> 'inactive'::text) AND ((dev_allocs.receiving_company_id = 175) OR (prop_allocs.receiving_company_id = 175)) AND ((developments.status)::text = 'active'::text)) OR (developments.company_id = 175))
-> Nested Loop Left Join (cost=0.00..57036.99 rows=41718 width=4355) (actual time=25.122..247.587 rows=99567 loops=1)
-> Nested Loop Left Join (cost=0.00..47616.39 rows=21766 width=4355) (actual time=25.111..163.827 rows=39774 loops=1)
-> Nested Loop Left Join (cost=0.00..41508.16 rows=21766 width=4355) (actual time=25.101..112.452 rows=39774 loops=1)
-> Nested Loop Left Join (cost=0.00..34725.22 rows=21766 width=4351) (actual time=25.087..68.104 rows=19887 loops=1)
-> Nested Loop Left Join (cost=0.00..28613.00 rows=21766 width=4351) (actual time=25.076..39.360 rows=19887 loops=1)
-> Nested Loop (cost=0.00..27478.54 rows=1147 width=4347) (actual time=25.059..29.966 rows=259 loops=1)
-> Index Scan using developments_pkey on developments (cost=0.00..25.17 rows=49 width=15) (actual time=0.048..0.127 rows=48 loops=1)
Filter: (((status)::text = 'active'::text) OR (company_id = 175))
-> Index Scan using index_properties_on_development_id on properties (cost=0.00..559.95 rows=26 width=4336) (actual time=0.534..0.618 rows=5 loops=48)
Index Cond: (development_id = developments.id)
Filter: (((status)::text <> 'deleted'::text) AND (SubPlan 1))
SubPlan 1
-> Limit (cost=0.00..10.00 rows=1 width=0) (actual time=0.011..0.011 rows=0 loops=2420)
-> Nested Loop (cost=0.00..10.00 rows=1 width=0) (actual time=0.011..0.011 rows=0 loops=2420)
Join Filter: (dp.participation_id = p.id)
-> Seq Scan on development_participations dp (cost=0.00..1.71 rows=1 width=4) (actual time=0.004..0.008 rows=1 loops=2420)
Filter: (allowed AND (development_id = properties.development_id))
-> Index Scan using index_participations_on_user_id on participations p (cost=0.00..8.27 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=3148)
Index Cond: (user_id = 387)
Filter: (company_id = 175)
-> Index Scan using index_allocation_items_on_development_id on allocation_items dev_items (cost=0.00..0.70 rows=23 width=8) (actual time=0.003..0.016 rows=77 loops=259)
Index Cond: (development_id = properties.development_id)
-> Index Scan using allocations_pkey on allocations dev_allocs (cost=0.00..0.27 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=19887)
Index Cond: (dev_items.allocation_id = id)
-> Index Scan using index_properties_property_groupings_on_property_id on properties_property_groupings ppg (cost=0.00..0.29 rows=2 width=8) (actual time=0.001..0.001 rows=2 loops=19887)
Index Cond: (property_id = properties.id)
-> Index Scan using property_groupings_pkey on property_groupings pg (cost=0.00..0.27 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=39774)
Index Cond: (id = ppg.property_grouping_id)
-> Index Scan using index_allocation_items_on_property_grouping_id on allocation_items prop_items (cost=0.00..0.36 rows=6 width=8) (actual time=0.001..0.001 rows=2 loops=39774)
Index Cond: (property_grouping_id = pg.id)
-> Index Scan using allocations_pkey on allocations prop_allocs (cost=0.00..0.27 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=99567)
Index Cond: (id = prop_items.allocation_id)
Total runtime: 716.692 ms
(39 rows)
Answering my own question.
This query has 2 big issues:
6 LEFT JOINs that produce cartesian product (resulting in billion-s of records even on small dataset).
DISTINCT that has to sort that billion records dataset.
So I had to eliminate those.
The way I did it is by replacing JOINs with 2 subqueries (won't provide it here since it should be pretty obvious).
As a result, the actual time went from ~700-800ms down to ~45ms which is more or less acceptable.
Most time is spend in the disk sort, you should use RAM by changing work_mem:
SET work_mem TO '20MB';
And check EXPLAIN ANALYZE again