I have the following query
select * from activity_feed where user_id in (select following_id from user_follow where follower_id=:user_id)
select * from activity_feed where project_id in (select project_id from user_project_follow where user_id=:user_id)
order by id desc limit 30
Which runs in approximately 14 ms according to postico
But when i do explain analyze on this query , the plannig time is 0.5 ms and the execution time is around 800 ms (which is what i would actually expect). Is this because the query without explain analyze is returning cached results? I still get less than 20 ms results even when. use other values.
Which one is more indictivie of the performance I'll get in production? I also realized that this is a rather inefficient query, I can't seem to figure out an index that would make this more efficient. It's possible that I will have to not use union
Edit: the execution plan
Limit (cost=1380.94..1380.96 rows=10 width=148) (actual time=771.111..771.405 rows=10 loops=1)
-> Sort (cost=1380.94..1385.64 rows=1881 width=148) (actual time=771.097..771.160 rows=10 loops=1)
Sort Key: activity_feed."timestamp" DESC
Sort Method: top-N heapsort Memory: 27kB
-> HashAggregate (cost=1321.48..1340.29 rows=1881 width=148) (actual time=714.888..743.273 rows=4462 loops=1)
Group Key: activity_feed.id, activity_feed."timestamp", activity_feed.user_id, activity_feed.verb, activity_feed.object_type, activity_feed.object_id, activity_feed.project_id, activity_feed.privacy_level, activity_feed.local_time, activity_feed.local_date
-> Append (cost=5.12..1274.46 rows=1881 width=148) (actual time=0.998..682.466 rows=4487 loops=1)
-> Hash Join (cost=5.12..610.43 rows=1350 width=70) (actual time=0.982..326.089 rows=3013 loops=1)
Hash Cond: (activity_feed.user_id = user_follow.following_id)
-> Seq Scan on activity_feed (cost=0.00..541.15 rows=24215 width=70) (actual time=0.016..150.535 rows=24215 loops=1)
-> Hash (cost=4.78..4.78 rows=28 width=8) (actual time=0.911..0.922 rows=29 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Index Only Scan using unique_user_follow_pair on user_follow (cost=0.29..4.78 rows=28 width=8) (actual time=0.022..0.334 rows=29 loops=1)
Index Cond: (follower_id = '17420532762804570'::bigint)
Heap Fetches: 0
-> Hash Join (cost=30.50..635.81 rows=531 width=70) (actual time=0.351..301.945 rows=1474 loops=1)
Hash Cond: (activity_feed_1.project_id = user_project_follow.project_id)
-> Seq Scan on activity_feed activity_feed_1 (cost=0.00..541.15 rows=24215 width=70) (actual time=0.027..143.896 rows=24215 loops=1)
-> Hash (cost=30.36..30.36 rows=11 width=8) (actual time=0.171..0.182 rows=11 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Only Scan using idx_user_project_follow_temp on user_project_follow (cost=0.28..30.36 rows=11 width=8) (actual time=0.020..0.102 rows=11 loops=1)
Index Cond: (user_id = '17420532762804570'::bigint)
Heap Fetches: 11
Planning Time: 0.571 ms
Execution Time: 771.774 ms
Thanks for the help in advance!
Very slow clock access like you show here (nearly 100 fold slower when TIMING defaults to ON!) usually indicates either old hardware or an old kernel IME. Not being able to trust EXPLAIN (ANALYZE) to get good data can be very frustrating if you are very particular about performance, so you should consider upgrading your hardware or your OS.
I have been tasked with rewriting some low performance sql in our system for which I have this query
"aggtable".id as t_id,
count(joined.packages)::integer as t_package_count,
sum(coalesce((joined.packages ->> 'weight'::text)::double precision, 0::double precision)) as t_total_weight
join (
"unnested".myid, json_array_elements("jsontable".jsondata) as packages
distinct unnest("tounnest".arrayofid) as myid
"aggtable" "tounnest") "unnested"
join "jsontable" on
"jsontable".id = "unnested".myid) joined on
joined.myid = any("aggtable".arrayofid)
group by
The EXPLAN ANALYSE result is
Sort Method: quicksort Memory: 611kB
-> Nested Loop (cost=30917.16..31333627.69 rows=27270 width=69) (actual time=4.028..2054.470 rows=3658 loops=1)
Join Filter: ((unnest(tounnest.arrayofid)) = ANY (aggtable.arrayofid))
Rows Removed by Join Filter: 9055436
-> ProjectSet (cost=30917.16..36645.61 rows=459000 width=48) (actual time=3.258..13.846 rows=3322 loops=1)
-> Hash Join (cost=30917.16..34316.18 rows=4590 width=55) (actual time=3.246..7.079 rows=1661 loops=1)
Hash Cond: ((unnest(tounnest.arrayofid)) = jsontable.id)
-> Unique (cost=30726.88..32090.38 rows=144700 width=16) (actual time=1.901..3.720 rows=1664 loops=1)
-> Sort (cost=30726.88..31408.63 rows=272700 width=16) (actual time=1.900..2.711 rows=1845 loops=1)
Sort Key: (unnest(tounnest.arrayofid))
Sort Method: quicksort Memory: 135kB
-> ProjectSet (cost=0.00..1444.22 rows=272700 width=16) (actual time=0.011..1.110 rows=1845 loops=1)
-> Seq Scan on aggtable tounnest (cost=0.00..60.27 rows=2727 width=30) (actual time=0.007..0.311 rows=2727 loops=1)
-> Hash (cost=132.90..132.90 rows=4590 width=55) (actual time=1.328..1.329 rows=4590 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 454kB
-> Seq Scan on jsontable (cost=0.00..132.90 rows=4590 width=55) (actual time=0.006..0.497 rows=4590 loops=1)
-> Materialize (cost=0.00..73.91 rows=2727 width=67) (actual time=0.000..0.189 rows=2727 loops=3322)
-> Seq Scan on aggtable (cost=0.00..60.27 rows=2727 width=67) (actual time=0.012..0.317 rows=2727 loops=1)
Planning Time: 0.160 ms
Execution Time: 2065.268 ms
I tried to rewrite this query from scratch to profile performance and to understand the original intention
count(joined.packages)::integer as t_package_count,
sum(coalesce((joined.packages ->> 'weight'::text)::double precision, 0::double precision)) as t_total_weight
joinid ,
json_array_elements(jsondata) as packages
( (
distinct unnest(at2.arrayofid) as joinid, at2.id as rootid
aggtable at2) unnested
join jsontable jt on
jt.id = unnested.joinid)) joined
group by joined.joinid
For which the EXPLAIN ANALYSE return
HashAggregate (cost=873570.28..873572.78 rows=200 width=28) (actual time=18.379..18.741 rows=1661 loops=1)
Group Key: (unnest(at2.arrayofid))
-> ProjectSet (cost=44903.16..191820.28 rows=27270000 width=48) (actual time=3.019..14.684 rows=3658 loops=1)
-> Hash Join (cost=44903.16..53425.03 rows=272700 width=55) (actual time=3.010..4.999 rows=1829 loops=1)
Hash Cond: ((unnest(at2.arrayofid)) = jt.id)
-> Unique (cost=44712.88..46758.13 rows=272700 width=53) (actual time=1.825..2.781 rows=1845 loops=1)
-> Sort (cost=44712.88..45394.63 rows=272700 width=53) (actual time=1.824..2.135 rows=1845 loops=1)
Sort Key: (unnest(at2.arrayofid)), at2.id
Sort Method: quicksort Memory: 308kB
-> ProjectSet (cost=0.00..1444.22 rows=272700 width=53) (actual time=0.009..1.164 rows=1845 loops=1)
-> Seq Scan on aggtable at2 (cost=0.00..60.27 rows=2727 width=67) (actual time=0.005..0.311 rows=2727 loops=1)
-> Hash (cost=132.90..132.90 rows=4590 width=55) (actual time=1.169..1.169 rows=4590 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 454kB
-> Seq Scan on jsontable jt (cost=0.00..132.90 rows=4590 width=55) (actual time=0.007..0.462 rows=4590 loops=1)
Planning Time: 0.144 ms
Execution Time: 18.889 ms
I see a huge difference in the query performance (20ms to 2000ms), as evaluated by postgres. Howver, the real query performance is no where near that difference ( the fast one is about 500ms and the slow one is about 1s )
My question
1/ Is that normal that EXPLAIN produce drastic difference in performance but not so much in real life?
2/ Is the second - optimized query correct? what did the first query do wrong?
I suppy also the credential to a sample database so that everyone can try the queries out
PW is
I have the below query running on a postgres and sqlserver DB (Use top for SQL server). The sorting of the "change_sequence" value is causing a high cost in my query, is there any way to reduce the cost but maintain the same results?
SELECT tablename,
CAST(primary_key_values AS VARCHAR),
CAST(min_sequence AS NUMERIC),
SELECT 'memdep' AS tablename,
CONCAT_WS(',',dependant,mem_num) AS primary_key_values,
'dependant,mem_num,' AS primary_key_fields,
_change_sequence AS min_sequence,
ROW_NUMBER() OVER(partition by dependant,mem_num order by _change_sequence) AS rn,
FROM mipbi_ods.memdep
WHERE mipbi_status = 'NEW'
) main
WHERE rn = 1
In essence what i'm looking for is the records from "memdep" where they have a "mipbi_status" of 'NEW' with the lowest "_change_sequence". Ive tried using a MIN() function instead of the ROW_NUMBER the speed is about the same cost is about 5 more.
Is there a way to reduce the cost/speed of the query. I have around 400 million records in this table if that helps.
Here is the query explained:
Limit (cost=3080.03..3080.53 rows=100 width=109) (actual time=17.633..17.648 rows=35 loops=1)
-> Unique (cost=3080.03..3089.04 rows=1793 width=109) (actual time=17.632..17.644 rows=35 loops=1)
-> Sort (cost=3080.03..3084.53 rows=1803 width=109) (actual time=17.631..17.634 rows=36 loops=1)
Sort Key: (concat_ws(','::text, memdet.mem_num))
Sort Method: quicksort Memory: 29kB
-> Bitmap Heap Scan on memdet (cost=54.39..2982.52 rows=1803 width=109) (actual time=16.853..17.542 rows=36 loops=1)
Recheck Cond: ((mipbi_status)::text = 'NEW'::text)
Heap Blocks: exact=8
-> Bitmap Index Scan on idx_mipbi_status_memdet (cost=0.00..53.94 rows=1803 width=0) (actual time=10.396..10.396 rows=38 loops=1)
Index Cond: ((mipbi_status)::text = 'NEW'::text)
Planning time: 0.201 ms
Execution time: 17.700 ms
I'm using a smaller table to show here, this isn't the 400 million record table, but indexes and all will be the same.
Here is the query plan for the large table:
Limit (cost=47148422.27..47149122.27 rows=100 width=113) (actual time=2407976.293..2407977.112 rows=100 loops=1)
Output: main.tablename, ((main.primary_key_values)::character varying), main.primary_key_fields, main.min_sequence, main._changed_fieldlist, main._operation, main.min_sequence
Buffers: shared hit=6269554 read=12205028 dirtied=1893 written=4566983, temp read=443831 written=1016025
-> Subquery Scan on main (cost=47148422.27..52102269.25 rows=707692 width=113) (actual time=2407976.292..2407977.100 rows=100 loops=1)
Output: main.tablename, (main.primary_key_values)::character varying, main.primary_key_fields, main.min_sequence, main._changed_fieldlist, main._operation, main.min_sequence
Filter: (main.rn = 1)
Buffers: shared hit=6269554 read=12205028 dirtied=1893 written=4566983, temp read=443831 written=1016025
-> WindowAgg (cost=47148422.27..50333038.19 rows=141538485 width=143) (actual time=2407976.288..2407977.080 rows=100 loops=1)
Output: 'claim', concat_ws(','::text, claim.gen_claimnum), 'gen_claimnum,', claim._change_sequence, row_number() OVER (?), claim._changed_fieldlist, claim._operation, claim.gen_claimnum
Buffers: shared hit=6269554 read=12205028 dirtied=1893 written=4566983, temp read=443831 written=1016025
-> Sort (cost=47148422.27..47502268.49 rows=141538485 width=39) (actual time=2407976.236..2407976.905 rows=100 loops=1)
Output: claim._change_sequence, claim.gen_claimnum, claim._changed_fieldlist, claim._operation
Sort Key: claim.gen_claimnum, claim._change_sequence
Sort Method: external merge Disk: 4588144kB
Buffers: shared hit=6269554 read=12205028 dirtied=1893 written=4566983, temp read=443831 written=1016025
-> Seq Scan on mipbi_ods.claim (cost=0.00..20246114.01 rows=141538485 width=39) (actual time=0.028..843181.418 rows=88042077 loops=1)
Output: claim._change_sequence, claim.gen_claimnum, claim._changed_fieldlist, claim._operation
Filter: ((claim.mipbi_status)::text = 'NEW'::text)
Rows Removed by Filter: 356194
Buffers: shared hit=6269554 read=12205028 dirtied=1893 written=4566983
Planning time: 8.796 ms
Execution time: 2408702.464 ms
I've been optimizing some sql queries against a production database clone. Here is an example query where I've create two indexes where we can run index-only scans really fast using a hash join.
explain analyse
select activity.id from activity, notification
where notification.user_id = '9a51f675-e1e2-46e5-8bcd-6bc535c7e7cb'
and notification.received = false
and notification.invalid = false
and activity.id = notification.activity_id
and activity.space_id = 'e12b42ac-4e54-476f-a4f5-7d6bdb1e61e2'
order by activity.end_time desc
limit 21;
Limit (cost=985.58..985.58 rows=1 width=24) (actual time=0.017..0.017 rows=0 loops=1)
-> Sort (cost=985.58..985.58 rows=1 width=24) (actual time=0.016..0.016 rows=0 loops=1)
Sort Key: activity.end_time DESC
Sort Method: quicksort Memory: 25kB
-> Hash Join (cost=649.76..985.57 rows=1 width=24) (actual time=0.010..0.010 rows=0 loops=1)
Hash Cond: (notification.activity_id = activity.id)
-> Index Only Scan using unreceived_notifications_index on notification (cost=0.42..334.62 rows=127 width=16) (actual time=0.009..0.009 rows=0 loops=1)
Index Cond: (user_id = '9a51f675-e1e2-46e5-8bcd-6bc535c7e7cb'::uuid)
Heap Fetches: 0
-> Hash (cost=634.00..634.00 rows=1227 width=24) (never executed)
-> Index Only Scan using space_activity_index on activity (cost=0.56..634.00 rows=1227 width=24) (never executed)
Index Cond: (space_id = 'e12b42ac-4e54-476f-a4f5-7d6bdb1e61e2'::uuid)
Heap Fetches: 0
Planning time: 0.299 ms
Execution time: 0.046 ms
And here are the indexes.
create index unreceived_notifications_index on notification using btree (
activity_id, -- index-only scan
id -- index-only scan
) where (
invalid = false
and received = false
create index space_activity_index on activity using btree (
end_time desc,
id -- index-only scan
However, I'm noticing that these indexes are making our development database a LOT slower. Here's the same query against a user in our development database and you'll notice its using a nested loop join this time and the order of the loops is really inefficient.
explain analyse
select notification.id from notification, activity
where notification.user_id = '7c74a801-7cb5-4914-bbbe-2b18cd1ced76'
and notification.received = false
and notification.invalid = false
and activity.id = notification.activity_id
and activity.space_id = '415fc269-e68f-4da0-b3e3-b1273b741a7f'
order by activity.end_time desc
limit 20;
Limit (cost=0.69..272.04 rows=20 width=24) (actual time=277.255..277.255 rows=0 loops=1)
-> Nested Loop (cost=0.69..71487.55 rows=5269 width=24) (actual time=277.253..277.253 rows=0 loops=1)
-> Index Only Scan using space_activity_index on activity (cost=0.42..15600.36 rows=155594 width=24) (actual time=0.016..59.433 rows=155666 loops=1)
Index Cond: (space_id = '415fc269-e68f-4da0-b3e3-b1273b741a7f'::uuid)
Heap Fetches: 38361
-> Index Only Scan using unreceived_notifications_index on notification (cost=0.27..0.35 rows=1 width=32) (actual time=0.001..0.001 rows=0 loops=155666)
Index Cond: ((user_id = '7c74a801-7cb5-4914-bbbe-2b18cd1ced76'::uuid) AND (activity_id = activity.id))
Heap Fetches: 0
Planning time: 0.351 ms
Execution time: 277.286 ms
One thing to note here is that there is are only 2 space_ids in our development database. I suspect this is causing Postgres to try to be clever, but it's actually making performance worse!
My question is:
Is there some way that I can force Postgres to run the hash join instead of the nested loop join?
Is there some way, in general, that I can make Postgres's query-planner more deterministic? Ideally, the query performance characteristics would be the exact same between these environments.
Edit: Note that when I leave out the space_id condition when querying my dev database, the result is faster.
explain analyse
select notification.id from notification, activity
where notification.user_id = '7c74a801-7cb5-4914-bbbe-2b18cd1ced76'
and notification.received = false
and notification.invalid = false
and activity.id = notification.activity_id
--and activity.space_id = '415fc269-e68f-4da0-b3e3-b1273b741a7f'
order by activity.end_time desc
limit 20;
Limit (cost=17628.13..17630.43 rows=20 width=24) (actual time=2.730..2.730 rows=0 loops=1)
-> Gather Merge (cost=17628.13..17996.01 rows=3199 width=24) (actual time=2.729..2.729 rows=0 loops=1)
Workers Planned: 1
Workers Launched: 1
-> Sort (cost=16628.12..16636.12 rows=3199 width=24) (actual time=0.126..0.126 rows=0 loops=2)
Sort Key: activity.end_time DESC
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=20.59..16441.88 rows=3199 width=24) (actual time=0.093..0.093 rows=0 loops=2)
-> Parallel Bitmap Heap Scan on notification (cost=20.17..2512.17 rows=3199 width=32) (actual time=0.092..0.092 rows=0 loops=2)
Recheck Cond: ((user_id = '7c74a801-7cb5-4914-bbbe-2b18cd1ced76'::uuid) AND (NOT invalid) AND (NOT received))
-> Bitmap Index Scan on unreceived_notifications_index (cost=0.00..18.82 rows=5439 width=0) (actual time=0.006..0.006 rows=0 loops=1)
Index Cond: (user_id = '7c74a801-7cb5-4914-bbbe-2b18cd1ced76'::uuid)
-> Index Scan using activity_pkey on activity (cost=0.42..4.35 rows=1 width=24) (never executed)
Index Cond: (id = notification.activity_id)
Planning time: 0.344 ms
Execution time: 3.433 ms
Edit: After reading about index hinting, I tried turning nested_loop off using set enable_nestloop=false; and the query is way faster!
Limit (cost=20617.76..20620.09 rows=20 width=24) (actual time=2.872..2.872 rows=0 loops=1)
-> Gather Merge (cost=20617.76..21130.20 rows=4392 width=24) (actual time=2.871..2.871 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=19617.74..19623.23 rows=2196 width=24) (actual time=0.086..0.086 rows=0 loops=3)
Sort Key: activity.end_time DESC
Sort Method: quicksort Memory: 25kB
-> Hash Join (cost=2609.20..19495.85 rows=2196 width=24) (actual time=0.062..0.062 rows=0 loops=3)
Hash Cond: (activity.id = notification.activity_id)
-> Parallel Seq Scan on activity (cost=0.00..14514.57 rows=64831 width=24) (actual time=0.006..0.006 rows=1 loops=3)
Filter: (space_id = '415fc269-e68f-4da0-b3e3-b1273b741a7f'::uuid)
-> Hash (cost=2541.19..2541.19 rows=5441 width=32) (actual time=0.007..0.007 rows=0 loops=3)
Buckets: 8192 Batches: 1 Memory Usage: 64kB
-> Bitmap Heap Scan on notification (cost=20.18..2541.19 rows=5441 width=32) (actual time=0.006..0.006 rows=0 loops=3)
Recheck Cond: ((user_id = '7c74a801-7cb5-4914-bbbe-2b18cd1ced76'::uuid) AND (NOT invalid) AND (NOT received))
-> Bitmap Index Scan on unreceived_notifications_index (cost=0.00..18.82 rows=5441 width=0) (actual time=0.004..0.004 rows=0 loops=3)
Index Cond: (user_id = '7c74a801-7cb5-4914-bbbe-2b18cd1ced76'::uuid)
Planning time: 0.375 ms
Execution time: 3.630 ms
It depends on how specialized you want to get. There are plan guides in postgresQL that you can use to force the queries to use specific indexes. But query optimizers are strongly impacted by record counts in the choices they make. Maybe you add the extra indexes in the non-dev environment and move on?
When I execute explain analyze on some query I've got the normal cost from some low value to some higher value. But when I'm trying to force to use the index on table by switching enable_seqscan to false, the query cost jumps to insane values like:
Merge Join (cost=10064648609.460..10088218360.810 rows=564249 width=21) (actual time=341699.323..370702.969 rows=3875328 loops=1)
Merge Cond: ((foxtrot.two = ((five_hotel.two)::numeric)) AND (foxtrot.alpha_two07 = ((five_hotel.alpha_two07)::numeric)))
-> Merge Append (cost=10000000000.580..10023064799.260 rows=23522481 width=24) (actual time=0.049..19455.320 rows=23522755 loops=1)
Sort Key: foxtrot.two, foxtrot.alpha_two07
-> Sort (cost=10000000000.010..10000000000.010 rows=1 width=76) (actual time=0.005..0.005 rows=0 loops=1)
Sort Key: foxtrot.two, foxtrot.alpha_two07
Sort Method: quicksort Memory: 25kB
-> Seq Scan on foxtrot (cost=10000000000.000..10000000000.000 rows=1 width=76) (actual time=0.001..0.001 rows=0 loops=1)
Filter: (kilo_sierra_oscar = 'oscar'::date)
-> Index Scan using alpha_five on five_uniform (cost=0.560..22770768.220 rows=23522480 width=24) (actual time=0.043..17454.619 rows=23522755 loops=1)
Filter: (kilo_sierra_oscar = 'oscar'::date)
As you can see I'm trying to retrive values by index, so they doesn't need to be sorted once they're loaded.
It is a simple query:
select *
from foxtrot a
where foxtrot.kilo_sierra_oscar = date'2015-01-01'
order by foxtrot.two, foxtrot.alpha_two07
Index scan: "Execution time: 19009.569 ms"
Sequential scan: "Execution time: 127062.802 ms"
Setting the enable_seqscan to false improves execution time of query, but I would like optimizer to calculate that.
Seq plan with buffers:
Sort (cost=4607555.110..4666361.310 rows=23522481 width=24) (actual time=101094.754..120740.190 rows=23522756 loops=1)
Sort Key: foxtrot.two, foxtrot.alpha07
Sort Method: external merge Disk: 805304kB
Buffers: shared hit=468690, temp read=100684 written=100684
-> Append (cost=0.000..762721.000 rows=23522481 width=24) (actual time=0.006..12018.725 rows=23522756 loops=1)
Buffers: shared hit=468690
-> Seq Scan on foxtrot (cost=0.000..0.000 rows=1 width=76) (actual time=0.001..0.001 rows=0 loops=1)
Filter: (kilo = 'oscar'::date)
-> Seq Scan on foxtrot (cost=0.000..762721.000 rows=23522480 width=24) (actual time=0.005..9503.851 rows=23522756 loops=1)
Filter: (kilo = 'oscar'::date)
Buffers: shared hit=468690
Index plan with buffers:
Merge Append (cost=10000000000.580..10023064799.260 rows=23522481 width=24) (actual time=0.046..19302.855 rows=23522756 loops=1)
Sort Key: foxtrot.two, foxtrot.alpha_two07
Buffers: shared hit=17855133 -> Sort (cost=10000000000.010..10000000000.010 rows=1 width=76) (actual time=0.009..0.009 rows=0 loops=1)
Sort Key: foxtrot.two, foxtrot.alpha_two07
Sort Method: quicksort Memory: 25kB
-> Seq Scan on foxtrot (cost=10000000000.000..10000000000.000 rows=1 width=76) (actual time=0.000..0.000 rows=0 loops=1)
Filter: (kilo = 'oscar'::date)
-> Index Scan using alpha_five on five (cost=0.560..22770768.220 rows=23522480 width=24) (actual time=0.036..17035.903 rows=23522756 loops=1)
Filter: (kilo = 'oscar'::date)
Buffers: shared hit=17855133
Why the cost of the query jumps so high? How can I avoid it?
The high cost is a direct consequence of set enable_seqscan=false.
The planner implements this "hint" by setting an arbitrary super-high cost (10 000 000 000) to the sequential scan technique. Then it computes the different potential execution strategies with their associated costs.
If the best result still has a super-high cost, it means that the planner found no strategy to avoid the sequential scan, even when trying at all costs.
In the plan shown in the question under "Index plan with buffers" this happens at the Seq Scan on foxtrot node.
This query takes ~4 seconds to complete:
SELECT DISTINCT "resources_resource"."id",
FROM "resources_resource"
INNER JOIN "resources_passageresource" ON ("resources_resource"."id" = "resources_passageresource"."resource_id")
WHERE "resources_passageresource"."start_ref" >= 66001001
ORDER BY "resources_resource"."ord" ASC, "resources_resource"."sort_name" ASC LIMIT 5
By popular request, EXPLAIN ANALYZE:
Limit (cost=1125.50..1125.68 rows=5 width=803) (actual time=4434.076..4434.557 rows=5 loops=1)
-> Unique (cost=1125.50..1136.91 rows=326 width=803) (actual time=4434.076..4434.557 rows=5 loops=1)
-> Sort (cost=1125.50..1126.32 rows=326 width=803) (actual time=4434.075..4434.075 rows=6 loops=1)
Sort Key: resources_resource.ord, resources_resource.sort_name, resources_resource.id, resources_resource.heading, resources_resource.name, resources_resource.old_name, resources_resource.clean_name, resources_resource.see_also_id, resources_resource.referenced_passages, resources_resource.resource_type, resources_resource.content, resources_resource.thumb, resources_resource.resource_origin
Sort Method: quicksort Memory: 424kB
-> Hash Join (cost=697.00..1111.89 rows=326 width=803) (actual time=3.453..41.429 rows=424 loops=1)
Hash Cond: (resources_passageresource.resource_id = resources_resource.id)
-> Bitmap Heap Scan on resources_passageresource (cost=10.78..190.19 rows=326 width=4) (actual time=0.107..0.401 rows=424 loops=1)
Recheck Cond: (start_ref >= 66001001)
-> Bitmap Index Scan on resources_passageresource_start_ref (cost=0.00..10.70 rows=326 width=0) (actual time=0.086..0.086 rows=424 loops=1)
Index Cond: (start_ref >= 66001001)
-> Hash (cost=431.32..431.32 rows=2232 width=803) (actual time=3.228..3.228 rows=2232 loops=1)
Buckets: 1024 Batches: 2 Memory Usage: 947kB
-> Seq Scan on resources_resource (cost=0.00..431.32 rows=2232 width=803) (actual time=0.002..1.621 rows=2232 loops=1)
Total runtime: 4435.460 ms
This is ORM-generated SQL. I can work in SQL, but I'm definitely not proficient, and the EXPLAIN output here is mystifying to me. What about this query is dragging me down?
UPDATE: #Ybakos identified that the ORDER_BY was causing trouble. Removing the ORDER_BY clause altogether helps a bit, but the query still takes 800ms. Here's the EXPLAIN ANALYZE, sans ORDER_BY:
HashAggregate (cost=1122.49..1125.75 rows=326 width=803) (actual time=787.519..787.559 rows=104 loops=1)
-> Hash Join (cost=697.00..1111.89 rows=326 width=803) (actual time=3.381..7.312 rows=424 loops=1)
Hash Cond: (resources_passageresource.resource_id = resources_resource.id)
-> Bitmap Heap Scan on resources_passageresource (cost=10.78..190.19 rows=326 width=4) (actual time=0.095..0.686 rows=424 loops=1)
Recheck Cond: (start_ref >= 66001001)
-> Bitmap Index Scan on resources_passageresource_start_ref (cost=0.00..10.70 rows=326 width=0) (actual time=0.079..0.079 rows=424 loops=1)
Index Cond: (start_ref >= 66001001)
-> Hash (cost=431.32..431.32 rows=2232 width=803) (actual time=3.173..3.173 rows=2232 loops=1)
Buckets: 1024 Batches: 2 Memory Usage: 947kB
-> Seq Scan on resources_resource (cost=0.00..431.32 rows=2232 width=803) (actual time=0.002..1.568 rows=2232 loops=1)
Total runtime: 787.678 ms
It seems to me, DISTINCT has to be used to remove duplicates produced by the join. So my question is, why produce the duplicates in the first place? I'm not entirely sure what this query's being ORM-generated must imply, but if rewriting it is an option, you could certainly rewrite it in such a way as to prevent duplicates from appearing. For instance, using IN:
SELECT "resources_resource"."id",
FROM "resources_resource"
WHERE "resources_resource"."id" IN (
SELECT "resources_passageresource"."resource_id"
FROM "resources_passageresource"
WHERE "resources_passageresource"."start_ref" >= 66001001
ORDER BY "resources_resource"."ord" ASC, "resources_resource"."sort_name" ASC LIMIT 5
or using EXISTS:
SELECT "resources_resource"."id",
FROM "resources_resource"
FROM "resources_passageresource"
WHERE "resources_passageresource"."resource_id" = "resources_resource"."id"
AND "resources_passageresource"."start_ref" >= 66001001
ORDER BY "resources_resource"."ord" ASC, "resources_resource"."sort_name" ASC LIMIT 5
And, of course, if it's acceptable to rewrite the query completely, I would also remove the long table names in front of column names. Consider the following, for instance (the IN query rewritten):
SELECT "id",
FROM "resources_resource"
WHERE "resources_resource"."id" IN (
SELECT "resource_id"
FROM "resources_passageresource"
WHERE "start_ref" >= 66001001
ORDER BY "ord" ASC, "sort_name" ASC LIMIT 5
It's the combination of ORDER BY with LIMIT.
If you don't have an index on (ord, sort_name) then I bet this is the cause of the slow performance. Or perhaps an index on (start_ref, ord, sort_name) is necessary for this particular query. Lastly, due to that join, perhaps have the left/first table be the one upon which your ORDER BY criteria applies.
That seems like a long time in the JOIN. The default memory settings in postgresql.conf are too low for any modern computer. Have you remembered to bump them up?