I have following query:
Sum(fact_individual_re.quality_hours) AS C0,
dim_gender.name AS C1,
dim_date.year AS C2
INNER JOIN dim_date ON fact_individual_re.dim_date_id = dim_date.id
INNER JOIN dim_gender ON fact_individual_re.dim_gender_id = dim_gender.id
GROUP BY dim_date.year,dim_gender.name
ORDER BY dim_date.year ASC,dim_gender.name ASC,Sum(fact_individual_re.quality_hours) ASC
When explaining it's plan, HASH JOIN is taking most time. Is there any way to minimize the time for HASH JOIN:
Sort (cost=190370.50..190370.55 rows=20 width=18) (actual time=4005.152..4005.154 rows=20 loops=1)
Sort Key: dim_date.year, dim_gender.name, (sum(fact_individual_re.quality_hours))
Sort Method: quicksort Memory: 26kB
-> Finalize GroupAggregate (cost=190369.07..190370.07 rows=20 width=18) (actual time=4005.106..4005.135 rows=20 loops=1)
Group Key: dim_date.year, dim_gender.name
-> Sort (cost=190369.07..190369.27 rows=80 width=18) (actual time=4005.100..4005.103 rows=100 loops=1)
Sort Key: dim_date.year, dim_gender.name
Sort Method: quicksort Memory: 32kB
-> Gather (cost=190358.34..190366.54 rows=80 width=18) (actual time=4004.966..4005.020 rows=100 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial HashAggregate (cost=189358.34..189358.54 rows=20 width=18) (actual time=3885.254..3885.259 rows=20 loops=5)
Group Key: dim_date.year, dim_gender.name
-> Hash Join (cost=125.17..170608.34 rows=2500000 width=14) (actual time=2.279..2865.808 rows=2000000 loops=5)
Hash Cond: (fact_individual_re.dim_gender_id = dim_gender.id)
-> Hash Join (cost=124.13..150138.54 rows=2500000 width=12) (actual time=2.060..2115.234 rows=2000000 loops=5)
Hash Cond: (fact_individual_re.dim_date_id = dim_date.id)
-> Parallel Seq Scan on fact_individual_re (cost=0.00..118458.00 rows=2500000 width=12) (actual time=0.204..982.810 rows=2000000 loops=5)
-> Hash (cost=78.50..78.50 rows=3650 width=8) (actual time=1.824..1.824 rows=3650 loops=5)
Buckets: 4096 Batches: 1 Memory Usage: 175kB
-> Seq Scan on dim_date (cost=0.00..78.50 rows=3650 width=8) (actual time=0.143..1.030 rows=3650 loops=5)
-> Hash (cost=1.02..1.02 rows=2 width=10) (actual time=0.193..0.193 rows=2 loops=5)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on dim_gender (cost=0.00..1.02 rows=2 width=10) (actual time=0.181..0.182 rows=2 loops=5)
Planning time: 0.609 ms
Execution time: 4020.423 ms
(26 rows)
I am using postgresql v10.
I'd recommend to partially group the rows before the join:
sum(quality_hours_sum) AS C0,
dim_gender.name AS C1,
dim_date.year AS C2
sum(quality_hours) as quality_hours_sum,
from fact_individual_re
group by dim_date_id, dim_gender_id
) as fact_individual_re_sum
join dim_date on dim_date_id = dim_date.id
join dim_gender on dim_gender_id = dim_gender.id
group by dim_date.year, dim_gender.name
order by dim_date.year, dim_gender.name, 0;
This way you will be joining only 1460 rows (count(distinct dim_date_id)*count(distint dim_gender_id)) instead of all 2M rows. Although it would still need to read and group all 2M rows - to avoid that you'd need something like summary table maintained with a trigger.
There is no predicate shown on the fact table, so we can assume prior to the filtering via the joins 100% of that table is required.
The indexes exist on the lookup tables, but are not covering indexes from what you say. Given 100% of the fact table being scanned, combined with the index not being covering, I would expect it to hash join.
As an experiment, you could apply a covering index (index the dim_date.date_id and dim_date.year in a single index) to see if it swaps off a hash join against dim_date.
With the overall lack of predicates though - outside of a covering index, a hash join is not necessarily the wrong query plan.
It tries to retrieve videos in order of the number of tags that are the same as the specific video.
The following query takes about 800ms, but the index appears to be used.
If you remove COUNT, GROUP BY, and ORDER BY from the SQL query, it runs super fast.(1-5ms)
In such a case, improving the SQL query alone will not speed up the process and
SELECT "videos_video"."id",
COUNT("videos_video"."id") AS "n"
FROM "videos_video"
INNER JOIN "videos_video_tags" ON ("videos_video"."id" = "videos_video_tags"."video_id")
WHERE ("videos_video_tags"."tag_id" IN
(SELECT U0."id"
FROM "videos_tag" U0
INNER JOIN "videos_video_tags" U1 ON (U0."id" = U1."tag_id")
WHERE U1."video_id" = '748b1814-f311-48da-a1f5-6bf8fe229c7f'))
GROUP BY "videos_video"."id"
Limit (cost=1040.69..1040.74 rows=20 width=24) (actual time=738.648..738.654 rows=20 loops=1)
-> Sort (cost=1040.69..1044.29 rows=1441 width=24) (actual time=738.646..738.650 rows=20 loops=1)
Sort Key: (count(videos_video.id)) DESC
Sort Method: top-N heapsort Memory: 27kB
-> HashAggregate (cost=987.93..1002.34 rows=1441 width=24) (actual time=671.006..714.322 rows=188818 loops=1)
Group Key: videos_video.id
Batches: 1 Memory Usage: 28689kB
-> Nested Loop (cost=35.20..980.73 rows=1441 width=16) (actual time=0.341..559.034 rows=240293 loops=1)
-> Nested Loop (cost=34.78..340.88 rows=1441 width=16) (actual time=0.278..92.806 rows=240293 loops=1)
-> HashAggregate (cost=34.35..34.41 rows=6 width=32) (actual time=0.188..0.200 rows=4 loops=1)
Group Key: u0.id
Batches: 1 Memory Usage: 24kB
-> Nested Loop (cost=0.71..34.33 rows=6 width=32) (actual time=0.161..0.185 rows=4 loops=1)
-> Index Only Scan using videos_video_tags_video_id_tag_id_f8d6ba70_uniq on videos_video_tags u1 (cost=0.43..4.53 rows=6 width=16) (actual time=0.039..0.040 rows=4 loops=1)
Index Cond: (video_id = '748b1814-f311-48da-a1f5-6bf8fe229c7f'::uuid)
Heap Fetches: 0
-> Index Only Scan using videos_tag_pkey on videos_tag u0 (cost=0.28..4.97 rows=1 width=16) (actual time=0.035..0.035 rows=1 loops=4)
Index Cond: (id = u1.tag_id)
Heap Fetches: 0
-> Index Scan using videos_video_tags_tag_id_2673cfc8 on videos_video_tags (cost=0.43..35.90 rows=1518 width=32) (actual time=0.029..16.728 rows=60073 loops=4)
Index Cond: (tag_id = u0.id)
-> Index Only Scan using videos_video_pkey on videos_video (cost=0.42..0.44 rows=1 width=16) (actual time=0.002..0.002 rows=1 loops=240293)
Index Cond: (id = videos_video_tags.video_id)
Heap Fetches: 46
Planning Time: 1.980 ms
Execution Time: 739.446 ms
(26 rows)
Time: 742.145 ms
---------- Results of the execution plan for the query as answered by Edouard. ----------
Nested Loop (cost=30043.90..30212.53 rows=20 width=746) (actual time=239.142..239.219 rows=20 loops=1)
-> Limit (cost=30043.48..30043.53 rows=20 width=24) (actual time=239.089..239.093 rows=20 loops=1)
-> Sort (cost=30043.48..30607.15 rows=225467 width=24) (actual time=239.087..239.090 rows=20 loops=1)
Sort Key: (count(*)) DESC
Sort Method: top-N heapsort Memory: 26kB
-> HashAggregate (cost=21789.21..24043.88 rows=225467 width=24) (actual time=185.710..219.211 rows=188818 loops=1)
Group Key: vt.video_id
Batches: 1 Memory Usage: 22545kB
-> Nested Loop (cost=20.62..20187.24 rows=320395 width=16) (actual time=4.975..106.839 rows=240293 loops=1)
-> Index Only Scan using videos_video_tags_video_id_tag_id_f8d6ba70_uniq on videos_video_tags vvt (cost=0.43..4.53 rows=6 width=16) (actual time=0.033..0.043 rows=4 loops=1)
Index Cond: (video_id = '748b1814-f311-48da-a1f5-6bf8fe229c7f'::uuid)
Heap Fetches: 0
-> Bitmap Heap Scan on videos_video_tags vt (cost=20.19..3348.60 rows=1518 width=32) (actual time=4.311..20.663 rows=60073 loops=4)
Recheck Cond: (tag_id = vvt.tag_id)
Heap Blocks: exact=34757
-> Bitmap Index Scan on videos_video_tags_tag_id_2673cfc8 (cost=0.00..19.81 rows=1518 width=0) (actual time=3.017..3.017 rows=60073 loops=4)
Index Cond: (tag_id = vvt.tag_id)
-> Index Scan using videos_video_pkey on videos_video v (cost=0.42..8.44 rows=1 width=738) (actual time=0.005..0.005 rows=1 loops=20)
Index Cond: (id = vt.video_id)
Planning Time: 0.854 ms
Execution Time: 241.392 ms
(21 rows)
Time: 242.909 ms
Here below are some ideas to simplify the query. Then an EXPLAIN ANALYSE will confirm the potential impacts on the query performance.
Starting from the subquery :
SELECT U0."id"
FROM "videos_tag" U0
INNER JOIN "videos_video_tags" U1 ON (U0."id" = U1."tag_id")
WHERE U1."video_id" = '748b1814-f311-48da-a1f5-6bf8fe229c7f'
According to the JOIN clause : U0."id" = U1."tag_id" so that SELECT U0."id" can be replaced by SELECT U1."tag_id".
In this case, the table "videos_tag" U0 is not used anymore in the subquery which can be simplified as :
SELECT U1."tag_id"
FROM "videos_video_tags" U1
WHERE U1."video_id" = '748b1814-f311-48da-a1f5-6bf8fe229c7f'
And the WHERE clause of the main query becomes :
WHERE "videos_video_tags"."tag_id" IN
( SELECT U1."tag_id"
FROM "videos_video_tags" U1
WHERE U1."video_id" = '748b1814-f311-48da-a1f5-6bf8fe229c7f'
which can be transformed as a self join on the table "videos_video_tags" to be added in the FROM clause of the main query :
FROM "videos_video" AS v
INNER JOIN "videos_video_tags" AS vt
ON v."id" = vt."video_id"
INNER JOIN "videos_video_tags" AS vvt
ON vvt."tag_id" = vt."tag_id"
WHERE vvt."video_id" = '748b1814-f311-48da-a1f5-6bf8fe229c7f'
Finally, the GROUP BY "videos_video"."id" clause can be replaced by GROUP BY "videos_video_tags"."video_id" according to the JOIN clause between both tables, and this new GROUP BY clause associated to the ORDER BY clause and LIMIT clause can apply to a subquery involving the table "videos_video_tags" only, and before joining with the table "videos_video" :
SELECT v."id",
FROM "videos_video" AS v
( SELECT vt."video_id"
, count(*) AS "n"
FROM "videos_video_tags" AS vt
INNER JOIN "videos_video_tags" AS vvt
ON vvt."tag_id" = vt."tag_id"
WHERE vvt."video_id" = '748b1814-f311-48da-a1f5-6bf8fe229c7f'
GROUP BY vt."video_id"
) AS w
ON v."id" = w."video_id"
I have a shipment order table which contains 2 array of JSON objects: The declared packages and the actual packages. What I want is to get the weight sum of of all declared packages and all actual packaes
The simpler SQL
explain analyse select
sum((tbl.decl)::double precision) as total_gross_weight,
sum((tbl.act)::double precision) as total_actual_weight
json_array_elements(declared_packages)->> 'weight' as decl,
json_array_elements(actual_packages)->> 'weight'as act
"shipment-order" so) tbl
group by
order by total_gross_weight desc
Sort (cost=162705.01..163957.01 rows=500800 width=32) (actual time=2350.293..2350.850 rows=4564 loops=1)
Sort Key: (sum(((((json_array_elements(so.declared_packages)) ->> 'weight'::text)))::double precision)) DESC
Sort Method: quicksort Memory: 543kB
-> GroupAggregate (cost=88286.58..103310.58 rows=500800 width=32) (actual time=2085.907..2348.947 rows=4564 loops=1)
Group Key: so.id
-> Sort (cost=88286.58..89538.58 rows=500800 width=80) (actual time=2085.895..2209.717 rows=1117847 loops=1)
Sort Key: so.id
Sort Method: external merge Disk: 28520kB
-> Result (cost=0.00..13615.16 rows=500800 width=80) (actual time=0.063..1744.941 rows=1117847 loops=1)
-> ProjectSet (cost=0.00..3599.16 rows=500800 width=80) (actual time=0.060..856.075 rows=1117847 loops=1)
-> Seq Scan on "shipment-order" so (cost=0.00..1045.08 rows=5008 width=233) (actual time=0.023..6.551 rows=5249 loops=1)
Planning time: 0.379 ms
Execution time: 2359.042 ms
While the more complex SQL, basically goes through multiple stage of left joining and cross joining toward the original table
explain analyse
("shipment-order" so
left join (
sum((d_packages.value ->> 'weight'::text)::double precision) as total_gross_weight
"shipment-order" so_1,
lateral json_array_elements(so_1.declared_packages) d_packages(value)
group by
so_1.id) declared_packages_info on
so.id = declared_packages_info.id
left join (
sum((a_packages.value ->> 'weight'::text)::double precision) as total_actual_weight
"shipment-order" so_1,
lateral json_array_elements(so_1.actual_packages) a_packages(value)
group by
so_1.id) actual_packages_info on
so.id = actual_packages_info.id)
order by
total_gross_weight desc
Performs better
Sort (cost=35509.14..35521.66 rows=5008 width=32) (actual time=1823.049..1823.375 rows=5249 loops=1)
Sort Key: declared_packages_info.total_gross_weight DESC
Sort Method: quicksort Memory: 575kB
-> Hash Left Join (cost=34967.97..35201.40 rows=5008 width=32) (actual time=1819.214..1822.000 rows=5249 loops=1)
Hash Cond: (so.id = actual_packages_info.id)
-> Hash Left Join (cost=17484.13..17704.40 rows=5008 width=24) (actual time=1805.038..1806.996 rows=5249 loops=1)
Hash Cond: (so.id = declared_packages_info.id)
-> Index Only Scan using "PK_bcd4a660acbe66f71749270d38a" on "shipment-order" so (cost=0.28..207.40 rows=5008 width=16) (actual time=0.032..0.695 rows=5249 loops=1)
Heap Fetches: 146
-> Hash (cost=17421.24..17421.24 rows=5008 width=24) (actual time=1804.955..1804.957 rows=4553 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 312kB
-> Subquery Scan on declared_packages_info (cost=17321.08..17421.24 rows=5008 width=24) (actual time=1802.980..1804.261 rows=4553 loops=1)
-> HashAggregate (cost=17321.08..17371.16 rows=5008 width=24) (actual time=1802.979..1803.839 rows=4553 loops=1)
Group Key: so_1.id
-> Nested Loop (cost=0.00..11061.08 rows=500800 width=48) (actual time=0.033..902.972 rows=1117587 loops=1)
-> Seq Scan on "shipment-order" so_1 (cost=0.00..1045.08 rows=5008 width=149) (actual time=0.009..4.149 rows=5249 loops=1)
-> Function Scan on json_array_elements d_packages (cost=0.00..1.00 rows=100 width=32) (actual time=0.121..0.145 rows=213 loops=5249)
-> Hash (cost=17421.24..17421.24 rows=5008 width=24) (actual time=14.158..14.160 rows=1362 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 138kB
-> Subquery Scan on actual_packages_info (cost=17321.08..17421.24 rows=5008 width=24) (actual time=13.550..13.904 rows=1362 loops=1)
-> HashAggregate (cost=17321.08..17371.16 rows=5008 width=24) (actual time=13.549..13.783 rows=1362 loops=1)
Group Key: so_1_1.id
-> Nested Loop (cost=0.00..11061.08 rows=500800 width=48) (actual time=0.036..9.922 rows=1837 loops=1)
-> Seq Scan on "shipment-order" so_1_1 (cost=0.00..1045.08 rows=5008 width=100) (actual time=0.008..4.161 rows=5249 loops=1)
-> Function Scan on json_array_elements a_packages (cost=0.00..1.00 rows=100 width=32) (actual time=0.001..0.001 rows=0 loops=5249)
Planning time: 0.210 ms
Execution time: 1824.286 ms
Should I go with the more complex query or should I try to optimize the simpler one? I see that the simple query have a very long external merge sort...
There are two things you can do to make the simple query faster:
don't use a large jsonb, but store weight in a regular table column
increase work_mem until you get a much cheaper hash aggregate
Assuming "id" is the primary or a unique key, you can probably get a little speed up with an even simpler query and a helper function. Process each row as a unit, rather than disaggregating, pooling, and re-aggregating.
create function sum_weigh(json) returns double precision language sql as $$
select sum((t->>'weight')::double precision) from json_array_elements($1) f(t)
$$ immutable parallel safe;
select id, sum_weigh(declared_packages), sum_weigh(actual_packages) from "shipment-order";
I am trying to get a row with highest popularity. Ordering by descending popularity is slowing down the query significantly.
Is there a better way to optimize this query ?
Postgresql - 9.5
```explain analyse SELECT v.cosmo_id,
v.resource_id, k.gid, k.popularity,v.cropinfo_id
FROM rmsg.verifications V INNER JOIN rmip.resourceinfo R ON
(R.id=V.resource_id AND R.source_id=54) INNER JOIN rmpp.kgidinfo K ON
(K.cosmo_id=V.cosmo_id) WHERE V.status=1 AND
v.crop_Status=1 AND V.locked_time isnull ORDER BY k.popularity
desc, (v.cosmo_id,
v.resource_id, v.cropinfo_id) LIMIT 1;```
Limit (cost=470399.99..470399.99 rows=1 width=31) (actual time=19655.552..19655.553 rows=1 loops=1)
Sort (cost=470399.99..470434.80 rows=13923 width=31) (actual time=19655.549..19655.549 rows=1 loops=1)
Sort Key: k.popularity DESC, (ROW(v.cosmo_id, v.resource_id, v.cropinfo_id))
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=19053.91..470330.37 rows=13923 width=31) (actual time=58.365..19627.405 rows=23006 loops=1)
-> Hash Join (cost=19053.48..459008.74 rows=13188 width=16) (actual time=58.275..19268.339 rows=19165 loops=1)
Hash Cond: (v.resource_id = r.id)
-> Seq Scan on verifications v (cost=0.00..409876.92 rows=7985725 width=16) (actual time=0.035..11097.163 rows=9908140 loops=1)
Filter: ((locked_time IS NULL) AND (status = 1) AND (crop_status = 1))
Rows Removed by Filter: 1126121
-> Hash (cost=18984.23..18984.23 rows=5540 width=4) (actual time=57.101..57.101 rows=5186 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 247kB
-> Bitmap Heap Scan on resourceinfo r (cost=175.37..18984.23 rows=5540 width=4) (actual time=2.827..51.318 rows=5186 loops=1)
Recheck Cond: (source_id = 54)
Heap Blocks: exact=5907
-> Bitmap Index Scan on resourceinfo_source_id_key (cost=0.00..173.98 rows=5540 width=0) (actual time=1.742..1.742 rows=6483 loops=1)
Index Cond: (source_id = 54)
Index Scan using kgidinfo_cosmo_id_idx on kgidinfo k (cost=0.43..0.85 rows=1 width=23) (actual time=0.013..0.014 rows=1 loops=19165)
Index Cond: (cosmo_id = v.cosmo_id)
Planning time: 1.083 ms
Execution time: 19655.638 ms
(21 rows)```
This is your query, simplified by removing parentheses:
SELECT v.cosmo_id, v.resource_id, k.gid, k.popularity, v.cropinfo_id
FROM rmsg.verifications V INNER JOIN
rmip.resourceinfo R
ON R.id = V.resource_id AND R.source_id = 54 INNER JOIN
rmpp.kgidinfo K
ON K.cosmo_id = V.cosmo_id
WHERE V.status = 1 AND v.crop_Status = 1 AND
V.locked_time is null
ORDER BY k.popularity desc, v.cosmo_id, v.resource_id, v.cropinfo_id
For this query, I would think in terms of indexes on verifications(status, crop_status, locked_time, resource_id, cosmo_id, crop_info_id), resourceinfo(id, source_id), and kgidinfo(cosmo_id). I don't see an easy way to remove the ORDER BY.
In looking at the query, I wonder if you might have a Cartesian product problem between the two tables.
I have the following two queries.Query 1 is fast since it uses indexes(uses nested loop join) and Query 2 uses hash join and it is slower.
Query 1 does order by on table 1 column and Query 2 does order by using table 2 column.
Query 1
learning=# explain analyze
select *
from users left join
on users.userid = access_logs.userid
order by users.userid
limit 10 offset 90;
Limit (cost=14.00..15.46 rows=10 width=104) (actual time=1.330..1.504 rows=10 loops=1)
-> Merge Left Join (cost=0.85..291532.97 rows=1995958 width=104) (actual time=0.037..1.482 rows=100 loops=1)
Merge Cond: (users.userid = access_logs.userid)
-> Index Scan using users_pkey on users (cost=0.43..151132.75 rows=1995958 width=76) (actual time=0.018..1.135 rows=100 loops=1)
-> Index Scan using access_logs_userid_idx on access_logs (cost=0.43..110471.45 rows=1995958 width=28) (actual time=0.012..0.198 rows=100 loops=1)
Planning time: 0.469 ms
Execution time: 1.569 ms
Query 2
learning=# explain analyze
select *
from users left join
on users.userid = access_logs.userid
order by access_logs.userid
limit 10 offset 90;
Limit (cost=293584.20..293584.23 rows=10 width=104) (actual time=3821.432..3821.439 rows=10 loops=1)
-> Sort (cost=293583.98..298573.87 rows=1995958 width=104) (actual time=3821.391..3821.415 rows=100 loops=1)
Sort Key: access_logs.userid
Sort Method: top-N heapsort Memory: 51kB
-> Hash Left Join (cost=73231.06..217299.90 rows=1995958 width=104) (actual time=539.859..3168.754 rows=1995958 loops=1)
Hash Cond: (users.userid = access_logs.userid)
-> Seq Scan on users (cost=0.00..44814.58 rows=1995958 width=76) (actual time=0.009..443.260 rows=1995958 loops=1)
-> Hash (cost=34636.58..34636.58 rows=1995958 width=28) (actual time=539.112..539.112 rows=1995958 loops=1)
Buckets: 262144 Batches: 2 Memory Usage: 58532kB
-> Seq Scan on access_logs (cost=0.00..34636.58 rows=1995958 width=28) (actual time=0.006..170.061 rows=1995958 loops=1)
Planning time: 0.480 ms
Execution time: 3832.245 ms
The second query is slow since the sorting is done before the join as in the plan.
Why does the sort in the second table not use the index? There is a plan below with just the sort.
Query - explain analyze select * from access_logs order by userid limit 10 offset 90;
Limit (cost=5.41..5.96 rows=10 width=28) (actual time=0.199..0.218 rows=10 loops=1)
-> Index Scan using access_logs_userid_idx on access_logs (cost=0.43..110471.45 rows=1995958 width=28) (actual time=0.029..0.201 rows=100 loops=1)
Planning time: 0.120 ms
Execution time: 0.252 ms
Edit 1:
My goal is not to compare both queries, in fact I want the result as in query 2,I just provided query 1 so that I can understand in comparison.
The order by is not limited to the join column, the user can also do order by another column in table 2, plans are below.
learning=# explain analyze select * from users left join access_logs on users.userid=access_logs.userid order by access_logs.last_login limit 10;
Limit (cost=260431.83..260431.86 rows=10 width=104) (actual time=3846.625..3846.627 rows=10 loops=1)
-> Sort (cost=260431.83..265421.73 rows=1995958 width=104) (actual time=3846.623..3846.623 rows=10 loops=1)
Sort Key: access_logs.last_login
Sort Method: top-N heapsort Memory: 27kB
-> Hash Left Join (cost=73231.06..217299.90 rows=1995958 width=104) (actual time=567.104..3174.818 rows=1995958 loops=1)
Hash Cond: (users.userid = access_logs.userid)
-> Seq Scan on users (cost=0.00..44814.58 rows=1995958 width=76) (actual time=0.007..443.364 rows=1995958 loops=1)
-> Hash (cost=34636.58..34636.58 rows=1995958 width=28) (actual time=566.814..566.814 rows=1995958 loops=1)
Buckets: 262144 Batches: 2 Memory Usage: 58532kB
-> Seq Scan on access_logs (cost=0.00..34636.58 rows=1995958 width=28) (actual time=0.004..169.137 rows=1995958 loops=1)
Planning time: 0.490 ms
Execution time: 3857.171 ms
Sort in the second query would not use index because the index is not guaranteed to have all the values being sorted. If there are some records in users not matched by access_logs then Left Join would generate null values referenced in query as access_logs.userid but not actually present in access_logs and thus not covered by the index.
The workaround can be to create a default initial record in access_log for each user and use Inner Join.
PGSQL 8.4.2, Linux
I make use of table inheritance
Each Table contains 3 million rows
Indexes on joining columns are set
Table statistics (analyze, vacuum analyze) are up-to-date
Only used table is "node" with varios partitioned sub-tables
Recursive query (pg >= 8.4)
Now here is the explained query:
rows AS
SELECT r.id, r.set, r.parent, r.masterid
FROM d_storage.node_dataset r
WHERE masterid = 3533933
) q
SELECT c.id, c.set, c.parent, r.masterid
FROM rows r
JOIN a_storage.node c
ON c.parent = r.id
) q
SELECT r.masterid, r.id AS nodeid
FROM rows r
CTE Scan on rows r (cost=2742105.92..2862119.94 rows=6000701 width=16) (actual time=0.033..172111.204 rows=4 loops=1)
CTE rows
-> Recursive Union (cost=0.00..2742105.92 rows=6000701 width=28) (actual time=0.029..172111.183 rows=4 loops=1)
-> Index Scan using node_dataset_masterid on node_dataset r (cost=0.00..8.60 rows=1 width=28) (actual time=0.025..0.027 rows=1 loops=1)
Index Cond: (masterid = 3533933)
-> Hash Join (cost=0.33..262208.33 rows=600070 width=28) (actual time=40628.371..57370.361 rows=1 loops=3)
Hash Cond: (c.parent = r.id)
-> Append (cost=0.00..211202.04 rows=12001404 width=20) (actual time=0.011..46365.669 rows=12000004 loops=3)
-> Seq Scan on node c (cost=0.00..24.00 rows=1400 width=20) (actual time=0.002..0.002 rows=0 loops=3)
-> Seq Scan on node_dataset c (cost=0.00..55001.01 rows=3000001 width=20) (actual time=0.007..3426.593 rows=3000001 loops=3)
-> Seq Scan on node_stammdaten c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=0.008..9049.189 rows=3000001 loops=3)
-> Seq Scan on node_stammdaten_adresse c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=3.455..8381.725 rows=3000001 loops=3)
-> Seq Scan on node_testdaten c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=1.810..5259.178 rows=3000001 loops=3)
-> Hash (cost=0.20..0.20 rows=10 width=16) (actual time=0.010..0.010 rows=1 loops=3)
-> WorkTable Scan on rows r (cost=0.00..0.20 rows=10 width=16) (actual time=0.002..0.004 rows=1 loops=3)
Total runtime: 172111.371 ms
(16 rows)
So far so bad, the planner decides to choose hash joins (good) but no indexes (bad).
Now after doing the following:
SET enable_hashjoins TO false;
The explained query looks like that:
CTE Scan on rows r (cost=15198247.00..15318261.02 rows=6000701 width=16) (actual time=0.038..49.221 rows=4 loops=1)
CTE rows
-> Recursive Union (cost=0.00..15198247.00 rows=6000701 width=28) (actual time=0.032..49.201 rows=4 loops=1)
-> Index Scan using node_dataset_masterid on node_dataset r (cost=0.00..8.60 rows=1 width=28) (actual time=0.028..0.031 rows=1 loops=1)
Index Cond: (masterid = 3533933)
-> Nested Loop (cost=0.00..1507822.44 rows=600070 width=28) (actual time=10.384..16.382 rows=1 loops=3)
Join Filter: (r.id = c.parent)
-> WorkTable Scan on rows r (cost=0.00..0.20 rows=10 width=16) (actual time=0.001..0.003 rows=1 loops=3)
-> Append (cost=0.00..113264.67 rows=3001404 width=20) (actual time=8.546..12.268 rows=1 loops=4)
-> Seq Scan on node c (cost=0.00..24.00 rows=1400 width=20) (actual time=0.001..0.001 rows=0 loops=4)
-> Bitmap Heap Scan on node_dataset c (cost=58213.87..113214.88 rows=3000001 width=20) (actual time=1.906..1.906 rows=0 loops=4)
Recheck Cond: (c.parent = r.id)
-> Bitmap Index Scan on node_dataset_parent (cost=0.00..57463.87 rows=3000001 width=0) (actual time=1.903..1.903 rows=0 loops=4)
Index Cond: (c.parent = r.id)
-> Index Scan using node_stammdaten_parent on node_stammdaten c (cost=0.00..8.60 rows=1 width=20) (actual time=3.272..3.273 rows=0 loops=4)
Index Cond: (c.parent = r.id)
-> Index Scan using node_stammdaten_adresse_parent on node_stammdaten_adresse c (cost=0.00..8.60 rows=1 width=20) (actual time=4.333..4.333 rows=0 loops=4)
Index Cond: (c.parent = r.id)
-> Index Scan using node_testdaten_parent on node_testdaten c (cost=0.00..8.60 rows=1 width=20) (actual time=2.745..2.746 rows=0 loops=4)
Index Cond: (c.parent = r.id)
Total runtime: 49.349 ms
(21 rows)
-> incredibly faster, because indexes were used.
Notice: Cost of the second query ist somewhat higher than for the first query.
So the main question is: Why does the planner make the first decision, instead of the second?
Also interesing:
SET enable_seqscan TO false;
i temp. disabled seq scans. Than the planner used indexes and hash joins, and the query still was slow. So the problem seems to be the hash join.
Maybe someone can help in this confusing situation?
thx, R.
If your explain differs significantly from reality (the cost is lower but the time is higher) it is likely that your statistics are out of date, or are on a non-representative sample.
Try again with fresh statistics. If this does not help, increase the sample size and build stats again.
Try this:
set seq_page_cost = '4.0';
set random_page_cost = '1.0;
Do this in your session to see if it makes a difference.
The hash join was expecting 600070 resulting rows, but only got 4 (in 3 loops, averaging 1 row per loop). If 600070 had been accurate, the hash join would presumably have been appropriate.