SQL query running very slow - postrges - sql

This query currently take 4 minutes to run:
with name1 as (
select col1 as a1, col2 as a2, sum(FEE) as a3
from s1, date
where return_date = datesk and year = 2000
group by col1, col2
)
select c_id
from name1 ala1, ss, cc
where ala1.a3 > (
select avg(a3) * 1.2 from name1 ctr2
where ala1.a2 = ctr2.a2
)
and s_sk = ala1.a2
and s_state = 'TN'
and ala1.a1 = c_sk
order by c_id
limit 100;
I have set work_mem=’1000MB’ and enable-nestloop=off
EXPLAIN ANALYZE of this query is: http://explain.depesz.com/s/DUa
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
--------------------
Limit (cost=59141.02..59141.09 rows=28 width=17) (actual time=253707.928..253707.940 rows=100 loops=1)
CTE name1
-> HashAggregate (cost=11091.33..11108.70 rows=1390 width=14) (actual time=105.223..120.358 rows=50441 loops=1)
Group Key: s1.col1, s1.col2
-> Hash Join (cost=2322.69..11080.90 rows=1390 width=14) (actual time=10.390..79.897 rows=55820 loops=1)
Hash Cond: (s1.return_date = date.datesk)
-> Seq Scan on s1 (cost=0.00..7666.14 rows=287514 width=18) (actual time=0.005..33.801 rows=287514 loops=1)
-> Hash (cost=2318.11..2318.11 rows=366 width=4) (actual time=10.375..10.375 rows=366 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 13kB
-> Seq Scan on date (cost=0.00..2318.11 rows=366 width=4) (actual time=5.224..10.329 rows=366 loops=1)
Filter: (year = 2000)
Rows Removed by Filter: 72683
-> Sort (cost=48032.32..48032.39 rows=28 width=17) (actual time=253707.923..253707.930 rows=100 loops=1)
Sort Key: cc.c_id
Sort Method: top-N heapsort Memory: 32kB
-> Hash Join (cost=43552.37..48031.65 rows=28 width=17) (actual time=253634.511..253696.291 rows=18976 loops=1)
Hash Cond: (cc.c_sk = ala1.a1)
-> Seq Scan on cc (cost=0.00..3854.00 rows=100000 width=21) (actual time=0.009..18.527 rows=100000 loops=1)
-> Hash (cost=43552.02..43552.02 rows=28 width=4) (actual time=253634.420..253634.420 rows=18976 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 668kB
-> Hash Join (cost=1.30..43552.02 rows=28 width=4) (actual time=136.819..253624.375 rows=18982 loops=1)
Hash Cond: (ala1.a2 = ss.s_sk)
-> CTE Scan on name1 ala1 (cost=0.00..43548.70 rows=463 width=8) (actual time=136.756..253610.817 rows=18982 loops=1)
Filter: (a3 > (SubPlan 2))
Rows Removed by Filter: 31459
SubPlan 2
-> Aggregate (cost=31.29..31.31 rows=1 width=32) (actual time=5.025..5.025 rows=1 loops=50441)
-> CTE Scan on name1 ctr2 (cost=0.00..31.27 rows=7 width=32) (actual time=0.032..3.860 rows=8241 loops=50441)
Filter: (ala1.a2 = a2)
Rows Removed by Filter: 42200
-> Hash (cost=1.15..1.15 rows=12 width=4) (actual time=0.036..0.036 rows=12 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on ss (cost=0.00..1.15 rows=12 width=4) (actual time=0.025..0.033 rows=12 loops=1)
Filter: (s_state = 'TN'::bpchar)
Planning time: 0.316 ms
Execution time: 253708.351 ms
(36 rows)
With enable_nestloop=on;
EXPLAIN ANLYZE result is : http://explain.depesz.com/s/NPo
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
--------------
Limit (cost=54916.36..54916.43 rows=28 width=17) (actual time=257869.004..257869.015 rows=100 loops=1)
CTE name1
-> HashAggregate (cost=11091.33..11108.70 rows=1390 width=14) (actual time=92.354..104.103 rows=50441 loops=1)
Group Key: s1.col1, s1.col2
-> Hash Join (cost=2322.69..11080.90 rows=1390 width=14) (actual time=9.371..68.156 rows=55820 loops=1)
Hash Cond: (s1.return_date = date.datesk)
-> Seq Scan on s1 (cost=0.00..7666.14 rows=287514 width=18) (actual time=0.011..25.637 rows=287514 loops=1)
-> Hash (cost=2318.11..2318.11 rows=366 width=4) (actual time=9.343..9.343 rows=366 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 13kB
-> Seq Scan on date (cost=0.00..2318.11 rows=366 width=4) (actual time=4.796..9.288 rows=366 loops=1)
Filter: (year = 2000)
Rows Removed by Filter: 72683
-> Sort (cost=43807.66..43807.73 rows=28 width=17) (actual time=257868.994..257868.998 rows=100 loops=1)
Sort Key: cc.c_id
Sort Method: top-N heapsort Memory: 32kB
-> Nested Loop (cost=0.29..43806.98 rows=28 width=17) (actual time=120.358..257845.941 rows=18976 loops=1)
-> Nested Loop (cost=0.00..43633.22 rows=28 width=4) (actual time=120.331..257692.654 rows=18982 loops=1)
Join Filter: (ala1.a2 = ss.s_sk)
Rows Removed by Join Filter: 208802
-> CTE Scan on name1 ala1 (cost=0.00..43548.70 rows=463 width=8) (actual time=120.316..257652.636 rows=18982 loops=1)
Filter: (a3 > (SubPlan 2))
Rows Removed by Filter: 31459
SubPlan 2
-> Aggregate (cost=31.29..31.31 rows=1 width=32) (actual time=5.105..5.105 rows=1 loops=50441)
-> CTE Scan on name1 ctr2 (cost=0.00..31.27 rows=7 width=32) (actual time=0.032..3.952 rows=8241 loops=50441)
Filter: (ala1.a2 = a2)
Rows Removed by Filter: 42200
-> Materialize (cost=0.00..1.21 rows=12 width=4) (actual time=0.000..0.001 rows=12 loops=18982)
-> Seq Scan on ss (cost=0.00..1.15 rows=12 width=4) (actual time=0.007..0.012 rows=12 loops=1)
Filter: (s_state = 'TN'::bpchar)
-> Index Scan using cc_pkey on cc (cost=0.29..6.20 rows=1 width=21) (actual time=0.007..0.007 rows=1 loops=18982)
Index Cond: (c_sk = ala1.a1)
Planning time: 0.453 ms
Execution time: 257869.554 ms
(34 rows)
Many other queries run quickly with enable_nestloop=off, there is no big difference for this query. Raw data is not really big, so 4 minutes is too much. I was expecting around 4-5 seconds.
Why is it taking so long !?
I tried this in both postgres versions 9.4 and 9.5. It is same. Maybe I can create brin indexes. But I am not sure for which columns to create.
Configuration setting:
effective_cache_size | 89GB
shared_buffers | 18GB
work_mem | 1000MB
maintenance_work_mem | 500MB
checkpoint_segments | 32
constraint_exclusion | on
checkpoint_completion_target | 0.5

Like John Bollinger commented, your sub-query gets evaluated for each row of the main query. But since you are averaging on a simple column, you can easily move the sub-query out to a CTE and calculate the average once, which should speed up things tremendously:
with name1 as (
select col1 as a1, col2 as a2, sum(FEE) as a3
from s1, date
where return_date = datesk and year = 2000
group by col1, col2
), avg_a3_by_a2 as (
select a2, avg(a3) * 1.2 as avg12
from name1
group by a2
)
select c_id
from name1, avg_a3_by_a2, ss, cc
where name1.a3 > avg_a3_by_a2.avg12
and name1.a2 = avg_a3_by_a2.a2
and s_sk = name1.a2
and s_state = 'TN'
and name1.a1 = c_sk
order by c_id
limit 100;
The new CTE calculates the average + 20% for every distinct value of a2.
Please also use the JOIN syntax instead of comma-separated FROM items as it makes your code far more readable. And if you start using aliases in your query, use them consistently on all tables and columns. I could correct neither of these two issues because of lack of information.

Related

Need to improve count performance in PostgreSQL for this query

I have this query in PostgreSQL:
SELECT COUNT("contacts"."id")
FROM "contacts"
INNER JOIN "phone_numbers" ON "phone_numbers"."id" = "contacts"."phone_number_id"
INNER JOIN "companies" ON "companies"."id" = "contacts"."company_id"
WHERE (
(
(
CAST("phone_numbers"."value" AS VARCHAR) ILIKE '%a%'
OR CAST("contacts"."first_name" AS VARCHAR) ILIKE '%a%'
)
OR CAST("contacts"."last_name" AS VARCHAR) ILIKE '%a%'
)
OR CAST("companies"."name" AS VARCHAR) ILIKE '%a%'
)
When I run the query it is taking 19secs to run. I need to improve the performance.
Note: I already have the index for the columns.
EXPLAIN ANALYZE report
Finalize Aggregate (cost=209076.49..209076.54 rows=1 width=8) (actual time=6117.381..6646.477 rows=1 loops=1)
-> Gather (cost=209076.42..209076.48 rows=4 width=8) (actual time=6117.370..6646.473 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=209066.42..209066.47 rows=1 width=8) (actual time=5952.710..5952.723 rows=1 loops=5)
-> Hash Join (cost=137685.37..208438.42 rows=251200 width=8) (actual time=3007.055..5945.571 rows=39193 loops=5)
Hash Cond: (contacts.company_id = companies.id)
Join Filter: (((phone_numbers.value)::text ~~* '%as%'::text) OR ((contacts.first_name)::text ~~* '%as%'::text) OR ((contacts.last_name)::text ~~* '%as%'::text) OR ((companies.name)::text ~~* '%as%'::text))
Rows Removed by Join Filter: 763817
-> Parallel Hash Join (cost=137684.86..201964.34 rows=1003781 width=41) (actual time=3006.633..4596.987 rows=803010 loops=5)
Hash Cond: (contacts.phone_number_id = phone_numbers.id)
-> Parallel Seq Scan on contacts (cost=0.00..59316.85 rows=1003781 width=37) (actual time=11.032..681.124 rows=803010 loops=5)
-> Parallel Hash (cost=68914.22..68914.22 rows=1295458 width=20) (actual time=1632.770..1632.770 rows=803184 loops=5)
Buckets: 65536 Batches: 64 Memory Usage: 4032kB
-> Parallel Seq Scan on phone_numbers (cost=0.00..68914.22 rows=1295458 width=20) (actual time=10.780..1202.242 rows=803184 loops=5)
-> Hash (cost=0.30..0.30 rows=4 width=40) (actual time=0.258..0.258 rows=4 loops=5)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on companies (cost=0.00..0.30 rows=4 width=40) (actual time=0.247..0.248 rows=4 loops=5)
Planning Time: 1.895 ms
Execution Time: 6646.558 ms
Please help me on this performance issue.
I tried FUNCTION row_count_estimate (query text) and it is not giving the exact count.
Solution Tried:
I tried the Robert solution and got 16 Secs to run
My Query is:
SELECT Count(id) AS id
FROM (
SELECT contacts.id AS id
FROM contacts
WHERE (
contacts.last_name ilike '%as%')
OR (
contacts.last_name ilike '%as%')
UNION
SELECT contacts.id AS id
FROM contacts
WHERE contacts.phone_number_id IN
(
SELECT phone_numbers.id AS phone_number_id
FROM phone_numbers
WHERE phone_numbers.value ilike '%as%')
UNION
SELECT contacts.id AS id
FROM contacts
WHERE contacts.company_id IN
(
SELECT companies.id AS company_id
FROM companies
WHERE companies.name ilike '%as%' )) AS ID
Report:
Aggregate (cost=395890.08..395890.13 rows=1 width=8) (actual time=5942.601..5942.667 rows=1 loops=1)
-> Unique (cost=332446.76..337963.57 rows=1103362 width=8) (actual time=5929.800..5939.658 rows=101989 loops=1)
-> Sort (cost=332446.76..335205.17 rows=1103362 width=8) (actual time=5929.799..5933.823 rows=101989 loops=1)
Sort Key: contacts.id
Sort Method: external merge Disk: 1808kB
-> Append (cost=10.00..220843.02 rows=1103362 width=8) (actual time=1.158..5900.926 rows=101989 loops=1)
-> Gather (cost=10.00..61935.48 rows=99179 width=8) (actual time=1.158..569.412 rows=101989 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Parallel Seq Scan on contacts (cost=0.00..61826.30 rows=24795 width=8) (actual time=0.446..477.276 rows=20398 loops=5)
Filter: ((last_name)::text ~~* '%as%'::text)
Rows Removed by Filter: 782612
-> Nested Loop (cost=0.84..359.91 rows=402 width=8) (actual time=5292.088..5292.089 rows=0 loops=1)
-> Index Scan using idx_phone_value on phone_numbers (cost=0.41..64.13 rows=402 width=8) (actual time=5292.087..5292.087 rows=0 loops=1)
Index Cond: ((value)::text ~~* '%as%'::text)
Rows Removed by Index Recheck: 4015921
-> Index Scan using index_contacts_on_phone_number_id on contacts contacts_1 (cost=0.43..0.69 rows=1 width=16) (never executed)
Index Cond: (phone_number_id = phone_numbers.id)
-> Gather (cost=10.36..75795.48 rows=1003781 width=8) (actual time=26.298..26.331 rows=0 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Hash Join (cost=0.36..74781.70 rows=250945 width=8) (actual time=3.758..3.758 rows=0 loops=5)
Hash Cond: (contacts_2.company_id = companies.id)
-> Parallel Seq Scan on contacts contacts_2 (cost=0.00..59316.85 rows=1003781 width=16) (actual time=0.128..0.128 rows=1 loops=5)
-> Hash (cost=0.31..0.31 rows=1 width=8) (actual time=0.726..0.727 rows=0 loops=5)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on companies (cost=0.00..0.31 rows=1 width=8) (actual time=0.726..0.726 rows=0 loops=5)
Filter: ((name)::text ~~* '%as%'::text)
Rows Removed by Filter: 4
Planning Time: 0.846 ms
Execution Time: 5948.330 ms
I tried the below also:
EXPLAIN ANALYZE SELECT
count(id) AS id
FROM
(SELECT
contacts.id AS id
FROM
contacts
WHERE
(
position('as' in LOWER(last_name)) > 0
)
UNION
SELECT
contacts.id AS id
FROM
contacts
WHERE
EXISTS (
SELECT
1
FROM
phone_numbers
WHERE
(
position('as' in LOWER(phone_numbers.value)) > 0
)
AND (
contacts.phone_number_id = phone_numbers.id
)
)
UNION
SELECT
contacts.id AS id
FROM
contacts
WHERE
EXISTS (
SELECT
1
FROM
companies
WHERE
(
position('as' in LOWER(companies.name)) > 0
)
AND (
contacts.company_id = companies.id
)
)
UNION DISTINCT SELECT
contacts.id AS id
FROM
contacts
WHERE
(
position('as' in LOWER(first_name)) > 0
)
) AS ID;
Report
Aggregate (cost=1609467.66..1609467.71 rows=1 width=8) (actual time=1039.249..1039.330 rows=1 loops=1)
-> Unique (cost=1320886.03..1345980.09 rows=5018811 width=8) (actual time=999.363..1030.500 rows=195963 loops=1)
-> Sort (cost=1320886.03..1333433.06 rows=5018811 width=8) (actual time=999.362..1013.818 rows=198421 loops=1)
Sort Key: contacts.id
Sort Method: external merge Disk: 3520kB
-> Gather (cost=10.00..754477.62 rows=5018811 width=8) (actual time=0.581..941.210 rows=198421 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Parallel Append (cost=0.00..749448.80 rows=5018811 width=8) (actual time=290.521..943.736 rows=39684 loops=5)
-> Parallel Hash Join (cost=101469.35..164569.24 rows=334587 width=8) (actual time=724.841..724.843 rows=0 loops=2)
Hash Cond: (contacts.phone_number_id = phone_numbers.id)
-> Parallel Seq Scan on contacts (cost=0.00..59315.91 rows=1003762 width=16) (never executed)
-> Parallel Hash (cost=78630.16..78630.16 rows=431819 width=8) (actual time=723.735..723.735 rows=0 loops=2)
Buckets: 131072 Batches: 32 Memory Usage: 0kB
-> Parallel Seq Scan on phone_numbers (cost=0.00..78630.16 rows=431819 width=8) (actual time=723.514..723.514 rows=0 loops=2)
Filter: ("position"(lower((value)::text), 'as'::text) > 0)
Rows Removed by Filter: 2007960
-> Hash Join (cost=0.38..74780.48 rows=250940 width=8) (actual time=0.888..0.888 rows=0 loops=1)
Hash Cond: (contacts_1.company_id = companies.id)
-> Parallel Seq Scan on contacts contacts_1 (cost=0.00..59315.91 rows=1003762 width=16) (actual time=0.009..0.009 rows=1 loops=1)
-> Hash (cost=0.33..0.33 rows=1 width=8) (actual time=0.564..0.564 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on companies (cost=0.00..0.33 rows=1 width=8) (actual time=0.563..0.563 rows=0 loops=1)
Filter: ("position"(lower((name)::text), 'as'::text) > 0)
Rows Removed by Filter: 4
-> Parallel Seq Scan on contacts contacts_2 (cost=0.00..66844.13 rows=334588 width=8) (actual time=0.119..315.032 rows=20398 loops=5)
Filter: ("position"(lower((last_name)::text), 'as'::text) > 0)
Rows Removed by Filter: 782612
-> Parallel Seq Scan on contacts contacts_3 (cost=0.00..66844.13 rows=334588 width=8) (actual time=0.510..558.791 rows=32144 loops=3)
Filter: ("position"(lower((first_name)::text), 'as'::text) > 0)
Rows Removed by Filter: 1306206
Planning Time: 2.115 ms
Execution Time: 1040.620 ms
It's hard to help you, because I don't have acces to your data. Let me try...
EXPLAIN ANALYZE report shows that:
Yor query doesn't using indexes. Full scan on table phone_numbers tooks 1.202 second, and 0.681 senod on contacts table.
"Rows Removed by Join Filter: 763817".
"Parallel Hash Join (cost=137684.86..201964.34 rows=1003781 width=41) (actual time=3006.633..4596.987 rows=803010 loops=5)" . So this query joins ~800k rows and then filter 763k of it.
Maybe you can reverse that. This should speed up (but that needs to be checked).
For example you can test this - rewrite your query in this direction:
SELECT COUNT( ID)
FROM
(
SELECT "contacts"."id"
FROM "contacts"
Where <filters on contract here>
union
SELECT "contacts"."id"
FROM "contacts"
where phone_number_id in ( select "phone_numbers"."id"
from "phone_numbers"
where <filters on phone_numbers here>
) as A
union
SELECT "contacts"."id"
FROM "contacts"
where company_id in ( select "companies"."id"
from "companies"
where <filters on companies here> )
) as B
Two indexes: one on column contacts.phone_number_id and another on contacts.company_id might help.
EDIT:
It using index on "phone_numbers"."id" with nested loop it tooks 5 seconds.
Try to avoid this.
Please check, what it will do for this:
SELECT Count(id) AS id
FROM (
SELECT contacts.id AS id
FROM contacts
WHERE (
contacts.last_name ilike '%as%')
OR (
contacts.last_name ilike '%as%')
UNION
SELECT contacts.id AS id
FROM contacts
WHERE contacts.phone_number_id IN
(
SELECT to_number(to_char(phone_numbers.id))) /* just for disable index scan for that column */ AS phone_number_id
FROM phone_numbers
WHERE phone_numbers.value ilike '%as%')
UNION
SELECT contacts.id AS id
FROM contacts
WHERE contacts.company_id IN
(
SELECT companies.id AS company_id
FROM companies
WHERE companies.name ilike '%as%' )) AS ID
Aggregate (cost=419095.35..419095.40 rows=1 width=8) (actual time=13235.986..13236.335 rows=1 loops=1)
-> Unique (cost=346875.23..353155.24 rows=1256002 width=8) (actual time=13211.350..13230.729 rows=195963 loops=1)
-> Sort (cost=346875.23..350015.24 rows=1256002 width=8) (actual time=13211.349..13219.607 rows=195963 loops=1)
Sort Key: contacts.id
Sort Method: external merge Disk: 3472kB
-> Append (cost=2249.63..218658.27 rows=1256002 width=8) (actual time=5927.019..13164.421 rows=195963 loops=1)
-> Gather (cost=2249.63..48279.58 rows=251838 width=8) (actual time=5927.019..6911.795 rows=195963 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Parallel Bitmap Heap Scan on contacts (cost=2239.63..48017.74 rows=62960 width=8) (actual time=5861.480..6865.957 rows=39193 loops=5)
Recheck Cond: (((first_name)::text ~~* '%as%'::text) OR ((last_name)::text ~~* '%as%'::text))
Rows Removed by Index Recheck: 763815
Heap Blocks: exact=10860 lossy=6075
-> BitmapOr (cost=2239.63..2239.63 rows=255705 width=0) (actual time=5917.966..5917.966 rows=0 loops=1)
-> Bitmap Index Scan on idx_trgm_contacts_first_name (cost=0.00..1291.57 rows=156527 width=0) (actual time=2972.404..2972.404 rows=4015039 loops=1)
Index Cond: ((first_name)::text ~~* '%as%'::text)
-> Bitmap Index Scan on idx_trgm_contacts_last_name (cost=0.00..822.14 rows=99177 width=0) (actual time=2945.560..2945.560 rows=4015038 loops=1)
Index Cond: ((last_name)::text ~~* '%as%'::text)
-> Nested Loop (cost=81.96..384.33 rows=402 width=8) (actual time=6213.028..6213.028 rows=0 loops=1)
-> Unique (cost=81.52..83.53 rows=402 width=8) (actual time=6213.027..6213.027 rows=0 loops=1)
-> Sort (cost=81.52..82.52 rows=402 width=8) (actual time=6213.027..6213.027 rows=0 loops=1)
Sort Key: ((NULLIF((phone_numbers.id)::text, ''::text))::integer)
Sort Method: quicksort Memory: 25kB
-> Index Scan using idx_trgm_phone_value on phone_numbers (cost=0.41..64.13 rows=402 width=8) (actual time=6213.006..6213.006 rows=0 loops=1)
Index Cond: ((value)::text ~~* '%as%'::text)
Rows Removed by Index Recheck: 4015921
-> Index Scan using index_contacts_on_phone_number_id on contacts contacts_1 (cost=0.44..0.70 rows=1 width=16) (never executed)
Index Cond: (phone_number_id = (NULLIF((phone_numbers.id)::text, ''::text))::integer)
-> Gather (cost=10.36..75794.22 rows=1003762 width=8) (actual time=25.691..25.709 rows=0 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Hash Join (cost=0.36..74780.46 rows=250940 width=8) (actual time=2.653..2.653 rows=0 loops=5)
Hash Cond: (contacts_2.company_id = companies.id)
-> Parallel Seq Scan on contacts contacts_2 (cost=0.00..59315.91 rows=1003762 width=16) (actual time=0.244..0.244 rows=1 loops=5)
-> Hash (cost=0.31..0.31 rows=1 width=8) (actual time=0.244..0.244 rows=0 loops=5)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on companies (cost=0.00..0.31 rows=1 width=8) (actual time=0.244..0.244 rows=0 loops=5)
Filter: ((name)::text ~~* '%as%'::text)
Rows Removed by Filter: 4
Planning Time: 1.458 ms
Execution Time: 13236.949 ms
I tried below,
SELECT Count(id) AS id
FROM (
SELECT contacts.id AS id
FROM contacts
WHERE (substring(LOWER(contacts.first_name), position('as' in LOWER(first_name)), 2) = 'as')
OR (substring(LOWER(contacts.last_name), position('as' in LOWER(last_name)), 2) = 'as')
UNION
SELECT contacts.id AS id
FROM contacts
WHERE contacts.phone_number_id IN
(
SELECT NULLIF(CAST(phone_numbers.id AS text), '')::int AS phone_number_id
FROM phone_numbers
WHERE (substring(LOWER(phone_numbers.value), position('as' in LOWER(phone_numbers.value)), 2) = 'as'))
UNION
SELECT contacts.id AS id
FROM contacts
WHERE contacts.company_id IN
(
SELECT companies.id AS company_id
FROM companies
WHERE (substring(LOWER(companies.name), position('as' in LOWER(companies.name)), 2) = 'as') )) AS ID
Aggregate (cost=508646.88..508646.93 rows=1 width=8) (actual time=1455.892..1455.995 rows=1 loops=1)
-> Unique (cost=447473.09..452792.55 rows=1063892 width=8) (actual time=1431.464..1450.434 rows=195963 loops=1)
-> Sort (cost=447473.09..450132.82 rows=1063892 width=8) (actual time=1431.464..1439.267 rows=195963 loops=1)
Sort Key: contacts.id
Sort Method: external merge Disk: 3472kB
-> Append (cost=10.00..340141.41 rows=1063892 width=8) (actual time=0.391..1370.557 rows=195963 loops=1)
-> Gather (cost=10.00..84460.02 rows=40050 width=8) (actual time=0.391..983.457 rows=195963 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Parallel Seq Scan on contacts (cost=0.00..84409.97 rows=10012 width=8) (actual time=1.696..987.285 rows=39193 loops=5)
Filter: (("substring"(lower((first_name)::text), "position"(lower((first_name)::text), 'as'::text), 2) = 'as'::text) OR ("substring"(lower((last_name)::text), "position"(lower((last_name)::text), 'as'::text), 2) = 'as'::text))
Rows Removed by Filter: 763817
-> Nested Loop (cost=85188.17..100095.23 rows=20080 width=8) (actual time=364.076..364.125 rows=0 loops=1)
-> HashAggregate (cost=85187.73..86191.73 rows=20080 width=8) (actual time=364.074..364.123 rows=0 loops=1)
Group Key: (NULLIF((phone_numbers.id)::text, ''::text))::integer
Batches: 1 Memory Usage: 793kB
-> Gather (cost=10.00..85137.53 rows=20080 width=8) (actual time=363.976..364.025 rows=0 loops=1)
Workers Planned: 3
Workers Launched: 3
-> Parallel Seq Scan on phone_numbers (cost=0.00..85107.45 rows=6477 width=8) (actual time=357.030..357.031 rows=0 loops=4)
Filter: ("substring"(lower((value)::text), "position"(lower((value)::text), 'as'::text), 2) = 'as'::text)
Rows Removed by Filter: 1003980
-> Index Scan using index_contacts_on_phone_number_id on contacts contacts_1 (cost=0.44..0.64 rows=1 width=16) (never executed)
Index Cond: (phone_number_id = (NULLIF((phone_numbers.id)::text, ''::text))::integer)
-> Gather (cost=10.40..75794.26 rows=1003762 width=8) (actual time=6.889..6.910 rows=0 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Hash Join (cost=0.40..74780.50 rows=250940 width=8) (actual time=0.138..0.139 rows=0 loops=5)
Hash Cond: (contacts_2.company_id = companies.id)
-> Parallel Seq Scan on contacts contacts_2 (cost=0.00..59315.91 rows=1003762 width=16) (actual time=0.004..0.004 rows=1 loops=5)
-> Hash (cost=0.35..0.35 rows=1 width=8) (actual time=0.081..0.081 rows=0 loops=5)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on companies (cost=0.00..0.35 rows=1 width=8) (actual time=0.081..0.081 rows=0 loops=5)
Filter: ("substring"(lower((name)::text), "position"(lower((name)::text), 'as'::text), 2) = 'as'::text)
Rows Removed by Filter: 4
Planning Time: 0.927 ms
Execution Time: 1456.742 ms

When ORDER BY is performed on values aggregated by COUNT, it takes time to issue the query

It tries to retrieve videos in order of the number of tags that are the same as the specific video.
The following query takes about 800ms, but the index appears to be used.
If you remove COUNT, GROUP BY, and ORDER BY from the SQL query, it runs super fast.(1-5ms)
In such a case, improving the SQL query alone will not speed up the process and
Do I need to use MATERIALIZED VIEW?
SELECT "videos_video"."id",
"videos_video"."title",
"videos_video"."thumbnail_url",
"videos_video"."preview_url",
"videos_video"."embed_url",
"videos_video"."duration",
"videos_video"."views",
"videos_video"."is_public",
"videos_video"."published_at",
"videos_video"."created_at",
"videos_video"."updated_at",
COUNT("videos_video"."id") AS "n"
FROM "videos_video"
INNER JOIN "videos_video_tags" ON ("videos_video"."id" = "videos_video_tags"."video_id")
WHERE ("videos_video_tags"."tag_id" IN
(SELECT U0."id"
FROM "videos_tag" U0
INNER JOIN "videos_video_tags" U1 ON (U0."id" = U1."tag_id")
WHERE U1."video_id" = '748b1814-f311-48da-a1f5-6bf8fe229c7f'))
GROUP BY "videos_video"."id"
ORDER BY "n" DESC
LIMIT 20;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1040.69..1040.74 rows=20 width=24) (actual time=738.648..738.654 rows=20 loops=1)
-> Sort (cost=1040.69..1044.29 rows=1441 width=24) (actual time=738.646..738.650 rows=20 loops=1)
Sort Key: (count(videos_video.id)) DESC
Sort Method: top-N heapsort Memory: 27kB
-> HashAggregate (cost=987.93..1002.34 rows=1441 width=24) (actual time=671.006..714.322 rows=188818 loops=1)
Group Key: videos_video.id
Batches: 1 Memory Usage: 28689kB
-> Nested Loop (cost=35.20..980.73 rows=1441 width=16) (actual time=0.341..559.034 rows=240293 loops=1)
-> Nested Loop (cost=34.78..340.88 rows=1441 width=16) (actual time=0.278..92.806 rows=240293 loops=1)
-> HashAggregate (cost=34.35..34.41 rows=6 width=32) (actual time=0.188..0.200 rows=4 loops=1)
Group Key: u0.id
Batches: 1 Memory Usage: 24kB
-> Nested Loop (cost=0.71..34.33 rows=6 width=32) (actual time=0.161..0.185 rows=4 loops=1)
-> Index Only Scan using videos_video_tags_video_id_tag_id_f8d6ba70_uniq on videos_video_tags u1 (cost=0.43..4.53 rows=6 width=16) (actual time=0.039..0.040 rows=4 loops=1)
Index Cond: (video_id = '748b1814-f311-48da-a1f5-6bf8fe229c7f'::uuid)
Heap Fetches: 0
-> Index Only Scan using videos_tag_pkey on videos_tag u0 (cost=0.28..4.97 rows=1 width=16) (actual time=0.035..0.035 rows=1 loops=4)
Index Cond: (id = u1.tag_id)
Heap Fetches: 0
-> Index Scan using videos_video_tags_tag_id_2673cfc8 on videos_video_tags (cost=0.43..35.90 rows=1518 width=32) (actual time=0.029..16.728 rows=60073 loops=4)
Index Cond: (tag_id = u0.id)
-> Index Only Scan using videos_video_pkey on videos_video (cost=0.42..0.44 rows=1 width=16) (actual time=0.002..0.002 rows=1 loops=240293)
Index Cond: (id = videos_video_tags.video_id)
Heap Fetches: 46
Planning Time: 1.980 ms
Execution Time: 739.446 ms
(26 rows)
Time: 742.145 ms
---------- Results of the execution plan for the query as answered by Edouard. ----------
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=30043.90..30212.53 rows=20 width=746) (actual time=239.142..239.219 rows=20 loops=1)
-> Limit (cost=30043.48..30043.53 rows=20 width=24) (actual time=239.089..239.093 rows=20 loops=1)
-> Sort (cost=30043.48..30607.15 rows=225467 width=24) (actual time=239.087..239.090 rows=20 loops=1)
Sort Key: (count(*)) DESC
Sort Method: top-N heapsort Memory: 26kB
-> HashAggregate (cost=21789.21..24043.88 rows=225467 width=24) (actual time=185.710..219.211 rows=188818 loops=1)
Group Key: vt.video_id
Batches: 1 Memory Usage: 22545kB
-> Nested Loop (cost=20.62..20187.24 rows=320395 width=16) (actual time=4.975..106.839 rows=240293 loops=1)
-> Index Only Scan using videos_video_tags_video_id_tag_id_f8d6ba70_uniq on videos_video_tags vvt (cost=0.43..4.53 rows=6 width=16) (actual time=0.033..0.043 rows=4 loops=1)
Index Cond: (video_id = '748b1814-f311-48da-a1f5-6bf8fe229c7f'::uuid)
Heap Fetches: 0
-> Bitmap Heap Scan on videos_video_tags vt (cost=20.19..3348.60 rows=1518 width=32) (actual time=4.311..20.663 rows=60073 loops=4)
Recheck Cond: (tag_id = vvt.tag_id)
Heap Blocks: exact=34757
-> Bitmap Index Scan on videos_video_tags_tag_id_2673cfc8 (cost=0.00..19.81 rows=1518 width=0) (actual time=3.017..3.017 rows=60073 loops=4)
Index Cond: (tag_id = vvt.tag_id)
-> Index Scan using videos_video_pkey on videos_video v (cost=0.42..8.44 rows=1 width=738) (actual time=0.005..0.005 rows=1 loops=20)
Index Cond: (id = vt.video_id)
Planning Time: 0.854 ms
Execution Time: 241.392 ms
(21 rows)
Time: 242.909 ms
Here below are some ideas to simplify the query. Then an EXPLAIN ANALYSE will confirm the potential impacts on the query performance.
Starting from the subquery :
SELECT U0."id"
FROM "videos_tag" U0
INNER JOIN "videos_video_tags" U1 ON (U0."id" = U1."tag_id")
WHERE U1."video_id" = '748b1814-f311-48da-a1f5-6bf8fe229c7f'
According to the JOIN clause : U0."id" = U1."tag_id" so that SELECT U0."id" can be replaced by SELECT U1."tag_id".
In this case, the table "videos_tag" U0 is not used anymore in the subquery which can be simplified as :
SELECT U1."tag_id"
FROM "videos_video_tags" U1
WHERE U1."video_id" = '748b1814-f311-48da-a1f5-6bf8fe229c7f'
And the WHERE clause of the main query becomes :
WHERE "videos_video_tags"."tag_id" IN
( SELECT U1."tag_id"
FROM "videos_video_tags" U1
WHERE U1."video_id" = '748b1814-f311-48da-a1f5-6bf8fe229c7f'
)
which can be transformed as a self join on the table "videos_video_tags" to be added in the FROM clause of the main query :
FROM "videos_video" AS v
INNER JOIN "videos_video_tags" AS vt
ON v."id" = vt."video_id"
INNER JOIN "videos_video_tags" AS vvt
ON vvt."tag_id" = vt."tag_id"
WHERE vvt."video_id" = '748b1814-f311-48da-a1f5-6bf8fe229c7f'
Finally, the GROUP BY "videos_video"."id" clause can be replaced by GROUP BY "videos_video_tags"."video_id" according to the JOIN clause between both tables, and this new GROUP BY clause associated to the ORDER BY clause and LIMIT clause can apply to a subquery involving the table "videos_video_tags" only, and before joining with the table "videos_video" :
SELECT v."id",
v."title",
v."thumbnail_url",
v."preview_url",
v."embed_url",
v."duration",
v."views",
v."is_public",
v."published_at",
v."created_at",
v."updated_at",
w."n"
FROM "videos_video" AS v
INNER JOIN
( SELECT vt."video_id"
, count(*) AS "n"
FROM "videos_video_tags" AS vt
INNER JOIN "videos_video_tags" AS vvt
ON vvt."tag_id" = vt."tag_id"
WHERE vvt."video_id" = '748b1814-f311-48da-a1f5-6bf8fe229c7f'
GROUP BY vt."video_id"
ORDER BY "n" DESC
LIMIT 20
) AS w
ON v."id" = w."video_id"

Postgresql Query Slows Inexplicably with Addition of WHERE Constraint

I have the following PostgreSQL query, which contains a few subqueries. The
query runs almost instantly until I add the WHERE lb.type = 'Marketing'
constraint, which causes it to take about 3 minutes. I find it inexplicable that
the addition of such a simple constraint causes such an extreme slowdown, but
I'm guessing it must point to a fundamental flaw in my approach.
I'm hoping for assistance on a few fronts:
Is my use of subqueries to select the latest records from specific tables appropriate, or could it cause performance issues?
What should I be looking for in the execution plan when trying to diagnose issues?
How should I go about determining what indexes to create for complex queries such as these?
Why could the additional WHERE constraint be causing such a massive slowdown?
The table structure is as follows:
CREATE TABLE sales.leads
(
lead_id integer NOT NULL DEFAULT nextval('sales.leads_lead_id_seq'::regclass),
batch_id integer,
expired integer NOT NULL DEFAULT 0,
closed integer NOT NULL DEFAULT 0,
merged integer NOT NULL DEFAULT 0,
CONSTRAINT leads_pkey PRIMARY KEY (lead_id)
)
CREATE TABLE sales.lead_batches
(
batch_id integer NOT NULL DEFAULT nextval('sales.lead_batches_batch_id_seq'::regclass),
inserted_datetime timestamp without time zone,
type character varying(100) COLLATE pg_catalog."default",
uploaded smallint NOT NULL DEFAULT '0'::smallint,
CONSTRAINT lead_batches_pkey PRIMARY KEY (batch_id)
)
CREATE TABLE sales.lead_results
(
lead_result_id integer NOT NULL DEFAULT nextval('sales.lead_results_lead_result_id_seq'::regclass),
lead_id integer,
assigned_datetime timestamp without time zone NOT NULL,
user_id character varying(255) COLLATE pg_catalog."default" NOT NULL,
resulted_datetime timestamp without time zone,
result character varying(255) COLLATE pg_catalog."default",
CONSTRAINT lead_results_pkey PRIMARY KEY (lead_result_id)
)
CREATE TABLE sales.personal_details
(
lead_id integer,
title character varying(50) COLLATE pg_catalog."default",
first_name character varying(100) COLLATE pg_catalog."default",
surname character varying(255) COLLATE pg_catalog."default",
email_address character varying(100) COLLATE pg_catalog."default",
updated_date date NOT NULL
)
CREATE TABLE sales.users
(
user_id character varying(50) COLLATE pg_catalog."default" NOT NULL,
surname character varying(255) COLLATE pg_catalog."default",
name character varying(255) COLLATE pg_catalog."default"
)
Query:
SELECT l.*, pd.*, lr.resulted_datetime, u.name
FROM sales.leads l
INNER JOIN sales.lead_batches lb ON l.batch_id = lb.batch_id
LEFT JOIN (
SELECT pd_sub.*
FROM sales.personal_details pd_sub
INNER JOIN (
SELECT lead_id, MAX(updated_date) AS updated_date
FROM sales.personal_details
GROUP BY lead_id
) sub ON pd_sub.lead_id = sub.lead_id AND pd_sub.updated_date = sub.updated_date
) pd ON l.lead_id = pd.lead_id
LEFT JOIN (
SELECT lr_sub.*
FROM sales.lead_results lr_sub
INNER JOIN (
SELECT lead_id, MAX(resulted_datetime) AS resulted_datetime
FROM sales.lead_results
GROUP BY lead_id) sub
ON lr_sub.lead_id = sub.lead_id AND lr_sub.resulted_datetime = sub.resulted_datetime
) lr ON l.lead_id = lr.lead_id
LEFT JOIN sales.users u ON u.user_id = lr.user_id
WHERE lb.type = 'Marketing'
AND lb.uploaded = 1
AND l.merged = 0
Execution plan:
Nested Loop Left Join (cost=10485.51..17604.18 rows=34 width=158) (actual time=717.862..168709.593 rows=18001 loops=1)
Join Filter: (l.lead_id = pd_sub.lead_id)
Rows Removed by Join Filter: 687818215
-> Nested Loop Left Join (cost=6487.82..12478.42 rows=34 width=135) (actual time=658.141..64951.950 rows=18001 loops=1)
Join Filter: (l.lead_id = lr_sub.lead_id)
Rows Removed by Join Filter: 435482960
-> Hash Join (cost=131.01..1816.10 rows=34 width=60) (actual time=1.948..126.067 rows=17998 loops=1)
Hash Cond: (l.batch_id = lb.batch_id)
-> Seq Scan on leads l (cost=0.00..1273.62 rows=32597 width=44) (actual time=0.032..69.763 rows=32621 loops=1)
Filter: (merged = 0)
Rows Removed by Filter: 5595
-> Hash (cost=130.96..130.96 rows=4 width=20) (actual time=1.894..1.894 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on lead_batches lb (cost=0.00..130.96 rows=4 width=20) (actual time=1.078..1.884 rows=4 loops=1)
Filter: (((type)::text = 'Marketing'::text) AND (uploaded = 1))
Rows Removed by Filter: 3866
-> Materialize (cost=6356.81..10661.81 rows=1 width=79) (actual time=0.006..1.362 rows=24197 loops=17998)
-> Nested Loop Left Join (cost=6356.81..10661.81 rows=1 width=79) (actual time=96.246..633.701 rows=24197 loops=1)
Join Filter: ((u.user_id)::text = (lr_sub.user_id)::text)
Rows Removed by Join Filter: 1742184
-> Gather (cost=6356.81..10659.19 rows=1 width=72) (actual time=96.203..202.086 rows=24197 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=5356.81..9659.09 rows=1 width=72) (actual time=134.595..166.341 rows=8066 loops=3)
Hash Cond: ((lr_sub.lead_id = lead_results.lead_id) AND (lr_sub.resulted_datetime = (max(lead_results.resulted_datetime))))
-> Parallel Seq Scan on lead_results lr_sub (cost=0.00..3622.05 rows=44605 width=72) (actual time=0.033..17.333 rows=35684 loops=3)
-> Hash (cost=5110.36..5110.36 rows=16430 width=12) (actual time=134.260..134.260 rows=24194 loops=3)
Buckets: 32768 Batches: 1 Memory Usage: 1391kB
-> HashAggregate (cost=4781.76..4946.06 rows=16430 width=12) (actual time=122.823..129.022 rows=24204 loops=3)
Group Key: lead_results.lead_id
-> Seq Scan on lead_results (cost=0.00..4246.51 rows=107051 width=12) (actual time=0.020..71.768 rows=107051 loops=3)
-> Seq Scan on users u (cost=0.00..1.72 rows=72 width=23) (actual time=0.002..0.007 rows=73 loops=24197)
-> Materialize (cost=3997.68..5030.85 rows=187 width=31) (actual time=0.003..2.033 rows=38211 loops=18001)
-> Hash Join (cost=3997.68..5029.92 rows=187 width=31) (actual time=52.802..85.774 rows=38211 loops=1)
Hash Cond: ((personal_details.lead_id = pd_sub.lead_id) AND ((max(personal_details.updated_date)) = pd_sub.updated_date))
-> HashAggregate (cost=1811.38..2186.06 rows=37468 width=8) (actual time=23.330..35.345 rows=38212 loops=1)
Group Key: personal_details.lead_id
-> Seq Scan on personal_details (cost=0.00..1623.92 rows=37492 width=8) (actual time=0.014..4.636 rows=38232 loops=1)
-> Hash (cost=1623.92..1623.92 rows=37492 width=35) (actual time=29.058..29.058 rows=38211 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2809kB
-> Seq Scan on personal_details pd_sub (cost=0.00..1623.92 rows=37492 width=35) (actual time=0.026..17.231 rows=38232 loops=1)
Planning time: 1.966 ms
Execution time: 168731.769 ms
I have an index on lead_id on all tables, and an additional index on (type, uploaded) in lead_batches.
Thanks very much in advance for any assistance!
EDIT:
The execution plan without the additional WHERE constraint:
Hash Left Join (cost=15861.46..17780.37 rows=30972 width=158) (actual time=765.076..844.512 rows=32053 loops=1)
Hash Cond: (l.lead_id = pd_sub.lead_id)
-> Hash Left Join (cost=10829.21..12630.45 rows=30972 width=135) (actual time=667.460..724.297 rows=32053 loops=1)
Hash Cond: (l.lead_id = lr_sub.lead_id)
-> Hash Join (cost=167.39..1852.48 rows=30972 width=60) (actual time=2.579..36.683 rows=32050 loops=1)
Hash Cond: (l.batch_id = lb.batch_id)
-> Seq Scan on leads l (cost=0.00..1273.62 rows=32597 width=44) (actual time=0.034..22.166 rows=32623 loops=1)
Filter: (merged = 0)
Rows Removed by Filter: 5595
-> Hash (cost=121.40..121.40 rows=3679 width=20) (actual time=2.503..2.503 rows=3679 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 234kB
-> Seq Scan on lead_batches lb (cost=0.00..121.40 rows=3679 width=20) (actual time=0.011..1.809 rows=3679 loops=1)
Filter: (uploaded = 1)
Rows Removed by Filter: 193
-> Hash (cost=10661.81..10661.81 rows=1 width=79) (actual time=664.855..664.855 rows=24197 loops=1)
Buckets: 32768 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2821kB
-> Nested Loop Left Join (cost=6356.81..10661.81 rows=1 width=79) (actual time=142.634..647.146 rows=24197 loops=1)
Join Filter: ((u.user_id)::text = (lr_sub.user_id)::text)
Rows Removed by Join Filter: 1742184
-> Gather (cost=6356.81..10659.19 rows=1 width=72) (actual time=142.590..241.913 rows=24197 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=5356.81..9659.09 rows=1 width=72) (actual time=141.250..171.403 rows=8066 loops=3)
Hash Cond: ((lr_sub.lead_id = lead_results.lead_id) AND (lr_sub.resulted_datetime = (max(lead_results.resulted_datetime))))
-> Parallel Seq Scan on lead_results lr_sub (cost=0.00..3622.05 rows=44605 width=72) (actual time=0.027..15.322 rows=35684 loops=3)
-> Hash (cost=5110.36..5110.36 rows=16430 width=12) (actual time=140.917..140.917 rows=24194 loops=3)
Buckets: 32768 Batches: 1 Memory Usage: 1391kB
-> HashAggregate (cost=4781.76..4946.06 rows=16430 width=12) (actual time=127.911..135.076 rows=24204 loops=3)
Group Key: lead_results.lead_id
-> Seq Scan on lead_results (cost=0.00..4246.51 rows=107051 width=12) (actual time=0.020..74.626 rows=107051 loops=3)
-> Seq Scan on users u (cost=0.00..1.72 rows=72 width=23) (actual time=0.002..0.006 rows=73 loops=24197)
-> Hash (cost=5029.92..5029.92 rows=187 width=31) (actual time=97.561..97.561 rows=38213 loops=1)
Buckets: 65536 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2660kB
-> Hash Join (cost=3997.68..5029.92 rows=187 width=31) (actual time=52.712..85.099 rows=38213 loops=1)
Hash Cond: ((personal_details.lead_id = pd_sub.lead_id) AND ((max(personal_details.updated_date)) = pd_sub.updated_date))
-> HashAggregate (cost=1811.38..2186.06 rows=37468 width=8) (actual time=23.831..35.015 rows=38214 loops=1)
Group Key: personal_details.lead_id
-> Seq Scan on personal_details (cost=0.00..1623.92 rows=37492 width=8) (actual time=0.012..4.995 rows=38234 loops=1)
-> Hash (cost=1623.92..1623.92 rows=37492 width=35) (actual time=28.468..28.468 rows=38213 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2809kB
-> Seq Scan on personal_details pd_sub (cost=0.00..1623.92 rows=37492 width=35) (actual time=0.024..17.089 rows=38234 loops=1)
Planning time: 2.058 ms
Execution time: 849.460 ms
The execution plan with nested_loops disabled:
Hash Left Join (cost=13088.17..17390.71 rows=34 width=158) (actual time=277.646..343.924 rows=18001 loops=1)
Hash Cond: (l.lead_id = pd_sub.lead_id)
-> Hash Right Join (cost=8055.91..12358.31 rows=34 width=135) (actual time=181.614..238.365 rows=18001 loops=1)
Hash Cond: (lr_sub.lead_id = l.lead_id)
-> Hash Left Join (cost=6359.43..10661.82 rows=1 width=79) (actual time=156.498..201.533 rows=24197 loops=1)
Hash Cond: ((lr_sub.user_id)::text = (u.user_id)::text)
-> Gather (cost=6356.81..10659.19 rows=1 width=72) (actual time=156.415..190.934 rows=24197 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=5356.81..9659.09 rows=1 width=72) (actual time=143.387..178.653 rows=8066 loops=3)
Hash Cond: ((lr_sub.lead_id = lead_results.lead_id) AND (lr_sub.resulted_datetime = (max(lead_results.resulted_datetime))))
-> Parallel Seq Scan on lead_results lr_sub (cost=0.00..3622.05 rows=44605 width=72) (actual time=0.036..22.404 rows=35684 loops=3)
-> Hash (cost=5110.36..5110.36 rows=16430 width=12) (actual time=143.052..143.052 rows=24194 loops=3)
Buckets: 32768 Batches: 1 Memory Usage: 1391kB
-> HashAggregate (cost=4781.76..4946.06 rows=16430 width=12) (actual time=131.793..137.760 rows=24204 loops=3)
Group Key: lead_results.lead_id
-> Seq Scan on lead_results (cost=0.00..4246.51 rows=107051 width=12) (actual time=0.023..78.918 rows=107051 loops=3)
-> Hash (cost=1.72..1.72 rows=72 width=23) (actual time=0.061..0.061 rows=73 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on users u (cost=0.00..1.72 rows=72 width=23) (actual time=0.031..0.039 rows=73 loops=1)
-> Hash (cost=1696.05..1696.05 rows=34 width=60) (actual time=25.068..25.068 rows=17998 loops=1)
Buckets: 32768 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2084kB
-> Hash Join (cost=10.96..1696.05 rows=34 width=60) (actual time=0.208..18.630 rows=17998 loops=1)
Hash Cond: (l.batch_id = lb.batch_id)
-> Seq Scan on leads l (cost=0.00..1273.62 rows=32597 width=44) (actual time=0.043..13.065 rows=32623 loops=1)
Filter: (merged = 0)
Rows Removed by Filter: 5595
-> Hash (cost=10.91..10.91 rows=4 width=20) (actual time=0.137..0.137 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Scan using lead_batches_type_idx on lead_batches lb (cost=0.28..10.91 rows=4 width=20) (actual time=0.091..0.129 rows=4 loops=1)
Index Cond: ((type)::text = 'Marketing'::text)
Filter: (uploaded = 1)
-> Hash (cost=5029.92..5029.92 rows=187 width=31) (actual time=96.005..96.005 rows=38213 loops=1)
Buckets: 65536 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2660kB
-> Hash Join (cost=3997.68..5029.92 rows=187 width=31) (actual time=52.166..84.592 rows=38213 loops=1)
Hash Cond: ((personal_details.lead_id = pd_sub.lead_id) AND ((max(personal_details.updated_date)) = pd_sub.updated_date))
-> HashAggregate (cost=1811.38..2186.06 rows=37468 width=8) (actual time=23.785..34.403 rows=38214 loops=1)
Group Key: personal_details.lead_id
-> Seq Scan on personal_details (cost=0.00..1623.92 rows=37492 width=8) (actual time=0.013..4.680 rows=38234 loops=1)
-> Hash (cost=1623.92..1623.92 rows=37492 width=35) (actual time=27.960..27.960 rows=38213 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2809kB
-> Seq Scan on personal_details pd_sub (cost=0.00..1623.92 rows=37492 width=35) (actual time=0.019..15.350 rows=38234 loops=1)
Planning time: 2.469 ms
Execution time: 346.590 ms
You are basically missing some important indexes here.
For testing improvements I've set up the tables myself and tried to fill them with test data with similar distribution as read from the explain plans.
My baseline performance was ~160 seconds: https://explain.depesz.com/s/WlKO
The first thing I did was creating indexes for the foreign key references (although not all will be necessary):
CREATE INDEX idx_personal_details_leads ON sales.personal_details (lead_id);
CREATE INDEX idx_leads_batches ON sales.leads (batch_id);
CREATE INDEX idx_lead_results_users ON sales.lead_results (user_id);
That brought us down to ~112 seconds: https://explain.depesz.com/s/aRcf
Now, most of the time get's actually spend on the self-joins (table personal_details using latest updated_date and table lead_results using latest resulted_datetime). Based on this, I came up with the following two indexes:
CREATE INDEX idx_personal_details_updated ON sales.personal_details (lead_id, updated_date DESC);
CREATE INDEX idx_lead_results_resulted ON sales.lead_results (lead_id, resulted_datetime DESC);
...which then immediately brings us down to ~110 milliseconds: https://explain.depesz.com/s/dDfk
Debugging help
What has helped me in debugging which indexes where most effective, I first rewrote the query to eliminate any sub-select and instead use a dedicated CTE for each of them:
WITH
leads_update_latest AS (
SELECT lead_id, MAX(updated_date) AS updated_date
FROM sales.personal_details
GROUP BY lead_id
),
pd AS (
SELECT pd_sub.*
FROM sales.personal_details pd_sub
INNER JOIN leads_update_latest sub ON (pd_sub.lead_id = sub.lead_id AND pd_sub.updated_date = sub.updated_date)
),
leads_result_latest AS (
SELECT lead_id, MAX(resulted_datetime) AS resulted_datetime
FROM sales.lead_results
GROUP BY lead_id
),
lr AS (
SELECT lr_sub.*
FROM sales.lead_results lr_sub
INNER JOIN leads_result_latest sub ON (lr_sub.lead_id = sub.lead_id AND lr_sub.resulted_datetime = sub.resulted_datetime)
),
leads AS (
SELECT l.*
FROM sales.leads l
INNER JOIN sales.lead_batches lb ON (l.batch_id = lb.batch_id)
WHERE lb.type = 'Marketing'
AND lb.uploaded = 1
AND l.merged = 0
)
SELECT l.*, pd.*, lr.resulted_datetime, u.name
FROM leads l
LEFT JOIN pd ON l.lead_id = pd.lead_id
LEFT JOIN lr ON l.lead_id = lr.lead_id
LEFT JOIN sales.users u ON u.user_id = lr.user_id
;
Surprisingly, by rewriting the query alone into CTE's, the PostgreSQL planner was way faster and took just ~2.3 seconds without any of the indexes: https://explain.depesz.com/s/lqzq
...with optimization:
FK-indexes down to ~230 milliseconds: https://explain.depesz.com/s/a6wT
However, with the other combined indexes, the CTE version degraded:
combined reverse indexes up to ~270 milliseconds: https://explain.depesz.com/s/TNNm
However, as these combined indexes speed up the original query a lot, they also grow a lot faster than single-column indexes and they are an additional write cost to account for in regards to the DB scalability.
As a result, it might make sense to go for a CTE-version performing a bit slower but fast enough to be able omit two additional indexes that the DB has to maintain.

Different Explain on same Query

I have created index on events table over column derived_tstamp which has over 4 million records:
CREATE INDEX derived_tstamp_date_index ON atomic.events ( date(derived_tstamp) );
When I am running queries with two different values for domain_userid I am getting different Explain results. In Query 1 its used the index but Query 2 not using the index. How to make sure index is used all the time for faster results ?
Query 1:
EXPLAIN ANALYZE SELECT
SUM(duration) as "total_time_spent"
FROM (
SELECT
domain_sessionidx,
MIN(derived_tstamp) as "start_time",
MAX(derived_tstamp) as "finish_time",
MAX(derived_tstamp) - min(derived_tstamp) as "duration"
FROM "atomic".events
WHERE date(derived_tstamp) >= date('2017-07-01') AND date(derived_tstamp) <= date('2017-08-02') AND domain_userid = 'd01ee409-ebff-4f37-bc97-9bbda45a7225'
GROUP BY 1
) v;
Explain of query 1
Aggregate (cost=1834.00..1834.01 rows=1 width=16) (actual time=138.619..138.619 rows=1 loops=1)
-> GroupAggregate (cost=1830.83..1832.93 rows=85 width=34) (actual time=137.096..138.563 rows=186 loops=1)
Group Key: events.domain_sessionidx
-> Sort (cost=1830.83..1831.09 rows=104 width=10) (actual time=137.063..137.681 rows=2726 loops=1)
Sort Key: events.domain_sessionidx
Sort Method: quicksort Memory: 224kB
-> Bitmap Heap Scan on events (cost=1412.95..1827.35 rows=104 width=10) (actual time=108.764..136.053 rows=2726 loops=1)
Recheck Cond: ((date(derived_tstamp) >= '2017-07-01'::date) AND (date(derived_tstamp) <= '2017-08-02'::date) AND ((domain_userid)::text = 'd01ee409-ebff-4f37-bc97-9bbda45a7225'::text))
Rows Removed by Index Recheck: 19704
Heap Blocks: exact=466 lossy=3331
-> BitmapAnd (cost=1412.95..1412.95 rows=104 width=0) (actual time=108.474..108.474 rows=0 loops=1)
-> Bitmap Index Scan on derived_tstamp_date_index (cost=0.00..448.34 rows=21191 width=0) (actual time=94.371..94.371 rows=818461 loops=1)
Index Cond: ((date(derived_tstamp) >= '2017-07-01'::date) AND (date(derived_tstamp) <= '2017-08-02'::date))
-> Bitmap Index Scan on events_domain_userid_index (cost=0.00..964.31 rows=20767 width=0) (actual time=3.044..3.044 rows=16834 loops=1)
Index Cond: ((domain_userid)::text = 'd01ee409-ebff-4f37-bc97-9bbda45a7225'::text)
Planning time: 0.166 ms
Query 2:
EXPLAIN ANALYZE SELECT
SUM(duration) as "total_time_spent"
FROM (
SELECT
domain_sessionidx,
MIN(derived_tstamp) as "start_time",
MAX(derived_tstamp) as "finish_time",
MAX(derived_tstamp) - min(derived_tstamp) as "duration"
FROM "atomic".events
WHERE date(derived_tstamp) >= date('2017-07-01') AND date(derived_tstamp) <= date('2017-08-02') AND domain_userid = 'e4c94f3e-9841-4b65-9031-ca4aa03809e7'
GROUP BY 1
) v;
Explain of query 2:
Aggregate (cost=226.12..226.13 rows=1 width=16) (actual time=0.402..0.402 rows=1 loops=1)
-> GroupAggregate (cost=226.08..226.10 rows=1 width=34) (actual time=0.394..0.397 rows=2 loops=1)
Group Key: events.domain_sessionidx
-> Sort (cost=226.08..226.08 rows=1 width=10) (actual time=0.381..0.386 rows=13 loops=1)
Sort Key: events.domain_sessionidx
Sort Method: quicksort Memory: 25kB
-> Index Scan using events_domain_userid_index on events (cost=0.56..226.07 rows=1 width=10) (actual time=0.030..0.368 rows=13 loops=1)
Index Cond: ((domain_userid)::text = 'e4c94f3e-9841-4b65-9031-ca4aa03809e7'::text)
Filter: ((date(derived_tstamp) >= '2017-07-01'::date) AND (date(derived_tstamp) <= '2017-08-02'::date))
Rows Removed by Filter: 184
Planning time: 0.162 ms
Execution time: 0.440 ms
The index is not used in the second case because there are so few rows matching the condition domain_userid = 'e4c94f3e-9841-4b65-9031-ca4aa03809e7' (only 197) that it is cheaper to filter those rows than to perform a bitmap index scan using your new index.

Strange pgsql query performance

I have a relation like this
R ( EDGE INTEGER, DIHEDRAL INTEGER, FACE INTEGER , VALENCY INTEGER)
I tested twice, 64 rows table R and 128 rows table R. but the simpler one takes much more time than the second one. The explain is like below (It shows error on explain.depesz.com). Could anyone help me to check why? thanks.
plan for 64 rows:
HashAggregate (cost=260.16..260.17 rows=1 width=12) (actual rows=64 loops=1)
-> Nested Loop (cost=89.44..260.15 rows=1 width=12) (actual rows=256 loops=1)
Join Filter: ((f1.face < f2.face) AND (e3.edge <> f1.edge) AND (e4.edge <> e3.edge) AND (f1.edge = f2.edge) AND (f1.face =
e3.face))
Rows Removed by Join Filter: 142606080
-> Nested Loop (cost=41.91..167.59 rows=1 width=16) (actual rows=557056 loops=1)
-> Nested Loop (cost=41.91..125.71 rows=1 width=8) (actual rows=256 loops=1)
Join Filter: ((e5.edge <> f2.edge) AND (e5.edge <> e2.edge) AND (e2.face = e5.face))
Rows Removed by Join Filter: 1113856
-> Hash Join (cost=41.91..83.73 rows=1 width=16) (actual rows=512 loops=1)
Hash Cond: (f2.face = e2.face)
Join Filter: (e2.edge <> f2.edge)
Rows Removed by Join Filter: 256
-> Seq Scan on r f2 (cost=0.00..41.76 rows=12 width=8) (actual rows=384 loops=1)
Filter: (valency = 3)
Rows Removed by Filter: 1920
-> Hash (cost=41.76..41.76 rows=12 width=8) (actual rows=2176 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 85kB
-> Seq Scan on r e2 (cost=0.00..41.76 rows=12 width=8) (actual rows=2176 loops=1)
Filter: (dihedral = 2)
Rows Removed by Filter: 128
-> Seq Scan on r e5 (cost=0.00..41.76 rows=12 width=8) (actual rows=2176 loops=512)
Filter: (dihedral = 2)
Rows Removed by Filter: 128
-> Seq Scan on r e3 (cost=0.00..41.76 rows=12 width=8) (actual rows=2176 loops=256)
Filter: (dihedral = 2)
Rows Removed by Filter: 128
-> Hash Join (cost=47.53..92.32 rows=11 width=16) (actual rows=256 loops=557056)
Hash Cond: (e4.face = f1.face)
Join Filter: (e4.edge <> f1.edge)
Rows Removed by Join Filter: 128
-> Seq Scan on r e4 (cost=0.00..36.01 rows=2301 width=8) (actual rows=2304 loops=557056)
-> Hash (cost=47.52..47.52 rows=1 width=8) (actual rows=128 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 5kB
-> Seq Scan on r f1 (cost=0.00..47.52 rows=1 width=8) (actual rows=128 loops=1)
Filter: ((valency = 3) AND (dihedral = 1))
Rows Removed by Filter: 2176
Total runtime: 159268.541 ms
(37 rows)
plan for 128 rows
HashAggregate (cost=501.28..501.29 rows=1 width=12) (actual rows=128 loops=1)
-> Nested Loop (cost=171.98..501.27 rows=2 width=12) (actual rows=512 loops=1)
Join Filter: ((e3.edge <> f1.edge) AND (e4.edge <> e3.edge) AND (f1.face = e3.face))
Rows Removed by Join Filter: 2227712
-> Seq Scan on r e3 (cost=0.00..80.31 rows=22 width=8) (actual rows=4352 loops=1)
Filter: (dihedral = 2)
Rows Removed by Filter: 256
-> Materialize (cost=171.98..420.08 rows=2 width=20) (actual rows=512 loops=4352)
-> Nested Loop (cost=171.98..420.07 rows=2 width=20) (actual rows=512 loops=1)
Join Filter: ((f1.face < f2.face) AND (f1.edge = f2.edge))
Rows Removed by Join Filter: 261632
-> Nested Loop (cost=80.59..242.23 rows=1 width=8) (actual rows=512 loops=1)
Join Filter: ((e5.edge <> f2.edge) AND (e5.edge <> e2.edge) AND (e2.face = e5.face))
Rows Removed by Join Filter: 4455936
-> Seq Scan on r e5 (cost=0.00..80.31 rows=22 width=8) (actual rows=4352 loops=1)
Filter: (dihedral = 2)
Rows Removed by Filter: 256
-> Materialize (cost=80.59..161.05 rows=2 width=16) (actual rows=1024 loops=4352)
-> Hash Join (cost=80.59..161.04 rows=2 width=16) (actual rows=1024 loops=1)
Hash Cond: (f2.face = e2.face)
Join Filter: (e2.edge <> f2.edge)
Rows Removed by Join Filter: 512
-> Seq Scan on r f2 (cost=0.00..80.31 rows=22 width=8) (actual rows=768 loops=1)
Filter: (valency = 3)
Rows Removed by Filter: 3840
-> Hash (cost=80.31..80.31 rows=22 width=8) (actual rows=4352 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 170kB
-> Seq Scan on r e2 (cost=0.00..80.31 rows=22 width=8) (actual rows=4352 loops=1)
Filter: (dihedral = 2)
Rows Removed by Filter: 256
-> Hash Join (cost=91.39..177.51 rows=22 width=16) (actual rows=512 loops=512)
Hash Cond: (e4.face = f1.face)
Join Filter: (e4.edge <> f1.edge)
Rows Removed by Join Filter: 256
-> Seq Scan on r e4 (cost=0.00..69.25 rows=4425 width=8) (actual rows=4608 loops=512)
-> Hash (cost=91.38..91.38 rows=1 width=8) (actual rows=256 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Seq Scan on r f1 (cost=0.00..91.38 rows=1 width=8) (actual rows=256 loops=1)
Filter: ((valency = 3) AND (dihedral = 1))
Rows Removed by Filter: 4352
Total runtime: 1262.761 ms
(41 rows)
The query planner uses statistics on row counts/index sizes/etc. to estimate how to get the best performance out of a query. A bulk insertion of rows immediately followed by a query may not show best performance, because these statistics may be out of date.
To make sure the planner makes informed choices, you need to issue a call to ANALYZE prior to running your EXPLAIN query.
In your specific scenario, chances are the planner made a bad choice in the first case (the 64 rows) and a good one in the second case (the 128 rows).