I am trying to execute the same SQL but with different values for the where clause. One query is taking significantly longer time to process than the other. I have also observed that the execution plan for the two queries is different too,
Query1 and Execution Plan:
explain analyze
select t."postal_code"
from dev."postal_master" t
left join dev."premise_master" f
on t."primary_code" = f."primary_code"
and t."name" = f."name"
and t."final_code" = f."final_code"
where 1 = 1 and t."region" = 'US'
and t."name" = 'UBQ'
and t."accountModCode" = 'LTI'
and t."modularity_code" = 'PHA'
group by t."postal_code", t."modularity_code", t."region",
t."feature", t."granularity"
Group (cost=4.19..4.19 rows=1 width=38) (actual time=76411.456..76414.348 rows=11871 loops=1)
Group Key: t."postal_code", t."modularity_code", t."region", t."feature", t.granularity
-> Sort (cost=4.19..4.19 rows=1 width=38) (actual time=76411.452..76412.045 rows=11879 loops=1)
Sort Key: t."postal_code", t."feature", t.granularity
Sort Method: quicksort Memory: 2055kB
-> Nested Loop Left Join (cost=0.17..4.19 rows=1 width=38) (actual time=45.373..76362.219 rows=11879 loops=1)
Join Filter: (((t."name")::text = (f."name")::text) AND ((t."primary_code")::text = (f."primary_code")::text) AND ((t."final_code")::text = (f."final_code")::text))
Rows Removed by Join Filter: 150642887
-> Index Scan using idx_postal_code_source on postal_master t (cost=0.09..2.09 rows=1 width=72) (actual time=36.652..154.339 rows=11871 loops=1)
Index Cond: (("name")::text = 'UBQ'::text)
Filter: ((("region")::text = 'US'::text) AND (("accountModCode")::text = 'LTI'::text) AND (("modularity_code")::text = 'PHA'::text))
Rows Removed by Filter: 550164
-> Index Scan using idx_postal_master_source on premise_master f (cost=0.08..2.09 rows=1 width=35) (actual time=0.016..3.720 rows=12690 loops=11871)
Index Cond: (("name")::text = 'UBQ'::text)
Planning Time: 1.196 ms
Execution Time: 76415.004 ms
Query2 and Execution plan:
explain analyze
select t."postal_code"
from dev."postal_master" t
left join dev."premise_master" f
on t."primary_code" = f."primary_code"
and t."name" = f."name"
and t."final_code" = f."final_code"
where 1 = 1 and t."region" = 'DE'
and t."name" = 'EME'
and t."accountModCode" = 'QEW'
and t."modularity_code" = 'NFX'
group by t."postal_code", t."modularity_code", t."region",
t."feature", t."granularity"
Group (cost=50302.96..50426.04 rows=1330 width=38) (actual time=170.687..184.772 rows=8230 loops=1)
Group Key: t."postal_code", t."modularity_code", t."region", t."feature", t.granularity
-> Gather Merge (cost=50302.96..50423.27 rows=1108 width=38) (actual time=170.684..182.965 rows=8230 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Group (cost=49302.95..49304.62 rows=554 width=38) (actual time=164.446..165.613 rows=2743 loops=3)
Group Key: t."postal_code", t."modularity_code", t."region", t."feature", t.granularity
-> Sort (cost=49302.95..49303.23 rows=554 width=38) (actual time=164.444..164.645 rows=3432 loops=3)
Sort Key: t."postal_code", t."feature", t.granularity
Sort Method: quicksort Memory: 550kB
Worker 0: Sort Method: quicksort Memory: 318kB
Worker 1: Sort Method: quicksort Memory: 322kB
-> Nested Loop Left Join (cost=1036.17..49297.90 rows=554 width=38) (actual time=2.143..148.372 rows=3432 loops=3)
-> Parallel Bitmap Heap Scan on territory_postal_mapping t (cost=1018.37..38323.78 rows=554 width=72) (actual time=1.898..11.849 rows=2743 loops=3)
Recheck Cond: ((("accountModCode")::text = 'QEW'::text) AND (("region")::text = 'DE'::text) AND (("name")::text = 'EME'::text))
Filter: (("modularity_code")::text = 'NFX'::text)
Rows Removed by Filter: 5914
Heap Blocks: exact=2346
-> Bitmap Index Scan on territorypostal__source_region_mod (cost=0.00..1018.31 rows=48088 width=0) (actual time=4.783..4.783 rows=25973 loops=1)
Index Cond: ((("accountModCode")::text = 'QEW'::text) AND (("region")::text = 'DE'::text) AND (("name")::text = 'EME'::text))
-> Bitmap Heap Scan on premise_master f (cost=17.80..19.81 rows=1 width=35) (actual time=0.047..0.048 rows=1 loops=8230)
Recheck Cond: (((t."primary_code")::text = ("primary_code")::text) AND ((t."final_code")::text = ("final_code")::text))
Filter: ((("name")::text = 'EME'::text) AND ((t."name")::text = ("name")::text))
Heap Blocks: exact=1955
-> BitmapAnd (cost=17.80..17.80 rows=1 width=0) (actual time=0.046..0.046 rows=0 loops=8230)
-> Bitmap Index Scan on premise_master__accountprimarypostal (cost=0.00..1.95 rows=105 width=0) (actual time=0.008..0.008 rows=24 loops=8230)
Index Cond: ((t."primary_code")::text = ("primary_code")::text)
-> Bitmap Index Scan on premise_master__accountfinalterritorycode (cost=0.00..15.80 rows=1403 width=0) (actual time=0.065..0.065 rows=559 loops=4568)
Index Cond: ((t."final_code")::text = ("final_code")::text)
Planning Time: 1.198 ms
Execution Time: 185.197 ms
I am aware that there will be different number of rows depending on the where condition but is that the only reason for the different execution plan. Also, how can I improve the performance of the first query.
The estimates are totally wrong for the first query, so it is no surprise that PostgreSQL picks a bad plan. Try these measures one after the other and see if they help:
Collect statistics:
ANALYZE premise_master, postal_master;
Calculate more precise statistics:
ALTER TABLE premise_master ALTER name SET statistics 1000;
ALTER TABLE postal_master ALTER name SET statistics 1000;
ANALYZE premise_master, postal_master;
The estimates in the first query are off in such a bad way that I suspect that there is an exceptional problem, like an upgrade with pg_upgrade where you forgot to run ANALYZE afterwards, or you are wiping the database statistics with pg_stat_reset().
If that is not the case, and a simple ANALYZE of the tables did the trick, the cause of the problem must be that autoanalyze does not run often enough on these tables. You can tune autovacuum to do that more often with a statement like this:
ALTER TABLE premise_master SET (autovacuum_analyze_scale_factor = 0.01);
That would make PostgreSQL collect statistics whenever 1% of the table has changed.
The first line of each EXPLAIN ANALYZE output suggests that the planner only expected 1 row from the first query, while it expected 1130 from the second, so that's probably why it chose a less efficient query plan. That usually means table statistics aren't up to date, and when they were last run there weren't many rows that would've matched the first query (maybe the data was being loaded in alphabetical order?). In this case the fix is to execute an ANALYZE dev."postal_master"query to refresh the statistics.
You could also try removing the GROUP BY clause entirely (if your tooling allows). I could be misreading but it doesn't look like it's affecting the output much. If that results in unwanted duplicates you can use select distinct t.postal_code instead of the group by.
Related
I have two same queries but with different where condition values
explain analyse select survey_contact_id, relation_id, count(survey_contact_id), count(relation_id) from nomination where survey_id = 1565 and account_id = 225 and deleted_at is NULL group by survey_contact_id, relation_id;
explain analyse select survey_contact_id, relation_id, count(survey_contact_id), count(relation_id) from nomination where survey_id = 888 and account_id = 12 and deleted_at is NULL group by survey_contact_id, relation_id;
When I ran this two queries they both producing different result
for first query the result
GroupAggregate (cost=0.28..8.32 rows=1 width=24) (actual time=0.016..0.021 rows=4 loops=1)
Group Key: survey_contact_id, relation_id
-> Index Only Scan using test on nomination (cost=0.28..8.30 rows=1 width=8) (actual time=0.010..0.012 rows=5 loops=1)
Index Cond: ((account_id = 225) AND (survey_id = 1565))
Heap Fetches: 5
Planning time: 0.148 ms
Execution time: 0.058 ms
and for the 2nd one
GroupAggregate (cost=11.08..11.12 rows=2 width=24) (actual time=0.015..0.015 rows=0 loops=1)
Group Key: survey_contact_id, relation_id
-> Sort (cost=11.08..11.08 rows=2 width=8) (actual time=0.013..0.013 rows=0 loops=1)
Sort Key: survey_contact_id, relation_id
Sort Method: quicksort Memory: 25kB
-> Bitmap Heap Scan on nomination (cost=4.30..11.07 rows=2 width=8) (actual time=0.008..0.008 rows=0 loops=1)
Recheck Cond: ((account_id = 12) AND (survey_id = 888) AND (deleted_at IS NULL))
-> Bitmap Index Scan on test (cost=0.00..4.30 rows=2 width=0) (actual time=0.006..0.006 rows=0 loops=1)
Index Cond: ((account_id = 12) AND (survey_id = 888))
Planning time: 0.149 ms
Execution time: 0.052 ms
Can anyone explain Me why Postgres is making a BitMap Scan instead of Index Only scan ?
The short version is that Postgres has a cost-based approach, so it has estimated that the cost of doing so is less in the second case, based on the statistics it has.
In your case, the total cost (estimated) of each of these queries is 8.32 and 11.12 respectively. You may be able to see the cost of the index-only scan for the second query by running set enable_bitmapscan = off.
Note that, based on its statistics, Postgres estimated that the first query would return 1 row (actually 4), and that the second would return 2 rows (actually 0).
There are several ways to get better statistics, but if analyze (or autovacuum) hasn't been run on that table for a while, that is a common cause of bad estimates. Another tell-tale that vacuum may not have been run recently (at least on this table) is the Heap Fetches: 5 you can see in the first query plan.
I'm confused by the "when data set increases" part of your question, please do add more context on that front if relevant.
Finally, if you're not already planning a PostgreSQL upgrade, I highly recommend doing so soon. 9.6 is nearly out of support, and versions 10, 11, 12, and 13 each contained a host of performance focussed features.
EXPLAIN ANALYZE SELECT "alerts"."id",
"alerts"."created_at",
't1'::text AS src_table
FROM "alerts"
INNER JOIN "devices"
ON "devices"."id" = "alerts"."device_id"
INNER JOIN "sites"
ON "sites"."id" = "devices"."site_id"
WHERE "sites"."cloud_id" = 111
AND "alerts"."created_at" >= '2019-08-30'
ORDER BY "created_at" DESC limit 9;
Limit (cost=1.15..36021.60 rows=9 width=16) (actual time=30.505..29495.765 rows=9 loops=1)
-> Nested Loop (cost=1.15..232132.92 rows=58 width=16) (actual time=30.504..29495.755 rows=9 loops=1)
-> Nested Loop (cost=0.86..213766.42 rows=57231 width=24) (actual time=0.029..29086.323 rows=88858 loops=1)
-> Index Scan Backward using alerts_created_at_index on alerts (cost=0.43..85542.16 rows=57231 width=24) (actual time=0.014..88.137 rows=88858 loops=1)
Index Cond: (created_at >= '2019-08-30 00:00:00'::timestamp without time zone)
-> Index Scan using devices_pkey on devices (cost=0.43..2.23 rows=1 width=16) (actual time=0.016..0.325 rows=1 loops=88858)
Index Cond: (id = alerts.device_id)
-> Index Scan using sites_pkey on sites (cost=0.29..0.31 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=88858)
Index Cond: (id = devices.site_id)
Filter: (cloud_id = 7231)
Rows Removed by Filter: 1
Total runtime: 29495.816 ms
Now we change to LIMIT 10:
EXPLAIN ANALYZE SELECT "alerts"."id",
"alerts"."created_at",
't1'::text AS src_table
FROM "alerts"
INNER JOIN "devices"
ON "devices"."id" = "alerts"."device_id"
INNER JOIN "sites"
ON "sites"."id" = "devices"."site_id"
WHERE "sites"."cloud_id" = 111
AND "alerts"."created_at" >= '2019-08-30'
ORDER BY "created_at" DESC limit 10;
Limit (cost=39521.79..39521.81 rows=10 width=16) (actual time=1.557..1.559 rows=10 loops=1)
-> Sort (cost=39521.79..39521.93 rows=58 width=16) (actual time=1.555..1.555 rows=10 loops=1)
Sort Key: alerts.created_at
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=5.24..39520.53 rows=58 width=16) (actual time=0.150..1.543 rows=11 loops=1)
-> Nested Loop (cost=4.81..16030.12 rows=2212 width=8) (actual time=0.137..0.643 rows=31 loops=1)
-> Index Scan using sites_cloud_id_index on sites (cost=0.29..64.53 rows=31 width=8) (actual time=0.014..0.057 rows=23 loops=1)
Index Cond: (cloud_id = 7231)
-> Bitmap Heap Scan on devices (cost=4.52..512.32 rows=270 width=16) (actual time=0.020..0.025 rows=1 loops=23)
Recheck Cond: (site_id = sites.id)
-> Bitmap Index Scan on devices_site_id_index (cost=0.00..4.46 rows=270 width=0) (actual time=0.006..0.006 rows=9 loops=23)
Index Cond: (site_id = sites.id)
-> Index Scan using alerts_device_id_index on alerts (cost=0.43..10.59 rows=3 width=24) (actual time=0.024..0.028 rows=0 loops=31)
Index Cond: (device_id = devices.id)
Filter: (created_at >= '2019-08-30 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 12
Total runtime: 1.603 ms
alerts table has millions of records, other tables are counted in thousands.
I can already optimize the query by simply not using limit < 10. What I don't understand is why the LIMIT affects the performance. Perhaps there's a better way than hardcoding this magic number "10".
The number of result rows affects the PostgreSQL optimizer, because plans that return the first few rows quickly are not necessarily plans that return the whole result as fast as possible.
In your case, PostgreSQL thinks that for small values of LIMIT, it will be faster by scanning the alerts table in the order of the ORDER BY clause using an index and just join the other tables using a nested loop until it has found 9 rows.
The benefit of such a strategy is that it doesn't have to calculate the complete result of the join, then sort it and throw away all but the first few result rows.
The danger is that it takes longer than expected to find the 9 matching rows, and this is what hits you:
Index Scan Backward using alerts_created_at_index on alerts (cost=0.43..85542.16 rows=57231 width=24) (actual time=0.014..88.137 rows=88858 loops=1)
So PostgreSQL has to process 88858 rows and use a nested loop join (which is inefficient if it has to loop often) until it finds 9 result rows. This may be because it underestimates the selectivity of the conditions, or because the many matching rows all happen to have low created_at.
The number 10 just happens to be the cut-off point where PostgreSQL thinks it will no longer be more efficient to use that strategy, it is a value that will change as the data in the database change.
You can avoid using that plan altogether by using an ORDER BY clause that does not match the index:
ORDER BY (created_at + INTERVAL '0 days') DESC
We have a query that runs on several different child tables created hourly and inherited from a base table, say tab_name1 and tab_name2. The query was working fine, but suddenly it started to perform badly for all the Childs since a particular date.
This is the query which works fine till tab_name_20180621*, not sure what happened after that.
SELECT
*
FROM
tab_name_201806220300
WHERE
id = 201806220300
AND col1 IN (
SELECT
col1
FROM
tab_name2_201806220300
WHERE
uid = 5452
AND id = 201806220300
);
The analyze output shows something like this, there's huge difference in execution time.
#1
Nested Loop Semi Join (cost=0.00..84762.11 rows=1 width=937) (actual time=117.599..117.599 rows=0 loops=1)
Join Filter: (tab_name_201806210100.col1 = tab_name2_201806210100.col1)
-> Seq Scan on tab_name_201806210100 (cost=0.00..31603.56 rows=1 width=937) (actual time=117.596..117.596 rows=0 loops=1)
Filter: (log_id = '201806220100'::bigint)
Rows Removed by Filter: 434045
-> Materialize (cost=0.00..53136.74 rows=1454 width=41) (never executed)
-> Seq Scan on tab_name2_201806210100 (cost=0.00..53129.47 rows=1454 width=41) (never executed)
Filter: ((uid = 5452) AND (log_id = '201806210100'::bigint))
Planning time: 1.490 ms
Execution time: 117.723 ms
#2
Nested Loop Semi Join (cost=0.00..10299.31 rows=48 width=1476) (actual time=1082.255..47041.945 rows=671 loops=1)
Join Filter: (tab_name_201806220100.col1 = tab_name2_201806220100.col1)
Rows Removed by Join Filter: 252444174
-> Seq Scan on tab_name_201806220100 (cost=0.00..4023.69 rows=95 width=1476) (actual time=0.008..36.292 rows=64153 loops=1)
Filter: (log_id = '201806220100'::bigint)
-> Materialize (cost=0.00..6274.19 rows=1 width=32) (actual time=0.000..0.264 rows=3935 loops=64153)
-> Seq Scan on tab_name2_201806220100 (cost=0.00..6274.19 rows=1 width=32) (actual time=0.464..55.475 rows=3960 loops=1)
Filter: ((log_id = '201806220100'::bigint) AND (uid = 5452))
Rows Removed by Filter: 140592
Planning time: 1.024 ms
Execution time: 47042.234 ms
I don't know what to infer from this and how to proceed from here. Could you help me?
here's my Django ORM query:
Group.objects.filter(public = True)\
.annotate(num_members = Count('members', distinct = True))\
.annotate(num_images = Count('images', distinct = True))\
.order_by(sort)
Unfortunately this is taking over 30 seconds even with just a few dozen Groups. Removing the annotate statements makes the query significantly faster at only 3 ms...
My database backend is Postgres and here's the SQL and explain:
Executed SQL
SELECT ••• FROM "astrobin_apps_groups_group"
LEFT OUTER JOIN "astrobin_apps_groups_group_members" ON (
"astrobin_apps_groups_group"."id" = "astrobin_apps_groups_group_members"."group_id"
)
LEFT OUTER JOIN "astrobin_apps_groups_group_images" ON (
"astrobin_apps_groups_group"."id" = "astrobin_apps_groups_group_images"."group_id")
WHERE "astrobin_apps_groups_group"."public" = true
GROUP BY
"astrobin_apps_groups_group"."id",
"astrobin_apps_groups_group"."date_created",
"astrobin_apps_groups_group"."date_updated",
"astrobin_apps_groups_group"."creator_id",
"astrobin_apps_groups_group"."owner_id",
"astrobin_apps_groups_group"."name",
"astrobin_apps_groups_group"."description",
"astrobin_apps_groups_group"."category",
"astrobin_apps_groups_group"."public",
"astrobin_apps_groups_group"."moderated",
"astrobin_apps_groups_group"."autosubmission",
"astrobin_apps_groups_group"."forum_id"
ORDER BY "astrobin_apps_groups_group"."date_updated" ASC
Time
30455.9268951 ms
QUERY PLAN
GroupAggregate (cost=5910.49..8288.54 rows=216 width=242) (actual time=29255.329..30269.284 rows=27 loops=1)
-> Sort (cost=5910.49..6068.88 rows=63357 width=242) (actual time=29253.278..29788.601 rows=201888 loops=1)
Sort Key: astrobin_apps_groups_group.date_updated, astrobin_apps_groups_group.id, astrobin_apps_groups_group.date_created, astrobin_apps_groups_group.creator_id, astrobin_apps_groups_group.owner_id, astrobin_apps_groups_group.name, astrobin_apps_groups_group.description, astrobin_apps_groups_group.category, astrobin_apps_groups_group.public, astrobin_apps_groups_group.moderated, astrobin_apps_groups_group.autosubmission, astrobin_apps_groups_group.forum_id
Sort Method: external merge Disk: 70176kB
-> Hash Right Join (cost=15.69..857.39 rows=63357 width=242) (actual time=1.903..397.613 rows=201888 loops=1)
Hash Cond: (astrobin_apps_groups_group_images.group_id = astrobin_apps_groups_group.id)
-> Seq Scan on astrobin_apps_groups_group_images (cost=0.00..106.05 rows=6805 width=8) (actual time=0.024..12.510 rows=6837 loops=1)
-> Hash (cost=12.31..12.31 rows=270 width=238) (actual time=1.853..1.853 rows=323 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 85kB
-> Hash Right Join (cost=3.63..12.31 rows=270 width=238) (actual time=0.133..1.252 rows=323 loops=1)
Hash Cond: (astrobin_apps_groups_group_members.group_id = astrobin_apps_groups_group.id)
-> Seq Scan on astrobin_apps_groups_group_members (cost=0.00..4.90 rows=290 width=8) (actual time=0.004..0.348 rows=333 loops=1)
-> Hash (cost=3.29..3.29 rows=27 width=234) (actual time=0.103..0.103 rows=27 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 7kB
-> Seq Scan on astrobin_apps_groups_group (cost=0.00..3.29 rows=27 width=234) (actual time=0.004..0.049 rows=27 loops=1)
Filter: public
Total runtime: 30300.606 ms
It would be great if somebody could suggest a way to optimize this. I feel like I'm missing a really low hanging fruit.
Thanks!
What are the indexes present on astrobin_apps_groups_group and "astrobin_apps_groups_group_member, astrobin_apps_groups_group_image table?
Is there any aggregate functions like SUM, COUNT used in your select? If no, then you can remove all columns from GROUP BY
The plan shows most of the time is taken for sorting. If you create an index on date_updated filed with NULLS LAST with latest values first in index, then planner may use this index for sorting.
For sorting, disk is getting used which is most costly affair. This is because your data collected for sorting is not fitting in memory. Try increasing the WORK_MEM - set work_mem='10MB'; SELECT.....
I have Rails application with the ability to filter records by state_code. I noticed that when i pass 'CA' as search term i get my results almost instantly. If i will pass 'AZ' for example it will take more than a minute though.
I don't have any ideas why so?
Below is query explains from psql:
Fast one:
EXPLAIN ANALYZE SELECT
accounts.id
FROM "accounts"
LEFT OUTER JOIN "addresses"
ON "addresses"."addressable_id" = "accounts"."id"
AND "addresses"."address_type" = 'mailing'
AND "addresses"."addressable_type" = 'Account'
WHERE "accounts"."organization_id" = 16
AND (addresses.state_code IN ('CA'))
ORDER BY accounts.name DESC;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=4941.94..4941.94 rows=1 width=18) (actual time=74.810..74.969 rows=821 loops=1)
Sort Key: accounts.name
Sort Method: quicksort Memory: 75kB
-> Hash Join (cost=4.46..4941.93 rows=1 width=18) (actual time=70.044..73.148 rows=821 loops=1)
Hash Cond: (addresses.addressable_id = accounts.id)
-> Seq Scan on addresses (cost=0.00..4911.93 rows=6806 width=4) (actual time=0.027..65.547 rows=15244 loops=1)
Filter: (((address_type)::text = 'mailing'::text) AND ((addressable_type)::text = 'Account'::text) AND ((state_code)::text = 'CA'::text))
Rows Removed by Filter: 129688
-> Hash (cost=4.45..4.45 rows=1 width=18) (actual time=2.037..2.037 rows=1775 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 87kB
-> Index Scan using organization_id_index on accounts (cost=0.29..4.45 rows=1 width=18) (actual time=0.018..1.318 rows=1775 loops=1)
Index Cond: (organization_id = 16)
Planning time: 0.565 ms
Execution time: 75.224 ms
(14 rows)
Slow one:
EXPLAIN ANALYZE SELECT
accounts.id
FROM "accounts"
LEFT OUTER JOIN "addresses"
ON "addresses"."addressable_id" = "accounts"."id"
AND "addresses"."address_type" = 'mailing'
AND "addresses"."addressable_type" = 'Account'
WHERE "accounts"."organization_id" = 16
AND (addresses.state_code IN ('NV'))
ORDER BY accounts.name DESC;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=4917.27..4917.27 rows=1 width=18) (actual time=97091.270..97091.277 rows=25 loops=1)
Sort Key: accounts.name
Sort Method: quicksort Memory: 26kB
-> Nested Loop (cost=0.29..4917.26 rows=1 width=18) (actual time=844.250..97091.083 rows=25 loops=1)
Join Filter: (accounts.id = addresses.addressable_id)
Rows Removed by Join Filter: 915875
-> Index Scan using organization_id_index on accounts (cost=0.29..4.45 rows=1 width=18) (actual time=0.017..10.315 rows=1775 loops=1)
Index Cond: (organization_id = 16)
-> Seq Scan on addresses (cost=0.00..4911.93 rows=70 width=4) (actual time=0.110..54.521 rows=516 loops=1775)
Filter: (((address_type)::text = 'mailing'::text) AND ((addressable_type)::text = 'Account'::text) AND ((state_code)::text = 'NV'::text))
Rows Removed by Filter: 144416
Planning time: 0.308 ms
Execution time: 97091.325 ms
(13 rows)
Slow one result is 25 rows, fast one is 821 rows, which is even more confusing.
I solved it by using VACUUM ANALYZE command from psql command line.