Postgresql recheck performed even there are no lossy blocks - sql

I am running a explain (buffers, analyze, verbose)
And I am getting this subresult
-> Bitmap Heap Scan on public.d (cost=109.92..8479.81 rows=5871 width=40) (actual time=1.334..29.942 rows=5306 loops=1)
Output: d.id, d.pd, d.iid, d.dtid, d.bid
Recheck Cond: ((d.sid = 100) AND (d.pd >= '2020-01-28 10:24:40.034+00'::timestamp with time zone) AND (d.pd <= '2020-04-28 10:24:40.034+00'::timestamp with time zone))
Heap Blocks: exact=2014
Buffers: shared hit=3 read=2035
-> Bitmap Index Scan on idx_d_didpd (cost=0.00..108.45 rows=5871 width=0) (actual time=1.018..1.018 rows=5306 loops=1)
Index Cond: ((d.sid = 100) AND (d.pd >= '2020-01-28 10:24:40.034+00'::timestamp with time zone) AND (d.pd <= '2020-04-28 10:24:40.034+00'::timestamp with time zone))
Buffers: shared read=24
What I am wondering that in whole result the most "costly" parts are that are performing the Bitman Heap Scan (other parts performing the index scan and they are pretty fast). But I´ve read that recheck on bitmap heap scan is performed just in case that there are some lossy blocks. Which I can not see here.
Can anyone tell me why is this Heap Scan performed?

Note that "Recheck Cond" is also present with just an EXPLAIN without the ANALYZE. The value of this field does not depend on what actually happened during execution. It is telling you what condition will be used in a potential recheck, it does not tell you how often the recheck was actually performed (which in your case was probably zero).
Can anyone tell me why is this Heap Scan performed?
The Bitmap Heap Scan is not just there to do a recheck, its main purpose is to fetch the data you asked for.

Related

Optimizing SELECT count(*) on large table

Basic count on a large table on PostgreSQL 14 with 64GB Ram & 20 threads. Storage is an NVME disk.
Questions:
How do I improve the query for this select count query? What kind of optimizations should I look into on Postgres configuration?
The workers planned is 4 but launched 0, is that normal?
EXPLAIN (ANALYZE, BUFFERS)
SELECT count(*) FROM public.product;
Finalize Aggregate (cost=2691545.69..2691545.70 rows=1 width=8) (actual time=330901.439..330902.951 rows=1 loops=1)
Buffers: shared hit=1963080 read=1140455 dirtied=1908 written=111146
I/O Timings: read=36692.273 write=6548.923
-> Gather (cost=2691545.27..2691545.68 rows=4 width=8) (actual time=330901.342..330902.861 rows=1 loops=1)
Workers Planned: 4
Workers Launched: 0
Buffers: shared hit=1963080 read=1140455 dirtied=1908 written=111146
I/O Timings: read=36692.273 write=6548.923
-> Partial Aggregate (cost=2690545.27..2690545.28 rows=1 width=8) (actual time=330898.747..330898.757 rows=1 loops=1)
Buffers: shared hit=1963080 read=1140455 dirtied=1908 written=111146
I/O Timings: read=36692.273 write=6548.923
-> Parallel Index Only Scan using points on products (cost=0.57..2634234.99 rows=22524114 width=0) (actual time=0.361..222958.361 rows=90993600 loops=1)
Heap Fetches: 46261956
Buffers: shared hit=1963080 read=1140455 dirtied=1908 written=111146
I/O Timings: read=36692.273 write=6548.923
Planning:
Buffers: shared hit=39 read=8
I/O Timings: read=0.398
Planning Time: 2.561 ms
JIT:
Functions: 4
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.691 ms, Inlining 104.789 ms, Optimization 24.169 ms, Emission 22.457 ms, Total 152.107 ms
Execution Time: 330999.777 ms
The workers planned is 4 but launched 0, is that normal?
It can happen when too many concurrent transactions compete for a limited number of allowed parallel workers. The manual:
The number of background workers that the planner will consider using
is limited to at most max_parallel_workers_per_gather. The
total number of background workers that can exist at any one time is
limited by both max_worker_processes and
max_parallel_workers. Therefore, it is possible for a
parallel query to run with fewer workers than planned, or even with no
workers at all. The optimal plan may depend on the number of workers
that are available, so this can result in poor query performance. If
this occurrence is frequent, consider increasing
max_worker_processes and max_parallel_workers so that more workers
can be run simultaneously or alternatively reducing
max_parallel_workers_per_gather so that the planner requests fewer
workers.
You can also optimize overall performance to free up resources, or get better hardware (in addition to ramping up max_parallel_workers).
What's also troubling:
Heap Fetches: 46261956
For 90993600 rows. That's way too many for comfort. An index-only scan is not supposed to do that many heap fetches.
Both of these symptoms would indicate massive concurrent write access (or long-running transactions hogging resources and keeping autovacuum from doing its job). Look into that, and/or tune per-table autovacuum settings for table product to be more aggressive, so that columns statistics are more valid and the visibility map can keep up. See:
Aggressive Autovacuum on PostgreSQL
Also, with halfway valid table statistics, a (blazingly fast!) estimate might be good enough? See:
Fast way to discover the row count of a table in PostgreSQL

Too many buffers hit+read during index scan

I've 2 tables User and Info. I'm writing a simple query with inner join and inserting the result into an unlogged table.
INSERT INTO Result (
iProfileId,email,format,content
)
SELECT
COALESCE(N1.iprofileId, 0),
Lower(N1.email),
W0.format,
W0.content
FROM
Info W0,
User N1
where
(N1.iprofileId = W0.iId);
Info table has 30M rows and User table has 158M rows. Due to some reason, this query is taking too long on one of my prod setups. At first glance it looks like its reading/hitting too many buffers:
Insert on Result (cost=152813.60..15012246.06 rows=31198136 width=1080) (actual time=5126063.502..5126063.502 rows=0 loops=1)
Buffers: shared hit=128815094 read=6103564 dirtied=599445 written=2088037
I/O Timings: read=2563306.517 write=570919.940
-> Merge Join (cost=152813.60..15012246.06 rows=31198136 width=1080) (actual time=0.097..5060947.922 rows=31191937 loops=1)
Merge Cond: (w0.iid = n1.iprofileid)
Buffers: shared hit=96480126 read=5574864 dirtied=70745 written=2009998
I/O Timings: read=2563298.981 write=562810.833
-> Index Scan using user_idx on info w0 (cost=0.56..2984094.60 rows=31198136 width=35) (actual time=0.012..246299.026 rows=31191937 loops=1)
Buffers: shared hit=481667 read=2490602 written=364347
I/O Timings: read=178000.987 write=38663.457
-> Index Scan using profile_id on user n1 (cost=0.57..14938848.88 rows=158842848 width=32) (actual time=0.020..4718272.082 rows=115378606 loops=1)
Buffers: shared hit=95998459 read=3084262 dirtied=70745 written=1645651
I/O Timings: read=2385297.994 write=524147.376
Planning Time: 11.531 ms
Execution Time: 5126063.577 ms
When I ran this query on a different setup but with similar tables and number of records, profile_id scan only used 5M pages(ran in 3m) whereas here it used(read+hit) 100M buffers(ran in 1.45h). When I checked using vacuum verbose this table only has 10M pages.
INFO: "User": found 64647 removable, 109184385 nonremovable row versions in 6876625 out of 10546400 pages
This is one of the good runs but we've seen this query taking up to 4-5 hrs as well. My test system which ran in under 3 mins also had iid distributed among profile_id range. But it had fewer columns and indexes as compared to the prod system. What could be the reason for this slowness?
The execution plan you are showing has a lot of dirtied and written pages. That indicates that the tables were freshly inserted, and your query was the first reader.
In PostgreSQL, the first reader of a new table row consults the commit log to see if that row is visible or not (did the transaction that created it commit?). It then sets flags in the row (the so-called hint bits) to save the next reader that trouble.
Setting the hint bits modifies the row, so the block is dirtied and has to be written to disk eventually. That writing is normally done by the checkpointer or the background writer, but they couldn't keep up, so the query had to clean out many dirty pages itself.
If you run the query a second time, it will be faster. For that reason, it is a good idea to VACUUM tables after bulk loading, which will also set the hint bits.
However, a large query like that will always be slow. Things you can try to speed it up further are:
have lots of RAM and load the tables into shared buffers with pg_prewarm
crank up work_mem in the hope to get a faster hash join
CLUSTER the tables using the indexes, so that heap fetches become more efficient

Why does a pg query stop using an index after a while?

I have this query in Postgres 12.0:
SELECT "articles"."id"
FROM "articles"
WHERE ((jsonfields ->> 'etat') = '0' OR (jsonfields ->> 'etat') = '1' OR (jsonfields ->> 'etat') = '2')
ORDER BY ordre ASC;
At this time:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1274.09..1274.97 rows=354 width=8) (actual time=13.000..13.608 rows=10435 loops=1)
Sort Key: ordre
Sort Method: quicksort Memory: 874kB
-> Bitmap Heap Scan on articles (cost=15.81..1259.10 rows=354 width=8) (actual time=1.957..10.807 rows=10435 loops=1)
Recheck Cond: (((jsonfields ->> 'etat'::text) = '1'::text) OR ((jsonfields ->> 'etat'::text) = '2'::text) OR ((jsonfields ->> 'etat'::text) = '0'::text))
Heap Blocks: exact=6839
-> BitmapOr (cost=15.81..15.81 rows=356 width=0) (actual time=1.171..1.171 rows=0 loops=1)
-> Bitmap Index Scan on myidx (cost=0.00..5.18 rows=119 width=0) (actual time=0.226..0.227 rows=2110 loops=1)
Index Cond: ((jsonfields ->> 'etat'::text) = '1'::text)
-> Bitmap Index Scan on myidx (cost=0.00..5.18 rows=119 width=0) (actual time=0.045..0.045 rows=259 loops=1)
Index Cond: ((jsonfields ->> 'etat'::text) = '2'::text)
-> Bitmap Index Scan on myidx (cost=0.00..5.18 rows=119 width=0) (actual time=0.899..0.899 rows=8066 loops=1)
Index Cond: ((jsonfields ->> 'etat'::text) = '0'::text)
Planning Time: 0.382 ms
Execution Time: 14.234 ms
(15 lignes)
After a while:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=7044.04..7079.35 rows=14127 width=8) (actual time=613.445..614.679 rows=15442 loops=1)
Sort Key: ordre
Sort Method: quicksort Memory: 1108kB
-> Seq Scan on articles (cost=0.00..6070.25 rows=14127 width=8) (actual time=0.060..609.477 rows=15442 loops=1)
Filter: (((jsonfields ->> 'etat'::text) = '1'::text) OR ((jsonfields ->> 'etat'::text) = '2'::text) OR ((jsonfields ->> 'etat'::text) = '3'::text))
Rows Removed by Filter: 8288
Planning Time: 0.173 ms
Execution Time: 615.744 ms
(8 lignes)
I need to re-create index:
DROP INDEX myidx;
CREATE INDEX myidx ON articles ( (jsonfields->>'etat') );
Why? How to fix this?
I tried to decrease memory for disable seqscan. It doesn't work.
I tried to do select pg_stat_reset();. It doesn't work.
pg_stat_reset() does not reset table statistics. It only resets counters (like how often an index was used), it has no effects on query plans.
To update table statistics, use ANALYZE (or VACUUM ANALYZE, while being at it). autovacuum should take care of this automatically, normally.
Your first query finds rows=10435, your second query finds rows=15442. Postgres expects to find rows=354 (!) in the first, but rows=14127 in the second. It largely under-estimates the number of result rows in the first, which favours indexes. So your first query was only fast by accident.
Table statistics have changed, there may be table and index bloat. Most importantly, your cost settings are probably misleading. Consider a lower setting for random_page_cost (and possibly for cpu_index_tuple_cost and others).
Related:
Keep PostgreSQL from sometimes choosing a bad query plan
If recreating the index leads to a different query plan, the index may have been bloated. (A bloated index would also discourage Postgres from using it.) More aggressive autovacuum settings, generally or just for the table or even just the index may help.
Also, expression indexes introduce additional statistics (the essential one on jsonfields->>'etat' in your case). Dropping the index drops those, too. And the new expression index starts out with empty statistics which are filled with the next manual ANALYZE or by autovacuum. So, typically, you should run ANALYZE on the table after creating an expression index - except that in your case you currently only seem to get the fast query when based on misleading stats, so fix that first.
Maybe revisit your database design. Does that etat value really have to be nested in a JSON column? Might be a lot cheaper overall to have it as separate column.
Be that as it may, the most expensive part of your first (fast) query plan is the Bitmap Heap Scan, where Postgres reads actual data pages to return id values. A shortcut with a "covering" index would be possible since Postgres 11:
CREATE INDEX myidx ON articles ((jsonfields->>'etat')) INCLUDE (ordre, id);
But this relies on autovacuum doing its job in timely manner even more, as it requires the visibility map to be up to date.
Or, if your WHERE clause is constant (always filtering for (jsonfields ->> 'etat') = ANY ('{0,1,2}')), a partial index would reign supreme:
CREATE INDEX myidx ON articles (ordre, id)
WHERE (jsonfields ->> 'etat') = ANY ('{0,1,2}');
Immediately after you create the functional index, it doesn't have any statistics gathered on it, so PostgreSQL must make some generic assumptions. Once auto-analyze has had a chance to run, it has real stats to work with. Now it turns out the more-accurate estimates actually leads to a worse plan, which is rather unfortunate.
The PostgreSQL planner generally assumes much of our data is not in cache. This assumption pushes it to favor seq scans over index scan when it will be returning a large number of rows (Your second plan is returning 2/3 of the table!). The reasons it makes this assumption is that it is safer. Assuming too little data is cached leads to merely bad plans, but assuming too much is cached leads to utterly catastrophic plans.
In general, the amount of data assumed to be cache is baked into the random_page_cost setting, so you can tweak that setting if you want it. (baking it into that setting, rather than having a separate setting, was a poor design decision in my opinion, but it was made a very long time ago).
You could set random_page_cost equal to seq_page_cost, to see if that solves the problem. But that is probably not a change you would want to make permanently, as it is likely to create more problems than it solves. Perhaps the correct setting is lower than the default but still higher than seq_page_cost. You should also do EXPLAIN (ANALYZE, BUFFERS), and set track_io_timing = on, to give you more information to use in evaluating this.
Another issue is that the bitmap heap scan never needs to consult the actual JSON data. It gets all the data it needs from the index. The seq scan needs to consult the JSON data, and how slow this is will be depends on things like what type it is (json or jsonb) and how much other stuff is in that JSON. PostgreSQL rather ridiculously thinks that parsing a JSON document take about the same amount of time as comparing two integers does.
You can more or less fix this problem (for json type) by running the following statement:
update pg_proc set procost =100 where proname='json_object_field_text';
(This is imperfect, as the cost of this function gets charged to the recheck condition of the bitmap heap scan even though no recheck is done. But the recheck is charged for each tuple expected to be returned, not each tuple expected to be in the table, so this creates a distinction you can take advantage of).

Postgres Materialize Node completing faster than its sub-node

I was analyzing a slow postgresql database query and I noticed something that seemed quite odd to me (I'm new at analysing queries in postgres). The actual time for the Materialize node both starts and finishes before its sub node.
-> Nested Loop (cost=300.28..698.21 rows=1 width=54) (actual time=180.547..11022.591 rows=166 loops=1)
Join Filter: (mytable1.category_id = mytable2.category_id)
-> Index Scan using mytable2_p_category_id on mytable2 (cost=0.00..3.48 rows=15 width=4) (actual time=0.012..0.037 rows=15 loops=1)
-> Materialize (cost=300.28..694.51 rows=1 width=54) (actual time=12.036..734.653 rows=166 loops=15)
-> Nested Loop (cost=300.28..694.50 rows=1 width=54) (actual time=180.520..11016.887 rows=166 loops=1)
Does anyone know when and where you might expect this to happen?
In case it's relevant our postgres server is running version 9.1
Thanks
As Denis pointed out in the comments (so I can't give him the tick of approval) it would seem that the actual time for the Materialize node is probably best read in terms of loops * actual time
So in this example that would be:
Start - (12.036 * 15) = 180.54
End - (734.653 * 15) = 11019.795
Browsing online I found other examples of looped Materialize statements being reported this way also.
So I guess the answer to the question of "when and where" is almost always when your Materialize node is being looped over.
So unless someone who knows better chimes in I'll just make this the answer for now.

PostgreSQL query not using index in production

I'm noticing something strange/weird:
The exact same query in development/production are not using the same query path. In particular, the development version is using indexes which are omitted in production (in favor of seqscan).
The only real difference is that the dataset is production is significantly larger--the index size is 1034 MB, vs 29 MB in production. Would PostgreSQL abstain from using indexes if they (or the table) are too big?
EDIT: EXPLAIN ANALYZE for both queries:
Development:
Limit (cost=41638.15..41638.20 rows=20 width=154) (actual time=159.576..159.581 rows=20 loops=1)
-> Sort (cost=41638.15..41675.10 rows=14779 width=154) (actual time=159.575..159.577 rows=20 loops=1)
Sort Key: (sum(scenario_ad_group_performances.clicks))
Sort Method: top-N heapsort Memory: 35kB
-> GroupAggregate (cost=0.00..41244.89 rows=14779 width=154) (actual time=0.040..151.535 rows=14197 loops=1)
-> Nested Loop Left Join (cost=0.00..31843.75 rows=93800 width=154) (actual time=0.022..82.509 rows=50059 loops=1)
-> Merge Left Join (cost=0.00..4203.46 rows=14779 width=118) (actual time=0.017..27.103 rows=14197 loops=1)
Merge Cond: (scenario_ad_groups.id = scenario_ad_group_vendor_instances.ad_group_id)
-> Index Scan using scenario_ad_groups_pkey on scenario_ad_groups (cost=0.00..2227.06 rows=14779 width=114) (actual time=0.009..12.085 rows=14197 loops=1)
Filter: (scenario_id = 22)
-> Index Scan using index_scenario_ad_group_vendor_instances_on_ad_group_id on scenario_ad_group_vendor_instances (cost=0.00..1737.02 rows=27447 width=8) (actual time=0.007..7.021 rows=16528 loops=1)
Filter: (ad_vendor_id = ANY ('{1,2,3}'::integer[]))
-> Index Scan using index_ad_group_performances_on_vendor_instance_id_and_date on scenario_ad_group_performances (cost=0.00..1.73 rows=11 width=44) (actual time=0.002..0.003 rows=3 loops=14197)
Index Cond: ((vendor_instance_id = scenario_ad_group_vendor_instances.id) AND (date >= '2012-02-01'::date) AND (date <= '2012-02-28'::date))
Total runtime: 159.710 ms
Production:
Limit (cost=822401.35..822401.40 rows=20 width=179) (actual time=21279.547..21279.591 rows=20 loops=1)
-> Sort (cost=822401.35..822488.42 rows=34828 width=179) (actual time=21279.543..21279.560 rows=20 loops=1)
Sort Key: (sum(scenario_ad_group_performances.clicks))
Sort Method: top-N heapsort Memory: 33kB
-> GroupAggregate (cost=775502.60..821474.59 rows=34828 width=179) (actual time=19126.783..21226.772 rows=34495 loops=1)
-> Sort (cost=775502.60..776739.48 rows=494751 width=179) (actual time=19125.902..19884.164 rows=675253 loops=1)
Sort Key: scenario_ad_groups.id
Sort Method: external merge Disk: 94200kB
-> Hash Right Join (cost=25743.86..596796.70 rows=494751 width=179) (actual time=1155.491..16720.460 rows=675253 loops=1)
Hash Cond: (scenario_ad_group_performances.vendor_instance_id = scenario_ad_group_vendor_instances.id)
-> Seq Scan on scenario_ad_group_performances (cost=0.00..476354.29 rows=4158678 width=44) (actual time=0.043..8949.640 rows=4307019 loops=1)
Filter: ((date >= '2012-02-01'::date) AND (date <= '2012-02-28'::date))
-> Hash (cost=24047.72..24047.72 rows=51371 width=143) (actual time=1123.896..1123.896 rows=34495 loops=1)
Buckets: 1024 Batches: 16 Memory Usage: 392kB
-> Hash Right Join (cost=6625.90..24047.72 rows=51371 width=143) (actual time=92.257..1070.786 rows=34495 loops=1)
Hash Cond: (scenario_ad_group_vendor_instances.ad_group_id = scenario_ad_groups.id)
-> Seq Scan on scenario_ad_group_vendor_instances (cost=0.00..11336.31 rows=317174 width=8) (actual time=0.020..451.496 rows=431770 loops=1)
Filter: (ad_vendor_id = ANY ('{1,2,3}'::integer[]))
-> Hash (cost=5475.55..5475.55 rows=34828 width=139) (actual time=88.311..88.311 rows=34495 loops=1)
Buckets: 1024 Batches: 8 Memory Usage: 726kB
-> Bitmap Heap Scan on scenario_ad_groups (cost=798.20..5475.55 rows=34828 width=139) (actual time=4.451..44.065 rows=34495 loops=1)
Recheck Cond: (scenario_id = 276)
-> Bitmap Index Scan on index_scenario_ad_groups_on_scenario_id (cost=0.00..789.49 rows=34828 width=0) (actual time=4.232..4.232 rows=37006 loops=1)
Index Cond: (scenario_id = 276)
Total runtime: 21306.697 ms
Disclaimer
I have used PostgreSQL very little. I'm answering based on my knowledge of SQL Server index usage and execution plans. I ask the PostgreSQL gods for mercy if I get something wrong.
Query Optimizers are Dynamic
You said your query plan has changed from your development to production environments. This is to be expected. Query optimizers are designed to generate the optimum execution plan based on the current data conditions. Under different conditions the optimizer may decide it is more efficient to use a table scan vs an index scan.
When would it be more efficient to use a table scan vs an index scan?
SELECT A, B
FROM someTable
WHERE A = 'SOME VALUE'
Let's say you have a non-clustered index on column A. In this case you are filtering on column A, which could potentially take advantage of the index. This would be efficient if the index is selective enough - basically, how many distinct values make up the index? The database keeps statistics on this selectivity info and uses these statistics when calculating costs for execution plans.
If you have a million rows in a table, but only 10 possible values for A, then your query would likely return about 100K rows. Because the index is non-clustered, and you are returning columns not included in the index, B, a lookup will need to be performed for each row returned. These look-ups are random-access lookups which are much more expensive then sequential reads used by a table scan. At a certain point it becomes more efficient for the database to just perform a table scan rather than an index scan.
This is just one scenario, there are many others. It's hard to know without knowing more about what your data is like, what your indexes look like and how you are trying to access the data.
To answer the original question:
Would PostgreSQL abstain from using indexes if they (or the table) are too big? No. It is more likely that in the way that you are accessing the data, it is less efficient for PostgreSQL to use the index vs using a table scan.
The PostgreSQL FAQ touches on this very subject (see: Why are my queries slow? Why don't they use my indexes?): https://wiki.postgresql.org/wiki/FAQ#Why_are_my_queries_slow.3F_Why_don.27t_they_use_my_indexes.3F
Postgres' query optimizer comes up with multiple scenarios (e.g. index vs seq-scan) and evaluates them using statistical information about your tables and the relative costs of disk/memory/index/table access set in configuration.
Did you use the EXPLAIN command to see why index use was omitted? Did you use EXPLAIN ANALYZE to find out if the decision was in error? Can we see the outputs, please?
edit:
As hard as analyzing two different singular queries on different systems are, I think I see a couple of things.
The production environment has a actual/cost rate of around 20-100 milliseconds per cost unit. I'm not even a DBA, but this seems consistent. The development environment has 261 for the main query. Does this seem right? Would you expect the raw speed (memory/disk/CPU) of the production environment to be 2-10x faster than dev?
Since the production environment has a much more complex query plan, it looks like it's doing its job. Undoubtedly, the dev environment's plan and many more have been considered, and deemed too costly. And the 20-100 variance isn't that much in my experience (but again, not a DBA) and shows that there isn't anything way off the mark. Still, you may want to run a VACUUM on the DB just in case.
I'm not experienced and patient enough to decode the full query, but could there be a denormalization/NOSQL-ization point for optimization?
The biggest bottleneck seems to be the disk merge at 90 MB. If the production environment has enough memory, you may want to increase the relevant setting (working memory?) to do it in-memory. It seems to be the work_mem parameter here, though you'll want to read through the rest.
I'd also suggest having a look at the index usage statistics. Many options with partial and functional indices exist.
It seems to me that your dev data is much "simpler" than the production data. As an example:
Development:
-> Index Scan using index_scenario_ad_group_vendor_instances_on_ad_group_id on scenario_ad_group_vendor_instances
(cost=0.00..1737.02 rows=27447 width=8)
(actual time=0.007..7.021 rows=16528 loops=1)
Filter: (ad_vendor_id = ANY ('{1,2,3}'::integer[]))
Production:
-> Seq Scan on scenario_ad_group_vendor_instances
(cost=0.00..11336.31 rows=317174 width=8)
(actual time=0.020..451.496 rows=431770 loops=1)
Filter: (ad_vendor_id = ANY ('{1,2,3}'::integer[]))
This means, that in dev 27447 matching row have been estimated upfront and 16528 rows were indeed found. That't the same ballpark and OK.
In production 317174 matching rows have been estimated upfront and 431770 rows were found. Also OK.
But comparing dev to prod means that the numbers are 10 times different. As already other answers indicate, doing 10 times more random seeks (due to index access) might indeed be worse than a plain table scan.
Hence the interesting question is: How many rows does this table contain both in dev and in prod? Is number_returned_rows / number_total_rows comparable between dev and prod?
Edit Don't forget: I have picked one index access as an example. A quick glance shows that the other index accesses have the same symptoms.
Try
SET enable_seqscan TO 'off'
before EXPLAIN ANALYZE