IN vs OR in the SQL WHERE clause - sql

When dealing with big databases, which performs better: IN or OR in the SQL WHERE clause?
Is there any difference about the way they are executed?

I assume you want to know the performance difference between the following:
WHERE foo IN ('a', 'b', 'c')
WHERE foo = 'a' OR foo = 'b' OR foo = 'c'
According to the manual for MySQL if the values are constant IN sorts the list and then uses a binary search. I would imagine that OR evaluates them one by one in no particular order. So IN is faster in some circumstances.
The best way to know is to profile both on your database with your specific data to see which is faster.
I tried both on a MySQL with 1000000 rows. When the column is indexed there is no discernable difference in performance - both are nearly instant. When the column is not indexed I got these results:
SELECT COUNT(*) FROM t_inner WHERE val IN (1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000);
1 row fetched in 0.0032 (1.2679 seconds)
SELECT COUNT(*) FROM t_inner WHERE val = 1000 OR val = 2000 OR val = 3000 OR val = 4000 OR val = 5000 OR val = 6000 OR val = 7000 OR val = 8000 OR val = 9000;
1 row fetched in 0.0026 (1.7385 seconds)
So in this case the method using OR is about 30% slower. Adding more terms makes the difference larger. Results may vary on other databases and on other data.

The best way to find out is looking at the Execution Plan.
I tried it with Oracle, and it was exactly the same.
CREATE TABLE performance_test AS ( SELECT * FROM dba_objects );
SELECT * FROM performance_test
WHERE object_name IN ('DBMS_STANDARD', 'DBMS_REGISTRY', 'DBMS_LOB' );
Even though the query uses IN, the Execution Plan says that it uses OR:
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 1416 | 163 (2)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| PERFORMANCE_TEST | 8 | 1416 | 163 (2)| 00:00:02 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_NAME"='DBMS_LOB' OR "OBJECT_NAME"='DBMS_REGISTRY' OR
"OBJECT_NAME"='DBMS_STANDARD')

The OR operator needs a much more complex evaluation process than the IN construct because it allows many conditions, not only equals like IN.
Here is a list of what you can use with OR but that are not compatible with IN:
greater, greater or equal, less, less or equal, LIKE and some more like the oracle REGEXP_LIKE.
In addition, consider that the conditions may not always compare the same value.
For the query optimizer it's easier to to manage the IN operator because is only a construct that defines the OR operator on multiple conditions with = operator on the same value. If you use the OR operator the optimizer may not consider that you're always using the = operator on the same value and, if it doesn't perform a deeper and more complex elaboration, it could probably exclude that there may be only = operators for the same values on all the involved conditions, with a consequent preclusion of optimized search methods like the already mentioned binary search.
[EDIT]
Probably an optimizer may not implement optimized IN evaluation process, but this doesn't exclude that one time it could happen(with a database version upgrade). So if you use the OR operator that optimized elaboration will not be used in your case.

I think oracle is smart enough to convert the less efficient one (whichever that is) into the other. So I think the answer should rather depend on the readability of each (where I think that IN clearly wins)

OR makes sense (from readability point of view), when there are less values to be compared.
IN is useful esp. when you have a dynamic source, with which you want values to be compared.
Another alternative is to use a JOIN with a temporary table.
I don't think performance should be a problem, provided you have necessary indexes.

I'll add info for PostgreSQL version 11.8 (released 2020-05-14).
IN may be significantly faster. E.g. table with ~23M rows.
Query with OR:
explain analyse select sum(mnozstvi_rozdil)
from product_erecept
where okres_nazev = 'Brno-město' or okres_nazev = 'Pardubice';
-- execution plan
Finalize Aggregate (cost=725977.36..725977.37 rows=1 width=32) (actual time=4536.796..4540.748 rows=1 loops=1)
-> Gather (cost=725977.14..725977.35 rows=2 width=32) (actual time=4535.010..4540.732 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=724977.14..724977.15 rows=1 width=32) (actual time=4519.338..4519.339 rows=1 loops=3)
-> Parallel Bitmap Heap Scan on product_erecept (cost=15589.71..724264.41 rows=285089 width=4) (actual time=135.832..4410.525 rows=230706 loops=3)
Recheck Cond: (((okres_nazev)::text = 'Brno-město'::text) OR ((okres_nazev)::text = 'Pardubice'::text))
Rows Removed by Index Recheck: 3857398
Heap Blocks: exact=11840 lossy=142202
-> BitmapOr (cost=15589.71..15589.71 rows=689131 width=0) (actual time=140.985..140.986 rows=0 loops=1)
-> Bitmap Index Scan on product_erecept_x_okres_nazev (cost=0.00..8797.61 rows=397606 width=0) (actual time=99.371..99.371 rows=397949 loops=1)
Index Cond: ((okres_nazev)::text = 'Brno-město'::text)
-> Bitmap Index Scan on product_erecept_x_okres_nazev (cost=0.00..6450.00 rows=291525 width=0) (actual time=41.612..41.612 rows=294170 loops=1)
Index Cond: ((okres_nazev)::text = 'Pardubice'::text)
Planning Time: 0.162 ms
Execution Time: 4540.829 ms
Query with IN:
explain analyse select sum(mnozstvi_rozdil)
from product_erecept
where okres_nazev in ('Brno-město', 'Pardubice');
-- execution plan
Aggregate (cost=593199.90..593199.91 rows=1 width=32) (actual time=855.706..855.707 rows=1 loops=1)
-> Index Scan using product_erecept_x_okres_nazev on product_erecept (cost=0.56..591477.07 rows=689131 width=4) (actual time=1.326..645.597 rows=692119 loops=1)
Index Cond: ((okres_nazev)::text = ANY ('{Brno-město,Pardubice}'::text[]))
Planning Time: 0.136 ms
Execution Time: 855.743 ms

Even though you use the IN operator MS SQL server will automatically convert it to OR operator. If you analyzed the execution plans will able to see this. So better to use it OR if its long IN operator list. it will at least save some nanoseconds of the operation.

I did a SQL query in a large number of OR (350). Postgres do it 437.80ms.
Now use IN:
23.18ms

Related

How to optimize this "select count" SQL? (postgres array comparision)

There is a table, has 10 million records, and it has a column which type is array, it looks like:
id | content | contained_special_ids
----------------------------------------
1 | abc | { 1, 2 }
2 | abd | { 1, 3 }
3 | abe | { 1, 4 }
4 | abf | { 3 }
5 | abg | { 2 }
6 | abh | { 3 }
and I want to know that how many records there is which contained_special_ids includes 3, so my sql is:
select count(*) from my_table where contained_special_ids #> array[3]
It works fine when data is small, however it takes long time (about 30+ seconds) when the table has 10 million records.
I have added index to this column:
"index_my_table_on_contained_special_ids" gin (contained_special_ids)
So, how to optimize this select count query?
Thanks a lot!
UPDATE
below is the explain:
Finalize Aggregate (cost=1049019.17..1049019.18 rows=1 width=8) (actual time=44343.230..44362.224 rows=1 loops=1)
Output: count(*)
-> Gather (cost=1049018.95..1049019.16 rows=2 width=8) (actual time=44340.332..44362.217 rows=3 loops=1)
Output: (PARTIAL count(*))
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1048018.95..1048018.96 rows=1 width=8) (actual time=44337.615..44337.615 rows=1 loops=3)
Output: PARTIAL count(*)
Worker 0: actual time=44336.442..44336.442 rows=1 loops=1
Worker 1: actual time=44336.564..44336.564 rows=1 loops=1
-> Parallel Bitmap Heap Scan on public.my_table (cost=9116.31..1046912.22 rows=442694 width=0) (actual time=330.602..44304.221 rows=391431 loops=3)
Recheck Cond: (my_table.contained_special_ids #> '{12511}'::bigint[])
Rows Removed by Index Recheck: 501077
Heap Blocks: exact=67496 lossy=109789
Worker 0: actual time=329.547..44301.513 rows=409272 loops=1
Worker 1: actual time=329.794..44304.582 rows=378538 loops=1
-> Bitmap Index Scan on index_my_table_on_contained_special_ids (cost=0.00..8850.69 rows=1062465 width=0) (actual time=278.413..278.414 rows=1176563 loops=1)
Index Cond: (my_table.contained_special_ids #> '{12511}'::bigint[])
Planning Time: 1.041 ms
Execution Time: 44362.262 ms
Increase work_mem until the lossy blocks go away. Also, make sure the table is well vacuumed to support index-only bitmap scans, and that you are using a new enough version (which you should tell us) to support those. Finally, you can try increasing effective_io_concurrency.
Also, post plans as text, not images; and turn on track_io_timing.
There is no way to optimize such a query due to 2 factors :
The use of a non atomic value that violate the FIRST NORMAL FORM
The fact that PostGreSQL is unable to perform quickly aggregate computation
On the first problem... 1st NORMAL FORM
each data in table's colums must be atomic.... Of course an array containing multiple value is not atomic.
Then no index would be efficient on such a column due to a type that violate 1FN
This can be reduced by using a table instaed of an array
On the poor performance of PG's aggregate
PG use a model of MVCC that combine in the same table data pages with phantom records and valid records, so to count valid record, that's need to red one by one all the records to distinguish wich one are valid to be counted from the other taht must not be count...
Most of other DBMS does not works as PG, like Oracle or SQL Server that does not keep phantom records inside the datapages, and some others have the exact count of the valid rows into the page header...
As a example, read the tests I have done comparing COUNT and other aggregate functions between PG and SQL Server, some queries runs 1500 time faster on SQL Server...

postgresql st_contains performance

SELECT
a.geom, 'tk' category,
ROUND(avg(tk), 1) tk
FROM
tb_grid_4326_100m a left outer join
(
SELECT
tk-273.15 tk, geom
FROM
tb_points
WHERE
hour = '23'
) b ON st_contains(a.geom, b.geom)
GROUP BY
a.geom
QUERY PLAN |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Finalize GroupAggregate (cost=54632324.85..54648025.25 rows=50698 width=184) (actual time=8522.042..8665.129 rows=50698 loops=1) |
Group Key: a.geom |
-> Gather Merge (cost=54632324.85..54646504.31 rows=101396 width=152) (actual time=8522.032..8598.567 rows=50698 loops=1) |
Workers Planned: 2 |
Workers Launched: 2 |
-> Partial GroupAggregate (cost=54631324.83..54633800.68 rows=50698 width=152) (actual time=8490.577..8512.725 rows=16899 loops=3) |
Group Key: a.geom |
-> Sort (cost=54631324.83..54631785.36 rows=184212 width=130) (actual time=8490.557..8495.249 rows=16996 loops=3) |
Sort Key: a.geom |
Sort Method: external merge Disk: 2296kB |
Worker 0: Sort Method: external merge Disk: 2304kB |
Worker 1: Sort Method: external merge Disk: 2296kB |
-> Nested Loop Left Join (cost=0.41..54602621.56 rows=184212 width=130) (actual time=1.729..8475.942 rows=16996 loops=3) |
-> Parallel Seq Scan on tb_grid_4326_100m a (cost=0.00..5866.24 rows=21124 width=120) (actual time=0.724..2.846 rows=16899 loops=3) |
-> Index Scan using sidx_tb_points on tb_points (cost=0.41..2584.48 rows=10 width=42) (actual time=0.351..0.501 rows=1 loops=50698)|
Index Cond: (((hour)::text = '23'::text) AND (geom # a.geom)) |
Filter: st_contains(a.geom, geom) |
Rows Removed by Filter: 0 |
Planning Time: 1.372 ms |
Execution Time: 8667.418 ms |
I want to join 100m grid table, 100,000 points table using st_contains function.
The 100m grid table has 75,769 records, and tb_points table has 2,434,536 records.
When a time condition is given, the tb_points table returns about 100,000 records.
(As a result, about 75,000 records JOIN about 100,000 records.)
(Index information)
100m grid table using gist(geom),
tb_points table using gist(hour, geom)
It took 30 seconds. How can i imporve the performance?
It is hard to give a definitive answer, but here are several things you can try:
For a multicolumn gist index, it is often a good idea to put the most selectively used column first. In your case, that would have the index be on (geom, hour), not (hour, geom). On the other hand, it can also be better to put the faster column first, and testing for scalar equality should be much faster than testing for containment. You would have to do the test and see which factor is more important for you.
You could try for an Index-only scan, which doesn't need to visit the table. That could save a lot of random IO. Do do that you would need the index gist (hour, geom) INCLUDE (tk, geom). The geom column in a gist index is not considered to be "returnable", so it also needs to be put in the INCLUDE part into order to get the IOS.
Finally, you could partition the table tb_points on "hour". Then you wouldn't need to put "hour" into the gist index, as it is already fulfilled by the partitioning.
And these can be mixed and matched, so you could also swap the column order in the INCLUDE index, or you could try to get both partitioning and the INCLUDE index working together.

How to optimize this simple yet slow query

I have a relatively simple table with several columns, two of them are expires_at(Date) and museum_id(BIGINT, FOREIGN). Both indexed, also using a compound index. The table contains around 3 million of rows in it.
Running query as simple as this takes around 90 seconds to complete:
SELECT *
FROM external_users
WHERE museum_id = 356
AND ((expires_at > '2022-02-16 07:35:39.818117') OR expires_at IS NULL)
Here is the explain analyze output:
Bitmap Heap Scan on external_users (cost=2595.76..148500.40 rows=59259 width=1255) (actual time=4901.257..90786.702 rows=94272 loops=1)
Recheck Cond: (((museum_id = 356) AND (expires_at > '2022-02-16'::date)) OR ((museum_id = 356) AND (expires_at IS NULL)))
Rows Removed by Index Recheck: 391889
Heap Blocks: exact=34133 lossy=33698
-> BitmapOr (cost=2595.76..2595.76 rows=63728 width=0) (actual time=4671.804..4671.806 rows=0 loops=1)
-> Bitmap Index Scan on index_external_users_on_museum_id_and_expires_at (cost=0.00..2187.79 rows=54336 width=0) (actual time=1229.564..1229.564 rows=33671 loops=1)
Index Cond: ((museum_id = 356) AND (expires_at > '2022-02-16'::date))
-> Bitmap Index Scan on index_external_users_on_museum_id_and_expires_at (cost=0.00..378.34 rows=9391 width=0) (actual time=3442.238..3442.238 rows=64337 loops=1)
Index Cond: ((museum_id = 356) AND (expires_at IS NULL))
Planning Time: 266.470 ms
Execution Time: 90838.777 ms
I can't really see anything helpful in the explain/analyze output but that might be related to my lack of experience in such. My peer-reviewer also didn't saw anything interesting in there which makes me think - is there anything i can do in order to help postgres handle queries like that faster or is it just the way it is for tables with over 3M records?
I will explain to you some rules, ways to optimize this query.
1 - When you use OR command on the where conditions, then DB can not use indexes. Recommended using union all. Example:
select *
from external_users
where museum_id = 356
and expires_at > '2022-02-16 07:35:39.818117'
union all
select *
from external_users
where museum_id = 356
and expires_at is null
2 - Your expires_at field may be a timestamp type. But, date types are faster than timestamp types. Because in timestamp types stored hours, minutes, seconds. Also, indexing size timestamp types will be greater than date type indexing size. If you need to store full datetime, then you can use casting types. For the best performance you must create a function-based index (on PostgreSQL this is called expression index), but not a standard index.
select *
from external_users
where museum_id = 356
and expires_at::date > '2022-02-16'
union all
select *
from external_users
where museum_id = 356
and expires_at is NULL
/*
We must cast `expires_at` field type to date type during creating the indexing process. Because in our query we use casting this type, so we must create an index via casting data.
*/
create index external_users_expires_at_idx
ON external_users USING btree ((expires_at::date));
3 - In where conditions if you are using always two, three fields samely, recommended creating one index for these fields, but not separately. In your query maybe always use museum_id and expires_at fields. Create index sample code:
create index external_users_full_index on external_users using btree (museum_id, (expires_at ::date));
The most important of all these ways is the first rule, so not using OR command.

Improve Postgre SQL query performance

I'm running this query in our database:
select
(
select least(2147483647, sum(pb.nr_size))
from tb_pr_dc pd
inner join tb_pr_dc_bn pb on 1=1
and pb.id_pr_dc_bn = pd.id_pr_dc_bn
where 1=1
and pd.id_pr = pt.id_pr -- outer query column
)
from
(
select regexp_split_to_table('[list of 500 ids]', ',')::integer id_pr
) pt
;
Which outputs 500 rows having a single result column and takes around 1 min and 43 secs to run. The explain (analyze, verbose, buffers) outputs the following plan:
Subquery Scan on pt (cost=0.00..805828.19 rows=1000 width=8) (actual time=96.791..103205.872 rows=500 loops=1)
Output: (SubPlan 1)
Buffers: shared hit=373771 read=153484
-> Result (cost=0.00..22.52 rows=1000 width=4) (actual time=0.434..3.729 rows=500 loops=1)
Output: ((regexp_split_to_table('[list of 500 ids]', ',')::integer id_pr)
-> ProjectSet (cost=0.00..5.02 rows=1000 width=32) (actual time=0.429..2.288 rows=500 loops=1)
Output: (regexp_split_to_table('[list of 500 ids]', ',')::integer id_pr
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
SubPlan 1
-> Aggregate (cost=805.78..805.80 rows=1 width=8) (actual time=206.399..206.400 rows=1 loops=500)
Output: LEAST('2147483647'::bigint, sum((pb.nr_size)::integer))
Buffers: shared hit=373771 read=153484
-> Nested Loop (cost=0.87..805.58 rows=83 width=4) (actual time=1.468..206.247 rows=219 loops=500)
Output: pb.nr_size
Inner Unique: true
Buffers: shared hit=373771 read=153484
-> Index Scan using tb_pr_dc_in05 on db.tb_pr_dc pd (cost=0.43..104.02 rows=83 width=4) (actual time=0.233..49.289 rows=219 loops=500)
Output: pd.id_pr_dc, pd.ds_pr_dc, pd.id_pr, pd.id_user_in, pd.id_user_ex, pd.dt_in, pd.dt_ex, pd.ds_mt_ex, pd.in_at, pd.id_tp_pr_dc, pd.id_pr_xz (...)
Index Cond: ((pd.id_pr)::integer = pt.id_pr)
Buffers: shared hit=24859 read=64222
-> Index Scan using tb_pr_dc_bn_pk on db.tb_pr_dc_bn pb (cost=0.43..8.45 rows=1 width=8) (actual time=0.715..0.715 rows=1 loops=109468)
Output: pb.id_pr_dc_bn, pb.ds_ex, pb.ds_md_dc, pb.ds_m5_dc, pb.nm_aq, pb.id_user, pb.dt_in, pb.ob_pr_dc, pb.nr_size, pb.ds_sg, pb.ds_cr_ch, pb.id_user_ (...)
Index Cond: ((pb.id_pr_dc_bn)::integer = (pd.id_pr_dc_bn)::integer)
Buffers: shared hit=348912 read=89262
Planning Time: 1.151 ms
Execution Time: 103206.243 ms
The logic is: for each id_pr chosen (in the list of 500 ids) calculate the sum of the integer column pb.nr_size associated with them, returning the lesser value between this amount and the number 2,147,483,647. The result must contain 500 rows, one for each id, and we already know that they'll match at least one row in the subquery, so will not produce null values.
The index tb_pr_dc_in05 is a b-tree on id_pr only, which is of integer type. The index tb_pr_dc_bn_pk is a b-tree on the primary key id_pr_dc_bn only, which is of integer type also. Table tb_pr_dc has many rows for each id_pr. Actually, we have 209,217 unique id_prs in tb_pr_dc for a total of 13,910,855 rows. Table tb_pr_dc_bn has the same amount of rows.
As can be seen, we defined 500 ids to query tb_pr_dc, finding 109,468 rows (less than 1% of the table size) and then finding the same amount looking in tb_pr_dc_bn. Imo, the indexes look fine and the amount of rows to evaluate is minimal, so I can't understand why it's taking so much time to run this query. A lot of other queries reading a lot more of data on other tables and doing more calculations are running fine. The DBA just ran a reindex and vacuum analyze, but still it's running the same slow way. We are running PostgreSQL 11 on Linux. I'm running this query in a replica without concurrent access.
What could I be missing that could improve this query performance?
Thanks for your attention.
The time is spent jumping all over the table to find 109468 randomly scattered rows, issuing random IO requests to do so. You can verify that be turning track_io_timing on and redoing the plans (probably just leave it turned on globally and by default, the overhead is low and the value it produces is high), but I'm sure enough that I don't need to see that output before reaching this conclusion. The other queries that are faster are probably accessing fewer disk pages because they access data that is more tightly packed, or is organized so that it can be read more sequentially. In fact, I would say your query is quite fast given how many pages it had to read.
You ask about why so many columns are output in the internal nodes of the plan. The reason for that is that PostgreSQL often just passes around pointers to where the tuple lives in the shared_buffers, and the tuple being pointed to has the columns that the table itself has. It could allocate memory in which to store a reformatted version of the tuple with the unnecessary columns stripped out, but that would generally be more work, not less. If it was a reason to copy and re-form the tuple anyway, it will remove the extraneous columns while it does so. But it won't do it without a reason.
One way to sped this up is to create indexes which will enable index-only scans. Those would be on tb_pr_dc (id_pr, id_pr_dc_bn) and on tb_pr_dc_bn (id_pr_dc_bn, nr_size).
If this isn't enough, there might be other ways to improve this too; but I can't think through them if I keep getting distracted by the long strings of unmemorable unpronounceable gibberish you have for table and column names.

Two column, bulk, random access retrieval from sparse table using PostgreSQL

I'm storing a relatively reasonable (~3 million) number of very small rows (the entire DB is ~300MB) in PostgreSQL. The data is organized thus:
Table "public.tr_rating"
Column | Type | Modifiers
-----------+--------------------------+---------------------------------------------------------------
user_id | bigint | not null
place_id | bigint | not null
rating | smallint | not null
rated_at | timestamp with time zone | not null default now()
rating_id | bigint | not null default nextval('tr_rating_rating_id_seq'::regclass)
Indexes:
"tr_rating_rating_id_key" UNIQUE, btree (rating_id)
"tr_rating_user_idx" btree (user_id, place_id)
Now, I would like to retrieve the ratings deposited over a set of places by your friends (a set of users)
The natural query I wrote is:
SELECT * FROM tr_rating WHERE user_id=ANY(?) AND place_id=ANY(?)
The size of the user_id array is ~500, while the place_id array is ~10,000
This turns into:
Bitmap Heap Scan on tr_rating (cost=2453743.43..2492013.53 rows=3627 width=34) (actual time=10174.044..10174.234 rows=1111 loops=1)
Buffers: shared hit=27922214
-> Bitmap Index Scan on tr_rating_user_idx (cost=0.00..2453742.53 rows=3627 width=0) (actual time=10174.031..10174.031 rows=1111 loops=1)
Index Cond: ((user_id = ANY (...) ))
Buffers: shared hit=27922214
Total runtime: 10279.290 ms
The first suspicious thing I see here is that it estimates that scanning the index for 500 users will take 2.5M disk seeks
Everything else here looks reasonable, except that it takes ten full seconds to do this! The index (via \di) looks like:
public | tr_rating_user_idx | index | tr_rating | 67 MB |
at 67 MB, I would expect it could tear through the index in a trivial amount of time, even if it has to do it sequentially. As the buffers accounting from the EXPLAIN ANALYZE shows, everything is already in memory (as all values other than shared_hit are zero and thus suppressed).
I have tried various combinations of REINDEX, VACUUM, ANALYZE, and CLUSTER with no measurable improvement.
Any thoughts as to what I am doing wrong here, or how I could debug further? I'm mystified; 67MB of data is a puny amount to spend so much time searching through...
For reference, the hardware is a 8-way recent Xeon with 8 15K 300GB drives in RAID-10. Should be enough :-)
EDIT
Per btilly's suggestion, I tried out temporary tables:
=> explain analyze select * from tr_rating NATURAL JOIN user_ids NATURAL JOIN place_ids;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=49133.46..49299.51 rows=3524 width=34) (actual time=13.801..15.676 rows=1111 loops=1)
Hash Cond: (place_ids.place_id = tr_rating.place_id)
-> Seq Scan on place_ids (cost=0.00..59.66 rows=4066 width=8) (actual time=0.009..0.619 rows=4251 loops=1)
-> Hash (cost=48208.02..48208.02 rows=74035 width=34) (actual time=13.767..13.767 rows=7486 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 527kB
-> Nested Loop (cost=0.00..48208.02 rows=74035 width=34) (actual time=0.047..11.055 rows=7486 loops=1)
-> Seq Scan on user_ids (cost=0.00..31.40 rows=2140 width=8) (actual time=0.006..0.399 rows=2189 loops=1)
-> Index Scan using tr_rating_user_idx on tr_rating (cost=0.00..22.07 rows=35 width=34) (actual time=0.002..0.003 rows=3 loops=2189)
Index Cond: (tr_rating.user_id = user_ids.user_id) JOIN place_ids;
Total runtime: 15.931 ms
Why is the query plan so much better when faced with temporary tables, rather than arrays? The data is exactly the same, simply presented in a different way. Additionally, I've measured the time to create a temporary table at running in the tens to hundreds of milliseconds, which is a pretty steep overhead to pay. Can I continue to use the array approach, yet allow Postgres to use the hash join which is so much faster, instead?
EDIT 2
By creating a hash index on user_id, the runtime reduces to 250ms. Adding another hash index to place_id reduces the runtime further to 50ms. This is still twice as slow as using temporary tables, but the overhead of making the table negates any gains I see. I still do not understand how doing O(500) lookups in a btree index can take ten seconds, but the hash index is unquestionably much faster.
It looks like it is taking each row in the index, and then scanning through your user_id array, then if it finds it scanning through your place_id array. That means that for 3 million rows it has to scan through 100 user_ids, and for each match it scans through 10,000 place_ids. Those matches are individually fast, but this is a poor algorithm that could potentially result in up to 30 billion operations.
You'd be better off creating two temporary tables, giving them indexes, and doing a join. If it does a hash join, then you'd potentially have 6 million hash lookups. (3 million for user_id and 3 million for place_id.)