postgresql st_contains performance - sql

SELECT
a.geom, 'tk' category,
ROUND(avg(tk), 1) tk
FROM
tb_grid_4326_100m a left outer join
(
SELECT
tk-273.15 tk, geom
FROM
tb_points
WHERE
hour = '23'
) b ON st_contains(a.geom, b.geom)
GROUP BY
a.geom
QUERY PLAN |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Finalize GroupAggregate (cost=54632324.85..54648025.25 rows=50698 width=184) (actual time=8522.042..8665.129 rows=50698 loops=1) |
Group Key: a.geom |
-> Gather Merge (cost=54632324.85..54646504.31 rows=101396 width=152) (actual time=8522.032..8598.567 rows=50698 loops=1) |
Workers Planned: 2 |
Workers Launched: 2 |
-> Partial GroupAggregate (cost=54631324.83..54633800.68 rows=50698 width=152) (actual time=8490.577..8512.725 rows=16899 loops=3) |
Group Key: a.geom |
-> Sort (cost=54631324.83..54631785.36 rows=184212 width=130) (actual time=8490.557..8495.249 rows=16996 loops=3) |
Sort Key: a.geom |
Sort Method: external merge Disk: 2296kB |
Worker 0: Sort Method: external merge Disk: 2304kB |
Worker 1: Sort Method: external merge Disk: 2296kB |
-> Nested Loop Left Join (cost=0.41..54602621.56 rows=184212 width=130) (actual time=1.729..8475.942 rows=16996 loops=3) |
-> Parallel Seq Scan on tb_grid_4326_100m a (cost=0.00..5866.24 rows=21124 width=120) (actual time=0.724..2.846 rows=16899 loops=3) |
-> Index Scan using sidx_tb_points on tb_points (cost=0.41..2584.48 rows=10 width=42) (actual time=0.351..0.501 rows=1 loops=50698)|
Index Cond: (((hour)::text = '23'::text) AND (geom # a.geom)) |
Filter: st_contains(a.geom, geom) |
Rows Removed by Filter: 0 |
Planning Time: 1.372 ms |
Execution Time: 8667.418 ms |
I want to join 100m grid table, 100,000 points table using st_contains function.
The 100m grid table has 75,769 records, and tb_points table has 2,434,536 records.
When a time condition is given, the tb_points table returns about 100,000 records.
(As a result, about 75,000 records JOIN about 100,000 records.)
(Index information)
100m grid table using gist(geom),
tb_points table using gist(hour, geom)
It took 30 seconds. How can i imporve the performance?

It is hard to give a definitive answer, but here are several things you can try:
For a multicolumn gist index, it is often a good idea to put the most selectively used column first. In your case, that would have the index be on (geom, hour), not (hour, geom). On the other hand, it can also be better to put the faster column first, and testing for scalar equality should be much faster than testing for containment. You would have to do the test and see which factor is more important for you.
You could try for an Index-only scan, which doesn't need to visit the table. That could save a lot of random IO. Do do that you would need the index gist (hour, geom) INCLUDE (tk, geom). The geom column in a gist index is not considered to be "returnable", so it also needs to be put in the INCLUDE part into order to get the IOS.
Finally, you could partition the table tb_points on "hour". Then you wouldn't need to put "hour" into the gist index, as it is already fulfilled by the partitioning.
And these can be mixed and matched, so you could also swap the column order in the INCLUDE index, or you could try to get both partitioning and the INCLUDE index working together.

Related

How to optimize this "select count" SQL? (postgres array comparision)

There is a table, has 10 million records, and it has a column which type is array, it looks like:
id | content | contained_special_ids
----------------------------------------
1 | abc | { 1, 2 }
2 | abd | { 1, 3 }
3 | abe | { 1, 4 }
4 | abf | { 3 }
5 | abg | { 2 }
6 | abh | { 3 }
and I want to know that how many records there is which contained_special_ids includes 3, so my sql is:
select count(*) from my_table where contained_special_ids #> array[3]
It works fine when data is small, however it takes long time (about 30+ seconds) when the table has 10 million records.
I have added index to this column:
"index_my_table_on_contained_special_ids" gin (contained_special_ids)
So, how to optimize this select count query?
Thanks a lot!
UPDATE
below is the explain:
Finalize Aggregate (cost=1049019.17..1049019.18 rows=1 width=8) (actual time=44343.230..44362.224 rows=1 loops=1)
Output: count(*)
-> Gather (cost=1049018.95..1049019.16 rows=2 width=8) (actual time=44340.332..44362.217 rows=3 loops=1)
Output: (PARTIAL count(*))
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=1048018.95..1048018.96 rows=1 width=8) (actual time=44337.615..44337.615 rows=1 loops=3)
Output: PARTIAL count(*)
Worker 0: actual time=44336.442..44336.442 rows=1 loops=1
Worker 1: actual time=44336.564..44336.564 rows=1 loops=1
-> Parallel Bitmap Heap Scan on public.my_table (cost=9116.31..1046912.22 rows=442694 width=0) (actual time=330.602..44304.221 rows=391431 loops=3)
Recheck Cond: (my_table.contained_special_ids #> '{12511}'::bigint[])
Rows Removed by Index Recheck: 501077
Heap Blocks: exact=67496 lossy=109789
Worker 0: actual time=329.547..44301.513 rows=409272 loops=1
Worker 1: actual time=329.794..44304.582 rows=378538 loops=1
-> Bitmap Index Scan on index_my_table_on_contained_special_ids (cost=0.00..8850.69 rows=1062465 width=0) (actual time=278.413..278.414 rows=1176563 loops=1)
Index Cond: (my_table.contained_special_ids #> '{12511}'::bigint[])
Planning Time: 1.041 ms
Execution Time: 44362.262 ms
Increase work_mem until the lossy blocks go away. Also, make sure the table is well vacuumed to support index-only bitmap scans, and that you are using a new enough version (which you should tell us) to support those. Finally, you can try increasing effective_io_concurrency.
Also, post plans as text, not images; and turn on track_io_timing.
There is no way to optimize such a query due to 2 factors :
The use of a non atomic value that violate the FIRST NORMAL FORM
The fact that PostGreSQL is unable to perform quickly aggregate computation
On the first problem... 1st NORMAL FORM
each data in table's colums must be atomic.... Of course an array containing multiple value is not atomic.
Then no index would be efficient on such a column due to a type that violate 1FN
This can be reduced by using a table instaed of an array
On the poor performance of PG's aggregate
PG use a model of MVCC that combine in the same table data pages with phantom records and valid records, so to count valid record, that's need to red one by one all the records to distinguish wich one are valid to be counted from the other taht must not be count...
Most of other DBMS does not works as PG, like Oracle or SQL Server that does not keep phantom records inside the datapages, and some others have the exact count of the valid rows into the page header...
As a example, read the tests I have done comparing COUNT and other aggregate functions between PG and SQL Server, some queries runs 1500 time faster on SQL Server...

Using the same order as a subquery without the results getting sorted unnecessarily

I have a large table over which I want to execute some window functions by scanning over an index and I want to stop scanning and produce the row when one of a number of conditions hold involving these aggregates (so WHERE ... LIMIT 1 is out of question, since I can't have window functions inside the WHERE).
Let me expand further on my concrete case:
Here's my events table:
=> \d events
Table "public.events"
Column | Type | Collation | Nullable | Default
------------+-------------------+-----------+----------+---------
block | character varying | | not null |
chainid | bigint | | not null |
height | bigint | | not null |
idx | bigint | | not null |
module | character varying | | not null |
modulehash | character varying | | not null |
name | character varying | | not null |
params | jsonb | | not null |
paramtext | character varying | | not null |
qualname | character varying | | not null |
requestkey | character varying | | not null |
Indexes:
"events_pkey" PRIMARY KEY, btree (block, idx, requestkey)
"events_height_chainid_idx" btree (height DESC, chainid, idx)
After much experimentation, I've arrived at a query that returns exactly the row I want and it also produces exactly the query plan that I'm envisioning:
=> EXPLAIN ANALYZE SELECT *
FROM (
SELECT *
, ROW_NUMBER() OVER (ORDER BY height DESC, block, requestkey, idx) as scan_num
, count(*) FILTER (WHERE qualname ILIKE '%transfer%') OVER
( ORDER BY height DESC, block, requestkey, idx
ROWS BETWEEN unbounded PRECEDING AND CURRENT ROW
) AS foundCnt
FROM events
ORDER BY height DESC, block, requestkey, idx
) as scanned_events
WHERE foundCnt = 3 OR scan_num = 100000
LIMIT 1
;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1065.81..1400.34 rows=1 width=397) (actual time=0.095..0.096 rows=1 loops=1)
-> Subquery Scan on scanned_events (cost=1065.81..165535223.46 rows=494824 width=397) (actual time=0.095..0.095 rows=1 loops=1)
Filter: ((scanned_events.foundcnt = 3) OR (scanned_events.scan_num = 100000))
Rows Removed by Filter: 2
-> WindowAgg (cost=1065.81..164791126.56 rows=49606460 width=397) (actual time=0.089..0.094 rows=3 loops=1)
-> WindowAgg (cost=1065.81..163550965.06 rows=49606460 width=389) (actual time=0.081..0.083 rows=4 loops=1)
-> Incremental Sort (cost=1065.81..162434819.71 rows=49606460 width=381) (actual time=0.076..0.076 rows=5 loops=1)
Sort Key: events.height DESC, events.block, events.requestkey, events.idx
Presorted Key: events.height
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 56kB Peak Memory: 56kB
-> Index Scan using events_height_chainid_idx on events (cost=0.56..158424783.98 rows=49606460 width=381) (actual time=0.015..0.035 rows=53 loops=1)
Planning Time: 0.112 ms
Execution Time: 0.128 ms
(13 rows)
Here's what this query is trying to achieve: Scan through the events table counting the number of rows whose qualname contains 'transfers' and return the row as soon as you find the 3rd match OR you end up scanning 100000 rows.
So, my high-level intention is to look for some condition (involving a moving aggregate) but I want to put an upper bound on how many rows I'm willing to fetch. But if I happen to find what I'm looking for quickly, I also don't want to go through the rest of those 100000 rows unnecessarily (similar to the query plan above, where it ends up scanning just 53 rows).
If you inspect the query plan, this query is doing exactly what I want, but it has a serious flaw: It's not guaranteed to produce the correct result, it just happens to do it since the correct result is produced by the most natural way to execute the query, but the top-level SELECT has no ORDER BY clause, so Postgres could in theory execute it in a different way and end up returning any one row that happens to return foundCnt = 3.
In order to remedy this flaw, I've tried the following:
=> EXPLAIN ANALYZE SELECT *
FROM (
SELECT *
, ROW_NUMBER() OVER (ORDER BY height DESC, block, requestkey, idx) as scan_num
, count(*) FILTER (WHERE qualname ILIKE '%transfer%') OVER
( ORDER BY height DESC, block, requestkey, idx
ROWS BETWEEN unbounded PRECEDING AND CURRENT ROW
) AS foundCnt
FROM events
ORDER BY height DESC, block, requestkey, idx
) as scanned_events
WHERE foundCnt = 3 OR scan_num = 100000
ORDER BY height DESC, block, requestkey, idx
LIMIT 1
;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=16173553.41..16173571.35 rows=1 width=397) (actual time=86703.480..88314.937 rows=1 loops=1)
-> Subquery Scan on scanned_events (cost=16173553.41..25051383.19 rows=494821 width=397) (actual time=86435.692..88047.148 rows=1 loops=1)
Filter: ((scanned_events.foundcnt = 3) OR (scanned_events.scan_num = 100000))
Rows Removed by Filter: 2
-> WindowAgg (cost=16173553.41..24307291.63 rows=49606104 width=397) (actual time=86435.682..88047.143 rows=3 loops=1)
-> WindowAgg (cost=16173553.41..23067139.03 rows=49606104 width=389) (actual time=86435.662..88047.120 rows=4 loops=1)
-> Gather Merge (cost=16173553.41..21951001.69 rows=49606104 width=381) (actual time=86435.630..88047.085 rows=5 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=16172553.39..16224226.41 rows=20669210 width=381) (actual time=86147.622..86147.642 rows=106 loops=3)
Sort Key: events.height DESC, events.block, events.requestkey, events.idx
Sort Method: external merge Disk: 6535240kB
Worker 0: Sort Method: external merge Disk: 6503568kB
Worker 1: Sort Method: external merge Disk: 6506736kB
-> Parallel Seq Scan on events (cost=0.00..2852191.10 rows=20669210 width=381) (actual time=43.151..4135.334 rows=16430767 loops=3)
Planning Time: 0.353 ms
JIT:
Functions: 16
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 3.412 ms, Inlining 105.447 ms, Optimization 204.392 ms, Emission 87.327 ms, Total 400.578 ms
Execution Time: 89345.338 ms
(21 rows)
Now it ends up scanning the entire table, even though I've just explicitly specified what it was already doing. I've tried many variations on the latter query, such as moving the subquery to a CTE or ordering the outer SELECT by scan_num, ordering both SELECTs by scan_num, only ordering the outer SELECT by height DESC, block, requestkey, idx. I honestly lost track of the variations I've already tried, but as soon as I have an ORDER BY clause on the outer SELECT, Postgres ends up scanning the entire table.
So, my question is: Is there any way to achieve what I want without relying on fragile semantics (like the query that does exactly what I want). I.e. what would be the correct way to write a Postgres query that will scan a bounded number of rows and return as soon as a condition (involving window functions) is satisfied.
Addressing the comments
#nbk suggested trying to add an index on height DESC, block, requestkey, idx, i.e. the exact order we're looking for. Even though I want to avoid adding that index because I'm happy with the performance of my first query (so the index shouldn't be necessary), I still tried it, but it didn't change the query plan of the second query at all, it doesn't use any indexes anyway. It just made the first query slighly faster as expected, since that one does use indexes.

Avoiding external disk sort for aggregate query

We have a table that contains raw analytics (like Google Analytics and similar) numbers for views on our videos. It contains numbers like raw views, downloads, loads, etc. Each video is identified by a video_id.
Data is recorded per-day, but because we need to extract on a number of metrics each day can contain multiple records for a specific video_id. Example:
date | video_id | country | source | downloads | etc...
----------------------------------------------------------------
2014-01-02 | 1 | us | facebook | 10 |
2014-01-02 | 1 | dk | facebook | 13 |
2014-01-02 | 1 | dk | admin | 20 |
I have a query where I need to get aggregate data for all videos that have new data beyond a certain date. To get the video ID's I do this query: SELECT video_id FROM table WHERE date >= '2014-01-01' GROUP BY photo_id (alternatively I could do a DISTINCT(video_id) without a GROUP BY, performance is identical).
Once I have these IDs I need the total aggregate data (for all time). Combined, this turns into the following query:
SELECT
video_id,
SUM(downloads),
SUM(loads),
<more SUMs),
FROM
table
WHERE
video_id IN (SELECT video_id FROM table WHERE date >= '2014-01-01' GROUP BY video_id)
GROUP BY
video_id
There is around ~10 columns we SUM (5-10 depending on the query). The EXPLAIN ANALYZE gives the following:
GroupAggregate (cost=2370840.59..2475948.90 rows=42537 width=72) (actual time=153790.362..162668.962 rows=87661 loops=1)
-> Sort (cost=2370840.59..2378295.16 rows=2981826 width=72) (actual time=153790.329..155833.770 rows=3285001 loops=1)
Sort Key: table.video_id
Sort Method: external merge Disk: 263528kB
-> Hash Join (cost=57066.94..1683266.53 rows=2981826 width=72) (actual time=740.210..143814.921 rows=3285001 loops=1)
Hash Cond: (table.video_id = table.video_id)
-> Seq Scan on table (cost=0.00..1550549.52 rows=5963652 width=72) (actual time=1.768..47613.953 rows=5963652 loops=1)
-> Hash (cost=56924.17..56924.17 rows=11422 width=8) (actual time=734.881..734.881 rows=87661 loops=1)
Buckets: 2048 Batches: 4 (originally 1) Memory Usage: 1025kB
-> HashAggregate (cost=56695.73..56809.95 rows=11422 width=8) (actual time=693.769..715.665 rows=87661 loops=1)
-> Index Only Scan using table_recent_ids on table (cost=0.00..52692.41 rows=1601328 width=8) (actual time=1.279..314.249 rows=1614339 loops=1)
Index Cond: (date >= '2014-01-01'::date)
Heap Fetches: 0
Total runtime: 162693.367 ms
As you can see, it's using a (quite big) external disk merge sort and taking a long time. I am unsure of why the sorts are triggered in the first place, and I am looking for a way to avoid it or at least minimize it. I know increasing work_mem can alleviate external disk merges, but in this case it seems to be excessive and having a work_mem above 500MB seems like a bad idea.
The table has two (relevant) indexes: One on video_id alone and another on (date, video_id).
EDIT: Updated query after running ANALYZE table.
Edited to match the revised query plan.
You are getting a sort because Postgres needs to sort the result rows to group them.
This query looks like it could really benefit from an index on table(video_id, date), or even just an index on table(video_id). Having such an index would likely avoid the need to sort.
Edited (#2) to suggest
You could also consider testing an alternative query such as this:
SELECT
video_id,
MAX(date) as latest_date,
<SUMs>
FROM
table
GROUP BY
video_id
HAVING
latest_date >= '2014-01-01'
That avoids any join or subquery, and given an index on table(video_id [, other columns]) it can be hoped that the sort will be avoided as well. It will compute the sums over the whole base table before filtering out the groups you don't want, but that operation is O(n), whereas sorting is O(m log m). Thus, if the date criterion is not very selective then checking it after the fact may be an improvement.

Why would this query in Postgres result in a 15-day lock?

Everything in my database was running normally -- reads, writes, lots of activity.
Then I wanted to add a column to the foos table. The foos table became unavailable. I quit the code executing the query and looked at locks in the system. I found the below query had a lock for 15 days. After that was my table-changing query, and after that were a bunch more queries which involved the foos table.
What would cause this query to get stuck for 15 days? This is in 9.1.3
select generate_report, b.count from
(select count(1), date_trunc('hour',f.event_happened_at) from
foos as f, bars as b
where age(f.event_happened_at) <= interval '24 hour' and f.id=b.foo_id and b.thing_type='Dog' and b.thing_id=26631
group by date_trunc('hour',f.event_happened_at)) as e
right join generate_report(date_trunc('hour',now()) - interval '24 hour',now(),interval '1 hour')
on generate_report = b.date_trunc
order by generate_report;
update: info from pg_stat_activity
| backend_start | xact_start | query_start | waiting |
-------+---------+----------+------------------+----------------+-----------------+------------------------
| 2012-11-19 18:38:40.029818+00 | 2012-11-19 18:38:40.145172+00 | 2012-11-19 18:38:40.145172+00 | f |
update: output of explain:
Merge Left Join (cost=14135.74..14138.08 rows=1000 width=16)
Merge Cond: (generate_report.generate_report = (date_trunc('hour'::text, f.event_happened_at)))
-> Sort (cost=12.97..13.47 rows=1000 width=8)
Sort Key: generate_report.generate_report
-> Function Scan on generate_report (cost=0.00..3.00 rows=1000 width=8)
-> Sort (cost=14122.77..14122.81 rows=67 width=16)
Sort Key: (date_trunc('hour'::text, f.event_happened_at))
-> HashAggregate (cost=14121.93..14122.17 rows=67 width=8)
-> Hash Join (cost=3237.14..14121.86 rows=67 width=8)
Hash Cond: (b.foo_id = f.id)
-> Index Scan using index_bars_on_thing_type_and_thing_id_and_baz on bars b (cost=0.00..10859.88 rows=10937 width=4)
Index Cond: (((thing_type)::text = 'Dog'::text) AND (thing_id = 26631))
-> Hash (cost=3131.42..3131.42 rows=30207 width=12)
-> Seq Scan on foos f (cost=0.00..3131.42 rows=30207 width=12)
Filter: (age((('now'::text)::date)::timestamp without time zone, event_happened_at) <= '24:00:00'::interval)
Per the info from pg_stat_activity you posted, it looks like this query is still executing (waiting = f). This means that the lock just has not been released yet.
You may want to start taking a look at your query to see if there are problems with its structure or the query plan it is generating. 15 days is definitely too long, most long running queries should take no more than 10 minutes before they are considered problems.
For assistance with that, you will need to post your table DDL, some sample data, and some idea of how many rows are in each table. That would probably be best posed as a new question, but you can always edit this one.

Two column, bulk, random access retrieval from sparse table using PostgreSQL

I'm storing a relatively reasonable (~3 million) number of very small rows (the entire DB is ~300MB) in PostgreSQL. The data is organized thus:
Table "public.tr_rating"
Column | Type | Modifiers
-----------+--------------------------+---------------------------------------------------------------
user_id | bigint | not null
place_id | bigint | not null
rating | smallint | not null
rated_at | timestamp with time zone | not null default now()
rating_id | bigint | not null default nextval('tr_rating_rating_id_seq'::regclass)
Indexes:
"tr_rating_rating_id_key" UNIQUE, btree (rating_id)
"tr_rating_user_idx" btree (user_id, place_id)
Now, I would like to retrieve the ratings deposited over a set of places by your friends (a set of users)
The natural query I wrote is:
SELECT * FROM tr_rating WHERE user_id=ANY(?) AND place_id=ANY(?)
The size of the user_id array is ~500, while the place_id array is ~10,000
This turns into:
Bitmap Heap Scan on tr_rating (cost=2453743.43..2492013.53 rows=3627 width=34) (actual time=10174.044..10174.234 rows=1111 loops=1)
Buffers: shared hit=27922214
-> Bitmap Index Scan on tr_rating_user_idx (cost=0.00..2453742.53 rows=3627 width=0) (actual time=10174.031..10174.031 rows=1111 loops=1)
Index Cond: ((user_id = ANY (...) ))
Buffers: shared hit=27922214
Total runtime: 10279.290 ms
The first suspicious thing I see here is that it estimates that scanning the index for 500 users will take 2.5M disk seeks
Everything else here looks reasonable, except that it takes ten full seconds to do this! The index (via \di) looks like:
public | tr_rating_user_idx | index | tr_rating | 67 MB |
at 67 MB, I would expect it could tear through the index in a trivial amount of time, even if it has to do it sequentially. As the buffers accounting from the EXPLAIN ANALYZE shows, everything is already in memory (as all values other than shared_hit are zero and thus suppressed).
I have tried various combinations of REINDEX, VACUUM, ANALYZE, and CLUSTER with no measurable improvement.
Any thoughts as to what I am doing wrong here, or how I could debug further? I'm mystified; 67MB of data is a puny amount to spend so much time searching through...
For reference, the hardware is a 8-way recent Xeon with 8 15K 300GB drives in RAID-10. Should be enough :-)
EDIT
Per btilly's suggestion, I tried out temporary tables:
=> explain analyze select * from tr_rating NATURAL JOIN user_ids NATURAL JOIN place_ids;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=49133.46..49299.51 rows=3524 width=34) (actual time=13.801..15.676 rows=1111 loops=1)
Hash Cond: (place_ids.place_id = tr_rating.place_id)
-> Seq Scan on place_ids (cost=0.00..59.66 rows=4066 width=8) (actual time=0.009..0.619 rows=4251 loops=1)
-> Hash (cost=48208.02..48208.02 rows=74035 width=34) (actual time=13.767..13.767 rows=7486 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 527kB
-> Nested Loop (cost=0.00..48208.02 rows=74035 width=34) (actual time=0.047..11.055 rows=7486 loops=1)
-> Seq Scan on user_ids (cost=0.00..31.40 rows=2140 width=8) (actual time=0.006..0.399 rows=2189 loops=1)
-> Index Scan using tr_rating_user_idx on tr_rating (cost=0.00..22.07 rows=35 width=34) (actual time=0.002..0.003 rows=3 loops=2189)
Index Cond: (tr_rating.user_id = user_ids.user_id) JOIN place_ids;
Total runtime: 15.931 ms
Why is the query plan so much better when faced with temporary tables, rather than arrays? The data is exactly the same, simply presented in a different way. Additionally, I've measured the time to create a temporary table at running in the tens to hundreds of milliseconds, which is a pretty steep overhead to pay. Can I continue to use the array approach, yet allow Postgres to use the hash join which is so much faster, instead?
EDIT 2
By creating a hash index on user_id, the runtime reduces to 250ms. Adding another hash index to place_id reduces the runtime further to 50ms. This is still twice as slow as using temporary tables, but the overhead of making the table negates any gains I see. I still do not understand how doing O(500) lookups in a btree index can take ten seconds, but the hash index is unquestionably much faster.
It looks like it is taking each row in the index, and then scanning through your user_id array, then if it finds it scanning through your place_id array. That means that for 3 million rows it has to scan through 100 user_ids, and for each match it scans through 10,000 place_ids. Those matches are individually fast, but this is a poor algorithm that could potentially result in up to 30 billion operations.
You'd be better off creating two temporary tables, giving them indexes, and doing a join. If it does a hash join, then you'd potentially have 6 million hash lookups. (3 million for user_id and 3 million for place_id.)