Performance of max() vs ORDER BY DESC + LIMIT 1 - sql

I was troubleshooting a few slow SQL queries today and don't quite understand the performance difference below:
When trying to extract the max(timestamp) from a data table based on some condition, using MAX() is slower than ORDER BY timestamp LIMIT 1 if a matching row exists, but considerably faster if no matching row is found.
SELECT timestamp
FROM data JOIN sensors ON ( sensors.id = data.sensor_id )
WHERE sensor.station_id = 4
ORDER BY timestamp DESC
LIMIT 1;
(0 rows)
Time: 1314.544 ms
SELECT timestamp
FROM data JOIN sensors ON ( sensors.id = data.sensor_id )
WHERE sensor.station_id = 5
ORDER BY timestamp DESC
LIMIT 1;
(1 row)
Time: 10.890 ms
SELECT MAX(timestamp)
FROM data JOIN sensors ON ( sensors.id = data.sensor_id )
WHERE sensor.station_id = 4;
(0 rows)
Time: 0.869 ms
SELECT MAX(timestamp)
FROM data JOIN sensors ON ( sensors.id = data.sensor_id )
WHERE sensor.station_id = 5;
(1 row)
Time: 84.087 ms
There are indexes on (timestamp) and (sensor_id, timestamp), and I noticed that Postgres uses very different query plans and indexes for both cases:
QUERY PLAN (ORDER BY)
--------------------------------------------------------------------------------------------------------
Limit (cost=0.43..9.47 rows=1 width=8)
-> Nested Loop (cost=0.43..396254.63 rows=43823 width=8)
Join Filter: (data.sensor_id = sensors.id)
-> Index Scan using timestamp_ind on data (cost=0.43..254918.66 rows=4710976 width=12)
-> Materialize (cost=0.00..6.70 rows=2 width=4)
-> Seq Scan on sensors (cost=0.00..6.69 rows=2 width=4)
Filter: (station_id = 4)
(7 rows)
QUERY PLAN (MAX)
----------------------------------------------------------------------------------------------------------
Aggregate (cost=3680.59..3680.60 rows=1 width=8)
-> Nested Loop (cost=0.43..3571.03 rows=43823 width=8)
-> Seq Scan on sensors (cost=0.00..6.69 rows=2 width=4)
Filter: (station_id = 4)
-> Index Only Scan using sensor_ind_timestamp on data (cost=0.43..1389.59 rows=39258 width=12)
Index Cond: (sensor_id = sensors.id)
(6 rows)
So my two questions are:
Where does this performance difference come from? I've seen the accepted answer here MIN/MAX vs ORDER BY and LIMIT, but that doesn't quite seem to apply here. Any good resources would be appreciated.
Are there better ways to increase performance in all cases (matching row vs no matching row) than adding an EXISTS check?
EDIT to address the questions in the comments below. I kept the initial query plans above for future reference:
Table definitions:
Table "public.sensors"
Column | Type | Modifiers
----------------------+------------------------+-----------------------------------------------------------------
id | integer | not null default nextval('sensors_id_seq'::regclass)
station_id | integer | not null
....
Indexes:
"sensor_primary" PRIMARY KEY, btree (id)
"ind_station_id" btree (station_id, id)
"ind_station" btree (station_id)
Table "public.data"
Column | Type | Modifiers
-----------+--------------------------+------------------------------------------------------------------
id | integer | not null default nextval('data_id_seq'::regclass)
timestamp | timestamp with time zone | not null
sensor_id | integer | not null
avg | integer |
Indexes:
"timestamp_ind" btree ("timestamp" DESC)
"sensor_ind" btree (sensor_id)
"sensor_ind_timestamp" btree (sensor_id, "timestamp")
"sensor_ind_timestamp_desc" btree (sensor_id, "timestamp" DESC)
Note that I added ind_station_id on sensors just now after #Erwin's suggestion below. Timings haven't really changed drastically, still >1200ms in the ORDER BY DESC + LIMIT 1 case and ~0.9ms in the MAX case.
Query Plans:
QUERY PLAN (ORDER BY)
----------------------------------------------------------------------------------------------------------
Limit (cost=0.58..9.62 rows=1 width=8) (actual time=2161.054..2161.054 rows=0 loops=1)
Buffers: shared hit=3418066 read=47326
-> Nested Loop (cost=0.58..396382.45 rows=43823 width=8) (actual time=2161.053..2161.053 rows=0 loops=1)
Join Filter: (data.sensor_id = sensors.id)
Buffers: shared hit=3418066 read=47326
-> Index Scan using timestamp_ind on data (cost=0.43..255048.99 rows=4710976 width=12) (actual time=0.047..1410.715 rows=4710976 loops=1)
Buffers: shared hit=3418065 read=47326
-> Materialize (cost=0.14..4.19 rows=2 width=4) (actual time=0.000..0.000 rows=0 loops=4710976)
Buffers: shared hit=1
-> Index Only Scan using ind_station_id on sensors (cost=0.14..4.18 rows=2 width=4) (actual time=0.004..0.004 rows=0 loops=1)
Index Cond: (station_id = 4)
Heap Fetches: 0
Buffers: shared hit=1
Planning time: 0.478 ms
Execution time: 2161.090 ms
(15 rows)
QUERY (MAX)
----------------------------------------------------------------------------------------------------------
Aggregate (cost=3678.08..3678.09 rows=1 width=8) (actual time=0.009..0.009 rows=1 loops=1)
Buffers: shared hit=1
-> Nested Loop (cost=0.58..3568.52 rows=43823 width=8) (actual time=0.006..0.006 rows=0 loops=1)
Buffers: shared hit=1
-> Index Only Scan using ind_station_id on sensors (cost=0.14..4.18 rows=2 width=4) (actual time=0.005..0.005 rows=0 loops=1)
Index Cond: (station_id = 4)
Heap Fetches: 0
Buffers: shared hit=1
-> Index Only Scan using sensor_ind_timestamp on data (cost=0.43..1389.59 rows=39258 width=12) (never executed)
Index Cond: (sensor_id = sensors.id)
Heap Fetches: 0
Planning time: 0.435 ms
Execution time: 0.048 ms
(13 rows)
So just like in the earlier explains, ORDER BY does a Scan using timestamp_in on data, which is not done in the MAX case.
Postgres version:
Postgres from the Ubuntu repos: PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 5.2.1-21ubuntu2) 5.2.1 20151003, 64-bit
Note that there are NOT NULL constraints in place, so ORDER BY won't have to sort over empty rows.
Note also that I'm largely interested in where the difference comes from. While not ideal, I can retrieve data relatively quickly using EXISTS (<1ms) and then SELECT (~11ms).

There does not seem to be an index on sensor.station_id, which is important here.
There is an actual difference between max() and ORDER BY DESC + LIMIT 1. Many people seem to miss that. NULL values sort first in descending sort order. So ORDER BY timestamp DESC LIMIT 1 returns a NULL value if one exists, while the aggregate function max() ignores NULL values and returns the latest not-null timestamp. ORDER BY timestamp DESC NULLS LAST LIMIT 1 would be equivalent
For your case, since your column d.timestamp is defined NOT NULL (as your update revealed), there is no effective difference. An index with DESC NULLS LAST and the same clause in the ORDER BY for the LIMIT query should still serve you best. I suggest these indexes (my query below builds on the 2nd one):
sensor(station_id, id)
data(sensor_id, timestamp DESC NULLS LAST)
You can drop the other indexes sensor_ind_timestamp and sensor_ind_timestamp_desc unless they are in use otherwise (unlikely, but possible).
Much more importantly, there is another difficulty: The filter on the first table sensors returns few, but still (possibly) multiple rows. Postgres expects to find 2 rows (rows=2) in your added EXPLAIN output.
The perfect technique would be an index-skip-scan (a.k.a. loose index scan) for the second table data - which is not currently implemented (up to at least Postgres 15). There are various workarounds. See:
Optimize GROUP BY query to retrieve latest row per user
The best should be:
SELECT d.timestamp
FROM sensors s
CROSS JOIN LATERAL (
SELECT timestamp
FROM data
WHERE sensor_id = s.id
ORDER BY timestamp DESC NULLS LAST
LIMIT 1
) d
WHERE s.station_id = 4
ORDER BY d.timestamp DESC NULLS LAST
LIMIT 1;
The choice between max() and ORDER BY / LIMIT hardly matters in comparison. You might as well:
SELECT max(d.timestamp) AS timestamp
FROM sensors s
CROSS JOIN LATERAL (
SELECT timestamp
FROM data
WHERE sensor_id = s.id
ORDER BY timestamp DESC NULLS LAST
LIMIT 1
) d
WHERE s.station_id = 4;
Or:
SELECT max(d.timestamp) AS timestamp
FROM sensors s
CROSS JOIN LATERAL (
SELECT max(timestamp) AS timestamp
FROM data
WHERE sensor_id = s.id
) d
WHERE s.station_id = 4;
Or even with a correlated subquery, shortest of all:
SELECT max((SELECT max(timestamp) FROM data WHERE sensor_id = s.id)) AS timestamp
FROM sensors s
WHERE station_id = 4;
Note the double parentheses!
The additional advantage of LIMIT in a LATERAL join is that you can retrieve arbitrary columns of the selected row, not just the latest timestamp (one column).
Related:
Why do NULL values come first when ordering DESC in a PostgreSQL query?
What is the difference between a LATERAL JOIN and a subquery in PostgreSQL?
Select first row in each GROUP BY group?
Optimize groupwise maximum query

The query plan shows index names timestamp_ind and timestamp_sensor_ind. But indexes like that do not help with a search for a particular sensor.
To resolve an equals query (like sensor.id = data.sensor_id) the column has to be the first in the index. Try to add an index that allows searching on sensor_id and, within a sensor, is sorted by timestamp:
create index sensor_timestamp_ind on data(sensor_id, timestamp);
Does adding that index speed up the query?

Related

Using the same order as a subquery without the results getting sorted unnecessarily

I have a large table over which I want to execute some window functions by scanning over an index and I want to stop scanning and produce the row when one of a number of conditions hold involving these aggregates (so WHERE ... LIMIT 1 is out of question, since I can't have window functions inside the WHERE).
Let me expand further on my concrete case:
Here's my events table:
=> \d events
Table "public.events"
Column | Type | Collation | Nullable | Default
------------+-------------------+-----------+----------+---------
block | character varying | | not null |
chainid | bigint | | not null |
height | bigint | | not null |
idx | bigint | | not null |
module | character varying | | not null |
modulehash | character varying | | not null |
name | character varying | | not null |
params | jsonb | | not null |
paramtext | character varying | | not null |
qualname | character varying | | not null |
requestkey | character varying | | not null |
Indexes:
"events_pkey" PRIMARY KEY, btree (block, idx, requestkey)
"events_height_chainid_idx" btree (height DESC, chainid, idx)
After much experimentation, I've arrived at a query that returns exactly the row I want and it also produces exactly the query plan that I'm envisioning:
=> EXPLAIN ANALYZE SELECT *
FROM (
SELECT *
, ROW_NUMBER() OVER (ORDER BY height DESC, block, requestkey, idx) as scan_num
, count(*) FILTER (WHERE qualname ILIKE '%transfer%') OVER
( ORDER BY height DESC, block, requestkey, idx
ROWS BETWEEN unbounded PRECEDING AND CURRENT ROW
) AS foundCnt
FROM events
ORDER BY height DESC, block, requestkey, idx
) as scanned_events
WHERE foundCnt = 3 OR scan_num = 100000
LIMIT 1
;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1065.81..1400.34 rows=1 width=397) (actual time=0.095..0.096 rows=1 loops=1)
-> Subquery Scan on scanned_events (cost=1065.81..165535223.46 rows=494824 width=397) (actual time=0.095..0.095 rows=1 loops=1)
Filter: ((scanned_events.foundcnt = 3) OR (scanned_events.scan_num = 100000))
Rows Removed by Filter: 2
-> WindowAgg (cost=1065.81..164791126.56 rows=49606460 width=397) (actual time=0.089..0.094 rows=3 loops=1)
-> WindowAgg (cost=1065.81..163550965.06 rows=49606460 width=389) (actual time=0.081..0.083 rows=4 loops=1)
-> Incremental Sort (cost=1065.81..162434819.71 rows=49606460 width=381) (actual time=0.076..0.076 rows=5 loops=1)
Sort Key: events.height DESC, events.block, events.requestkey, events.idx
Presorted Key: events.height
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 56kB Peak Memory: 56kB
-> Index Scan using events_height_chainid_idx on events (cost=0.56..158424783.98 rows=49606460 width=381) (actual time=0.015..0.035 rows=53 loops=1)
Planning Time: 0.112 ms
Execution Time: 0.128 ms
(13 rows)
Here's what this query is trying to achieve: Scan through the events table counting the number of rows whose qualname contains 'transfers' and return the row as soon as you find the 3rd match OR you end up scanning 100000 rows.
So, my high-level intention is to look for some condition (involving a moving aggregate) but I want to put an upper bound on how many rows I'm willing to fetch. But if I happen to find what I'm looking for quickly, I also don't want to go through the rest of those 100000 rows unnecessarily (similar to the query plan above, where it ends up scanning just 53 rows).
If you inspect the query plan, this query is doing exactly what I want, but it has a serious flaw: It's not guaranteed to produce the correct result, it just happens to do it since the correct result is produced by the most natural way to execute the query, but the top-level SELECT has no ORDER BY clause, so Postgres could in theory execute it in a different way and end up returning any one row that happens to return foundCnt = 3.
In order to remedy this flaw, I've tried the following:
=> EXPLAIN ANALYZE SELECT *
FROM (
SELECT *
, ROW_NUMBER() OVER (ORDER BY height DESC, block, requestkey, idx) as scan_num
, count(*) FILTER (WHERE qualname ILIKE '%transfer%') OVER
( ORDER BY height DESC, block, requestkey, idx
ROWS BETWEEN unbounded PRECEDING AND CURRENT ROW
) AS foundCnt
FROM events
ORDER BY height DESC, block, requestkey, idx
) as scanned_events
WHERE foundCnt = 3 OR scan_num = 100000
ORDER BY height DESC, block, requestkey, idx
LIMIT 1
;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=16173553.41..16173571.35 rows=1 width=397) (actual time=86703.480..88314.937 rows=1 loops=1)
-> Subquery Scan on scanned_events (cost=16173553.41..25051383.19 rows=494821 width=397) (actual time=86435.692..88047.148 rows=1 loops=1)
Filter: ((scanned_events.foundcnt = 3) OR (scanned_events.scan_num = 100000))
Rows Removed by Filter: 2
-> WindowAgg (cost=16173553.41..24307291.63 rows=49606104 width=397) (actual time=86435.682..88047.143 rows=3 loops=1)
-> WindowAgg (cost=16173553.41..23067139.03 rows=49606104 width=389) (actual time=86435.662..88047.120 rows=4 loops=1)
-> Gather Merge (cost=16173553.41..21951001.69 rows=49606104 width=381) (actual time=86435.630..88047.085 rows=5 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=16172553.39..16224226.41 rows=20669210 width=381) (actual time=86147.622..86147.642 rows=106 loops=3)
Sort Key: events.height DESC, events.block, events.requestkey, events.idx
Sort Method: external merge Disk: 6535240kB
Worker 0: Sort Method: external merge Disk: 6503568kB
Worker 1: Sort Method: external merge Disk: 6506736kB
-> Parallel Seq Scan on events (cost=0.00..2852191.10 rows=20669210 width=381) (actual time=43.151..4135.334 rows=16430767 loops=3)
Planning Time: 0.353 ms
JIT:
Functions: 16
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 3.412 ms, Inlining 105.447 ms, Optimization 204.392 ms, Emission 87.327 ms, Total 400.578 ms
Execution Time: 89345.338 ms
(21 rows)
Now it ends up scanning the entire table, even though I've just explicitly specified what it was already doing. I've tried many variations on the latter query, such as moving the subquery to a CTE or ordering the outer SELECT by scan_num, ordering both SELECTs by scan_num, only ordering the outer SELECT by height DESC, block, requestkey, idx. I honestly lost track of the variations I've already tried, but as soon as I have an ORDER BY clause on the outer SELECT, Postgres ends up scanning the entire table.
So, my question is: Is there any way to achieve what I want without relying on fragile semantics (like the query that does exactly what I want). I.e. what would be the correct way to write a Postgres query that will scan a bounded number of rows and return as soon as a condition (involving window functions) is satisfied.
Addressing the comments
#nbk suggested trying to add an index on height DESC, block, requestkey, idx, i.e. the exact order we're looking for. Even though I want to avoid adding that index because I'm happy with the performance of my first query (so the index shouldn't be necessary), I still tried it, but it didn't change the query plan of the second query at all, it doesn't use any indexes anyway. It just made the first query slighly faster as expected, since that one does use indexes.

Query Optimization with WHERE condition and a single JOIN

I have 2 tables with one-to-many relationship.
Users-> 1 million (1)
Requests-> 10 millions (n)
What I'm trying to do, is to fetch the user alongside with the latest request made; and be able to filter the whole dataset based on the (last) request columns.
The current query is fetching the correct results but it is painfully slow. ~7-9 seconds
SELECT *
FROM users AS u
INNER JOIN requests AS r
ON u.id = r.user_id
WHERE (r.created_at = u.last_request_date AND r.ignored = false)
ORDER BY u.last_request_date DESC
LIMIT 10 OFFSET 0
I have also tried to JOIN the r.created_at as a second ON condition instead of filtering on the WHERE statement, but without a difference in performance.
UPDATE:
Indexes:
Users: last_request_date
Requests: created_at, user_id(foreign)
Execution plan: https://explain.depesz.com/s/JsLr#source
Execution plan:
Limit (cost=1000.88..21080.19 rows=10 width=139) (actual time=15966.670..15990.322 rows=10 loops=1)
Buffers: shared hit=3962420 read=152361
-> Gather Merge (cost=1000.88..757990.77 rows=377 width=139) (actual time=15966.653..15990.138 rows=10 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=3962420 read=152361
-> Nested Loop (cost=0.86..756947.24 rows=157 width=139) (actual time=9456.384..10622.180 rows=7 loops=3)
Buffers: shared hit=3962420 read=152361
" -> Parallel Index Scan Backward using users_last_request_date on users ""User"" (cost=0.42..55742.72 rows=420832 width=75) (actual time=0.061..2443.484 rows=333340 loops=3)"
Buffers: shared hit=5102 read=15849
-> Index Scan using requests_user_id on requests (cost=0.43..1.66 rows=1 width=64) (actual time=0.010..0.010 rows=0 loops=1000019)
" Index Cond: (user_id = ""User"".id)"
" Filter: ((NOT ignored) AND (""User"".last_request_date = created_at))"
Rows Removed by Filter: 10
Buffers: shared hit=3957318 read=136512
Planning Time: 0.745 ms
Execution Time: 15990.489 ms
The biggest bottleneck from your execution plan was this part, requests might add one more column created_at to the index (because there is a filter cost)
-> Index Scan using requests_user_id on requests (cost=0.43..1.66 rows=1 width=64) (actual time=0.010..0.010 rows=0 loops=1000019)
" Index Cond: (user_id = ""User"".id)"
" Filter: ((NOT ignored) AND (""User"".last_request_date = created_at))"
Rows Removed by Filter: 10
Buffers: shared hit=3957318 read=136512
so you might try to create an index like the below.
CREATE INDEX IX_requests ON requests (
user_id,
created_at
);
If the ignored = false is a small amount on the requests table you can try to use Partial Indexes which might help you reduce your storage and improve your index performance.
CREATE INDEX FIX_requests ON requests (
user_id,
created_at
)
WHERE ignored = false;
On the other thing, I would use an index as below for users table because there is an order by on the last_request_date column and users table join request table by id
CREATE INDEX IX_users ON users (
last_request_date,
id
);
NOTE
I would avoid using SELECT * because it might cost more than IO we might not need use select all columns from the table in most scenes.
Try creating this BTREE index to handle the requests table lookup more efficiently.
CREATE INDEX id_ignored_date ON requests (user_id, ignored, created_at);
Your plan says
-> Index Scan using requests_user_id on requests
(cost=0.43..1.66 rows=1 width=64) (actual time=0.010..0.010 rows=0 loops=1000019)
"Index Cond: (user_id = ""User"".id)"
"Filter: ((NOT ignored) AND (""User"".last_request_date = created_at))"
and this index will move the Filter conditions into the Index Cond, which should be faster.
Pro tip: #Kendle is right. Don't use SELECT * in production software, especially performance-sensitive software, unless you have a good reason. It makes your RDBMS server, network, and client program work harder for no good reason.
Edit: Read this about how to use multicolumn BTREE indexes effectively. https://www.postgresql.org/docs/current/indexes-multicolumn.html
As you only need the last 10 users I would suggest that we only fetch the last 100 records from requests. This may avoid a million join comparisons, to test as the query optimiser may already be doing this.
This number should be modified according to your application. It may be that the last 10 records will always be 10 different users or that we need to fetch more than 100 to be sure of having 10 users.
SELECT *
FROM users AS u
INNER JOIN (select * from requests
where r.ignored = false
order by created_at desc
limit 100) AS r
ON u.id = r.user_id
WHERE (r.created_at = u.last_request_date)
ORDER BY u.last_request_date DESC
LIMIT 10 OFFSET 0

postgresql - join between two large tables takes very long

I do have two rather large tables and I need to do a date range join between those. Unfortunately the query takes over 12 hours. I am using postgresql 10.5 running in docker with max. 5GB of ram and up to 12 CPU cores available.
Basically in the left table I do have an Equipment ID and a list of date ranges (from = Timestamp, to = ValidUntil). I then want to join the right table, which has measurements (sensor data) for all of the equipments, so that I only get the sensor data that lies within one of the date ranges (from the left table). Query:
select
A.*,
B."Timestamp" as "PressureTimestamp",
B."PropertyValue" as "Pressure"
from A
inner join B
on B."EquipmentId" = A."EquipmentId"
and B."Timestamp" >= A."Timestamp"
and B."Timestamp" < A."ValidUntil"
This query unfortunately is only utilizing one core, which might be the reason that it is running that slow. Is there a way to rewrite the query so it can be parallelized?
Indexes:
create index if not exists A_eq_timestamp_validUntil on public.A using btree ("EquipmentId", "Timestamp", "ValidUntil");
create index if not exists B_eq_timestamp on public.B using btree ("EquipmentId", "Timestamp");
Tables:
-- contains 332,000 rows
CREATE TABLE A (
"EquipmentId" bigint,
"Timestamp" timestamp without time zone,
"ValidUntil" timestamp without time zone
)
WITH ( OIDS = FALSE )
-- contains 70,000,000 rows
CREATE TABLE B
(
"EquipmentId" bigint,
"Timestamp" timestamp without time zone,
"PropertyValue" double precision
)
WITH ( OIDS = FALSE )
Execution plan (explain ... output):
Nested Loop (cost=176853.59..59023908.95 rows=941684055 width=48)
-> Bitmap Heap Scan on v2_pressure p (cost=176853.16..805789.35 rows=9448335 width=24)
Recheck Cond: ("EquipmentId" = 2956235)
-> Bitmap Index Scan on v2_pressure_eq (cost=0.00..174491.08 rows=9448335 width=0)
Index Cond: ("EquipmentId" = 2956235)"
-> Index Scan using v2_prs_eq_timestamp_validuntil on v2_prs prs (cost=0.42..5.16 rows=100 width=32)
Index Cond: (("EquipmentId" = 2956235) AND (p."Timestamp" >= "Timestamp") AND (p."Timestamp" < "ValidUntil"))
Update 1:
Fixed the indexes, according to comments, which improved performance a lot
Index correction is the first resort to fix slowness but it will only help to some extent. Given that your tables are big I would recommend to try Postgres Partition . It has some inbuilt support from postgres.
But you need to have some filter/partition criteria. I don't see any where clause in your query so can't suggest. May be you can try equipmentId. This can also help in achieving parallelism.
-- \i tmp.sql
CREATE TABLE A
( equipmentid bigint NOT NULL
, ztimestamp timestamp without time zone NOT NULL
, validuntil timestamp without time zone NOT NULL
, PRIMARY KEY (equipmentid,ztimestamp)
, UNIQUE (equipmentid,validuntil) -- mustbeunique, since the intervals dont overlap
) ;
-- contains 70,000,000 rows
CREATE TABLE B
( equipmentid bigint NOT NULL
, ztimestamp timestamp without time zone NOT NULL
, propertyvalue double precision
, PRIMARY KEY (equipmentid,ztimestamp)
) ;
INSERT INTO B(equipmentid,ztimestamp,propertyvalue)
SELECT i,t, random()
FROM generate_series(1,1000) i
CROSS JOIN generate_series('2018-09-01','2018-09-30','1day'::interval) t
;
INSERT INTO A(equipmentid,ztimestamp,validuntil)
SELECT equipmentid,ztimestamp, ztimestamp+ '7 days'::interval
FROM B
WHERE date_part('dow', ztimestamp) =0
;
ANALYZE A;
ANALYZE B;
EXPLAIN
SELECT
A.*,
B.ztimestamp AS pressuretimestamp,
B.propertyvalue AS pressure
FROM A
INNER JOIN B
ON B.equipmentid = A.equipmentid
AND B.ztimestamp >= A.ztimestamp
AND B.ztimestamp < A.validuntil
WHERE A.equipmentid=333 -- I added this, the plan in the question also has a r estriction on Id
;
And the resulting plan:
SET
ANALYZE
ANALYZE
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.34..21.26 rows=17 width=40)
-> Index Scan using a_equipmentid_validuntil_key on a (cost=0.17..4.34 rows=5 width=24)
Index Cond: (equipmentid = 333)
-> Index Scan using b_pkey on b (cost=0.17..3.37 rows=3 width=24)
Index Cond: ((equipmentid = 333) AND (ztimestamp >= a.ztimestamp) AND (ztimestamp < a.validuntil))
(5 rowSET
That is with my current setting of random_page_cost=1.1;
After setting it to 4.0, I get the same plan as the OP:
SET
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=35.13..54561.69 rows=1416136 width=40) (actual time=1.391..1862.275 rows=225540 loops=1)
-> Bitmap Heap Scan on aa2 (cost=34.71..223.52 rows=1345 width=24) (actual time=1.173..5.223 rows=1345 loops=1)
Recheck Cond: (equipmentid = 5)
Heap Blocks: exact=9
-> Bitmap Index Scan on aa2_equipmentid_validuntil_key (cost=0.00..34.38 rows=1345 width=0) (actual time=1.047..1.048 rows=1345 loops=1)
Index Cond: (equipmentid = 5)
-> Index Scan using bb2_pkey on bb2 (cost=0.42..29.87 rows=1053 width=24) (actual time=0.109..0.757 rows=168 loops=1345)
Index Cond: ((equipmentid = 5) AND (ztimestamp >= aa2.ztimestamp) AND (ztimestamp < aa2.validuntil))
Planning Time: 3.167 ms
Execution Time: 2168.967 ms
(10 rows)

Slow LEFT JOIN on CTE with time intervals

I am trying to debug a query in PostgreSQL that I've built to bucket market data in time buckets in arbitrary time intervals. Here is my table definition:
CREATE TABLE historical_ohlcv (
exchange_symbol TEXT NOT NULL,
symbol_id TEXT NOT NULL,
kafka_key TEXT NOT NULL,
open NUMERIC,
high NUMERIC,
low NUMERIC,
close NUMERIC,
volume NUMERIC,
time_open TIMESTAMP WITH TIME ZONE NOT NULL,
time_close TIMESTAMP WITH TIME ZONE,
CONSTRAINT historical_ohlcv_pkey
PRIMARY KEY (exchange_symbol, symbol_id, time_open)
);
CREATE INDEX symbol_id_idx
ON historical_ohlcv (symbol_id);
CREATE INDEX open_close_symbol_id
ON historical_ohlcv (time_open, time_close, exchange_symbol, symbol_id);
CREATE INDEX time_open_idx
ON historical_ohlcv (time_open);
CREATE INDEX time_close_idx
ON historical_ohlcv (time_close);
The table has ~25m rows currently. My query as an example for 1 hour, but could be 5 mins, 10 mins, 2 days, etc.
EXPLAIN ANALYZE WITH vals AS (
SELECT
NOW() - '5 months' :: INTERVAL AS frame_start,
NOW() AS frame_end,
INTERVAL '1 hour' AS t_interval
)
, grid AS (
SELECT
start_time,
lead(start_time, 1)
OVER (
ORDER BY start_time ) AS end_time
FROM (
SELECT
generate_series(frame_start, frame_end,
t_interval) AS start_time,
frame_end
FROM vals
) AS x
)
SELECT max(high)
FROM grid g
LEFT JOIN historical_ohlcv ohlcv ON ohlcv.time_open >= g.start_time
WHERE exchange_symbol = 'BINANCE'
AND symbol_id = 'ETHBTC'
GROUP BY start_time;
The WHERE clause could be any valid value in the table.
This technique was inspired by:
Best way to count records by arbitrary time intervals in Rails+Postgres.
The idea is to make a common table and left join your data with that to indicate which bucket stuff is in. This query is really slow! It's currently taking 15s. Based on the query planner, we have a really expensive nested loop:
QUERY PLAN
HashAggregate (cost=2758432.05..2758434.05 rows=200 width=40) (actual time=16023.713..16023.817 rows=542 loops=1)
Group Key: g.start_time
CTE vals
-> Result (cost=0.00..0.02 rows=1 width=32) (actual time=0.005..0.005 rows=1 loops=1)
CTE grid
-> WindowAgg (cost=64.86..82.36 rows=1000 width=16) (actual time=2.986..9.594 rows=3625 loops=1)
-> Sort (cost=64.86..67.36 rows=1000 width=8) (actual time=2.981..4.014 rows=3625 loops=1)
Sort Key: x.start_time
Sort Method: quicksort Memory: 266kB
-> Subquery Scan on x (cost=0.00..15.03 rows=1000 width=8) (actual time=0.014..1.991 rows=3625 loops=1)
-> ProjectSet (cost=0.00..5.03 rows=1000 width=16) (actual time=0.013..1.048 rows=3625 loops=1)
-> CTE Scan on vals (cost=0.00..0.02 rows=1 width=32) (actual time=0.008..0.009 rows=1 loops=1)
-> Nested Loop (cost=0.56..2694021.34 rows=12865667 width=14) (actual time=7051.730..16015.873 rows=31978 loops=1)
-> CTE Scan on grid g (cost=0.00..20.00 rows=1000 width=16) (actual time=2.988..11.635 rows=3625 loops=1)
-> Index Scan using historical_ohlcv_pkey on historical_ohlcv ohlcv (cost=0.56..2565.34 rows=12866 width=22) (actual time=3.712..4.413 rows=9 loops=3625)
Index Cond: ((exchange_symbol = 'BINANCE'::text) AND (symbol_id = 'ETHBTC'::text) AND (time_open >= g.start_time))
Filter: (time_close < g.end_time)
Rows Removed by Filter: 15502
Planning time: 0.568 ms
Execution time: 16023.979 ms
My guess is this line is doing a lot:
LEFT JOIN historical_ohlcv ohlcv ON ohlcv.time_open >= g.start_time
AND ohlcv.time_close < g.end_time
But I'm not sure how to accomplish this in another way.
P.S. apologies if this belongs to dba.SE. I read the FAQ and this seemed too basic for that site, so I posted here.
Edits as requested:
SELECT avg(pg_column_size(t)) FROM historical_ohlcv t TABLESAMPLE SYSTEM (0.1); returns 107.632
For exchange_symbol, there are 3 unique values, for symbol_id there are ~400
PostgreSQL version: PostgreSQL 10.3 (Ubuntu 10.3-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609, 64-bit.
The table will be growing about ~1m records a day, so not exactly read-only. All this stuff is done locally and I will try to move to RDS or to help manage hardware issues.
Related: if I wanted to add other aggregates, specifically 'first in the bucket', 'last in the bucket', min, sum, would my indexing strategy change?
Correctness first: I suspect a bug in your query:
LEFT JOIN historical_ohlcv ohlcv ON ohlcv.time_open >= g.start_time
AND ohlcv.time_close < g.end_time
Unlike my referenced answer, you join on a time interval: (time_open, time_close]. The way you do it excludes rows in the table where the interval crosses bucket borders. Only intervals fully contained in a single bucket count. I don't think that's intended?
A simple fix would be to decide bucket membership based on time_open (or time_close) alone. If you want to keep working with both, you have to define exactly how to deal with intervals overlapping with multiple buckets.
Also, you are looking for max(high) per bucket, which is different in nature from count(*) in my referenced answer.
And your buckets are simple intervals per hour?
Then we can radically simplify. Working with just time_open:
SELECT date_trunc('hour', time_open) AS hour, max(high) AS max_high
FROM historical_ohlcv
WHERE exchange_symbol = 'BINANCE'
AND symbol_id = 'ETHBTC'
AND time_open >= now() - interval '5 months' -- frame_start
AND time_open < now() -- frame_end
GROUP BY 1
ORDER BY 1;
Related:
Resample on time series data
It's hard to talk about further performance optimization while basics are unclear. And we'd need more information.
Are WHERE conditions variable?
How many distinct values in exchange_symbol and symbol_id?
Avg. row size? What do you get for:
SELECT avg(pg_column_size(t)) FROM historical_ohlcv t TABLESAMPLE SYSTEM (0.1);
Is the table read-only?
Assuming you always filter on exchange_symbol and symbol_id and values are variable, your table is read-only or autovacuum can keep up with the write load so we can hope for index-only scans, you would best have a multicolumn index on (exchange_symbol, symbol_id, time_open, high DESC) to support this query. Index columns in this order. Related:
Multicolumn index and performance
Depending on data distribution and other details a LEFT JOIN LATERAL solution might be another option. Related:
How to find an average of values for time intervals in postgres
Optimize GROUP BY query to retrieve latest record per user
Aside from all that, you EXPLAIN plan exhibits some very bad estimates:
https://explain.depesz.com/s/E5yI
Are you using a current version of Postgres? You may have to work on your server configuration - or at least set higher statistics targets on relevant columns and more aggressive autovacuum settings for the big table. Related:
Keep PostgreSQL from sometimes choosing a bad query plan
Aggressive Autovacuum on PostgreSQL

How to optimize a MAX SQL query with GROUP BY DATE

I'm trying to optimize a query from a table with 3M rows.
The columns are value, datetime and point_id.
SELECT DATE(datetime), MAX(value) FROM historical_points WHERE point_id=1 GROUP BY DATE(datetime);
This query takes 2 seconds.
I tried indexing the point_id=1 but the results were not much better.
Is it possible to index the MAX query or is there a better way to do it? Maybe with an INNER JOIN?
EDIT:
This is the explain analyze of similar one, that is tackling the case better. This one also ha performance problem.
EXPLAIN ANALYZE SELECT DATE(datetime), MAX(value), MIN(value) FROM buildings_hispoint WHERE point_id=64 AND datetime BETWEEN '2017-09-01 00:00:00' AND '2017-10-01 00:00:00' GROUP BY DATE(datetime);
>GroupAggregate (cost=84766.65..92710.99 rows=336803 width=68) (actual time=1461.060..2701.145 rows=21 loops=1)
> Group Key: (date(datetime))
> -> Sort (cost=84766.65..85700.23 rows=373430 width=14) (actual time=1408.445..1547.929 rows=523621 loops=1)
> Sort Key: (date(datetime))
> Sort Method: external sort Disk: 11944kB
> -> Bitmap Heap Scan on buildings_hispoint (cost=10476.02..43820.81 rows=373430 width=14) (actual time=148.970..731.154 rows=523621 loops=1)
> Recheck Cond: (point_id = 64)
> Filter: ((datetime >= '2017-09-01 00:00:00+02'::timestamp with time zone) AND (datetime Rows Removed by Filter: 35712
> Heap Blocks: exact=14422
> -> Bitmap Index Scan on buildings_measurementdatapoint_ffb10c68 (cost=0.00..10382.67 rows=561898 width=0) (actual time=125.150..125.150 rows=559333 loops=1)
> Index Cond: (point_id = 64)
>Planning time: 0.284 ms
>Execution time: 2704.566 ms
Without seeing EXPLAIN output is difficult to say something. My guess is that you must include DATE() call on index definition:
CREATE INDEX historical_points_idx ON historical_points (DATE(datetime), point_id);
Also, if point_id has more distinct values than DATE(datetime) then you must reverse column order:
CREATE INDEX historical_points_idx ON historical_points (point_id, DATE(datetime));
Keep in mind that cardinality of columns is very important to the planner, columns with high selectivity is preferred to go first.
SELECT DISTINCT ON (DATE(datetime)) DATE(datetime), value
FROM historical_points WHERE point_id=1
ORDER BY DATE(datetime) DESC, value DESC;
Put an computed index on DATE(datetime), value. [I hope those aren't your real column names. Using reserved words like VALUE as a column name is a recipe for confusion.]
The SELECT DISTINCT will work like a GROUP ON. The ORDER BY replaces the MAX, and will be fast if indexed.
I owe this technique to #ErwinBrandstetter.