I have a SQLite table with an Id and an active period, and I am trying to get counts of the number of active of rows over a sequence of times.
A vastly simplified version of this table is:
CREATE TABLE Data (
EntityId INTEGER NOT NULL,
Start INTEGER NOT NULL,
Finish INTEGER
);
With some example data
INSERT INTO Data VALUES
(1, 0, 2),
(1, 4, 6),
(1, 8, NULL),
(2, 5, 7),
(2, 9, NULL),
(3, 8, NULL);
And an desired output of something like:
Time
Count
0
1
1
1
2
0
3
0
4
1
5
2
6
1
7
0
8
2
9
3
For which I am querying with:
WITH RECURSIVE Generate_Time(Time) AS (
SELECT 0
UNION ALL
SELECT Time + 1 FROM Generate_Time
WHERE Time + 1 <= (SELECT MAX(Start) FROM Data)
)
SELECT Time, COUNT(EntityId)
FROM Data
JOIN Generate_Time ON Start <= Time AND (Finish > Time OR Finish IS NULL)
GROUP BY Time
There is also some data I need to categorise the counts by (some are on the original table, some are using a join), but I am hitting a performance bottleneck in the order of seconds on even small amounts of data (~25,000 rows) without any of that.
I have added an index on the table covering Start/End:
CREATE INDEX Ix_Data ON Data (
Start,
Finish
);
and that helped somewhat but I can't help but feel there's a more elegant & performant way of doing this. Using the CTE to iterate over a range doesn't seem like it will scale very well but I can't think of another way to calculate what I need.
I've been looking at the query plan too, and I think the slow part of the GROUP BY since it can't use an index for that since it's from the CTE so SQLite generates a temporary BTree:
3 0 0 MATERIALIZE 3
7 3 0 SETUP
8 7 0 SCAN CONSTANT ROW
21 3 0 RECURSIVE STEP
22 21 0 SCAN TABLE Generate_Time
27 21 0 SCALAR SUBQUERY 2
32 27 0 SEARCH TABLE Data USING COVERING INDEX Ix_Data
57 0 0 SCAN SUBQUERY 3
59 0 0 SEARCH TABLE Data USING INDEX Ix_Data (Start<?)
71 0 0 USE TEMP B-TREE FOR GROUP BY
Any suggestions of a way to speed this query up, or even a better way of storing this data to craft a tighter query would be most welcome!
To get to the desired output as per your question, the following can be done.
For better performance, on option is to make use of generate_series to generate rows instead of the recursive CTE and limit the number of rows to the max-value available in data.
WITH RECURSIVE Generate_Time(Time) AS (
SELECT 0
UNION ALL
SELECT Time + 1 FROM Generate_Time
WHERE Time + 1 <= (SELECT MAX(Start) FROM Data)
)
SELECT gt.Time
,count(d.entityid)
FROM Generate_Time gt
LEFT JOIN Data d
ON gt.Time between d.start and IFNULL(d.finish,gt.Time)
GROUP BY gt.Time
This ended up being simply a case of the result set being too large. In my real data, the result set before grouping was ~19,000,000 records. I was able to do some partitioning on my client side, splitting the queries into smaller discrete chunks which improved performance ~10x, which still wasn't quite as fast as I wanted but was acceptable for my use case.
This is pretty weird for me. TD overestimates amount of rows, actually 42 mln, estimated 943 mln.
The query is pretty easy:
select ID, sum(amount)
from v_tb -- view
where REPORT_DATE between Date '2017-11-01' and Date '2017-11-30' -- report_date has date format
group by 1
Plan:
1) First, we lock tb in view v_tb for access.
2) Next, we do an all-AMPs SUM step to aggregate from 1230 partitions
of tb in view v_tb with a
condition of ("(tb.REPORT_DATE >= DATE '2017-11-01') AND
(tb.REPORT_DATE <= DATE '2017-11-30')")
, grouping by field1 ( ID). Aggregate
Intermediate Results are computed locally, then placed in Spool 1.
The input table will not be cached in memory, but it is eligible
for synchronized scanning. The size of Spool 1 is estimated with
low confidence to be 943,975,437 rows (27,375,287,673 bytes). The
estimated time for this step is 1 minute and 26 seconds.
3) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 1 minute and 26 seconds.
There is collected statistics according to DBC.statsV on ID, report_date, (ID, report_date) - they all up to date. There is no null values - TRUE
UniqueValueCount for ID, report_date, (ID, report_date) - 36 mln, 839, 1232 mln values - seems to be correct
Why TD overestimated the amount of rows? Isn't it should get the final result based on UniqueValueCount of ID only, because I group on it
UPD1:
-- estimates 32 mln rows
select ID, sum(amount)
from v_tb -- view
where REPORT_DATE between Date '2017-11-01' and Date '2017-11-01' -- report_date has date format
group by 1
-- estimates 89 mln rows
select ID, sum(amount)
from v_tb -- view
where REPORT_DATE between Date '2017-11-01' and Date '2017-11-02' -- report_date has date format
group by 1
So the problem is with where predicate
SampleSizePct is equal to 5.01 - does it mean, that sample size is only 5%? - yes it is
UPD2: Previous query was part of a bigger query, which looks like this:
select top 100000000
base.*
, case when CPE_MODEL_NEW.device_type in ('Smartphone', 'Phone', 'Tablet', 'USB modem') then CPE_MODEL_NEW.device_type
else 'other' end as device_type
, usg_mbou
, usg_arpu_content
, date '2017-11-30' as max_report_date
, macroregion_name
from (
select
a.SUBS_ID
, a.tac
, MSISDN
, BRANCH_ID
, max(bsegment) bsegment
, max((date '2017-11-30' - cast (activation_dttm as date))/30.4167) as LT_month
, Sum(REVENUE_COMMERCE) REVENUE_COMMERCE
, max(LAST_FLASH_DTTM) LAST_FLASH_DTTM
from PRD2_BDS_V2.SUBS_CLR_D a
where a.REPORT_DATE between Date '2017-11-01' and Date '2017-11-30'
group by 1,2,3,4 --, 8, 9
) base
left join CPE_MODEL_NEW on base.tac = CPE_MODEL_NEW.tac
left join
(
select SUBS_ID, sum(case when TRAFFIC_TYPE_ID = 4 /*DATA*/ then all_vol / (1024 * 1024) else 0 end) usg_mbou
,sum(case when COST_BAND_ID IN (3,46,49,56) then rated_amount else 0 end) usg_arpu_content
from PRD2_BDS_V2.SUBS_USG_D where SUBS_USG_D.REPORT_DATE between Date '2017-11-01' and Date '2017-11-30'
group by 1
) SUBS_USG_D
on SUBS_USG_D.SUBS_ID = base.SUBS_ID
LEFT JOIN PRD2_DIC_V.BRANCH AS BRANCH ON base.BRANCH_ID = BRANCH.BRANCH_ID
LEFT JOIN PRD2_DIC_V2.REGION AS REGION ON BRANCH.REGION_ID = REGION.REGION_ID
AND Date '2017-11-30' >= REGION.SDATE AND REGION.EDATE >= Date '2017-11-01'
LEFT JOIN PRD2_DIC_V2.MACROREGION AS MACROREGION ON REGION.MACROREGION_ID = MACROREGION.MACROREGION_ID
AND Date '2017-11-30' >= MACROREGION.SDATE AND Date '2017-11-01' <= MACROREGION.EDATE
Query fail on spool problem on a almost last steps:
We do an All-AMPs STAT FUNCTION step from Spool 10 by way of an all-rows scan into Spool 29, which is redistributed by hash code to all AMPs. The result rows are put into Spool 9, which is redistributed by hash code to all AMPs..
There is no product join, no wrong dublication to all amps, what offen lead to spool problem. However there is another problem, very high skew:
Snapshot CPU skew: 99.7%
Snapshot I/O skew: 99.7%
Spool usage just only 30 GB, but it easily uses more than 300 Gb at the beginning of query execution.
Tables aren't skewed
Full explain:
1) First, we lock TELE2_UAT.CPE_MODEL_NEW for access, we lock
PRD2_DIC.REGION in view PRD2_DIC_V2.REGION for access, we lock
PRD2_DIC.MACROREGION in view PRD2_DIC_V2.MACROREGION for access,
we lock PRD2_DIC.BRANCH in view PRD2_DIC_V.BRANCH for access, we
lock PRD2_BDS.SUBS_CLR_D for access, and we lock
PRD2_BDS.SUBS_USG_D for access.
2) Next, we do an all-AMPs SUM step to aggregate from 1230 partitions
of PRD2_BDS.SUBS_CLR_D with a condition of (
"(PRD2_BDS.SUBS_CLR_D.REPORT_DATE >= DATE '2017-11-01') AND
(PRD2_BDS.SUBS_CLR_D.REPORT_DATE <= DATE '2017-11-30')"), and the
grouping identifier in field 1. Aggregate Intermediate Results
are computed locally,skipping sort when applicable, then placed in
Spool 4. The input table will not be cached in memory, but it is
eligible for synchronized scanning. The size of Spool 4 is
estimated with low confidence to be 1,496,102,647 rows (
285,755,605,577 bytes). The estimated time for this step is 1
minute and 55 seconds.
3) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by
way of an all-rows scan into Spool 2 (used to materialize
view, derived table, table function or table operator base)
(all_amps) (compressed columns allowed), which is built
locally on the AMPs with Field1 ("UniqueId"). The size of
Spool 2 is estimated with low confidence to be 1,496,102,647
rows (140,633,648,818 bytes). Spool AsgnList:
"Field_1" = "UniqueId",
"Field_2" = "SUBS_ID",
"Field_3" = "TAC",
"Field_4" = "MSISDN",
"Field_5" = "BRANCH_ID",
"Field_6" = "Field_6",
"Field_7" = "Field_7",
"Field_8" = "Field_8",
"Field_9" = "Field_9".
The estimated time for this step is 57.85 seconds.
2) We do an all-AMPs SUM step to aggregate from 1230 partitions
of PRD2_BDS.SUBS_USG_D with a condition of ("(NOT
(PRD2_BDS.SUBS_USG_D.SUBS_ID IS NULL )) AND
((PRD2_BDS.SUBS_USG_D.REPORT_DATE >= DATE '2017-11-01') AND
(PRD2_BDS.SUBS_USG_D.REPORT_DATE <= DATE '2017-11-30'))"),
and the grouping identifier in field 1. Aggregate
Intermediate Results are computed locally,skipping sort when
applicable, then placed in Spool 7. The input table will not
be cached in memory, but it is eligible for synchronized
scanning. The size of Spool 7 is estimated with low
confidence to be 943,975,437 rows (42,478,894,665 bytes).
The estimated time for this step is 1 minute and 29 seconds.
4) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 7 (Last Use) by
way of an all-rows scan into Spool 1 (used to materialize
view, derived table, table function or table operator
SUBS_USG_D) (all_amps) (compressed columns allowed), which is
built locally on the AMPs with Field1 ("UniqueId"). The size
of Spool 1 is estimated with low confidence to be 943,975,437
rows (42,478,894,665 bytes). Spool AsgnList:
"Field_1" = "UniqueId",
"Field_2" = "SUBS_ID",
"Field_3" = "Field_3",
"Field_4" = "Field_4".
The estimated time for this step is 16.75 seconds.
2) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by
way of an all-rows scan into Spool 11 (all_amps) (compressed
columns allowed), which is redistributed by hash code to all
AMPs to all AMPs with hash fields ("Spool_2.SUBS_ID"). Then
we do a SORT to order Spool 11 by row hash. The size of
Spool 11 is estimated with low confidence to be 1,496,102,647
rows (128,664,827,642 bytes). Spool AsgnList:
"SUBS_ID" = "Spool_2.SUBS_ID",
"TAC" = "TAC",
"MSISDN" = "MSISDN",
"BRANCH_ID" = "BRANCH_ID",
"BSEGMENT" = "BSEGMENT",
"LT_MONTH" = "LT_MONTH",
"REVENUE_COMMERCE" = "REVENUE_COMMERCE",
"LAST_FLASH_DTTM" = "LAST_FLASH_DTTM".
The estimated time for this step is 4 minutes and 8 seconds.
5) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by
way of an all-rows scan into Spool 12 (all_amps) (compressed
columns allowed), which is redistributed by hash code to all
AMPs to all AMPs with hash fields ("Spool_1.SUBS_ID"). Then
we do a SORT to order Spool 12 by row hash. The size of
Spool 12 is estimated with low confidence to be 943,975,437
rows (34,927,091,169 bytes). Spool AsgnList:
"SUBS_ID" = "Spool_1.SUBS_ID",
"USG_MBOU" = "USG_MBOU",
"USG_ARPU_CONTENT" = "USG_ARPU_CONTENT".
The estimated time for this step is 1 minute and 5 seconds.
2) We do an all-AMPs RETRIEVE step from PRD2_DIC.BRANCH in view
PRD2_DIC_V.BRANCH by way of an all-rows scan with a condition
of ("NOT (PRD2_DIC.BRANCH in view PRD2_DIC_V.BRANCH.BRANCH_ID
IS NULL)") into Spool 13 (all_amps) (compressed columns
allowed), which is redistributed by hash code to all AMPs to
all AMPs with hash fields ("PRD2_DIC.BRANCH.REGION_ID").
Then we do a SORT to order Spool 13 by row hash. The size of
Spool 13 is estimated with high confidence to be 107 rows (
1,712 bytes). Spool AsgnList:
"BRANCH_ID" = "BRANCH_ID",
"REGION_ID" = "PRD2_DIC.BRANCH.REGION_ID".
The estimated time for this step is 0.02 seconds.
6) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step (No Sum) from PRD2_DIC.REGION in
view PRD2_DIC_V2.REGION by way of a RowHash match scan with a
condition of ("(PRD2_DIC.REGION in view
PRD2_DIC_V2.REGION.EDATE >= DATE '2017-11-01') AND
(PRD2_DIC.REGION in view PRD2_DIC_V2.REGION.SDATE <= DATE
'2017-11-30')"), which is joined to Spool 13 (Last Use) by
way of a RowHash match scan. PRD2_DIC.REGION and Spool 13
are right outer joined using a merge join, with condition(s)
used for non-matching on right table ("NOT
(Spool_13.REGION_ID IS NULL)"), with a join condition of (
"Spool_13.REGION_ID = PRD2_DIC.REGION.ID"). The result goes
into Spool 14 (all_amps) (compressed columns allowed), which
is redistributed by hash code to all AMPs to all AMPs with
hash fields ("PRD2_DIC.REGION.MACROREGION_CODE"). Then we do
a SORT to order Spool 14 by row hash. The size of Spool 14
is estimated with low confidence to be 107 rows (2,461 bytes).
Spool AsgnList:
"MACROREGION_CODE" = "PRD2_DIC.REGION.MACROREGION_CODE",
"BRANCH_ID" = "{RightTable}.BRANCH_ID".
The estimated time for this step is 0.03 seconds.
2) We do an all-AMPs RETRIEVE step from TELE2_UAT.CPE_MODEL_NEW
by way of an all-rows scan with no residual conditions into
Spool 17 (all_amps) (compressed columns allowed), which is
duplicated on all AMPs with hash fields (
"TELE2_UAT.CPE_MODEL_NEW.TAC"). Then we do a SORT to order
Spool 17 by row hash. The size of Spool 17 is estimated with
high confidence to be 49,024,320 rows (2,696,337,600 bytes).
Spool AsgnList:
"TAC" = "TELE2_UAT.CPE_MODEL_NEW.TAC",
"DEVICE_TYPE" = "DEVICE_TYPE".
The estimated time for this step is 2.81 seconds.
3) We do an all-AMPs JOIN step (No Sum) from Spool 11 (Last Use)
by way of a RowHash match scan, which is joined to Spool 12
(Last Use) by way of a RowHash match scan. Spool 11 and
Spool 12 are left outer joined using a merge join, with
condition(s) used for non-matching on left table ("NOT
(Spool_11.SUBS_ID IS NULL)"), with a join condition of (
"Spool_12.SUBS_ID = Spool_11.SUBS_ID"). The result goes into
Spool 18 (all_amps) (compressed columns allowed), which is
built locally on the AMPs with hash fields ("Spool_11.TAC").
Then we do a SORT to order Spool 18 by row hash. The size of
Spool 18 is estimated with low confidence to be 1,496,102,648
rows (152,602,470,096 bytes). Spool AsgnList:
"BRANCH_ID" = "{LeftTable}.BRANCH_ID",
"TAC" = "Spool_11.TAC",
"SUBS_ID" = "{LeftTable}.SUBS_ID",
"MSISDN" = "{LeftTable}.MSISDN",
"BSEGMENT" = "{LeftTable}.BSEGMENT",
"LT_MONTH" = "{LeftTable}.LT_MONTH",
"REVENUE_COMMERCE" = "{LeftTable}.REVENUE_COMMERCE",
"LAST_FLASH_DTTM" = "{LeftTable}.LAST_FLASH_DTTM",
"USG_MBOU" = "{RightTable}.USG_MBOU",
"USG_ARPU_CONTENT" = "{RightTable}.USG_ARPU_CONTENT".
The estimated time for this step is 3 minutes and 45 seconds.
7) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step (No Sum) from
PRD2_DIC.MACROREGION in view PRD2_DIC_V2.MACROREGION by way
of a RowHash match scan with a condition of (
"(PRD2_DIC.MACROREGION in view PRD2_DIC_V2.MACROREGION.EDATE
>= DATE '2017-11-01') AND (PRD2_DIC.MACROREGION in view
PRD2_DIC_V2.MACROREGION.SDATE <= DATE '2017-11-30')"), which
is joined to Spool 14 (Last Use) by way of a RowHash match
scan. PRD2_DIC.MACROREGION and Spool 14 are right outer
joined using a merge join, with condition(s) used for
non-matching on right table ("NOT (Spool_14.MACROREGION_CODE
IS NULL)"), with a join condition of (
"Spool_14.MACROREGION_CODE = PRD2_DIC.MACROREGION.MR_CODE").
The result goes into Spool 19 (all_amps) (compressed columns
allowed), which is duplicated on all AMPs with hash fields (
"Spool_14.BRANCH_ID"). The size of Spool 19 is estimated
with low confidence to be 34,240 rows (1,712,000 bytes).
Spool AsgnList:
"BRANCH_ID" = "Spool_14.BRANCH_ID",
"MR_NAME" = "{LeftTable}.MR_NAME".
The estimated time for this step is 0.04 seconds.
2) We do an all-AMPs JOIN step (No Sum) from Spool 17 (Last Use)
by way of a RowHash match scan, which is joined to Spool 18
(Last Use) by way of a RowHash match scan. Spool 17 and
Spool 18 are right outer joined using a merge join, with
condition(s) used for non-matching on right table ("NOT
(Spool_18.TAC IS NULL)"), with a join condition of (
"Spool_18.TAC = Spool_17.TAC"). The result goes into Spool
22 (all_amps) (compressed columns allowed), which is built
locally on the AMPs with hash fields ("Spool_18.BRANCH_ID").
The size of Spool 22 is estimated with low confidence to be
1,496,102,648 rows (204,966,062,776 bytes). Spool AsgnList:
"BRANCH_ID" = "Spool_18.BRANCH_ID",
"SUBS_ID" = "{RightTable}.SUBS_ID",
"TAC" = "{RightTable}.TAC",
"MSISDN" = "{RightTable}.MSISDN",
"BSEGMENT" = "{RightTable}.BSEGMENT",
"LT_MONTH" = "{RightTable}.LT_MONTH",
"REVENUE_COMMERCE" = "{RightTable}.REVENUE_COMMERCE",
"LAST_FLASH_DTTM" = "{RightTable}.LAST_FLASH_DTTM",
"DEVICE_TYPE" = "{LeftTable}.DEVICE_TYPE",
"USG_MBOU" = "{RightTable}.USG_MBOU",
"USG_ARPU_CONTENT" = "{RightTable}.USG_ARPU_CONTENT".
The estimated time for this step is 1 minute and 23 seconds.
8) We do an all-AMPs JOIN step (No Sum) from Spool 19 (Last Use) by
way of an all-rows scan, which is joined to Spool 22 (Last Use) by
way of an all-rows scan. Spool 19 is used as the hash table and
Spool 22 is used as the probe table in a right outer joined using
a single partition classical hash join, with condition(s) used for
non-matching on right table ("NOT (Spool_22.BRANCH_ID IS NULL)"),
with a join condition of ("Spool_22.BRANCH_ID = Spool_19.BRANCH_ID").
The result goes into Spool 10 (all_amps) (compressed columns
allowed), which is built locally on the AMPs with Field1 ("28364").
The size of Spool 10 is estimated with low confidence to be
1,496,102,648 rows (260,321,860,752 bytes). Spool AsgnList:
"Field_1" = "28364",
"Spool_10.SUBS_ID" = "{ Copy }{RightTable}.SUBS_ID",
"Spool_10.TAC" = "{ Copy }{RightTable}.TAC",
"Spool_10.MSISDN" = "{ Copy }{RightTable}.MSISDN",
"Spool_10.BRANCH_ID" = "{ Copy }{RightTable}.BRANCH_ID",
"Spool_10.BSEGMENT" = "{ Copy }{RightTable}.BSEGMENT",
"Spool_10.LT_MONTH" = "{ Copy }{RightTable}.LT_MONTH",
"Spool_10.REVENUE_COMMERCE" = "{ Copy
}{RightTable}.REVENUE_COMMERCE",
"Spool_10.LAST_FLASH_DTTM" = "{ Copy }{RightTable}.LAST_FLASH_DTTM",
"Spool_10.DEVICE_TYPE" = "{ Copy }{RightTable}.DEVICE_TYPE",
"Spool_10.USG_MBOU" = "{ Copy }{RightTable}.USG_MBOU",
"Spool_10.USG_ARPU_CONTENT" = "{ Copy
}{RightTable}.USG_ARPU_CONTENT",
"Spool_10.MR_NAME" = "{ Copy }{LeftTable}.MR_NAME".
The estimated time for this step is 1 minute and 45 seconds.
9) We do an all-AMPs STAT FUNCTION step from Spool 10 by way of an
all-rows scan into Spool 29, which is redistributed by hash code
to all AMPs. The result rows are put into Spool 9 (group_amps),
which is built locally on the AMPs with Field1 ("Field_1"). This
step is used to retrieve the TOP 100000000 rows. Load
distribution optimization is used. If this step retrieves less
than 100000000 rows, then execute step 10. The size is estimated
with low confidence to be 100,000,000 rows (25,000,000,000 bytes).
10) We do an all-AMPs STAT FUNCTION step from Spool 10 (Last Use) by
way of an all-rows scan into Spool 29 (Last Use), which is
redistributed by hash code to all AMPs. The result rows are put
into Spool 9 (group_amps), which is built locally on the AMPs with
Field1 ("Field_1"). This step is used to retrieve the TOP
100000000 rows. The size is estimated with low confidence to be
100,000,000 rows (25,000,000,000 bytes).
11) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 9 are sent back to the user as the result of
statement 1.
What can I do here?
Most databases show wrong estimates, and this is OK as long as the relationship between those estimates is good enough to produce a decent execution plan.
Now, if you think the execution plan is wrong, then you should seriously care about those estimates. Did you update the tables statistics recently?
Otherwise, I wouldn't worry too much about it.
With example tables:
create table user_login (
user_id integer not null,
login_time numeric not null, -- seconds since epoch or similar
constraint unique(user_id, login_time)
);
create table user_page_visited (
page_id integer not null,
page_visited_at numeric not null -- seconds since epoch or similar
);
with example data:
> user_login
user_id login_time
1 1 100
2 1 140
> user_page_visited
page_id page_visited_at
1 1 100
2 1 200
3 2 120
4 2 130
5 3 160
6 3 150
I wish to return all rows of user_page_visited that fall into a range based off user_login.login_time, for example, return all pages accessed within 20 seconds of an existing login_time:
> user_page_visited
page_id page_visited_at
1 1 100
3 2 120
5 3 160
6 3 150
How would I do this efficiently when both tables have lots of rows? For example, the following query does something similar (returns duplicate rows when ranges overlap), but seems to very slow:
select * from
user_login l cross join
user_page_visited v
where v.page_visited_at >= l.login_time
and v.page_visited_at <= l.login_time + 20;
First, use regular join syntax:
select *
from user_login l join
user_page_visited v
on v.page_visited_at >= l.login_time and
v.page_visited_at <= l.login_time + 20;
Next, be sure that you have indexes on the columns used for the join. . . user_login(login_time) and user_page_visited(page_visited_at).
If these don't work, then you still have a couple of options. If the "20" is fixed, you can vary the type of index. There are also tricks if you are only looking for one match between, say, the login and the page visited.
This solution is based on the comments of the answer from Gordon Linoff.
First we retrieve the tuples that were accessed in the same time slice as a user connection or in the following time slice using the following query:
SELECT DISTINCT page_id, page_visited_at
FROM user_login
INNER JOIN user_page_visited ON login_time::INT / 20 = page_visited_at::INT / 20 OR login_time::INT / 20 = page_visited_at::INT / 20 - 1;
We now need indexes in order to get a good query plan:
CREATE INDEX i_user_login_login_time_20 ON user_login ((login_time::INT / 20));
CREATE INDEX i_user_page_visited_page_visited_at_20 ON user_page_visited ((page_visited_at::INT / 20));
CREATE INDEX i_user_page_visited_page_visited_at_20_minus_1 ON user_page_visited ((page_visited_at::INT / 20 - 1));
If you EXPLAIN the query with these indexes, you get a BitmapOr on two Bitmap Index Scan operations, with some low constant cost. On the other hand, without these indexes you get a sequential scan with a way higher cost (I tested with tables of ~100k tuples each).
However this query gives too much results. We need to filter it again to get the final result:
SELECT DISTINCT page_id, page_visited_at
FROM user_login
INNER JOIN user_page_visited ON login_time::INT / 20 = page_visited_at::INT / 20 OR login_time::INT / 20 = page_visited_at::INT / 20 - 1
WHERE page_visited_at BETWEEN login_time AND login_time + 20;
Using EXPLAIN on this query shows that PostgreSQL still uses the Bitmap Index Scans.
With ~100k rows in user_login and ~200k rows in user_page_visited the query needs ~1.4s to retrieve ~200k rows versus 3.5s without the slice prefilter.
(uname -a: Linux shepwork 4.4.26-gentoo #8 SMP Mon Nov 21 09:45:10 CET 2016 x86_64 AMD FX(tm)-6300 Six-Core Processor AuthenticAMD GNU/Linux)
Original query
delete B from
TABLE_BASE B ,
TABLE_INC I
where B.ID = I.IDID and B.NUM = I.NUM;
Performanace stats for above query
+-------------------+---------+-----------+
| Response Time | SumCPU | ImpactCPU |
+-------------------+---------+-----------+
| 00:05:29.190000 | 2852 | 319672 |
+-------------------+---------+-----------+
Optimized Query 1
DEL FROM TABLE_BASE WHERE (ID, NUM) IN
(SELECT ID, NUM FROM TABLE_INC);
Stats for above query
+-----------------+--------+-----------+
| QryRespTime | SumCPU | ImpactCPU |
+-----------------+--------+-----------+
| 00:00:00.570000 | 15.42 | 49.92 |
+-----------------+--------+-----------+
Optimized Query 2
DELETE FROM TABLE_BASE B WHERE EXISTS
(SELECT * FROM TABLE_INC I WHERE B.ID = I.ID AND B.NUM = I.NUM);
Stats for above query
+-----------------+--------+-----------+
| QryRespTime | SumCPU | ImpactCPU |
+-----------------+--------+-----------+
| 00:00:00.400000 | 11.96 | 44.93 |
+-----------------+--------+-----------+
My question -
How/Why does the Optimized Query 1 and 2 significantly affect the performance so much ?
What is the best practice for such DELETE queries ?
Should I choose Query 1 or Query 2 ? Which one is ideal/better/reliable? I feel Query 1 would be ideal because instead of SELECT * I am using SELECT ID,NUM reducing to only two columns but Query 2 is showing better results.
QUERY 1
This query is optimized using type 2 profile T2_Linux64, profileid 21.
1) First, we lock TEMP_DB.TABLE_BASE for write on a
reserved RowHash to prevent global deadlock.
2) Next, we lock TEMP_DB_T.TABLE_INC for access, and we
lock TEMP_DB.TABLE_BASE for write.
3) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from
TEMP_DB.TABLE_BASE by way of an all-rows scan
with no residual conditions into Spool 2 (all_amps), which is
redistributed by the hash code of (
TEMP_DB.TABLE_BASE.NUM,
TEMP_DB.TABLE_BASE.ID) to all AMPs. Then
we do a SORT to order Spool 2 by row hash. The size of Spool
2 is estimated with low confidence to be 168,480 rows (
5,054,400 bytes). The estimated time for this step is 0.03
seconds.
2) We do an all-AMPs RETRIEVE step from
TEMP_DB_T.TABLE_INC by way of an all-rows scan
with no residual conditions into Spool 3 (all_amps), which is
redistributed by the hash code of (
TEMP_DB_T.TABLE_INC.NUM,
TEMP_DB_T.TABLE_INC.ID) to all AMPs. Then
we do a SORT to order Spool 3 by row hash and the sort key in
spool field1 eliminating duplicate rows. The size of Spool 3
is estimated with high confidence to be 5,640 rows (310,200
bytes). The estimated time for this step is 0.03 seconds.
4) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an
all-rows scan, which is joined to Spool 3 (Last Use) by way of an
all-rows scan. Spool 2 and Spool 3 are joined using an inclusion
merge join, with a join condition of ("(ID = ID) AND
(NUM = NUM)"). The result goes into Spool 1 (all_amps),
which is redistributed by the hash code of (
TEMP_DB.TABLE_BASE.ROWID) to all AMPs. Then we do
a SORT to order Spool 1 by row hash and the sort key in spool
field1 eliminating duplicate rows. The size of Spool 1 is
estimated with no confidence to be 168,480 rows (3,032,640 bytes).
The estimated time for this step is 1.32 seconds.
5) We do an all-AMPs MERGE DELETE to
TEMP_DB.TABLE_BASE from Spool 1 (Last Use) via the
row id. The size is estimated with no confidence to be 168,480
rows. The estimated time for this step is 42.95 seconds.
6) We spoil the parser's dictionary cache for the table.
7) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.
QUERY 2 EXPLAIN PLAN
This query is optimized using type 2 profile T2_Linux64, profileid 21.
1) First, we lock TEMP_DB.TABLE_BASE for write on a reserved RowHash to
prevent global deadlock.
2) Next, we lock TEMP_DB_T.TABLE_INC for access, and we
lock TEMP_DB.TABLE_BASE for write.
3) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from TEMP_DB.TABLE_BASE by way of
an all-rows scan with no residual conditions into Spool 2
(all_amps), which is redistributed by the hash code of (
TEMP_DB.TABLE_BASE.NUM, TEMP_DB.TABLE_BASE.ID) to all AMPs.
Then we do a SORT to order Spool 2 by row hash. The size of
Spool 2 is estimated with low confidence to be 168,480 rows (
5,054,400 bytes). The estimated time for this step is 0.03
seconds.
2) We do an all-AMPs RETRIEVE step from
TEMP_DB_T.TABLE_INC by way of an all-rows scan
with no residual conditions into Spool 3 (all_amps), which is
redistributed by the hash code of (
TEMP_DB_T.TABLE_INC.NUM,
TEMP_DB_T.TABLE_INC.ID) to all AMPs. Then
we do a SORT to order Spool 3 by row hash and the sort key in
spool field1 eliminating duplicate rows. The size of Spool 3
is estimated with high confidence to be 5,640 rows (310,200
bytes). The estimated time for this step is 0.03 seconds.
4) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an
all-rows scan, which is joined to Spool 3 (Last Use) by way of an
all-rows scan. Spool 2 and Spool 3 are joined using an inclusion
merge join, with a join condition of ("(NUM = NUM) AND
(ID = ID)"). The result goes into Spool 1 (all_amps), which
is redistributed by the hash code of (TEMP_DB.TABLE_BASE.ROWID) to all
AMPs. Then we do a SORT to order Spool 1 by row hash and the sort
key in spool field1 eliminating duplicate rows. The size of Spool
1 is estimated with no confidence to be 168,480 rows (3,032,640
bytes). The estimated time for this step is 1.32 seconds.
5) We do an all-AMPs MERGE DELETE to TEMP_DB.TABLE_BASE from Spool 1 (Last
Use) via the row id. The size is estimated with no confidence to
be 168,480 rows. The estimated time for this step is 42.95
seconds.
6) We spoil the parser's dictionary cache for the table.
7) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.
For TABLE_BASE
+----------------+----------+
| table_bytes | skewness |
+----------------+----------+
| 16842085888.00 | 22.78 |
+----------------+----------+
For TABLE_INC
+-------------+----------+
| table_bytes | skewness |
+-------------+----------+
| 5317120.00 | 44.52 |
+-------------+----------+
What's the relation between TABLE_BASE and TABLE_INC?
If it's one-to-many Q1 probably creates a huge spool first while Q2&3 might apply DISTINCT before the join.
Regarding IN vs. EXISTS there should be hardly any difference, did you check dbc.QryLogStepsV?
Edit:
If (ID,Num) is the PI of the target table rewriting to a MERGE DELETE should provide best performance:
MERGE INTO TABLE_BASE AS tgt
USING TABLE_INC AS src
ON src.ID = tgt.ID,
AND src.Num = tgt.Num
WHEN MATCHED
THE DELETE
Note that I've modified table/field names etc. for readability. Some of the original names are quite confusing.
I have three different tables:
Retailer (Id+Code is a unique key)
- Id
- Code
- LastReturnDate
- ...
Delivery/DeliveryHistory (combination of Date+RetailerId is unique)
- Date
- RetailerId
- HasReturns
- ...
Delivery and DeliveryHistory are almost identical. Data is periodically moved to the history table, and there's no surefire way to know when this last happened. In general, the Delivery-table is quite small -- usually less than 100,000 rows -- while the history table will typically have millions of rows.
My task is to update the LastReturnDate field for each retailer based on the current highest date value for which HasReturns is true in Delivery or DeliveryHistory.
Previously this has been solved with a view defined as follows:
SELECT Id, Code, MAX(Date) Date
FROM Delivery
WHERE HasReturns = 1
GROUP BY Id, Code
UNION
SELECT Id, Code, MAX(Date) Date
FROM DeliveryHistory
WHERE HasReturns = 1
GROUP BY Id, Code
And the following UPDATE statement:
UPDATE Retailer SET LastReturnDate = (
SELECT MAX(Date) FROM DeliveryView
WHERE Retailer.Id = DeliveryView.Id AND Retailer.Code = DeliveryView.Code)
WHERE Code = :Code AND EXISTS (
SELECT * FROM DeliveryView
WHERE Retailer.Id = DeliveryView.Id AND Retailer.Code = DeliveryView.Code
HAVING
MAX(Date) > LastReturnDate OR
(LastReturnDate IS NULL AND MAX(Date) IS NOT NULL))
The EXISTS-clause guards against updating fields where the current value is greater than the new one, but this is actually not a significant concern, because it's hard to see how that could ever happen during normal program execution. Note also how the AND Max(Date) IS NOT NULL part is in fact superfluous, since it's impossible for Date to be null in DeliveryView. But the EXISTS-clause appears to actually improve performance slightly.
However, the performance of the UPDATE has recently been horrendous. In a database where the Retailer table contains only 1000-2000 relevant entries, the UPDATE has been taking more than five minutes to run. Note that it does this even if I remove the entire EXISTS clause, i.e. with this very simply statement:
UPDATE Retailer SET LastReturnDate = (
SELECT MAX(Date) FROM DeliveryView
WHERE Retailer.Id = DeliveryView.Id AND Retailer.Code = DeliveryView.Code)
WHERE Code = :Code
I've therefore been looking into a better solution. My first idea was to create a temporary table, but after a while I tried to write it as a MERGE statement:
MERGE INTO Retailer
USING (SELECT Id, Code, MAX(Date) Date FROM DeliveryView GROUP BY Id, Code)
ON (Retailer.Id = DeliveryView.Id AND Retailer.Code = DeliveryView.Code)
WHEN MATCHED THEN
UPDATE SET LastReturnDate = Date WHERE Code = :Code
This seems to work, and it's more than an order of magnitude faster than the UPDATE.
I have three questions:
Can I be certain that this will have the same effect as the UPDATE in all cases (disregarding the edge case of LastReturnDate already being larger than MAX(Date))?
Why is it so much faster?
Is there some better solution?
Query plans
MERGE plan
Cost: 25,831, Bytes: 1,143,828
Plain language
Every row in the table SCHEMA.Delivery is read.
The rows were sorted in order to be grouped.
Every row in the table SCHEMA.DeliveryHistory is read.
The rows were sorted in order to be grouped.
Return all rows from steps 2, 4 - including duplicate rows.
The rows from step 5 were sorted to eliminate duplicate rows.
A view definition was processed, either from a stored view SCHEMA.DeliveryView or as defined by steps 6.
The rows were sorted in order to be grouped.
A view definition was processed, either from a stored view SCHEMA. or as defined by steps 8.
Every row in the table SCHEMA.Retailer is read.
The result sets from steps 9, 10 were joined (hash).
A view definition was processed, either from a stored view SCHEMA. or as defined by steps 11.
Rows were merged.
Rows were remotely merged.
Technical
Plan Cardinality Distribution
14 MERGE STATEMENT REMOTE ALL_ROWS
Cost: 25 831 Bytes: 1 143 828 3 738
13 MERGE SCHEMA.Retailer ORCL
12 VIEW SCHEMA.
11 HASH JOIN
Cost: 25 831 Bytes: 1 192 422 3 738
9 VIEW SCHEMA.
Cost: 25 803 Bytes: 194 350 7 475
8 SORT GROUP BY
Cost: 25 803 Bytes: 194 350 7 475
7 VIEW VIEW SCHEMA.DeliveryView ORCL
Cost: 25 802 Bytes: 194 350 7 475
6 SORT UNIQUE
Cost: 25 802 Bytes: 134 550 7 475
5 UNION-ALL
2 SORT GROUP BY
Cost: 97 Bytes: 25 362 1 409
1 TABLE ACCESS FULL TABLE SCHEMA.Delivery [Analyzed] ORCL
Cost: 94 Bytes: 210 654 11 703
4 SORT GROUP BY
Cost: 25 705 Bytes: 109 188 6 066
3 TABLE ACCESS FULL TABLE SCHEMA.DeliveryHistory [Analyzed] ORCL
Cost: 16 827 Bytes: 39 333 636 2 185 202
10 TABLE ACCESS FULL TABLE SCHEMA.Retailer [Analyzed] ORCL
Cost: 27 Bytes: 653 390 2 230
UPDATE plan
Cost: 101,492, Bytes: 272,060
Plain language
Every row in the table SCHEMA.Retailer is read.
One or more rows were retrieved using index SCHEMA.DeliveryHasReturns . The index was scanned in ascending order.
Rows from table SCHEMA.Delivery were accessed using rowid got from an index.
The rows were sorted in order to be grouped.
One or more rows were retrieved using index SCHEMA.DeliveryHistoryHasReturns . The index was scanned in ascending order.
Rows from table SCHEMA.DeliveryHistory were accessed using rowid got from an index.
The rows were sorted in order to be grouped.
Return all rows from steps 4, 7 - including duplicate rows.
The rows from step 8 were sorted to eliminate duplicate rows.
A view definition was processed, either from a stored view SCHEMA.DeliveryView or as defined by steps 9.
The rows were sorted in order to be grouped.
A view definition was processed, either from a stored view SCHEMA. or as defined by steps 11.
Rows were updated.
Rows were remotely updated.
Technical
Plan Cardinality Distribution
14 UPDATE STATEMENT REMOTE ALL_ROWS
Cost: 101 492 Bytes: 272 060 1 115
13 UPDATE SCHEMA.Retailer ORCL
1 TABLE ACCESS FULL TABLE SCHEMA.Retailer [Analyzed] ORCL
Cost: 27 Bytes: 272 060 1 115
12 VIEW SCHEMA.
Cost: 90 Bytes: 52 2
11 SORT GROUP BY
Cost: 90 Bytes: 52 2
10 VIEW VIEW SCHEMA.DeliveryView ORCL
Cost: 90 Bytes: 52 2
9 SORT UNIQUE
Cost: 90 Bytes: 36 2
8 UNION-ALL
4 SORT GROUP BY
Cost: 15 Bytes: 18 1
3 TABLE ACCESS BY INDEX ROWID TABLE SCHEMA.Delivery [Analyzed] ORCL
Cost: 14 Bytes: 108 6
2 INDEX RANGE SCAN INDEX SCHEMA.DeliveryHasReturns [Analyzed] ORCL
Cost: 2 12
7 SORT GROUP BY
Cost: 75 Bytes: 18 1
6 TABLE ACCESS BY INDEX ROWID TABLE SCHEMA.DeliveryHistory [Analyzed] ORCL
Cost: 74 Bytes: 4 590 255
5 INDEX RANGE SCAN INDEX SCHEMA.DeliveryHistoryHasReturns [Analyzed] ORCL
Cost: 6 509