Hi i have an issue on handling some data on SQL, and returning some values by the nearest date. I have two Tables:
Table 1
ID Content Date
--------------------------------------------
123 X 2013-11-18
123 ZE 2013-11-29
233 YX 2013-12-30
233 XX 2013-12-28
444 Z 2014-02-24
Table 2
ID Value Validation Date
--------------------------------------------
123 0.54 2013-11-11
123 0.42 2013-11-18
123 0.32 2013-11-27
233 1.2 2013-12-4
233 1.1 2013-12-28
233 1.0 2013-12-29
444 4 2014-02-11
444 3 2014-02-15
444 2 2014-02-23
The output that i pretend is something like:
ID Content Date Value Validation Date
------------------------------------------------------------------------
123 X 2013-11-18 0.42 2013-11-18
123 ZE 2013-11-29 0.32 2013-11-27
233 YX 2013-12-30 1.0 2013-12-29
233 XX 2013-12-28 1.1 2013-12-28
444 Z 2014-02-24 2 2014-02-23
So i would like to return back the value where the validation date is the nearest to the date (where the validation date has to be always smaller than the date). Can you please help me? The ID in table 1 and 2 is not unique.
You can use the following query:
SELECT ID, Content, [Date], Value, [Validation Date]
FROM (
SELECT t1.ID, Content, [Date], Value, [Validation Date],
ROW_NUMBER() OVER (PARTITION BY t1.ID, Content
ORDER BY DATEDIFF(d, [Validation Date], [Date])) AS rn
FROM Table1 AS t1
INNER JOIN Table2 AS t2 ON t1.ID = t2.ID AND [Validation Date] <= [Date]
) t
WHERE t.rn = 1
ROW_NUMBER() is used to track the record with the smallest [Date] -[Validation Date] difference per (ID, Content) pair of values.
try this :
SELECT a.id,
a.content,
a.date,
b.valu,
b.validationdate
FROM (select tt.id,
tt.content,
tt.date,
row_number() over(partition by tt.id order by tt.date desc) rn
from table1 tt) a
JOIN (select t.id,
t.content,
t.date,
t.valu,
t.validationdate,
row_number() over(partition by t.id order by t.validationdate desc) rn
from table2 t) b
on a.id=b.id and a.rn=b.rn
I think the only way to do this is correlation. Something like that.
SELECT a.id, a.content, a.date,
(SELECT TOP 1 b.value, b.validate
FROM table2 b
WHERE b.id=a.id
ORDER BY b.validate DESC) from table1 a
I think the best approach is to use outer apply:
select t1.id, t1.content, t1.date, t2.value, t2.validdate
from table1 t1 outer apply
(SELECT TOP 1 t2.value, t2.validdate
FROM table2 t2
WHERE t2.id = t1.id
ORDER BY t2.validdate DESC
) t2;
Related
Say input:
Table T1
row_num_unimportant indicator
1 111
2 222
Table T2
row_num_unimportant indicator val_timestamp val_of_interest2
1 112 timestamp2 value1
2 113 timestamp1 value3
3 114 timestamp3 value2
4 223 timestamp4 value5
5 224 timestamp5 value4
I'd like to see the JOIN results
indicator min_timestamp val_of_interest2
111 timestamp1 value3
222 timestamp4 value5
The difficulty is the have val_of_interest2 to correlate with the min_timestamp.
Say in a naive JOIN:
SELECT
indicator,
MIN(val_timestamp) AS min_timestamp,
???? AS val_of_interest2
FROM (
SELECT
t1.indicator,
t2.val_timestamp,
t2.val_of_interest2
FROM
T1 t1
JOIN T2 t2
ON (t2.indicator >= t1.indicator)
)
GROUP BY
indicator
Basically, what do I put in the ??? part? (or do I need a different query all together?)
Thanks!
You would not use group by for this. One option is window functions:
SELECT indicator, val_timestamp, val_of_interest2
FROM (SELECT t1.indicator, t2.val_timestamp, t2.val_of_interest2,
ROW_NUMBER() OVER (PARTITION BY t1.indicator ORDER BY t2.val_timestamp) as seqnum
FROM T1 t1 JOIN
T2 t2
ON t2.indicator >= t1.indicator
) t
WHERE seqnum = 1;
I have a table that looks like this
ID Type Change_Date
1 t1 2015-10-08
1 t2 2016-01-03
1 t3 2016-03-07
2 t1 2017-12-13
2 t2 2018-02-01
It shows if a customer has changed account type and when. However, I'd like a query that can give me the follow output
ID Type Change_Date
1 t1 2015-10
1 t1 2015-11
1 t1 2015-12
1 t2 2016-01
1 t2 2016-02
1 t3 2016-03
1 t3 2016-04
... ... ...
1 t3 2018-10
for each ID. The output shows what account type the customer had for each month until the current month. My problem is filling in the "empty" months. In some cases the interval between account changes can be more than a year.
I hope this makes sense.
Thanks in advance.
Base on Presto SQL(because your origin question is about Presto/SQL)
Update in 2018-11-01: use lead() to simplify SQL
Prepare data
Table mytable same as yours
id type update_date
1 t1 2015-10-08
1 t2 2016-01-03
1 t3 2016-03-07
2 t1 2017-12-13
2 t2 2018-02-01
Table t_month is a dictionary table which has all month data from 2015-01 to 2019-12. This kind of dictionary tables are useful.
ym
2015-01
2015-02
2015-03
2015-04
2015-05
2015-06
2015-07
2015-08
2015-09
...
2019-12
Add lifespan for mytable
Normally, your should 'manage' your data like their lifespan. So mytable should like
id type start_date end_date
1 t1 2015-10-08 2016-01-03
1 t2 2016-01-03 2016-03-07
1 t3 2016-03-07 null
2 t1 2017-12-13 2018-02-01
2 t2 2018-02-01 null
But in this case, you don't. So next step is 'create' one. Use lead() window function.
select
id,
type,
date_format(update_date, '%Y-%m') as start_month,
lead(
date_format(update_date, '%Y-%m'),
1, -- next one
date_format(current_date+interval '1' month, '%Y-%m') -- if null return next month
) over(partition by id order by update_date) as end_month
from mytable
Output
id type start_month end_month
1 t1 2015-10 2016-01
1 t2 2016-01 2016-03
1 t3 2016-03 2018-11
2 t1 2017-12 2018-02
2 t2 2018-02 2018-11
Cross join id and month
It's simple
with id_month as (
select * from t_month
cross join (select distinct id from mytable)
)
select * from id_month
Output
ym id
2015-01 1
2015-02 1
2015-03 1
...
2019-12 1
2015-01 2
2015-02 2
2015-03 2
...
2019-12 2
Finally
Now, you can use subquery in select clause
select
id,
type,
ym
from (
select
t1.id,
t1.ym,
(select type from mytable2 where t1.id = id and t1.ym >= start_month and t1.ym < end_month) as type
from id_month t1
)
where type is not null
-- order by id, ym
Full sql
with mytable2 as (
select
id,
type,
date_format(update_date, '%Y-%m') as start_month,
lead(
date_format(update_date, '%Y-%m'),
1, -- next one
date_format(current_date+interval '1' month, '%Y-%m') -- if null return next month
) over(partition by id order by update_date) as end_month
from mytable
)
, id_month as (
select * from t_month
cross join (select distinct id from mytable)
)
select
id,
type,
ym
from (
select
t1.id,
t1.ym,
(select type from mytable2 where t1.id = id and t1.ym >= start_month and t1.ym < end_month) as type
from id_month t1
)
where type is not null
--order by id, ym
Output
id type ym
1 t1 2015-10
1 t1 2015-11
1 t1 2015-12
1 t2 2016-01
1 t2 2016-02
1 t3 2016-03
1 t3 2016-04
...
1 t3 2018-10
2 t1 2017-12
2 t1 2018-01
2 t2 2018-02
...
2 t2 2018-10
i have a table that contains:
itemid inventdimid datephysical transrefid
10001 123 2015-01-02 300002
10002 123 2015-01-03 3566
10001 123 2015-02-05 55555
10002 124 2015-02-01 4545
The result i want
itemid inventdimid datephysical transrefid
10001 123 2015-02-05 555
10002 123 2015-01-03 3566
10002 124 2015-02-01 4545
MY query:
SELECT a.itemid,a.inventdimid,max(a.datephysical),a.transrefid
FROM a where dataareaid = 'ermi'
group by a.itemid,a.inventdimid
it is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Use the ANSI standard row_number() function:
select t.*
from (select t.*,
row_number() over (partition by itemid, inventdimid
order by datephysical desc) as seqnum
from table t
) t
where seqnum = 1;
Find max(a.datephysical) for each itemid, inventdimid combination, select all rows from that date.
SELECT itemid, inventdimid, datephysical, transrefid
FROM a a1
where dataareaid = 'ermi'
and datephysical = (select max(datephysical)
from a a2
where a1.itemid = a2.itemid
and a1.inventdimid = a2.inventdimid
and a2.dataareaid = 'ermi')
You have to create a temporary table with your GROUP BY and then join the original table with it.
Try this:
SELECT T1.*,T2.datephysical,T2.transrefid FROM
(SELECT itemid,inventdimid
FROM TableName
GROUP BY itemid,inventdimid) T1 JOIN
(SELECT itemid,inventdimid,datephysical,transrefid
FROM TableName) T2 ON T1.itemid=T2.itemid AND T1.inventdimid=T2.inventdimid
I'm assuming you want the transrefid corresponding with the a.datephysical shown? This would be done by turning the column into a subquery:
SELECT a.itemid,a.inventdimid,max(a.datephysical),
(SELECT b.transrefid FROM MY_TABLE b where
b.datareaid = 'ermi' and b.itemid = a.itemid and b.inventdimid = a.itemid
and b.datephysical = max(a.datephysical)) as transrefid
FROM MY_TABLE a where dataareaid = 'ermi'
group by a.itemid, a.inventdimid
Some databases may not support this syntax though and it will fail if there are more than one records with the same date.
I have the following table
LogCheque (LogChequeID, ChequeID, Date, HolderID)
each row shows which Cheque (ChequeID) is transfered to Whom (HolderID) at which Date.
I want to select the list of LogCheques but with each cheque appearing only once, showing the last transfer
example data
LogChequeID ChequeID Date HolderID
1 1012 2013-01-10 200
2 1526 2013-01-12 125
3 1012 2013-01-19 413
4 1526 2013-02-11 912
5 1526 2013-02-17 800
and my desired output would be
LogChequeID ChequeID Date HolderID
3 1012 2013-01-19 413
5 1526 2013-02-17 800
I have tried
select lch.ChequeID, lch.DateFa, lch.ChequeID
from LCheque lch
group by lch.ChequeID, lch.DateFa, lch.LChequeID
having lch.LChequeID = (select MAX(LChequeID) where ChequeID = lch.ChequeID)
but it returns every row.
Any help would be greatly helpful and appreciated with open arms :) Thanks in advance.
You can use CTE + ROW_NUMBER() ranking function
;WITH cte AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY ChequeID ORDER BY [Date] DESC) AS rn
FROM dbo.LCheque
)
SELECT *
FROM cte
WHERE rn = 1
Demo on SQLFiddle
OR option with EXISTS operator
SELECT *
FROM dbo.LCheque t
WHERE EXISTS(
SELECT 1
FROM dbo.LCheque t2
WHERE t.ChequeID = t2.ChequeID
HAVING MAX(t2.[Date]) = t.[Date]
)
Demo on SQLFiddle
OR option with APPLY() operator
SELECT *
FROM dbo.LCheque t CROSS APPLY (
SELECT 1
FROM dbo.LCheque t2
WHERE t.ChequeID = t2.ChequeID
HAVING MAX(t2.[Date]) = t.[Date]
) o (IsMatch)
Demo on SQLFiddle
select lch.ChequeID,max(lch.Date),lch.HolderID
from LCheque lch
group by lch.ChequeID,lch.HolderID
CTE is much neater (perhaps more efficient) but you almost had it.
select lch.ChequeID, lch.DateFa, lch.ChequeID
from LCheque lch
where lch.LChequeID = (select MAX(LChequeID) where ChequeID = lch.ChequeID)
Here's my sql server table
ID Date Value
___ ____ _____
3241 9/17/12 5
3241 9/16/12 100
3241 9/15/12 20
4355 9/16/12 12
4355 9/15/12 132
4355 9/14/12 4
1234 9/16/12 45
2236 9/15/12 128
2236 9/14/12 323
2002 9/17/12 45
This seems like it should be easy to do, but I don't know why I'm stuck. I'd like to select ONLY the max(date) and value at that max(date) for each id. I want to ignore all other dates that aren't the max(date) with respect to each id.
Here's what I'd like the table to look like:
ID Date Value
___ ____ _____
3241 9/17/12 5
4355 9/16/12 12
1234 9/16/12 45
2236 9/15/12 128
2002 9/17/12 45
I tried group by using max(date), but it didn't group anything. I'm not sure what I'm doing wrong. Thanks in advance for the help!
You can use the following:
select t1.id, t2.mxdate, t1.value
from yourtable t1
inner join
(
select max(date) mxdate, id
from yourtable
group by id
) t2
on t1.id = t2.id
and t1.date = t2.mxdate
See Demo
I have used this for avoiding join statement
WITH table1
AS (SELECT
id,
Date,
Value,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY Date DESC) AS rn
FROM yourtable)
SELECT
*
FROM table1
WHERE rn = 1
This would give you what you need:
SELECT
m.ID,
m.Date,
m.Value
FROM
myTable m
JOIN (SELECT ID, max(Date) as Date FROM myTable GROUP BY ID) as a
ON m.ID = a.ID and m.Date = a.Date
You haven't specified your SQL implementation, but something like this should work:
Note that the op didn't specifically ask to use max(), just to get the id, value at the max[imum] date.
TSQL:
select top 1 ID, Date, Value from yourtable order by Date DESC;
Not TSQL, has to support limit:
(Not tested.)
select ID, Date, Value from yourtable order by Date DESC limit 1,1;