SQL Where In clause with multiple fields - sql

I have a table as below.
id date value
1 2011-10-01 xx
1 2011-10-02 xx
...
1000000 2011-10-01 xx
Then I have 1000 ids each associates with a date. I would like to perform something as below:
SELECT id, date, value
FROM the table
WHERE (id, date) IN ((id1, <= date1), (id2, <= date2), (id1000, <= date1000))
What's the best way to achieve above query?

You didn't specify your DBMS, so this is standard SQL.
You could do something like this:
with list_of_dates (id, dt) as (
values
(1, date '2016-01-01'),
(2, date '2016-01-02'),
(3, date '2016-01-03')
)
select
from the_table t
join list_of_dates ld on t.id = ld.id and t.the_date <= ld.dt;
This assumes that you do not have duplicates in the list of dates.
Update - now that the DBMS has been disclosed.
For SQL Server you need to change that to:
with list_of_dates (id, dt) as (
values
select 1, cast('20160101' as datetime) union all
select 2, cast('20160102' as datetime) union all
select 3, cast('20160103' as datetime)
)
select
from the_table t
join list_of_dates ld on t.id = ld.id and t.the_date <= ld.dt;

since this is info known ahead of time build a temp table of this info and then join to it
create table #test(id int, myDate date)
insert into #test(id,myDate) values
(1, '10/1/2016'),
(2, '10/2/2016'),
(3, '10/3/2016')
select a.id, a.date, a.value
from table as a
inner join
#test as b on a.id=b.id and a.date<=b.myDate

Related

SQL inner join with filtering

I have 2 tables as follows:
Table1:
ID Date
1 2022-01-01
2 2022-02-01
3 2022-02-05
Table2
ID Date Amount
1 2021-08-01 15
1 2022-02-10 15
2 2022-02-15 20
2 2021-01-01 15
2 2022-02-20 20
1 2022-03-01 15
I want to select the rows in Table2 such that only rows past the Date in Table1 are selected in Table2 and calculate a sum of amounts of each subset and max(date) in Table2 for each subset grouped by ID.
So the result would look like
ID Date Amount
1 2022-03-01 30
2 2022-02-20 40
SQL newbie here...I tried an inner join, but wasnt able to pass the date filter along...
Tried query:
with table1 as (select * from table1)
,table2 as (select * from table2)
select * from table1 a
inner join table2 b on (a.id=b.id)
Thanks!
Much like Paul, I would use a JOIN but I would put the clauses on the ON, so if you join to more tables, it's cleaner for the SQL optimizer to see what is the intent on a per table/join basis. I would also use aliases on tables and use the alias, so there is no room for confusion where the value is coming from, which again as a habit makes life easier when composing more complex SQL or cut'n'pasting into bigger blocks of code.
so with some CTE's for the data:
WITH table1(id, date) AS (
SELECT * FROM VALUES
(1, '2022-01-01'),
(2 , '2022-02-01'),
(3 , '2022-02-05')
), table2(id, date, amount) AS (
SELECT * FROM VALUES
(1, '2021-08-01'::date, 15),
(1, '2022-02-10'::date, 15),
(2, '2022-02-15'::date, 20),
(2, '2021-01-01'::date, 15),
(2, '2022-02-20'::date, 20),
(1, '2022-03-01'::date, 15)
)
The following SQL:
SELECT a.id,
max(b.date) as max_date,
sum(b.amount) as sum_amount
FROM table1 AS a
JOIN table2 AS b
ON a.id = b.id AND a.date <= b.date
GROUP BY 1
ORDER BY 1;
ID
MAX_DATE
SUM_AMOUNT
1
2022-03-01
30
2
2022-02-20
40
Here is how I would do this with Snowflake:
--create the tables and load data
--table1
CREATE TABLE TABLE1 (ID NUMBER, DATE DATE);
INSERT INTO TABLE1 VALUES (1, '2022-01-01');
INSERT INTO TABLE1 VALUES (2 , '2022-02-01');
INSERT INTO TABLE1 VALUES (3 , '2022-02-05');
--table 2
CREATE TABLE TABLE2 (ID NUMBER, DATE DATE, AMOUNT NUMBER);
INSERT INTO TABLE2 VALUES(1, '2021-08-01', 15);
INSERT INTO TABLE2 VALUES(1, '2022-02-10', 15);
INSERT INTO TABLE2 VALUES(2, '2022-02-15', 20);
INSERT INTO TABLE2 VALUES(2, '2021-01-01', 15);
INSERT INTO TABLE2 VALUES(2, '2022-02-20', 20);
INSERT INTO TABLE2 VALUES(1, '2022-03-01', 15);
Now obtain the data using a select
SELECT TABLE1.ID, MAX(TABLE2.DATE), SUM(AMOUNT)
FROM TABLE1, TABLE2
WHERE TABLE1.ID = TABLE2.ID
AND TABLE1.DATE < TABLE2.DATE
GROUP BY TABLE1.ID
Results
ID
MAX(TABLE2.DATE)
SUM(AMOUNT)
1
2022-03-01
30
2
2022-02-20
40
Not personally familiar with Snowflake but a standard SQL query that should work would be:
select id, Max(date) Date, Sum(Amount) Amount
from Table2 t2
where exists (
select * from Table1 t1
where t1.Id = t2.Id and t1.Date < t2.Date
)
group by Id;
Note that because you are only requiring data from Table2, an exists is preferable over an inner join and in almost all cases will be more performant than a join, at worst the same.

Selecting minimal dates, or nulls in SQL

This is grossly oversimplified, but:
I have a table, something like the following:
CREATE TABLE Table1
([ID] int, [USER] varchar(5), [DATE] date)
;
INSERT INTO Table1
([ID], [USER], [DATE])
VALUES
(1, 'A', '2018-10-01'),
(2, 'A', '2018-09-01'),
(3, 'A', NULL),
(4, 'B', '2018-05-03'),
(5, 'B', '2017-04-01'),
(6, 'C', NULL)
;
And for each user, I wish to retrieve the whole row of details where the DATE variable is minimal.
SELECT T.USER FROM TABLE1 T
WHERE T.DATE = (SELECT MIN(DATE) FROM TABLE1 T1 WHERE T1.USER = T.USER)
Works great, however in the instance there is no row with a populated DATE field, there will be a row with a NULL, like the final row of my table above, which I also wish to select.
So my ideal output in this case is:
(2, 'A', '2018-09-01'),
(5, 'B', '2017-04-01'),
(6, 'C', NULL)
SQL fiddle: http://www.sqlfiddle.com/#!9/df42b5/6
I think something could be done using an EXCLUDE statement but it gets complex very quickly.
You may try with row_number()
demo
select * from
(select *, row_number() over(partition by [user] order by [user],case when
[date] is null then 0 else 1 end desc,[date]) as rn
from Table1)x where rn=1
use union and and co-related sub-query with min() function
CREATE TABLE Table1 (ID int, usr varchar(50), DATE1 date)
;
INSERT INTO Table1 VALUES
(1, 'A', '2018-10-01'),
(2, 'A', '2018-09-01'),
(3, 'A', NULL),
(4, 'B', '2018-05-03'),
(5, 'B', '2017-04-01'),
(6, 'C', NULL)
;
select * from Table1 t where
DATE1= (select min(date1) from Table1 t1 where t1.usr=t.usr
) and date1 is not null
union
select * from Table1 t where date1 is null
and t.usr not in ( select usr from Table1 where date1 is not null)
DEMO
ID usr DATE1
2 A 01/09/2018 00:00:00
5 B 01/04/2017 00:00:00
6 C
You can use GROUP BY and JOIN to output the desired results.
select t.Id
, x.[User]
, x.[MinDate] as [Date]
from
(select [User]
, min([Date]) as MinDate
from table1
group by [User]) x
inner join table1 t on t.[User] = x.[User] and (t.[Date] = x.[MinDate] or x.[MinDate] is null)
You can use a Common Table Expression:
;WITH chronology AS (
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY [USER]
ORDER BY ISNULL([DATE], '2900-01-01') ASC
) Idx
FROM TABLE1
)
SELECT ID, [USER], [DATE]
FROM chronology
WHERE Idx=1;
Using a CTE in this solution simplifies the query improving its readability, maintainability and extensibility. Furthermore, I expect this approach to be optimal in terms of performance.

Find most recent record by date

This is my original data (anonymised):
id usage verified date
1 4000 Y 2015-03-20
2 5000 N 2015-06-20
3 6000 N 2015-07-20
4 7000 Y 2016-09-20
Original query:
SELECT
me.usage,
mes.verified,
mes.date
FROM
Table1 me,
Table2 mes,
Table3 m,
Table4 mp
WHERE
me.theFk=mes.id
AND mes.theFk=m.id
AND m.theFk=mp.id
How would I go about selecting the most recent verified and non-verified?
So I would be left with:
id usage verified date
1 6000 N 2015-07-20
2 7000 Y 2016-09-20
I am using Microsoft SQL Server 2012.
First, do not use implicit joins. This was discontinued more than 10 years ago.
Second, embrace the power of the CTE, the in clause and row_number:
with CTE as
(
select
me.usage,
mes.verified,
mes.date,
row_number() over (partition by Verified order by Date desc) as CTEOrd
from Table1 me
inner join Table2 mes
on me.theFK = mes.id
where mes.theFK in
(
select m.id
from Table3 m
inner join Table4 mp
on mp.id = m.theFK
)
)
select CTE.*
from CTE
where CTEOrd = 1
You can select the TOP 1 ordered by date for verified=N, union'd with the TOP 1 ordered by date for verified=Y.
Or in pseudo SQL:
SELECT TOP 1 ...fields ...
FROM ...tables/joins...
WHERE Verified = 'N'
ORDER BY Date DESC
UNION
SELECT TOP 1 ...fields ...
FROM ...tables/joins...
WHERE Verified = 'Y'
ORDER BY Date DESC
drop table #stack2
CREATE TABLE #stack2
([id] int, [usage] int, [verified] varchar(1), [date] datetime)
;
INSERT INTO #stack2
([id], [usage], [verified], [date])
VALUES
(1, 4000, 'Y', '2015-03-20 00:00:00'),
(2, 5000, 'N', '2015-06-20 00:00:00'),
(3, 6000, 'N', '2015-07-20 00:00:00'),
(4, 7000, 'Y', '2016-09-20 00:00:00')
;
;with cte as (select verified,max(date) d from #stack2 group by verified)
select row_number() over( order by s2.[verified]),s2.[usage], s2.[verified], s2.[date] from #stack2 s2 join cte c on c.verified=s2.verified and c.d=s2.date
As per the data shown i had written the query.
for your scenario this will be use full
WITH cte1
AS (SELECT me.usage,
mes.verified,
mes.date
FROM Table1 me,
Table2 mes,
Table3 m,
Table4 mp
WHERE me.theFk = mes.id
AND mes.theFk = m.id
AND m.theFk = mp.id),
cte
AS (SELECT verified,
Max(date) d
FROM cte1
GROUP BY verified)
SELECT Row_number()
OVER(
ORDER BY s2.[verified]),
s2.[usage],
s2.[verified],
s2.[date]
FROM cte1 s2
JOIN cte c
ON c.verified = s2.verified
AND c.d = s2.date
You can as the below Without join.
-- Mock data
DECLARE #Tbl TABLE (id INT, usage INT, verified CHAR(1), date DATETIME)
INSERT INTO #Tbl
VALUES
(1, 4000 ,'Y', '2015-03-20'),
(2, 5000 ,'N', '2015-06-20'),
(3, 6000 ,'N', '2015-07-20'),
(4, 7000 ,'Y', '2016-09-20')
SELECT
A.id ,
A.usage ,
A.verified ,
A.MaxDate
FROM
(
SELECT
id ,
usage ,
verified ,
date,
MAX(date) OVER (PARTITION BY verified) MaxDate
FROM
#Tbl
) A
WHERE
A.date = A.MaxDate
Result:
id usage verified MaxDate
----------- ----------- -------- ----------
3 6000 N 2015-07-20
4 7000 Y 2016-09-20
CREATE TABLE #Table ( ID INT ,usage INT, verified VARCHAR(10), _date DATE)
INSERT INTO #Table ( ID , usage , verified , _date)
SELECT 1,4000 , 'Y','2015-03-20' UNION ALL
SELECT 2, 5000 , 'N' ,'2015-06-20' UNION ALL
SELECT 3, 6000 , 'N' ,'2015-07-20' UNION ALL
SELECT 4, 7000 , 'Y' ,'2016-09-20'
SELECT ROW_NUMBER() OVER(ORDER BY usage) ID,usage , A.verified , A._date
FROM #Table
JOIN
(
SELECT verified , MAX(_date) _date
FROM #Table
GROUP BY verified
) A ON #Table._date = A._date

how to use SQL group to filter rows with maximum date value

I have the following table
CREATE TABLE Test
(`Id` int, `value` varchar(20), `adate` varchar(20))
;
INSERT INTO Test
(`Id`, `value`, `adate`)
VALUES
(1, 100, '2014-01-01'),
(1, 200, '2014-01-02'),
(1, 300, '2014-01-03'),
(2, 200, '2014-01-01'),
(2, 400, '2014-01-02'),
(2, 30 , '2014-01-04'),
(3, 800, '2014-01-01'),
(3, 300, '2014-01-02'),
(3, 60 , '2014-01-04')
;
I want to achieve the result which selects only Id having max value of date. ie
Id ,value ,adate
1, 300,'2014-01-03'
2, 30 ,'2014-01-04'
3, 60 ,'2014-01-04'
how can I achieve this using group by? I have done as follows but it is not working.
Select Id,value,adate
from Test
group by Id,value,adate
having adate = MAX(adate)
Can someone help with the query?
Select the maximum dates for each id.
select id, max(adate) max_date
from test
group by id
Join on that to get the rest of the columns.
select t1.*
from test t1
inner join (select id, max(adate) max_date
from test
group by id) t2
on t1.id = t2.id and t1.adate = t2.max_date;
Please try:
select
*
from
tbl a
where
a.adate=(select MAX(adate) from tbl b where b.Id=a.Id)
If you are using a DBMS that has analytical functions you can use ROW_NUMBER:
SELECT Id, Value, ADate
FROM ( SELECT ID,
Value,
ADate,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Adate DESC) AS RowNum
FROM Test
) AS T
WHERE RowNum = 1;
Otherwise you will need to use a join to the aggregated max date by Id to filter the results from Test to only those where the date matches the maximum date for that Id
SELECT Test.Id, Test.Value, Test.ADate
FROM Test
INNER JOIN
( SELECT ID, MAX(ADate) AS ADate
FROM Test
GROUP BY ID
) AS MaxT
ON MaxT.ID = Test.ID
AND MaxT.ADate = Test.ADate;
I would try something like this
Select t1.Id, t1.value, t1.adate
from Test as t1
where t1.adate = (select max(t2.adate)
from Test as t2
where t2.id = t1.id)

SQL Server Distinct Question

I need to be able to select only the first row for each name that has the greatest value.
I have a table with the following:
id name value
0 JOHN 123
1 STEVE 125
2 JOHN 127
3 JOHN 126
So I am looking to return:
id name value
1 STEVE 125
2 JOHN 127
Any idea on the MSSQL Syntax on how to perform this operation?
While you specified SQL Server, you did not specify the version. If you are using SQL Server 2005 or later, you can do something like:
With RankedItems As
(
Select id, name, value
, Row_Number() Over ( Partition By name Order By value Desc, id Asc ) As ItemRank
From Table
)
Select id, name, value
From RankedItems
Where ItemRank = 1
try:
SELECT
MIN(id) as id,dt.name,dt.value
FROM (SELECT
name,MAX(value) as value
FROM YourTable
GROUP BY name
) dt
INNER JOIN YourTable t ON dt.name=t.name and dt.value=t.value
GROUP BY dt.name,dt.value
try it out:
DECLARE #YourTable table (id int, name varchar(10), value int)
INSERT #YourTable VALUES (0, 'JOHN', 123)
INSERT #YourTable VALUES (1, 'STEVE', 125)
INSERT #YourTable VALUES (2, 'JOHN', 127)
INSERT #YourTable VALUES (3, 'JOHN', 126)
--extra data not in the question, shows why you need the outer group by
INSERT #YourTable VALUES (4, 'JOHN', 127)
INSERT #YourTable VALUES (5, 'JOHN', 127)
INSERT #YourTable VALUES (6, 'JOHN', 127)
INSERT #YourTable VALUES (7, 'JOHN', 127)
SELECT
MIN(id) as id,dt.name,dt.value
FROM (SELECT
name,MAX(value) as value
FROM #YourTable
GROUP BY name
) dt
INNER JOIN #YourTable t ON dt.name=t.name and dt.value=t.value
GROUP BY dt.name,dt.value
ORDER BY id
output:
id name value
----------- ---------- -----------
1 STEVE 125
2 JOHN 127
(2 row(s) affected)
You could do something like
SELECT id, name, value
FROM (SELECT id, name, value
ROWNUMBER() OVER (PARTITION BY name ORDER BY value DESC) AS r
FROM table) AS x
WHERE x.r = 1 ;
This will not work in SQL Server 2000 and earlier, but it will be incredibly fast in SQL Server 2005 and 2008
How about:
SELECT a.id, a.name, b.maxvalue
FROM mytbl a
INNER JOIN (SELECT id, max(value) as maxvalue
FROM mytbl
GROUP BY id) b ON b.id = a.id
SELECT a.id, a.name, a.value
FROM mytbl a
INNER JOIN (SELECT name, max(value) as maxvalue
FROM mytbl
GROUP BY name) b ON b.name = a.name and b.maxvalue = a.value