select distinct list of ids from table with earliest value in same table

select distinct list of ids from table with earliest value in same table - sql

I have the following table,
SDate Id Balance
2016-01-01 ABC 3
2016-01-01 DEF 7
2016-01-01 GHI 2
2016-02-01 ABC 6
2016-02-01 DEF 4
2016-02-01 GHI 8
2016-02-01 XYZ 12
I need to write a query that gives me a distinct list of Id's over a date range (so in this example SDate >= '2016-01-01' and SDate <= '2016-02-01') but also give me the earliest balance so the result from the table above I would like to see is,
Id Balance
ABC 3
DEF 7
GHI 2
XYZ 12
Is this possible?
UPDATE
Sorry I should have specified that for each date the Id is unique.

You can do this with a derived table that first works out the minimum SDate value for each Id value. Using this you then join back to your original table to find the Balance for the row that matches those values:
declare #t table(SDate date,Id nvarchar(3),Balance int);
insert into #t values ('2016-01-01','ABC',3),('2016-01-01','DEF',7),('2016-01-01','GHI',2),('2016-02-01','ABC',6),('2016-02-01','DEF',4),('2016-02-01','GHI',8),('2016-02-01','XYZ',12);
declare #StartDate date = '20160101';
declare #EndDate date = '20160201';
with d as
(
select Id
,min(SDate) as MinSDate
from #t
where SDate between #StartDate and #EndDate
group by id
)
select d.Id
,t.Balance
from d
inner join #t t
on(d.Id = t.Id
and d.MinSDate = t.SDate
);
Output:
Id | Balance
----+--------
ABC | 3
DEF | 7
GHI | 2
XYZ | 12

This should be possible with a window function - all you have to do is
partition by id
assign a row number, and
select the top row for each id
Example:
select id,
balance
from (
select id,
balance,
row_number() over( partition by id order by SDate ) as row_num
from table1
where SDate between '2016-01-01' and '2016-02-01'
) as a
where row_num = 1
Note: the advantage of this method is it is a lot more flexible. Say you wanted the 2 oldest records, you could just change to where row_num <= 2.

Analytic row_number() should be the fastest
select *
from (
select
t.*,
row_number() over (partition by Id order by SDate) rn
from your_table t
) t where rn = 1;

You can achieve this with a self join, which may not be the fastest or most elegant solution:
CREATE TABLE #SOPostSample
(
SDate DATE ,
Id NVARCHAR(5) ,
Balance INT
);
INSERT INTO #SOPostSample
( SDate, Id, Balance )
VALUES ( '2016-01-01', 'ABC', 3 ),
( '2016-01-01', 'DEF', 7 ),
( '2016-01-01', 'GHI', 2 ),
( '2016-02-01', 'ABC', 6 ),
( '2016-02-01', 'DEF', 4 ),
( '2016-02-01', 'GHI', 8 ),
( '2016-02-01', 'XYZ', 12 );
SELECT t1.Id ,
MIN(t2.Balance) Balance
FROM #SOPostSample t1
INNER JOIN #SOPostSample t2 ON t1.Id = t2.Id
GROUP BY t1.Id ,
t2.SDate
HAVING t2.SDate = MIN(t1.SDate);
DROP TABLE #SOPostSample;
Produces:
id Balance
============
ABC 3
DEF 7
GHI 2
XYZ 12
This works for the sample data, but please test with more data as I just wrote it quickly.

This should work, Top 1 just inserted for safety, should not be needed if SDate and Id are unique in combination
SELECT o.Id ,
( SELECT TOP 1
Balance
FROM tbl
WHERE Id = o.Id
AND SDate = MIN(o.SDate)
) Balance
FROM tbl o
GROUP BY Id
HAVING sDate BETWEEN '20160101' AND '20160201';

You can use sub-query
SELECT Id ,
( SELECT TOP 1
Balance
FROM [TableName] AS T1
WHERE T1.Id = [TableName].Id
ORDER BY SDate
) AS Balance
FROM [TableName]
GROUP BY Id;

Related

How to get rows from two tables on maximum value of particular field

I have two tables that has date_updated column.
TableA is like below
con_id date_updated type
--------------------------------------------
123 19/06/2018 2
123 15/06/2018 1
123 01/05/2018 3
101 06/04/2018 1
101 05/03/2018 2
And I have TableB that also has the same structure
con_id date_updated type
--------------------------------------------
123 15/05/2018 2
123 01/05/2018 1
101 07/06/2018 1
The resultant table should have the data with the recent date
con_id date_updated type
--------------------------------------------
123 19/06/2018 2
101 07/06/2018 1
Here the date_updated column is datetime datatype of sql server. I tried this by using group by and selecting the maximum date_updated. But i am not able to include column type in select statement. When i used type in group by ,the result is not correct as the type is also grouped. How can i query this. Please help

SELECT *
FROM
(SELECT *, ROW_NUMBER() OVER(Partition By con_id ORDER BY date_updated DESC) as seq
FROM
(SELECT * FROM TableA
UNION ALL
SELECT * FROM TableB) as tblMain) as tbl2
WHERE seq = 1

One method:
WITH A AS(
SELECT TOP 1 con_id,
date_updated,
type
FROM TableA
ORDER BY date_updated DESC),
B AS(
SELECT TOP 1 con_id,
date_updated,
type
FROM TableB
ORDER BY date_updated DESC),
U AS(
SELECT *
FROM A
UNION ALL
SELECT *
FROM B)
SELECT *
FROM U;
The 2 CTE's at the top get your most recent rows from the tables, and then the end statement unions them together.
For the benefit of the person who says this doesn't work:
USE Sandbox;
GO
CREATE TABLE tablea (con_id int, date_updated date, [type] tinyint);
CREATE TABLE tableb (con_id int, date_updated date, [type] tinyint);
GO
INSERT INTO tablea
VALUES
(123,'19/06/2018',2),
(123,'15/06/2018',1),
(123,'01/05/2018',3),
(101,'06/04/2018',1),
(101,'05/03/2018',2);
INSERT INTO tableb
VALUES
(123,'15/05/2018',2),
(123,'01/05/2018',1),
(101,'07/06/2018',1);
GO
WITH A AS(
SELECT TOP 1 con_id,
date_updated,
[type]
FROM TableA
ORDER BY date_updated DESC),
B AS(
SELECT TOP 1 con_id,
date_updated,
[type]
FROM TableB
ORDER BY date_updated DESC),
U AS(
SELECT *
FROM A
UNION ALL
SELECT *
FROM B)
SELECT *
FROM U;
GO
DROP TABLE tablea;
DROP TABLE tableb;
This returns the dataset:
con_id date_updated type
----------- ------------ ----
123 2018-06-19 2
101 2018-06-07 1
Which is identical to the OP's data:
con_id date_updated type
--------------------------------------------
123 19/06/2018 2
101 07/06/2018 1

Hope this helps:
WITH combined
AS(
select * FROM tableA
UNION
select * FROM tableB)
SELECT t1.con_id,
t1.date_updated,
t1.type
FROM (
SELECT con_id,
date_updated,
type,
row_number() OVER(partition BY con_id ORDER BY date_updated DESC) AS rownumber
FROM combined) t1
WHERE rownumber = 1;

Can be done using window functions:
declare #TableA table (con_id int, date_updated date, [type] int)
declare #TableB table (con_id int, date_updated date, [type] int)
insert into #TableA values
(123, '2018-06-19', 2)
, (123, '2018-06-15', 1)
, (123, '2018-05-01', 3)
, (101, '2018-04-06', 1)
, (101, '2018-03-05', 2)
insert into #TableB values
(123, '2018-05-15', 2)
, (123, '2018-05-01', 1)
, (101, '2018-06-07', 1)
select distinct con_id
, first_value(date_updated) over (partition by con_id order by con_id, date_updated desc) as con_id
, first_value([type]) over (partition by con_id order by con_id, date_updated desc) as [type]
from
(Select * from #TableA UNION Select * from #TableB) x

How can I select distinct by one column?

I have a table with the columns below, and I need to get the values if COD is duplicated, get the non NULL on VALUE column. If is not duplicated, it can get a NULL VALUE. Like the example:
I'm using SQL SERVER.
This is what I get:
COD ID VALUE
28 1 NULL
28 2 Supermarket
29 1 NULL
29 2 School
29 3 NULL
30 1 NULL
This is what I want:
COD ID VALUE
28 2 Supermarket
29 2 School
30 1 NULL
What I'm tryin' to do:
;with A as (
(select DISTINCT COD,ID,VALUE from CodId where ID = 2)
UNION
(select DISTINCT COD,ID,NULL from CodId where ID != 2)
)select * from A order by COD

You can try this.
DECLARE #T TABLE (COD INT, ID INT, VALUE VARCHAR(20))
INSERT INTO #T
VALUES(28, 1, NULL),
(28, 2 ,'Supermarket'),
(29, 1 ,NULL),
(29, 2 ,'School'),
(29, 3 ,NULL),
(30, 1 ,NULL)
;WITH CTE AS (
SELECT *, RN= ROW_NUMBER() OVER (PARTITION BY COD ORDER BY VALUE DESC) FROM #T
)
SELECT COD, ID ,VALUE FROM CTE
WHERE RN = 1
Result:
COD ID VALUE
----------- ----------- --------------------
28 2 Supermarket
29 2 School
30 1 NULL

Another option is to use the WITH TIES clause in concert with Row_Number()
Example
Select top 1 with ties *
from YourTable
Order By Row_Number() over (Partition By [COD] order by Value Desc)
Returns
COD ID VALUE
28 2 Supermarket
29 2 School
30 1 NULL

I would use GROUP BY and JOIN. If there is no NOT NULL value for a COD than it should be resolved using the OR in JOIN clause.
SELECT your_table.*
FROM your_table
JOIN (
SELECT COD, MAX(value) value
FROM your_table
GROUP BY COD
) gt ON your_table.COD = gt.COD and (your_table.value = gt.value OR gt.value IS NULL)

If you may have more than one non null value for a COD this will work
drop table MyTable
CREATE TABLE MyTable
(
COD INT,
ID INT,
VALUE VARCHAR(20)
)
INSERT INTO MyTable
VALUES (28,1, NULL),
(28,2,'Supermarket'),
(28,3,'School'),
(29,1,NULL),
(29,2,'School'),
(29,3,NULL),
(30,1,NULL);
WITH Dups AS
(SELECT COD FROM MyTable GROUP BY COD HAVING count (*) > 1 )
SELECT MyTable.COD,MyTable.ID,MyTable.VALUE FROM MyTable
INNER JOIN dups ON MyTable.COD = Dups.COD
WHERE value IS NOT NULL
UNION
SELECT MyTable.COD,MyTable.ID,MyTable.VALUE FROM MyTable
LEFT JOIN dups ON MyTable.COD = Dups.COD
WHERE dups.cod IS NULL

How to do an inner join to get rid of dupes

How can I use an inner join to get rid of the dupes that I dont want?
The table I'm working on looks like this:
ID Edited_date Status
------------------------
1 1/1/2015 A
1 1/1/2016 B
1 2/1/2016 C
2 1/1/2017 D
2 3/1/2017 B
3 1/1/2016 C
3 4/1/2017 B
3 1/1/2014 D
However, I only want the status of each loan from the most recent edited_date
ID Edited_date Status
------------------------
1 2/1/2016 C
2 3/1/2017 B
3 4/1/2017 B

select * from [table] t1
inner join
(
select ID, max(Edited_date) maxDt
from [Table]
group by ID
) t2
on t1.ID = t2.ID
and t1.Edited_date = t2.maxDt;

For select only:
SELECT *
FROM
(
SELECT *, ROW_NUMBER()OVER(PARTITION BY ID ORDER BY Edited_date desc) as Indicator
FROM TABLE_NAME
) as ABC
WHERE ABC.Indicator = 1
For delete:
WITH ABC
AS
(
SELECT *, ROW_NUMBER()OVER(PARTITION BY ID ORDER BY Edited_date desc) as Indicator
FROM TABLE_NAME
)
DELETE FROM ABC
WHERE ABC.Indicator != 1

using row_number() partitioned by id to get the latest edited_date
select id, edited_date, status
from (
select *
, rn = row_number() over (partition by id order by edited_date desc)
from t
) as s
where rn = 1
top with ties version:
select top 1 with ties
id
, edited_date
, status
from t
order by row_number() over (partition by id order by edited_date desc)

Begin Transaction
Create table #temp (Id int, Edited_date date, Status char(2))
Insert into #temp
Values
('1','1/1/2015','A'),
('1','1/1/2016','B'),
('1','2/1/2016','C'),
('2','1/1/2017','D'),
('2','3/1/2017','B'),
('3','1/1/2016','C'),
('3','4/1/2017','B'),
('3','1/1/2014','D')
Create table #temp2 (Id int, Edited_date date, Status char(2))
Insert into #temp2
Values
('1','2/1/2016','C'),
('2','3/1/2017','B'),
('3','4/1/2017','B')
/** emphasis on the below **/
;with cte as (
Select max(Edited_date) as Edited_date, Status From #temp Group By Status
union all
Select max(Edited_date) as Edited_date, Status From #temp2 Group By Status
)
Select Status, max(Edited_date) as Recent_Edited_date From cte Group By Status
/** End of Emphasis **/
Drop table #temp
Drop table #temp2
Rollback
--- Result ---
Status| Recent_Edited_date
A| 2015-01-01
B| 2017-04-01
C| 2016-02-01
D| 2017-01-01

Compare getdate() with two different fields

I have 2 tables:
T1 T2
id Effdate E_id DOB
-------------- ------------
1 20161212 2 1950-02-16 00:12:24
2 20130124 5 1978-01-16 10:14:30
I want to compare getdate() < Maxdate(effdate, DOB)?
I am getting datetime conversion error.
for example : getdate() < MAXDATE( 20161212 , 1950-02-16 00:12:24)
expected result should be from table T1:
id Effdate
--------------
1 20161212

If id in both tables are correspondent on id = E_id you can UNION them and GROUP BY id:
;WITH T1 AS (
SELECT 1 id,
CAST('20161212' as varchar(10)) Effdate
UNION ALL
SELECT 2,
'20130124'
), T2 AS (
SELECT 1 E_id,
CAST('1950-02-16 00:12:24' as datetime) DOB
UNION ALL
SELECT 2,
'1978-01-16 10:14:30'
)
SELECT id,
MAX(CAST(Effdate as datetime)) as MD
FROM (
SELECT *
FROM T1
UNION ALL
SELECT *
FROM T2
) t
GROUP BY id
HAVING MAX(CAST(Effdate as datetime)) >= GETDATE()
Will bring you expected result

Coalesce over Rows in MSSQL 2008,

I'm trying to determine the best approach here in MSSQL 2008.
Here is my sample data
TransDate Id Active
-------------------------
1/18 1pm 5 1
1/18 2pm 5 0
1/18 3pm 5 Null
1/18 4pm 5 1
1/18 5pm 5 0
1/18 6pm 5 Null
If grouped by Id and ordered by the TransDate, I want the last Non Null Value for the Active Column, and the MAX of TransDate
SELECT MAX(TransDate) AS TransDate,
Id,
--LASTNonNull(Active) AS Active
Here would be the results:
TransDate Id Active
---------------------
1/18 6pm 5 0
It would be like a Coalesce but over the rows, instead of two values/columns.
There would be many other columns that would also have this similiar method applied, so I really don't want to make a seperate join for each of the columns.
Any ideas?

I'd probably use a correlated sub query.
SELECT MAX(TransDate) AS TransDate,
Id,
(SELECT TOP (1) Active
FROM T t2
WHERE t2.Id = t1.Id
AND Active IS NOT NULL
ORDER BY TransDate DESC) AS Active
FROM T t1
GROUP BY Id
A way without
SELECT
Id,
MAX(TransDate) AS TransDate,
CAST(RIGHT(MAX(CONVERT(CHAR(23),TransDate,121) + CAST(Active AS CHAR(1))),1) AS BIT) AS Active,
/*You can probably figure out a more efficient thing to
compare than the above depending on your data. e.g.*/
CAST(MAX(DATEDIFF(SECOND,'19500101',TransDate) * CAST(10 AS BIGINT) + Active)%10 AS BIT) AS Active2
FROM T
GROUP BY Id
Or following the comments would cross apply work better for you?
WITH T (TransDate, Id, Active, SomeOtherColumn) AS
(
select GETDATE(), 5, 1, 'A' UNION ALL
select 1+GETDATE(), 5, 0, 'B' UNION ALL
select 2+GETDATE(), 5, null, 'C' UNION ALL
select 3+GETDATE(), 5, 1, 'D' UNION ALL
select 4+GETDATE(), 5, 0, 'E' UNION ALL
select 5+GETDATE(), 5, null,'F'
),
T1 AS
(
SELECT MAX(TransDate) AS TransDate,
Id
FROM T
GROUP BY Id
)
SELECT T1.TransDate,
Id,
CA.Active AS Active,
CA.SomeOtherColumn AS SomeOtherColumn
FROM T1
CROSS APPLY (SELECT TOP (1) Active, SomeOtherColumn
FROM T t2
WHERE t2.Id = T1.Id
AND Active IS NOT NULL
ORDER BY TransDate DESC) CA

This example should help, using analytical functions Max() OVER and Row_Number() OVER
create table tww( transdate datetime, id int, active bit)
insert tww select GETDATE(), 5, 1
insert tww select 1+GETDATE(), 5, 0
insert tww select 2+GETDATE(), 5, null
insert tww select 3+GETDATE(), 5, 1
insert tww select 4+GETDATE(), 5, 0
insert tww select 5+GETDATE(), 5, null
select maxDate as Transdate, id, Active
from (
select *,
max(transdate) over (partition by id) maxDate,
ROW_NUMBER() over (partition by id
order by case when active is not null then 0 else 1 end, transdate desc) rn
from tww
) x
where rn=1
Another option, quite expensive, would be doing it through XML. For educational purposes only
select
ID = n.c.value('#id', 'int'),
trandate = n.c.value('(data/transdate)[1]', 'datetime'),
active = n.c.value('(data/active)[1]', 'bit')
from
(select xml=convert(xml,
(select id [#id],
( select *
from tww t
where t.id=tww.id
order by transdate desc
for xml path('data'), type)
from tww
group by id
for xml path('node'), root('root'), elements)
)) x cross apply xml.nodes('root/node') n(c)
It works on the principle that the XML generated has each record as a child node of the ID. Null columns have been omitted, so the first column found using xpath (child/columnname) is the first non-null value similar to COALESCE.

You could use a subquery:
SELECT MAX(TransDate) AS TransDate
, Id
, (
SELECT TOP 1 t2.Active
FROM YourTable t2
WHERE t1.id = t2.id
and t2.Active is not null
ORDER BY
t2.TransDate desc
)
FROM YourTable t1

I created a temp table named #temp to test my solution, and here is what I came up with:
transdate id active
1/1/2011 12:00:00 AM 5 1
1/2/2011 12:00:00 AM 5 0
1/3/2011 12:00:00 AM 5 null
1/4/2011 12:00:00 AM 5 1
1/5/2011 12:00:00 AM 5 0
1/6/2011 12:00:00 AM 5 null
1/1/2011 12:00:00 AM 6 2
1/2/2011 12:00:00 AM 6 3
1/3/2011 12:00:00 AM 6 null
1/4/2011 12:00:00 AM 6 2
1/5/2011 12:00:00 AM 6 null
This query...
select max(a.transdate) as transdate, a.id, (
select top (1) b.active
from #temp b
where b.active is not null
and b.id = a.id
order by b.transdate desc
) as active
from #temp a
group by a.id
Returns these results.
transdate id active
1/6/2011 12:00:00 AM 5 0
1/5/2011 12:00:00 AM 6 2

Assuming a table named "test1", how about using ROW_NUMBER, OVER and PARTITION BY?
SELECT transdate, id, active FROM
(SELECT transdate, ROW_NUMBER() OVER(PARTITION BY id ORDER BY transdate desc) AS rownumber, id, active
FROM test1
WHERE active is not null) a
WHERE a.rownumber = 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

select distinct list of ids from table with earliest value in same table - sql

Analytic row_number() should be the fastest select * from ( select t.*, row_number() over (partition by Id order by SDate) rn from your_table t ) t where rn = 1;

This should work, Top 1 just inserted for safety, should not be needed if SDate and Id are unique in combination SELECT o.Id , ( SELECT TOP 1 Balance FROM tbl WHERE Id = o.Id AND SDate = MIN(o.SDate) ) Balance FROM tbl o GROUP BY Id HAVING sDate BETWEEN '20160101' AND '20160201';

You can use sub-query SELECT Id , ( SELECT TOP 1 Balance FROM [TableName] AS T1 WHERE T1.Id = [TableName].Id ORDER BY SDate ) AS Balance FROM [TableName] GROUP BY Id;

Related

How to get rows from two tables on maximum value of particular field

How can I select distinct by one column?

How to do an inner join to get rid of dupes

Compare getdate() with two different fields

Coalesce over Rows in MSSQL 2008,

Categories

Resources