Returning most recent row SQL Server - sql

I have this table
CREATE TABLE Test (
OrderID int,
Person varchar(10),
LastModified Date
);
INSERT INTO Test (OrderID, Person, LastModified)
VALUES (1, 'Sam', '2018-05-15'),
(1, 'Tim','2018-05-14'),
(1, 'Kim','2018-05-05'),
(1, 'Dave','2018-05-13'),
(1, 'James','2018-05-11'),
(1, 'Fred','2018-05-05');
select * result:
| OrderID | Person | LastModified |
|---------|--------|--------------|
| 1 | Sam | 2018-05-15 |
| 1 | Tim | 2018-05-14 |
| 1 | Kim | 2018-05-05 |
| 1 | Dave | 2018-05-13 |
| 1 | James | 2018-05-11 |
| 1 | Fred | 2018-05-05 |
I am looking to return the most recent modified row which is the first row with 'Sam'.
Now i now i can use max to return the most recent date but how can i aggregate the person column to return sam?
Looking for a result set like
| OrderID | Person | LastModified |
|---------|--------|--------------|
| 1 | Sam | 2018-05-15 |
I ran this:
SELECT
OrderID,
max(Person) AS [Person],
max(LastModified) AS [LastModified]
FROM Test
GROUP BY
OrderID
but this returns:
| OrderID | Person | LastModified |
|---------|--------|--------------|
| 1 | Tim | 2018-05-15 |
Can someone advice me further please? thanks
*** UPDATE
INSERT INTO Test (OrderID, Person, LastModified)
VALUES (1, 'Sam', '2018-05-15'),
(1, 'Tim','2018-05-14'),
(1, 'Kim','2018-05-05'),
(1, 'Dave','2018-05-13'),
(1, 'James','2018-05-11'),
(1, 'Fred','2018-05-05'),
(2, 'Dave','2018-05-13'),
(2, 'James','2018-05-11'),
(2, 'Fred','2018-05-05');
So i would be looking for this result to be:
| OrderID | Person | LastModified |
|---------|--------|--------------|
| 1 | Sam | 2018-05-15 |
| 2 | Dave | 2018-05-13 |

If you always want just one record (the latest modified one) per OrderID then this would do it:
SELECT
t2.OrderID
, t2.Person
, t2.LastModified
FROM (
SELECT
MAX( LastModified ) AS LastModified
, OrderID
FROM
Test
GROUP BY
OrderID
) t
INNER JOIN Test t2
ON t2.LastModified = t.LastModified
AND t2.OrderID = t.OrderID

Expanding on your comment ("thanks very much, is there a way i can do this if there is more than one orderID e.g. multiple people and lastmodified for multiple orderID's?"), in xcvd's answer, I assume what you therefore want is this:
WITH CTE AS(
SELECT OrderId,
Person,
LastModifed,
ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY LastModified DESC) AS RN
FROM YourTable)
SELECT OrderID,
Person,
LastModified
FROM CTE
WHERE RN = 1;

How about just using TOP (1) and ORDER BY?
SELECT TOP (1) t.*
FROM Test t
ORDER BY LastModified DESC;
If you want this for each orderid, then this is a handy method in SQL Server:
SELECT TOP (1) WITH TIES t.*
FROM Test t
ORDER BY ROW_NUMBER() OVER (PARTITION BY OrderId ORDER BY LastModified DESC);

"xcvd's" answer is perfect for this, I would just like to add another solution that can be used here for the sake of showing you a method that can be used in more complex situations than this. This solution uses a nested query (sub-query) to find the MAX(LastModified) regardless of any other field and it will use the result in the original query's WHERE clause to find any results that meet the new criteria. Cheers.
SELECT OrderID
, Person
, LastModified
FROM Test
WHERE LastModified IN (SELECT MAX(LastModified)
FROM Test)

Here is one other method :
select t.*
from Test t
where LastModified = (select max(t1.LastModified) from Test t1 where t1.OrderID = t.OrderID);

Related

How can I create header records by taking values from one of several line items?

I have a set of sorted line items. They are sorted first by ID then by Date:
| ID | DESCRIPTION | Date |
| --- | ----------- |----------|
| 100 | Red |2019-01-01|
| 101 | White |2019-01-01|
| 101 | White_v2 |2019-02-01|
| 102 | Red_Trim |2019-01-15|
| 102 | White |2019-01-16|
| 102 | Blue |2019-01-20|
| 103 | Red_v3 |2019-01-14|
| 103 | Red_v3 |2019-03-14|
I need to insert rows in a SQL Server table, which represents a project header, so that the first row for each ID provides the Description and Date in the destination table. There should only be one row in the destination table for each ID.
For example, the source table above would result in this at the destination:
| ID | DESCRIPTION | Date |
| --- | ----------- |----------|
| 100 | Red |2019-01-01|
| 101 | White |2019-01-01|
| 102 | Red_Trim |2019-01-15|
| 103 | Red_v3 |2019-01-14|
How do I collapse the source so that I take only the first row for each ID from source?
I prefer to do this with a transformation in SSIS but can use SQL if necessary. Actually, solutions for both methods would be most helpful.
This question is distinct from Trouble using ROW_NUMBER() OVER (PARTITION BY …)
in that this seeks to identify an approach. The asker of that question has adopted one approach, of more than one available as identified by answers here. That question is about how to make that particular approach work.
You can use row_number() :
select t.*
from (select t.*, row_number() over (partition by id order by date) as seq
from table t
) t
where seq = 1;
A correlated subquery will help here:
SELECT *
FROM yourtable t1
WHERE [Date] = (SELECT min([Date]) FROM yourtable WHERE id = t1.id)
use first_value window function
select * from (select *,
first_value(DESCRIPTION) over(partition by id order by Date) as des,
row_number() over(partition by id order by Date) rn
from table
) a where a.rn =1
You can use the ROW_NUMBER() window function to do this. For example:
select *
from (
select
id, description, date,
row_number() over(partition by id order by date) as rn
from t
)
where rn = 1

How to select rows and nearby rows with specific conditions

I have a table (Trans) of values like
OrderID (unique) | CustID | OrderDate| TimeSinceLast|
------------------------------------------------------
123a | A01 | 20.06.18 | 20 |
123y | B05 | 20.06.18 | 31 |
113k | A01 | 18.05.18 | NULL | <------- need this
168x | C01 | 17.04.18 | 8 |
999y | B05 | 15.04.18 | NULL | <------- need this
188k | A01 | 15.04.18 | 123 |
678a | B05 | 16.03.18 | 45 |
What I need is to select the rows where TimeSinceLast is null, as well as a row preceding and following where TimeSinceLast is not null, grouped by custID
I'd need my final table to look like:
OrderID (unique) | CustID | OrderDate| TimeSinceLast|
------------------------------------------------------
123a | A01 | 20.06.18 | 20 |
113k | A01 | 18.05.18 | NULL |
188k | A01 | 15.04.18 | 123 |
123y | B05 | 20.06.18 | 31 |
999y | B05 | 15.04.18 | NULL |
678a | B05 | 16.03.18 | 45 |
The main problem is that TimeSinceLast is not reliable and for whatsoever reason does not calculate well the days since last order, so I cannot use it in a query for preceding or following row.
I have tried to look for codes and found something like this on this forum
with dt as
(select distinct custID, OrderID,
max (case when timeSinceLast is null then OrderID end)
over(partition by custID order by OrderDate
rows between 1 preceding and 1 following) as NullID
from Trans)
select *
from dt
where request_id between NullID -1 and NullID+1
But does not work well for my purposes. Also it looks like max function cannot work with missing values.
Many thanks
Use lead() and lag().
What I need is to select the rows where TimeSinceLast is null, as well as a row preceding and following where TimeSinceLast is not null.
First, the ordering is a little unclear. Your sample data and code do not match. The following assumes some combination of the date and orderid, but there may be other columns that better capture what you mean by "preceding" and "following".
This is a little tricky, because you don't want to always include the first and last rows -- unless necessary. So, look at two columns:
select t.*
from (select t.*,
lead(TimeSinceLast) over (partition by custid order by orderdate, orderid) as next_tsl,
lag(TimeSinceLast) over (partition by custid order by orderdate, orderid) as prev_tsl,
lead(orderid) over (partition by custid order by orderdate, orderid) as next_orderid,
lag(orderid) over (partition by custid order by orderdate, orderid) as prev_orderid
from t
) t
where TimeSinceLast is not null or
(next_tsl is null and next_orderid is not null) or
(prev_tsl is null and prev_orderid is not null);
USE APPLY
DECLARE #TransTable TABLE (OrderID char(4), CustID char(3), OrderDate date, TimeSinceLast int)
INSERT #TransTable VALUES
('123a', 'A01', '06.20.2018', 20),
('123y', 'B05', '06.20.2018' ,31),
('113k', 'A01', '05.18.2018' ,NULL), ------- need this
('168x', 'C01', '04.17.2018' ,8),
('999y', 'B05', '04.15.2018' ,NULL), ------- need this
('188k', 'A01', '04.15.2018' ,123),
('678a', 'B05', '03.16.2018' ,45)
SELECT B.OrderID, B.CustID, B.OrderDate, B.TimeSinceLast
FROM #TransTable A
CROSS APPLY (
SELECT 0 AS rn, A.OrderID, A.CustID, A.OrderDate, A.TimeSinceLast
UNION ALL
SELECT TOP 2 ROW_NUMBER() OVER (PARTITION BY CASE WHEN T.OrderDate > A.OrderDate THEN 1 ELSE 0 END ORDER BY ABS(DATEDIFF(day, T.OrderDate, A.OrderDate))) rn,
T.OrderID, T.CustID, T.OrderDate, T.TimeSinceLast
FROM #TransTable T
WHERE T.CustID = A.CustID AND T.OrderID <> A.OrderID
ORDER BY rn
) B
WHERE A.TimeSinceLast IS NULL
ORDER BY B.CustID, B.OrderDate DESC

Get latest record for customers by Date

I want to get the latest phone number of customer by date. There are multiple entries for the same customer. But out of that I only want the record which has the maximum date.
Sample Data,
|cust_id | phone | hist_date
| A | 1234 | 2015-10-02
| A | 4567 | 2016-10-02
| A | 7896 | 2017-10-02
| B | 6456 | 2015-10-02
| B | 8621 | 2016-10-02
| B | 6382 | 2017-10-02
| A | 1393 | 2018-10-02
Desired result is
|cust_id | phone | hist_date
| A | 1393 | 2018-10-02
| B | 6382 | 2017-10-02
Please don't hard-code it with year. I need it to be dynamic so that every time only the latest date record will show. I know this can be achieved by Sub-query and CTE using ROW NUMBER. I tried but haven't got it right. Thanks a lot for the help.
use row_number() analytic function
select * from
(select *,row_number()over(partition by cust_id order by hist_date desc) rn
from logic
) t where t.rn=1
or you can use corelate subquery
select t1.* from logic t1
where t1.hist_date=( select max(hist_date)
from logic t2 where t1.cust_id=t2.cust_id
)
use row_number() window function
select * from
(
select *, row_number() over(partition by cus_id order by hist_date desc) as rn
from logic
)A where rn=1
You can also try the following query.
create table temp(cust_id char(1), phone char(5), hist_date date)
insert into temp values
('A', '1234', '2015-10-02'),
('A', '4567', '2016-10-02'),
('A', '7896', '2017-10-02'),
('B', '6456', '2015-10-02'),
('B', '8621', '2016-10-02'),
('B', '6382', '2017-10-02'),
('A', '1393', '2018-10-02')
Now the actual query.
Select a.* from temp a
inner join (
Select cust_id, MAX(hist_date) as hist_date from temp
group by cust_id
)b on a.cust_id = b.cust_id and a.hist_date = b.hist_date
Live Demo

Selecting row with highest ID based on another column

In SQL Server 2008 R2, suppose I have a table layout like this...
+----------+---------+-------------+
| UniqueID | GroupID | Title |
+----------+---------+-------------+
| 1 | 1 | TEST 1 |
| 2 | 1 | TEST 2 |
| 3 | 3 | TEST 3 |
| 4 | 3 | TEST 4 |
| 5 | 5 | TEST 5 |
| 6 | 6 | TEST 6 |
| 7 | 6 | TEST 7 |
| 8 | 6 | TEST 8 |
+----------+---------+-------------+
Is it possible to select every row with the highest UniqueID number, for each GroupID. So according to the table above - if I ran the query, I would expect this...
+----------+---------+-------------+
| UniqueID | GroupID | Title |
+----------+---------+-------------+
| 2 | 1 | TEST 2 |
| 4 | 3 | TEST 4 |
| 5 | 5 | TEST 5 |
| 8 | 6 | TEST 8 |
+----------+---------+-------------+
Been chomping on this for a while, but can't seem to crack it.
Many thanks,
SELECT *
FROM (SELECT uniqueid, groupid, title,
Row_number()
OVER ( partition BY groupid ORDER BY uniqueid DESC) AS rn
FROM table) a
WHERE a.rn = 1
With SQL-Server as rdbms you can use a ranking function like ROW_NUMBER:
WITH CTE AS
(
SELECT UniqueID, GroupID, Title,
RN = ROW_NUMBER() OVER (PARTITON BY GroupID
ORDER BY UniqueID DESC)
FROM dbo.TableName
)
SELECT UniqueID, GroupID, Title
FROM CTE
WHERE RN = 1
This returns exactly one record for each GroupID even if there are multiple rows with the highest UniqueID (the name does not suggest so). If you want to return all rows in then use DENSE_RANK instead of ROW_NUMBER.
Here you can see all functions and how they work: http://technet.microsoft.com/en-us/library/ms189798.aspx
Since you have not mentioned any RDBMS, this statement below will work on almost all RDBMS. The purpose of the subquery is to get the greatest uniqueID for every GROUPID. To be able to get the other columns, the result of the subquery is joined on the original table.
SELECT a.*
FROM tableName a
INNER JOIN
(
SELECT GroupID, MAX(uniqueID) uniqueID
FROM tableName
GROUP By GroupID
) b ON a.GroupID = b.GroupID
AND a.uniqueID = b.uniqueID
In the case that your RDBMS supports Qnalytic functions, you can use ROW_NUMBER()
SELECT uniqueid, groupid, title
FROM
(
SELECT uniqueid, groupid, title,
ROW_NUMBER() OVER (PARTITION BY groupid
ORDER BY uniqueid DESC) rn
FROM tableName
) x
WHERE x.rn = 1
TSQL Ranking Functions
The ROW_NUMBER() generates sequential number which you can filter out. In this case the sequential number is generated on groupid and sorted by uniqueid in descending order. The greatest uniqueid will have a value of 1 in rn.
SELECT *
FROM the_table tt
WHERE NOT EXISTS (
SELECT *
FROM the_table nx
WHERE nx.GroupID = tt.GroupID
AND nx.UniqueID > tt.UniqueID
)
;
Should work in any DBMS (no window functions or CTEs are needed)
is probably faster than a sub query with an aggregate
Keeping it simple:
select * from test2
where UniqueID in (select max(UniqueID) from test2 group by GroupID)
Considering:
create table test2
(
UniqueID numeric,
GroupID numeric,
Title varchar(100)
)
insert into test2 values(1,1,'TEST 1')
insert into test2 values(2,1,'TEST 2')
insert into test2 values(3,3,'TEST 3')
insert into test2 values(4,3,'TEST 4')
insert into test2 values(5,5,'TEST 5')
insert into test2 values(6,6,'TEST 6')
insert into test2 values(7,6,'TEST 7')
insert into test2 values(8,6,'TEST 8')

SQL Order By and "Not-So-Much Group"

Lets say I have a table:
--------------------------------------
| ID | DATE | GROUP | RESULT |
--------------------------------------
| 1 | 01/06 | Group1 | 12345 |
| 2 | 01/05 | Group2 | 54321 |
| 3 | 01/04 | Group1 | 11111 |
--------------------------------------
I want to order the result by the most recent date at the top but group the "group" column together, but still have distinct entries. The result that I want would be:
1 | 01/06 | Group1 | 12345
3 | 01/04 | Group1 | 11111
2 | 01/05 | Group2 | 54321
What would be a query to get that result?
thank you!
EDIT:
I'm using MSSQL. I'll look into translating the oracle query into MS SQL and report my results.
EDIT
SQL Server 2000, so OVER/PARTITION is not supported =[
Thank you!
You should specify what RDBMS you are using. This answer is for Oracle, may not work in other systems.
SELECT * FROM table
ORDER BY MAX(date) OVER (PARTITION BY group) DESC, group, date DESC
declare #table table (
ID int not null,
[DATE] smalldatetime not null,
[GROUP] varchar(10) not null,
[RESULT] varchar(10) not null
)
insert #table values (1, '2009-01-06', 'Group1', '12345')
insert #table values (2, '2009-01-05', 'Group2', '12345')
insert #table values (3, '2009-01-04', 'Group1', '12345')
select t.*
from #table t
inner join (
select
max([date]) as [order-date],
[GROUP]
from #table orderer
group by
[GROUP]
) x
on t.[GROUP] = x.[GROUP]
order by
x.[order-date] desc,
t.[GROUP],
t.[DATE] desc
use an order by clause with two params:
...order by group, date desc
this assumes that your date column does hold dates and not varchars
SELECT table2.myID,
table2.mydate,
table2.mygroup,
table2.myresult
FROM (SELECT DISTINCT mygroup FROM testtable as table1) as grouptable
JOIN testtable as table2
ON grouptable.mygroup = table2.mygroup
ORDER BY grouptable.mygroup,table2.mydate
SORRY, could NOT bring myself to use columns that were reserved names, rename the columns to make it work :)
this is MUCH simpler than the accepted answer btw.