SQL Server get latest value by date - sql

I have a SQL Server table that has project_id int,update_date datetime ,update_text varchar(max)
The table has many updates per project_id. I need to fetch the latest by update_date for all project_id values.
Example:
project_id update_date update_text
1 2017/01/01 Happy new year.
2 2017/01/01 Nice update
2 2017/02/14 Happy Valentine's
3 2016/12/25 Merry Christmas
3 2017/01/01 A New year is a good thing
The query should get:
project_id update_date update_text
1 2017/01/01 Happy new year.
2 2017/02/14 Happy Valentine's
3 2017/01/01 A New year is a good thing

using top with ties with row_number()
select top 1 with ties
project_id, update_date, update_text
from projects
order by row_number() over (partition by project_id order by update_date desc)
rextester demo: http://rextester.com/MGUNU86353
returns:
+------------+-------------+----------------------------+
| project_id | update_date | update_text |
+------------+-------------+----------------------------+
| 1 | 2017-01-01 | Happy new year. |
| 2 | 2017-02-14 | Happy Valentine's |
| 3 | 2017-01-01 | A New year is a good thing |
+------------+-------------+----------------------------+

This query will give you the latest date for each project
Select Project_Id, Max (Update_Date) Max_Update_Date
From MyTable
Group By Project_Id
So join it back to the original table
Select Project_Id, Update_Date, Update_Text
From MyTable
Inner Join
(
Select Project_Id, Max (Update_Date) Max_Update_Date
From MyTable
Group By Project_Id
) MaxDates
On MyTable.Project_Id = MaxDates.Project_Id
And MyTable.Update_Date = MaxDates.Max_Update_Date

You can find the MAX(date) like:
SELECT * FROM [table]
INNER JOIN (SELECT project_id, date = MAX(update_date) FROM [table] GROUP BY project_id) AS a
ON [table].project_id = a.project_id AND update_date = date
or you can use ROW_NUMBER() like:
SELECT * FROM (
SELECT *, rownum = ROW_NUMBER() OVER (PARTITION BY project_id ORDER BY
update_date DESC) FROM [table]
) AS a WHERE rownum = 1

Note to top answer:
Whilst some may say it is the best answer it will often not be the most efficient query time. In my data the following example is an order of magnitude faster.
SELECT
project_id, update_date, update_text
FROM
projects P
WHERE
update_date = (SELECT MAX(update_date) FROM projects WHERE project_id = P.project_id)
(If you can have more than one update on a given date then would need max on update_text and group by as in this specific example you would not know which update_text value was valid.)

You can do it as:
with CTE as(
SELECT project_id, MAX(update_date) update_date
FROM YourTable
GROUP BY project_id
)
SELECT cte.project_id, cte.update_date , max(t.update_text) update_text
FROM YourTable T inner join CTE on T.project_id = CTE.project_id
group by cte.project_id, cte.update_date ;
Demo.

Related

First value in DATE minus 30 days SQL

I have bunch of data out of which I'm showing ID, max date and it's corresponding values (user id, type, ...). Then I need to take MAX date for each ID, substract 30 days and show first date and it's corresponding values within this date period.
Example:
ID Date Name
1 01.05.2018 AAA
1 21.04.2018 CCC
1 05.04.2018 BBB
1 28.03.2018 AAA
expected:
ID max_date max_name previous_date previous_name
1 01.05.2018 AAA 05.04.2018 BBB
I have working solution using subselects, but as I have quite huge WHERE part, refresh takes ages.
SUBSELECT looks like that:
(SELECT MIN(N.name)
FROM t1 N
WHERE N.ID = T.ID
AND (N.date < MAX(T.date) AND N.date >= (MAX(T.date)-30))
AND (...)) AS PreviousName
How'd you write the select?
I'm using TSQL
Thanks
I can do this with 2 CTEs to build up the dates and names.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE t1 (ID int, theDate date, theName varchar(10)) ;
INSERT INTO t1 (ID, theDate, theName)
VALUES
( 1,'2018-05-01','AAA' )
, ( 1,'2018-04-21','CCC' )
, ( 1,'2018-04-05','BBB' )
, ( 1,'2018-03-27','AAA' )
, ( 2,'2018-05-02','AAA' )
, ( 2,'2018-05-21','CCC' )
, ( 2,'2018-03-03','BBB' )
, ( 2,'2018-01-20','AAA' )
;
Main Query:
;WITH cte1 AS (
SELECT t1.ID, t1.theDate, t1.theName
, DATEADD(day,-30,t1.theDate) AS dMinus30
, ROW_NUMBER() OVER (PARTITION BY t1.ID ORDER BY t1.theDate DESC) AS rn
FROM t1
)
, cte2 AS (
SELECT c2.ID, c2.theDate, c2.theName
, ROW_NUMBER() OVER (PARTITION BY c2.ID ORDER BY c2.theDate) AS rn
, COUNT(*) OVER (PARTITION BY c2.ID) AS theCount
FROM cte1
INNER JOIN cte1 c2 ON cte1.ID = c2.ID
AND c2.theDate >= cte1.dMinus30
WHERE cte1.rn = 1
GROUP BY c2.ID, c2.theDate, c2.theName
)
SELECT cte1.ID, cte1.theDate AS max_date, cte1.theName AS max_name
, cte2.theDate AS previous_date, cte2.theName AS previous_name
, cte2.theCount
FROM cte1
INNER JOIN cte2 ON cte1.ID = cte2.ID
AND cte2.rn=1
WHERE cte1.rn = 1
Results:
| ID | max_date | max_name | previous_date | previous_name |
|----|------------|----------|---------------|---------------|
| 1 | 2018-05-01 | AAA | 2018-04-05 | BBB |
| 2 | 2018-05-21 | CCC | 2018-05-02 | AAA |
cte1 builds the list of max_date and max_name grouped by the ID and then using a ROW_NUMBER() window function to sort the groups by the dates to get the most recent date. cte2 joins back to this list to get all dates within the last 30 days of cte1's max date. Then it does essentially the same thing to get the last date. Then the outer query joins those two results together to get the columns needed while only selecting the most and least recent rows from each respectively.
I'm not sure how well it will scale with your data, but using the CTEs should optimize pretty well.
EDIT: For the additional requirement, I just added in another COUNT() window function to cte2.
I would do:
select id,
max(case when seqnum = 1 then date end) as max_date,
max(case when seqnum = 1 then name end) as max_name,
max(case when seqnum = 2 then date end) as prev_date,
max(case when seqnum = 2 then name end) as prev_name,
from (select e.*, row_number() over (partition by id order by date desc) as seqnum
from example e
) e
group by id;

Min Date from one column multiple rows

My apologies, I should have added every column and complete problem not just portion.
I have a table A which stores all invoices issued(id 1) payments received (id 4) from clients. Sometimes client pay in 2-3 installments. I want to find dateifference between invoice issued and last payment collected for the invoice. My data looks like this
**a.cltid**|**A.Invnum**|A.Cash|A.Date | a.type| a.status
70 |112 |-200 |2012-03-01|4 |P
70 |112 |-500 |2012-03-12|4 |P
90 |124 |-550 |2012-01-20|4 |P
70 |112 |700 |2012-02-20|1 |p
55 |101 |50 |2012-01-15|1 |d
90 |124 |550 |2012-01-15|1 |P
I am running
Select *, Datediff(dd,T.date,P.date)
from (select a.cltid, a.invnumber,a.cash, min(a.date)date
from table.A as A
where a.status<>'d' and a.type=1
group by a.cltid, a.invnumber,a.cash)T
join
Select *
from (select a.cltid, a.invnumber,a.cash, min(a.date)date
from table.A as A
where a.status<>'d' and a.type=4
group by a.cltid, a.invnumber,a.cash)P
on
T.invnumb=P.invnumber and T.cltid=P.cltid
How can I make it work? So it shows me
70|112|-500|2012-03-12|4|P 70|112|700|2012-02-20|1|p|22
90|124|-550|2012-01-20|4|P 90|124|550|2012-01-15|1|P|5
Edited***
You can use row_number to assign sequence number within each cltid in the order of decreasing date and then filter to get the first row for each cltid which will be the row with latest date for that cltid:
select *
from (
select A.*,
row_number() over (
partition by a.cltid order by a.date desc
) rn
from table.A as A
) t
where rn = 1;
It will return one row (with latest date) for each client. If you want to return all the rows which have latest date, use rank() instead.
Use a ranking function to get all the columns:
select a.*
from (select a.*,
row_number() over (partition by cltid order by date desc) as seqnum
from a
) a
where seqnum = 1;
Use aggregation if you only want the date. The issue with your query is that the group by clause has too many columns:
select a.cltid, max(a.date) as date
from table.A as A
group by a.cltid;
And the fact that min() returns the first date not the last date.
There are many ways to do this. Here are some of them:
test setup: http://rextester.com/VGUY60367
with common_table_expression as () using row_number()
with cte as (
select *
, rn = row_number() over (
partition by cltid, Invnum
order by [date] desc
)
from a
)
select cltid, Invnum, Cash, [date]
from cte
where rn = 1
cross apply version:
select distinct
a.cltid
, a.Invnum
, x.Cash
, x.[date]
from a
cross apply (
select top 1
cltid, Invnum
, [date]
, Cash
from a as i
where i.cltid =a.cltid
and i.Invnum=a.Invnum
order by i.[date] desc
) as x;
top with ties version:
select top 1 with ties
*
from a
order by
row_number() over (
partition by cltid, Invnum
order by [date] desc
)
all return:
+-------+--------+---------------------+------+
| cltid | Invnum | date | Cash |
+-------+--------+---------------------+------+
| 70 | 112 | 12.03.2012 00:00:00 | -500 |
| 90 | 124 | 20.01.2012 00:00:00 | -550 |
+-------+--------+---------------------+------+
You can achieve the desired o/p by this:
Select
a.cltid, a.invnumber,a.cash, max(a.date) [date]
from
YourTable a
group by
a.cltid, a.invnumber, a.cash, a.date

select top N records for each entity

I have a table like below -
ID | Reported Date | Device_ID
-------------------------------------------
1 | 2016-03-09 09:08:32.827 | 1
2 | 2016-03-08 09:08:32.827 | 1
3 | 2016-03-08 09:08:32.827 | 1
4 | 2016-03-10 09:08:32.827 | 2
5 | 2016-03-05 09:08:32.827 | 2
Now, i want a top 1 row based on date column for each device_ID
Expected Output
ID | Reported Date | Device_ID
-------------------------------------------
1 | 2016-03-09 09:08:32.827 | 1
4 | 2016-03-10 09:08:32.827 | 2
I am using SQL Server 2008 R2. i can go and write Stored Procedure to handle it but wanted do it with simple query.
****************EDIT**************************
Answer by 'Felix Pamittan' worked well but for 'N' just change it to
SELECT
Id, [Reported Date], Device_ID
FROM (
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY Device_ID ORDER BY [ReportedDate] DESC)
FROM tbl
)t
WHERE Rn >= N
He had mentioned this in comment thought to add it to questions so that no body miss it.
Use ROW_NUMBER:
SELECT
Id, [Reported Date], Device_ID
FROM (
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY Device_ID ORDER BY [ReportedDate] DESC)
FROM tbl
)t
WHERE Rn = 1
You can also try using CTE
With DeviceCTE AS
(SELECT *, ROW_NUMBER() OVER(PARTITION BY Device_ID ORDER BY [Reported Date] DESC) AS Num
FROM tblname)
SELECT Id, [Reported Date], Device_ID
From DeviceCTE
Where Num = 1
If you can't use an analytic function, e.g. because your application layer won't allow it, then you can try the following solution which uses a subquery to arrive at the answer:
SELECT t1.ID, t2.maxDate, t1.Device_ID
INNER JOIN
(
SELECT Device_ID, MAX([Reported Date]) AS maxDate
FROM yourTable
GROUP BY Device_ID
) t2
ON t1.Device_ID = t2.Device_ID AND t1.[Reported Date] = t2.maxDate
Select * from DEVICE_TABLE D
where [Reported Date] = (Select Max([Reported Date]) from DEVICE_TABLE where Device_ID = D.Device_ID)
should do the trick, assume that "top 1 row based on date column" means that you want to select the latest reported date of each Device_ID ?
As for your title, select top 5 rows of each Device_ID
Select * from DEVICE_TABLE D
where [Reported Date] in (Select top 5 [Reported Date] from DEVICE_TABLE D where Device_ID = D.Device_ID)
order by Device_ID, [Reported Date] desc
will give you the top 5 latest reports of each device id.
You may want to sort out the top 5 date if your data isn't in order...
Again with no analytic functions you can use CROSS APPLY :
DECLARE #tbl TABLE(Id INT,[Reported Date] DateTime , Device_ID INT)
INSERT INTO #tbl
VALUES
(1,'2016-03-09 09:08:32.827',1),
(2,'2016-03-08 09:08:32.827',1),
(3,'2016-03-08 09:08:32.827',1),
(4,'2016-03-10 09:08:32.827',2),
(5,'2016-03-05 09:08:32.827',2)
SELECT r.*
FROM ( SELECT DISTINCT Device_ID FROM #tbl ) d
CROSS APPLY ( SELECT TOP 1 *
FROM #tbl t
WHERE d.Device_ID = t.Device_ID ) r
Can be easily modified to support N records.
Credits go to wBob answering this question here

Comparing row values in oracle

I have Table1 with three columns:
Key | Date | Price
----------------------
1 | 26-May | 2
1 | 25-May | 2
1 | 24-May | 2
1 | 23 May | 3
1 | 22 May | 4
2 | 26-May | 2
2 | 25-May | 2
2 | 24-May | 2
2 | 23 May | 3
2 | 22 May | 4
I want to select the row where value 2 was last updated (24-May). The Date was sorted using RANK function.
I am not able to get the desired results. Any help will be appreciated.
SELECT *
FROM (SELECT key, DATE, price,
RANK() over (partition BY key order by DATE DESC) AS r2
FROM Table1 ORDER BY DATE DESC) temp;
Another way of looking at the problem is that you want to find the most recent record with a price different from the last price. Then you want the next record.
with lastprice as (
select t.*
from (select t.*
from table1 t
order by date desc
) t
where rownum = 1
)
select t.*
from (select t.*
from table1 t
where date > (select max(date)
from table1 t2
where t2.price <> (select price from lastprice)
)
order by date asc
) t
where rownum = 1;
This query looks complicated. But, it is structured so it can take advantage of indexes on table1(date). The subqueries are necessary in Oracle pre-12. In the most recent version, you can use fetch first 1 row only.
EDIT:
Another solution is to use lag() and find the most recent time when the value changed:
select t1.*
from (select t1.*
from (select t1.*,
lag(price) over (order by date) as prev_price
from table1 t1
) t1
where prev_price is null or prev_price <> price
order by date desc
) t1
where rownum = 1;
Under many circumstances, I would expect the first version to have better performance, because the only heavy work is done in the innermost subquery to get the max(date). This verson has to calculate the lag() as well as doing the order by. However, if performance is an issue, you should test on your data in your environment.
EDIT II:
My best guess is that you want this per key. Your original question says nothing about key, but:
select t1.*
from (select t1.*,
row_number() over (partition by key order by date desc) as seqnum
from (select t1.*,
lag(price) over (partition by key order by date) as prev_price
from table1 t1
) t1
where prev_price is null or prev_price <> price
order by date desc
) t1
where seqnum = 1;
You can try this:-
SELECT Date FROM Table1
WHERE Price = 2
AND PrimaryKey = (SELECT MAX(PrimaryKey) FROM Table1
WHERE Price = 2)
This is very similar to the second option by Gordon Linoff but introduces a second windowed function row_number() to locate the most recent row that changed the price. This will work for all or a range of keys.
select
*
from (
select
*
, row_number() over(partition by Key order by [date] DESC) rn
from (
select
*
, NVL(lag(Price) over(partition by Key order by [date] DESC),0) prevPrice
from table1
where Key IN (1,2,3,4,5) -- as an example
)
where Price <> prevPrice
)
where rn = 1
apologies but I haven't been able to test this at all.

SQL count changes in column

I need to count the changes in assigned group on a ticket. The problem is my log also count changes in assignee that are in the same group.
Here is some sample data
ticket_id | assigned_group | assignee | date
----------------------------------------------------
1001 | group A | john | 1-1-15
1001 | group A | michael | 1-2-15
1001 | group A | jacob | 1-3-15
1001 | group B | eddie | 1-4-15
1002 | group A | john | 1-1-15
1002 | group B | eddie | 1-2-15
1002 | group A | john | 1-3-15
1002 | group B | eddie | 1-4-15
1002 | group A | john | 1-5-15
I need this to return
ticket_id | count
--------------------
10001 | 2
10002 | 4
My query is like this
select ticket_id, assigned_group, count(*) from mytable group by ticket_id, assigned_group
But that gives me
ticket_id | count
--------------------
10001 | 4
10002 | 5
edit:
Also if I use
select ticket_id, count(Distinct assigned_group) as [Count] from mytable group by ticket_id
I only get
ticket_id | count
--------------------
10001 | 2
10002 | 2
Any advice?
Use Distinct Count to get the result
select ticket_id, count(Distinct assigned_group) as [Count]
from mytable
group by ticket_id
try this..
with temp as
(
select ticket_id, assigned_group, count(*) as count,date from mytable group by ticket_id, assigned_group,date
)
select ticket_id, count from temp
You can use Row_number() function to look into the next record's value.
with tbl as (select *, row_number() over(partition by ticket_id order by 1) from table)
select a.ticket_id, a.assigned_group, a.assignee_name, a.date,
count(case when a.assigned_group <> b.assigned_group then 1 else 0 end) as No_of_change
from tbl as a
left join tbl as b
on a.rn = b.rn + 1
If you are using SQL Server 2012, then you can use the LAG function to determine the previous assigned group easily. Then, if the previous assigned group is different from the current assigned group, you can increment the count, as below:
WITH previous_groups AS
(
SELECT
ticket_id,
assign_date,
assigned_group,
LAG(assigned_group, 1, NULL) OVER (PARTITION BY ticket_id ORDER BY assign_date) AS prev_assign_group
FROM mytable
)
SELECT
ticket_id,
SUM(CASE
WHEN assigned_group <> prev_assign_group THEN 1
ELSE 0
END) AS count
FROM previous_groups
WHERE prev_assign_group IS NOT NULL
GROUP BY ticket_id
ORDER BY ticket_id;
If you are using SQL Server 2008 or earlier versions, then you need an extra step to determine the previous assigned group, as below:
WITH previous_assign_dates AS
(
SELECT
mt1.ticket_id,
mt1.assign_date,
MAX(mt2.assign_date) AS prev_assign_date
FROM mytable mt1
LEFT JOIN mytable mt2
ON mt1.ticket_id = mt2.ticket_id
AND mt2.assign_date < mt1.assign_date
GROUP BY
mt1.ticket_id,
mt1.assign_date
),
previous_groups AS
(
SELECT
mt1.*,
mt2.assigned_group AS prev_assign_group
FROM mytable mt1
INNER JOIN previous_assign_dates pad
ON mt1.ticket_id = pad.ticket_id
AND mt1.assign_date = pad.assign_date
LEFT JOIN mytable mt2
ON pad.ticket_id = mt2.ticket_id
AND pad.prev_assign_date = mt2.assign_date
)
SELECT
ticket_id,
SUM(CASE
WHEN assigned_group <> prev_assign_group THEN 1
ELSE 0
END) AS count
FROM previous_groups
WHERE prev_assign_group IS NOT NULL
GROUP BY ticket_id
ORDER BY ticket_id;
SQL Fiddle demo
References:
The LAG function on MSDN
Adding an ordinal number within the ticket, then a self join where the group is different and consecutive ordinals, should work:
SELECT t1.ticket_id, COUNT(*) FROM
(SELECT *, ROW_NUMBER() OVER(PARTITION BY ticket_id ORDER BY date) ordinal
FROM mytable) t1
JOIN
(SELECT *, ROW_NUMBER() OVER(PARTITION BY ticket_id ORDER BY date) ordinal FROM nytable) t2
ON t1.ticket_id=t2.ticket_id AND t1.assigned_group<>t2.assigned_group AND t1.ordinal+1=t2.ordinal
GROUP BY t1.ticket_id