Find day difference of last two rows - sql

What I'm trying to do is find the MAX date and do a datediff between the most recent date and the second to last date to create a single column for the difference in days. How do I get rid of the first two rows? I attempted to do a MAX by wrapping it another table, no luck.
Sample Data:
ITEM ID
ITEM
LAST UPDATED
REASON
123
Pencil
4/1/2020
Correction
123
Pencil
8/1/2020
Correction
123
Pencil
9/3/2020
Correction
456
Highlighter
5/1/2020
Correction
456
Highlighter
5/10/2020
Correction
789
Pen
10/1/2020
Correction
789
Pen
10/1/2020
Correction
Expected Output:
ITEM ID
ITEM
LAST UPDATED
REASON
Days Diff Since Last Correction
123
Pencil
9/3/2020
Correction
33
456
Highlighter
5/10/2020
Correction
9
789
Pen
10/20/2020
Correction
19
Here's what I've used so far:
SELECT
[Item_ID]
,[Item]
,[Last_Updated]
,[Reason]
,DATEDIFF(day,lag([Last_Updated],1) over(partition by [Item_ID] ORDER BY [Last_Updated] asc), [Last_Updated]) AS DAY_DIFF
FROM [Table]
This is giving me the below:
Item_ID Item Last_Updated Reason DAY_DIFF
123 Pencil 2020-04-01 Correction NULL
123 Pencil 2020-08-01 Correction 122
123 Pencil 2020-09-03 Correction 33
456 Highlighter 2020-05-01 Correction NULL
456 Highlighter 2020-05-10 Correction 9
789 Pen 2020-10-01 Correction NULL
789 Pen 2020-10-20 Correction 19

select t.* from(
SELECT
[Item_ID]
,[Item]
,[Last_Updated]
,[Reason]
,datediff(day, lag([Last_Updated],1,Last_Updated)over (partition by [Item_ID] order by [Last_Updated]),[Last_Updated]) as 'Difference Between Last Correction',
row_number() over (partition by [Item_ID] order by [Last_Updated] desc) as rn
FROM [TABLE]
)t
where rn = 1;

Related

How to get the last day of the month without LAST_DAY() or EOMONTH()?

I have a table t with:
DATE
LOCATION
PRODUCT_ID
AMOUNT
2021-10-29
1
123
10
2021-10-30
1
123
9
2021-10-31
1
123
8
2021-10-29
1
456
100
2021-10-30
1
456
90
2021-10-31
1
456
80
2021-10-29
2
123
18
2021-10-30
2
123
17
2021-11-29
2
456
18
I need to find the AMOUNT of each PRODUCT_ID for each combination of LOCATION + PRODUCT_ID.
If a PRODUCT_ID has no entry for that day the AMOUNT is NULL.
So the result should look like:
DATE
LOCATION
PRODUCT_ID
AMOUNT
2021-10-31
1
123
8
2021-10-31
1
456
80
2021-10-31
2
123
NULL
2021-11-30
2
456
NULL
Sadly EXASOL has no LAST_DAY() or EOMONTH() function. How can I solve this?
You can get to the last day of the month using a date_trunc function in combination with date_add:
case
when t.date = date_add('day', -1, date_add('month', 1, date_trunc('month', t.date)))
then 'Y' else 'N' end as end_of_month
That being said, if you group your table for all combinations of locations and products, you will not get NULLs for products without sales on the last day of the month as shown in your output table.
When you group your data, any value that does not exist will simply not show up in your output table. If you want to force nulls to show up, you can create a new table that contains all combinations of products, locations, and hard-coded end of month dates.
Then, you can left join your old table with this new hard-coded table by date, location, and product. This method will give you the NULL values you expect.

Identify value changes in history table

I have following table, which apart from other attributes contains:
Customer ID - unique identifier
Value
CreatedDate - when the record has been created (based on ETL)
UpdatedDate - until when the record has been valid
Since there are other attributes apart from the [Value], which are being tracked for historical values, there might be cases, where there are multiple rows with the same [Value] for the same customer, but different timestamps in [CreatedDate] / [UpdatedDate]. Thus, the data may look like:
Customer ID
Value
CreatedDate
UpdatedDate
1
111
04/08/2021 15:00
04/08/2021 17:00
1
111
01/08/2021 09:00
04/08/2021 15:00
1
222
20/07/2021 01:30
01/08/2021 09:00
1
222
01/06/2021 08:00
20/07/2021 01:30
1
111
01/04/2021 07:15
01/06/2021 08:00
2
333
03/08/2021 04:30
04/08/2021 17:00
2
444
23/07/2021 01:20
03/08/2021 04:30
2
444
01/04/2021 13:50
23/07/2021 01:20
I would like to keep the unique [Values] in correct sequence, hence keep the [Value] for the earliest [CreatedDate], however, if Customer had originally Value1, then changed it to Value2 and finally, changed back to Value1. I would like to keep these 2 changes as well. Hence the ideal output should look like:
Customer ID
Value
CreatedDate
UpdatedDate
1
111
01/08/2021 09:00
04/08/2021 17:00
1
222
01/06/2021 08:00
01/08/2021 09:00
1
111
01/04/2021 07:15
01/06/2021 08:00
2
333
03/08/2021 04:30
04/08/2021 17:00
2
444
01/04/2021 13:50
03/08/2021 04:30
Based on CreatedDate / UpdatedDate identify, the chronological sequence of changes and identify the earliest CreatedDate and latest UpdatedDate. However, if particular value appeared multiple times, but has been interspersed by different value, I would like to keep it too.
I've tried the below approach and it works fine however it does not work for the scenario above and the output look like:
SELECT [Customer ID]
,Value
,MIN(CreatedDate) as CreatedDate
,MAX(UpdatedDate) as UpdatedDate
FROM #History
GROUP BY ID, Value
Customer ID
Value
CreatedDate
UpdatedDate
1
111
01/04/2021 07:15
04/08/2021 17:00
1
222
01/06/2021 08:00
01/08/2021 09:00
2
333
03/08/2021 04:30
04/08/2021 17:00
2
444
01/04/2021 13:50
03/08/2021 04:30
Any ideas, please? I've tried using LAG and LEAD as well, but was not able to make it work either.
This is a type of gaps-and-island problem that is probably best solved by looking for overlaps using a cumulative maximum:
select customerid, min(createddate), max(updateddate)
from (select t.*,
sum(case when prev_updatedate >= createddate then 0 else 1 end) over (partition by customerid, value order by createddate) as grp
from (select h.*,
max(updateddate) over (partition by customerid, value order by createddate rows between unbounded preceding and 1 preceding) as prev_updatedate
from #history h
) h
) h
group by customerid, value, grp;
The logic is to look at the most recent updatedate before each row for each customer and value. If this is earlier than the row's create date, then this starts are new group.
The final result is just aggregating the rows in each group.

Continuous Date / Not continuous Date sql server

I'm encountering a problem with continous date / not cointinuous date on sql server 2012.
I have a table that looks like this :
Article
Creation date
1234
04/01/2021
1234
05/01/2021
1234
06/01/2021
1234
07/01/2021
1234
10/01/2021
1234
12/01/2021
12345
02/01/2021
12345
03/01/2021
12345
17/01/2021
123456
01/01/2021
123456
03/01/2021
123456
05/01/2021
The problem is :
I want to get the count of every article by continuous date with the min date of the range, it's a bit difficult to explain what I want but there is an example of the result :
Article
Creation date
Count
1234
04/01/2021
4
1234
10/01/2021
1
1234
12/01/2021
1
12345
02/01/2021
2
12345
17/01/2021
1
123456
01/01/2021
1
123456
03/01/2021
1
123456
05/01/2021
1
For example :
count of 1st row = 4 because there is 4 continous day on the range 04/01/2021 to 07/01/2021
count of 2nd row = 1 because there is only 1 day, 0 continuous day with 10/01/2021 for this article
count of 3rd row = 1 because there is only 1 day, 0 continuous day with 12/01/2021 for this article
I'm starting with that :
;WITH CTE AS (
SELECT Article, [Creation date], StartDate= Dateadd(day,-ROW_NUMBER() OVER (ORDER BY [Creation date]),[Creation date])
FROM MyTable
)
SELECT Article, min([Creation date]) as [Creation date], count(Article) as count
FROM CTE
GROUP BY StartDate, Article, [Creation date]
order by Article, [Creation date]
Output :
Article
Creation date
Count
1234
04/01/2021
1
1234
05/01/2021
1
1234
06/01/2021
1
1234
07/01/2021
1
1234
10/01/2021
1
1234
12/01/2021
1
12345
02/01/2021
1
12345
03/01/2021
1
12345
17/01/2021
1
123456
01/01/2021
1
123456
03/01/2021
1
123456
05/01/2021
1
but the result is wrong, I don't really know how to approach this problem. If someone can enlighten me, appreciate.
Thank you
This is an example of a gaps-and-islands problem. The simplest solution in this case is to subtract an increasing sequence of values and aggregate. This works because the difference is constant for incremental dates:
select article, min(creation_date), max(creation_date), count(*)
from (select t.*,
row_number() over (partition by article order by creation_date) as seqnum
from mytable t
) t
group by article, dateadd(day, -seqnum, creation_date)
order by article, min(creation_date);

Recursive query with time difference

This is my first post here even though I am a daily reader. :)
I need to produce an MS SQL Server 2014 report that shows the clients that come back to do business with me in less than or equal to 3 days. I tried with INNER JOINS but I wasn't successful.
The way I thought of the solution is using the below Logic:
If product is same
and if userId is same
and if action was donedeal but now is new
and if date diff <= 3 days
and if type is NOT same
then show results
e.g of my Data:
id orderId userId type product date action
1 1001 654 ordered apple 01/05/2016 new
2 1002 889 ordered peach 01/05/2016 new
3 1001 654 paid apple 01/05/2016 donedeal
4 1002 889 paid peach 03/05/2016 donedeal
5 1003 654 ordered apple 03/05/2016 new
6 1004 889 ordered peach 04/05/2016 new
7 1005 122 ordered apple 04/05/2016 new
8 1006 978 ordered peach 04/05/2016 new
9 1005 122 paid apple 04/05/2016 donedeal
10 1007 122 ordered apple 10/05/2016 new
Desired results:
id orderId userId type product date Diff
3 1001 654 paid apple 01/05/2016 2 days
4 1002 889 paid peach 03/05/2016 1 day
5 1003 654 ordered apple 03/05/2016 2 days
6 1004 889 ordered peach 04/05/2016 1 day
Could you please direct me to the functions that can be useful for me to solve this?
Thanks in advance.
#
Update
Gordon Linoff gave me the suggested code below but since the Type had to be different I replicated the code and run it as per below and it worked:
select t.*
from (select t.*,
max(case when action = 'donedeal' and type='paid' then date end) over
(partition by user, product order by date) as last_donedealdate
from t
) t
where action = 'new' and type='ordered' date < dateadd(day, 3, last_donedealdate)
UNION ALL
select t.*
from (select t.*,
max(case when action = 'donedeal' and type='ordered' then date end) over
(partition by user, product order by date) as last_donedealdate
from t
) t
where action = 'new' and type='paid' date < dateadd(day, 3, last_donedealdate)
You can use window functions for this. To get the last done deal date, use max() with partition by and order by. The rest is just where clause logic:
select t.*
from (select t.*,
max(case when action = 'donedeal' then date end) over
(partition by user, product order by date) as last_donedealdate
from t
) t
where action = 'new' and date < dateadd(day, 3, last_donedealdate);

t-sql re-rank when group field changes

I'm stuck! I am trying to create a counter which starts at 1 again when group field changes:
This is what I am trying to get:
ProdID Date counter
123 1/1/2016 1
123 1/2/2016 2
123 1/3/2016 3
123 1/4/2016 4
456 1/1/2016 1
456 1/2/2016 2
789 1/1/2016 1
789 1/2/2016 2
789 1/3/2016 3
789 1/4/2016 4
789 1/5/2016 5
When I use rank() and over, doesn't reset when prodid changes?
If you're just trying to select the data then this should give you those results:
SELECT
ProdID,
[Date], -- A poor name for a column, since it's not only a reserved word, but also not at all descriptive
ROW_NUMBER() OVER (PARTITION BY ProdID ORDER BY [Date]) AS counter
FROM
My_Table
PARTITION BY tells SQL Server that you want the windows for the ROW_NUMBER windowed function to be partitioned by the ProdID. Imagine breaking up your data into groups by ProdID. The ORDER BY tells it to order the data within each window by the Date before applying the function.
Did you try this?
SELECT ProdId,Date, ROW_NUMBER() OVER
(PARTITION BY ProdID ORDER BY Date DESC)
AS Counter from table
order by Date ASC