Continuous Date / Not continuous Date sql server - sql

I'm encountering a problem with continous date / not cointinuous date on sql server 2012.
I have a table that looks like this :
Article
Creation date
1234
04/01/2021
1234
05/01/2021
1234
06/01/2021
1234
07/01/2021
1234
10/01/2021
1234
12/01/2021
12345
02/01/2021
12345
03/01/2021
12345
17/01/2021
123456
01/01/2021
123456
03/01/2021
123456
05/01/2021
The problem is :
I want to get the count of every article by continuous date with the min date of the range, it's a bit difficult to explain what I want but there is an example of the result :
Article
Creation date
Count
1234
04/01/2021
4
1234
10/01/2021
1
1234
12/01/2021
1
12345
02/01/2021
2
12345
17/01/2021
1
123456
01/01/2021
1
123456
03/01/2021
1
123456
05/01/2021
1
For example :
count of 1st row = 4 because there is 4 continous day on the range 04/01/2021 to 07/01/2021
count of 2nd row = 1 because there is only 1 day, 0 continuous day with 10/01/2021 for this article
count of 3rd row = 1 because there is only 1 day, 0 continuous day with 12/01/2021 for this article
I'm starting with that :
;WITH CTE AS (
SELECT Article, [Creation date], StartDate= Dateadd(day,-ROW_NUMBER() OVER (ORDER BY [Creation date]),[Creation date])
FROM MyTable
)
SELECT Article, min([Creation date]) as [Creation date], count(Article) as count
FROM CTE
GROUP BY StartDate, Article, [Creation date]
order by Article, [Creation date]
Output :
Article
Creation date
Count
1234
04/01/2021
1
1234
05/01/2021
1
1234
06/01/2021
1
1234
07/01/2021
1
1234
10/01/2021
1
1234
12/01/2021
1
12345
02/01/2021
1
12345
03/01/2021
1
12345
17/01/2021
1
123456
01/01/2021
1
123456
03/01/2021
1
123456
05/01/2021
1
but the result is wrong, I don't really know how to approach this problem. If someone can enlighten me, appreciate.
Thank you

This is an example of a gaps-and-islands problem. The simplest solution in this case is to subtract an increasing sequence of values and aggregate. This works because the difference is constant for incremental dates:
select article, min(creation_date), max(creation_date), count(*)
from (select t.*,
row_number() over (partition by article order by creation_date) as seqnum
from mytable t
) t
group by article, dateadd(day, -seqnum, creation_date)
order by article, min(creation_date);

Related

Find day difference of last two rows

What I'm trying to do is find the MAX date and do a datediff between the most recent date and the second to last date to create a single column for the difference in days. How do I get rid of the first two rows? I attempted to do a MAX by wrapping it another table, no luck.
Sample Data:
ITEM ID
ITEM
LAST UPDATED
REASON
123
Pencil
4/1/2020
Correction
123
Pencil
8/1/2020
Correction
123
Pencil
9/3/2020
Correction
456
Highlighter
5/1/2020
Correction
456
Highlighter
5/10/2020
Correction
789
Pen
10/1/2020
Correction
789
Pen
10/1/2020
Correction
Expected Output:
ITEM ID
ITEM
LAST UPDATED
REASON
Days Diff Since Last Correction
123
Pencil
9/3/2020
Correction
33
456
Highlighter
5/10/2020
Correction
9
789
Pen
10/20/2020
Correction
19
Here's what I've used so far:
SELECT
[Item_ID]
,[Item]
,[Last_Updated]
,[Reason]
,DATEDIFF(day,lag([Last_Updated],1) over(partition by [Item_ID] ORDER BY [Last_Updated] asc), [Last_Updated]) AS DAY_DIFF
FROM [Table]
This is giving me the below:
Item_ID Item Last_Updated Reason DAY_DIFF
123 Pencil 2020-04-01 Correction NULL
123 Pencil 2020-08-01 Correction 122
123 Pencil 2020-09-03 Correction 33
456 Highlighter 2020-05-01 Correction NULL
456 Highlighter 2020-05-10 Correction 9
789 Pen 2020-10-01 Correction NULL
789 Pen 2020-10-20 Correction 19
select t.* from(
SELECT
[Item_ID]
,[Item]
,[Last_Updated]
,[Reason]
,datediff(day, lag([Last_Updated],1,Last_Updated)over (partition by [Item_ID] order by [Last_Updated]),[Last_Updated]) as 'Difference Between Last Correction',
row_number() over (partition by [Item_ID] order by [Last_Updated] desc) as rn
FROM [TABLE]
)t
where rn = 1;

How to get value from another record in the same table?

I have the following table structure. How can create a view or create a select statement adding a column showing the value from another record of the same notice# field in the same table.
ID notice# notice Date Sequence
1 ABCD1 1/2/2021 1
2 ABCD1 1/3/2021 2
3 ABCD1 1/3/2021 3
4 ABCD2 1/3/2021 1
5 ABCD2 1/3/2021 2
Expected result: I want to add a new column Prior notice date as
ID notice# notice Date Sequence Prior Noice Date
1 ABCD1 1/2/2021 1
2 ABCD1 1/3/2021 2 1/2/2021
3 ABCD1 1/3/2021 3 1/3/2021
4 ABCD2 1/3/2021 1
5 ABCD2 1/3/2021 2 1/3/2021
If Sequence 1 then Prior Noice Date = null
If Sequence 2 then Prior Noice Date = SO Date of Sequence 1
If Sequence 3 then Prior Noice Date = SO Date of Sequence 2
LAG/LEAD functions can be used to achieve this.
http://sqlfiddle.com/#!18/df7f2/4
select *, lag([notice date]) over (partition by notice# order by [sequence])
from table1
order by notice#, [sequence];
Here is a subquery solution:
SELECT
ID,
[notice#],
[notice Date],
Sequence,
(SELECT [notice Date]
FROM YourTable yt2
WHERE yt2.Sequence = yt1.Sequence - 1
AND yt2.[notice#] = yt1.[notice#]) as [Prior Noice Date]
FROM YourTable yt1

SQL - dynamic sum based on dynamic date range

I'm new to SQL and I'm not even sure if what I am trying to achieve is possible.
I have two tables. The first gives an account number, a 'from' date and a 'to' date. The second table shows monthly volume for each account.
Table 1 - Dates
Account# Date_from Date_to
-------- --------- -------
123 2018-01-01 2018-12-10
456 2018-06-01 2018-12-10
789 2018-04-23 2018-11-01
Table 2 - Monthly_Volume
Account# Date Volume
--------- ---------- ------
123 2017-12-01 5
123 2018-01-15 5
123 2018-02-05 5
456 2018-01-01 10
456 2018-10-01 15
789 2017-06-01 5
789 2018-01-15 10
789 2018-06-20 7
I would like to merge the two tables in such a way that each account in Table 1 has a fourth column that gives the sum of Volume between Date_from and Date_to.
Desired Result:
Account# Date_from Date_to Sum(Volume)
-------- --------- ------- -----------
123 2018-01-01 2018-12-10 10
456 2018-06-01 2018-12-10 15
789 2018-04-23 2018-11-01 7
I believe that this would be possible to achieve for each account individually by doing something like the following and joining the result to the Dates table:
SELECT
Account#,
SUM(Volume)
FROM Monthly_Volume
WHERE
Account# = '123'
AND Date_from >= TO_DATE('2018-01-01', 'YYYY-MM-DD')
AND Date_to <= TO_DATE('2018-12-10', 'YYYY-MM-DD')
GROUP BY Account#
What I'd like to know is whether it is possible to achieve this without having to individually fill in the Account#, Date_from and Date_to for each account (there are ~1,000 accounts), but have it be done automatically for each entry in the Dates table.
Thank you!
You should be able to use join and group by:
select d.account#, d.Date_from, d.Date_to, sum(mv.volume)
from dates d left join
monthly_volume mv
on mv.account# = d.account# and
mv.date between d.Date_from and d.Date_to
group by d.account#, d.Date_from, d.Date_to;

Use Calendar table to generate historical view of the data

I have a created_date (timestamp) on 1 of my tables, that also has the duration column of a project, and I need to join with another table that only has first_day_of_month column that has the first day of each month, and other relevant information.
Table 1
id project_id created_date duration
1 12345 01/01/2015 10
2 12345 20/10/2015 11
3 12345 10/04/2016 13
4 12345 10/08/2016 15
Table 2
project_id month_start_date
12345 01/01/2015
12345 01/02/2015
12345 01/03/2015
12345 01/04/2015
...
12345 01/08/2016
Expected result
project_id month_start_date duration
12345 01/01/2015 10
12345 01/02/2015 10
...
12345 01/10/2015 11
12345 01/11/2015 11
...
12345 01/04/2016 13
12345 01/05/2016 13
12345 01/06/2016 13
...
12345 01/08/2016 15
I want to be able to present the data listed in my second table historically. So, basically I want the query to return the same duration related to the month_start_date, so that values will repeat until another dateadd(month,datediff(month,0,created_date),0) = first_day_of_month is met... and so forth.
This is my query:
select table2.project_name,
table2.month_start_date,
table1.duration,
table1.created_date
from table1 left outer join table2
on table1.project_id=table2.project_id
where dateadd(month,datediff(month,0,table1.created_date),0)<=table2.month_start_date
group by table2.project_name,table2.month_start_date,table1.duration,table1.created_date
order by table2.month_start_date asc
but I get repeated records on this:
Result I'm getting
project_id month_start_date duration
12345 01/01/2015 10
12345 01/02/2015 10
...
12345 01/10/2015 10
12345 01/10/2015 11
...
12345 01/04/2016 10
12345 01/04/2016 11
12345 01/04/2016 13
...
12345 01/08/2016 10
12345 01/08/2016 11
12345 01/08/2016 13
12345 01/08/2016 15
Can anyone help?
Thank you!
I'd use CROSS/OUTER APPLY operator.
Here is one possible variant. For each row in your calendar table Table2 (for each month) the inner correlated subquery inside the CROSS APPLY finds one row from Table1. It will be the row with the same project_id and the first row with created_date before the month_start_date plus 1 month.
SELECT
Table2.project_id
,Table2.month_start_date
,Durations.duration
FROM
Table2
CROSS APPLY
(
SELECT TOP(1) Table1.duration
FROM Table1
WHERE
Table1.project_id = Table2.project_id
AND Table1.created_date < DATEADD(month, 1, Table2.month_start_date)
ORDER BY Table1.created_date DESC
) AS Durations
;
Make sure that Table1 has index on (project_id, created_date) include (duration). Otherwise, performance would be poor.

Generate sequence based on the value in the previous row and current row

I have the below table having student information.
S_ID Group_ID Date Score
12345 1 1/1/2015 1
12345 1 2/1/2015 2
12345 1 3/1/2015 4
12345 1 4/1/2015 5
12345 1 9/1/2015 3
12345 1 10/1/2015 8
12345 2 1/1/2015 2
12345 2 2/1/2015 4
12345 2 3/1/2015 6
I want to generate a new table based for few students after adding a sequence column as shown below
S_ID Group_ID Date Score Sequence
12345 1 1/1/2015 1 1
12345 1 2/1/2015 2 2
12345 1 3/1/2015 4 3
12345 1 4/1/2015 5 4
12345 1 9/1/2015 3 3
12345 1 10/1/2015 8 4
12345 2 1/1/2015 2 2
12345 2 2/1/2015 4 3
12345 2 3/1/2015 6 4
Rules:
Sequence should be generated for each combination of S_ID, Group_I
For the first record, sequence number will be same as the Score
2nd record onwards, this will be 1 + the previous sequence number
if the difference between the date of the previous row and current row is
more than 100 days, sequence number will be restarted (same as the
Score for that record)
This is a large table and I am looking for the most optimized SQL. Any help would be greatly appreciated
The trick here is to find where the sequence numbers start over. This is for new students, groups, and when the previous date has too big a gap. For the latter, you can use lag() to calculate a "new dates start flag" and then aggregate this to get a grouping.
select t.*,
(first_value(score) over (partition by s_id, group_id, grp order by date) +
row_number() over (partition by s_id, group_id, grp order by date) - 1
) as sequence
from (select t.*,
sum(case when prev_date is null or prev_date < date - 100
then 1 else 0
end) over (partition by s_id, group_id order by date) as grp
from (select t.*,
lag(date) over (partition by s_id, group_id order by date) as prev_date
from t
) t
) t;