Concatenation of adjacent dates in SQL - sql

I would like to know how to make intersections or concatenations of adjacent date ranges in sql.
I have a list of customer start and end dates, for example (in dd/mm/yyyy format, where 31/12/9999 means the customer is still a current customer).
CustID | StartDate | Enddate |
1 | 01/08/2011|19/06/2012|
1 | 20/06/2012|07/03/2012|
1 | 03/05/2012|31/12/9999|
2 | 09/03/2009|16/08/2009|
2 | 16/01/2010|10/10/2010|
2 | 11/10/2010|31/12/9999|
3 | 01/08/2010|19/08/2010|
3 | 20/08/2010|26/12/2011|
Although the dates in different rows don't overlap, I would consider some of the ranges as a contigous period of time, e.g when the start date comes one day after an end date (for a given customer). Hence I would like to return a query that returns just the intersection of the dates,
CustID | StartDate | Enddate |
1 | 01/08/2011|07/03/2012|
1 | 03/05/2012|31/12/9999|
2 | 09/03/2009|16/08/2009|
2 | 16/01/2010|31/12/9999|
3 | 01/08/2010|26/12/2011|
I've looked at CTE tables, but I can't figure out how to return just one row for one contigous block of dates.

This should work in 2005 forward:
;WITH cte2 AS (SELECT 0 AS Number
UNION ALL
SELECT Number + 1
FROM cte2
WHERE Number < 10000)
SELECT CustID, Min(GroupStart) StartDate, MAX(EndDate) EndDate
FROM (SELECT *
, DATEADD(DAY,b.number,a.StartDate) GroupStart
, DATEADD(DAY,1- DENSE_RANK() OVER (PARTITION BY CustID ORDER BY DATEADD(DAY,b.number,a.StartDate)),DATEADD(DAY,b.number,a.StartDate)) GroupDate
FROM Table1 a
JOIN cte2 b
ON b.number <= DATEDIFF(d, startdate, EndDate)
) X
GROUP BY CustID, GroupDate
ORDER BY CustID, StartDate
OPTION (MAXRECURSION 0)
Demo: SQL Fiddle
You can build a quick table of numbers 0-something large enough to cover the spread of dates in your ranges to replace the cte so it doesn't run each time, indexed properly it will run quickly.

you can do this with recursive common table expression:
with cte as (
select t.CustID, t.StartDate, t.EndDate, t2.StartDate as NextStartDate
from Table1 as t
left outer join Table1 as t2 on t2.CustID = t.CustID and t2.StartDate = case when t.EndDate < '99991231' then dateadd(dd, 1, t.EndDate) end
), cte2 as (
select c.CustID, c.StartDate, c.EndDate, c.NextStartDate
from cte as c
where c.NextStartDate is null
union all
select c.CustID, c.StartDate, c2.EndDate, c2.NextStartDate
from cte2 as c2
inner join cte as c on c.CustID = c2.CustID and c.NextStartDate = c2.StartDate
)
select CustID, min(StartDate) as StartDate, EndDate
from cte2
group by CustID, EndDate
order by CustID, StartDate
option (maxrecursion 0);
sql fiddle demo
Quick performance tests:
Results on 750 rows, small periods of 2 days length:
sql fiddle demo
My query: 300 ms
Goat CO query with CTE: 10804 ms
Goat CO query with table of fixed numbers: 7 ms
Results on 5 rows, large periods:
sql fiddle demo
My query: 1 ms
Goat CO query with CTE: 700 ms
Goat CO query with table of fixed numbers: 36 ms

Related

SQL: How to create a daily view based on different time intervals using SQL logic?

Here is an example:
Id|price|Date
1|2|2022-05-21
1|3|2022-06-15
1|2.5|2022-06-19
Needs to look like this:
Id|Date|price
1|2022-05-21|2
1|2022-05-22|2
1|2022-05-23|2
...
1|2022-06-15|3
1|2022-06-16|3
1|2022-06-17|3
1|2022-06-18|3
1|2022-06-19|2.5
1|2022-06-20|2.5
...
Until today
1|2022-08-30|2.5
I tried using the lag(price) over (partition by id order by date)
But i can't get it right.
I'm not familiar with Azure, but it looks like you need to use a calendar table, or generate missing dates using a recursive CTE.
To get started with a recursive CTE, you can generate line numbers for each id (assuming multiple id values) in the source data ordered by date. These rows with row number equal to 1 (with the minimum date value for the corresponding id) will be used as the starting point for the recursion. Then you can use the DATEADD function to generate the row for the next day. To use the price values ​​from the original data, you can use a subquery to get the price for this new date, and if there is no such value (no row for this date), use the previous price value from CTE (use the COALESCE function for this).
For SQL Server query can look like this
WITH cte AS (
SELECT
id,
date,
price
FROM (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) AS rn
FROM tbl
) t
WHERE rn = 1
UNION ALL
SELECT
cte.id,
DATEADD(d, 1, cte.date),
COALESCE(
(SELECT tbl.price
FROM tbl
WHERE tbl.id = cte.id AND tbl.date = DATEADD(d, 1, cte.date)),
cte.price
)
FROM cte
WHERE DATEADD(d, 1, cte.date) <= GETDATE()
)
SELECT * FROM cte
ORDER BY id, date
OPTION (MAXRECURSION 0)
Note that I added OPTION (MAXRECURSION 0) to make the recursion run through all the steps, since the default value is 100, this is not enough to complete the recursion.
db<>fiddle here
The same approach for MySQL (you need MySQL of version 8.0 to use CTE)
WITH RECURSIVE cte AS (
SELECT
id,
date,
price
FROM (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) AS rn
FROM tbl
) t
WHERE rn = 1
UNION ALL
SELECT
cte.id,
DATE_ADD(cte.date, interval 1 day),
COALESCE(
(SELECT tbl.price
FROM tbl
WHERE tbl.id = cte.id AND tbl.date = DATE_ADD(cte.date, interval 1 day)),
cte.price
)
FROM cte
WHERE DATE_ADD(cte.date, interval 1 day) <= NOW()
)
SELECT * FROM cte
ORDER BY id, date
db<>fiddle here
Both queries produces the same results, the only difference is the use of the engine's specific date functions.
For MySQL versions below 8.0, you can use a calendar table since you don't have CTE support and can't generate the required date range.
Assuming there is a column in the calendar table to store date values ​​(let's call it date for simplicity) you can use the CROSS JOIN operator to generate date ranges for the id values in your table that will match existing dates. Then you can use a subquery to get the latest price value from the table which is stored for the corresponding date or before it.
So the query would be like this
SELECT
d.id,
d.date,
(SELECT
price
FROM tbl
WHERE tbl.id = d.id AND tbl.date <= d.date
ORDER BY tbl.date DESC
LIMIT 1
) price
FROM (
SELECT
t.id,
c.date
FROM calendar c
CROSS JOIN (SELECT DISTINCT id FROM tbl) t
WHERE c.date BETWEEN (
SELECT
MIN(date) min_date
FROM tbl
WHERE tbl.id = t.id
)
AND NOW()
) d
ORDER BY id, date
Using my pseudo-calendar table with date values ranging from 2022-05-20 to 2022-05-30 and source data in that range, like so
id
price
date
1
2
2022-05-21
1
3
2022-05-25
1
2.5
2022-05-28
2
10
2022-05-25
2
100
2022-05-30
the query produces following results
id
date
price
1
2022-05-21
2
1
2022-05-22
2
1
2022-05-23
2
1
2022-05-24
2
1
2022-05-25
3
1
2022-05-26
3
1
2022-05-27
3
1
2022-05-28
2.5
1
2022-05-29
2.5
1
2022-05-30
2.5
2
2022-05-25
10
2
2022-05-26
10
2
2022-05-27
10
2
2022-05-28
10
2
2022-05-29
10
2
2022-05-30
100
db<>fiddle here

Fill the data on the missing date range

I have a table will the data with exist data below:
Select Date, [Closing Balance] from StockClosing
Date | Closing Quantity
---------------------------
20200828 | 5
20200901 | 10
20200902 | 8
20200904 | 15
20200905 | 18
There are some missing date on the table, example 20200829 to 20200831 and 20200903.
Those closing quantity of the missing date will be follow as per previous day closing quantity.
I would like select the table result in a full range of date (show everyday) with the closing quantity. Expected result,
Date | Closing Quantity
---------------------------
20200828 | 5
20200829 | 5
20200830 | 5
20200831 | 5
20200901 | 10
20200902 | 8
20200903 | 8
20200904 | 15
20200905 | 18
Beside using cursor/for loop to insert the missing date and data 1 by 1, is that any SQL command can do it at once?
You have option to use recursive CTE.
For reference Click Here
;with cte as(
select max(date) date from YourTable
),cte1 as (
select min(date) date from YourTable
union all
select dateadd(day,1,cte1.date) date from cte1 where date<(select date from cte)
)select c.date,isnull(y.[Closing Quantity],
(select top 1 a.[Closing Quantity] from YourTable a where c.date>a.date order by a.date desc) )
as [Closing Quantity]
from cte1 c left join YourTable y on c.date=y.date
The easiest way to do this is to use LAST_VALUE along with the IGNORE NULLS option. Sadly, SQL Server does not support this. There is a workaround using analytic functions, but I would actually offer this simple option, which uses a correlated subquery to fill in the missing values:
WITH dates AS (
SELECT '20200828' AS Date UNION ALL
SELECT '20200829' UNION ALL
SELECT '20200830' UNION ALL
SELECT '20200831' UNION ALL
SELECT '20200901' UNION ALL
SELECT '20200902' UNION ALL
SELECT '20200903' UNION ALL
SELECT '20200904' UNION ALL
SELECT '20200905'
)
SELECT
d.Date,
(SELECT TOP 1 t2.closing FROM StockClosing t2
WHERE t2.Date <= d.Date AND t2.closing IS NOT NULL
ORDER BY t2.Date DESC) AS closing
FROM dates d
LEFT JOIN StockClosing t1
ON d.Date = t1.Date;
Demo

Frequency Distribution by Day

I have records of No. of calls coming to a call center. When a call comes into a call center a ticket is open.
So, let's say ticket 1 (T1) is open on 8/1/19 and it stays open till 8/5/19. So, if a person ran a query everyday then on 8/1 it will show 1 ticket open...same think on day 2 till day 5....I want to get records by day to see how many tickets were open for each day.....
In short, Frequency Distribution by Day.
Ticket Open_date Close_date
T1 8/1/2019 8/5/2019
T2 8/1/2019 8/6/2019
Result:
Result
Date # Tickets_Open
8/1/2019 2
8/2/2019 2
8/3/2019 2
8/4/2019 2
8/5/2019 2
8/6/2019 1
8/7/2019 0
8/8/2019 0
8/9/2019 0
8/10/2019 0
We can handle your requirement via the use of a calendar table, which stores all dates covering the full range in your data set.
WITH dates AS (
SELECT '2019-08-01' AS dt UNION ALL
SELECT '2019-08-02' UNION ALL
SELECT '2019-08-03' UNION ALL
SELECT '2019-08-04' UNION ALL
SELECT '2019-08-05' UNION ALL
SELECT '2019-08-06' UNION ALL
SELECT '2019-08-07' UNION ALL
SELECT '2019-08-08' UNION ALL
SELECT '2019-08-09' UNION ALL
SELECT '2019-08-10'
)
SELECT
d.dt,
COUNT(t.Open_date) AS num_tickets_open
FROM dates d
LEFT JOIN tickets t
ON d.dt BETWEEN t.Open_date AND t.Close_date
GROUP BY
d.dt;
Note that in practice if you expect to have this reporting requirement in the long term, you might want to replace the dates CTE above with a bona-fide table of dates.
This solution generates the list of dates from the tickets table using CTE recursion and calculates the count:
WITH Tickets(Ticket, Open_date, Close_date) AS
(
SELECT "T1", "8/1/2019", "8/5/2019"
UNION ALL
SELECT "T2", "8/1/2019", "8/6/2019"
),
Ticket_dates(Ticket, Dates) as
(
SELECT t1.Ticket, CONVERT(DATETIME, t1.Open_date)
FROM Tickets t1
UNION ALL
SELECT t1.Ticket, DATEADD(dd, 1, CONVERT(DATETIME, t1.Dates))
FROM Ticket_dates t1
inner join Tickets t2 on t1.Ticket = t2.Ticket
where DATEADD(dd, 1, CONVERT(DATETIME, t1.Dates)) <= CONVERT(DATETIME, t2.Close_date)
)
SELECT CONVERT(varchar, Dates, 1), count(*)
FROM Ticket_dates
GROUP by Dates
ORDER by Dates
A "general purpose" trick is to generate a series of numbers, which can be done using CTE's but there are many alternatives, and from that create the needed range of dates. Once that exists then you can left join your ticket data to this and then count by date.
CREATE TABLE mytable(
Ticket VARCHAR(8) NOT NULL PRIMARY KEY
,Open_date DATE NOT NULL
,Close_date DATE NOT NULL
);
INSERT INTO mytable(Ticket,Open_date,Close_date) VALUES ('T1','8/1/2019','8/5/2019');
INSERT INTO mytable(Ticket,Open_date,Close_date) VALUES ('T2','8/1/2019','8/6/2019');
Also note I am using a cross apply in this example to "attach" the min and max dates of your tickets to each numbered row. You would need to include your own logic on what data to select here.
;WITH
cteDigits AS (
SELECT 0 AS digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
)
, cteTally AS (
SELECT
[1s].digit
+ [10s].digit * 10
+ [100s].digit * 100 /* add more like this as needed */
AS num
FROM cteDigits [1s]
CROSS JOIN cteDigits [10s]
CROSS JOIN cteDigits [100s] /* add more like this as needed */
)
select
n.num + 1 rownum
, dateadd(day,n.num,ca.min_date) as on_date
, count(t.Ticket) as tickets_open
from cteTally n
cross apply (select min(Open_date), max(Close_date) from mytable) ca (min_date, max_date)
left join mytable t on dateadd(day,n.num,ca.min_date) between t.Open_date and t.Close_date
where dateadd(day,n.num,ca.min_date) <= ca.max_date
group by
n.num + 1
, dateadd(day,n.num,ca.min_date)
order by
rownum
;
result:
+--------+---------------------+--------------+
| rownum | on_date | tickets_open |
+--------+---------------------+--------------+
| 1 | 01.08.2019 00:00:00 | 2 |
| 2 | 02.08.2019 00:00:00 | 2 |
| 3 | 03.08.2019 00:00:00 | 2 |
| 4 | 04.08.2019 00:00:00 | 2 |
| 5 | 05.08.2019 00:00:00 | 2 |
| 6 | 06.08.2019 00:00:00 | 1 |
+--------+---------------------+--------------+

Aggregate a subtotal column based on two dates of that same row

Situation:
I have 5 columns
id
subtotal (price of item)
order_date (purchase date)
updated_at (if refunded or any other status change)
status
Objective:
I need the order date as column 1
I need to get the subtotal for each day regardless if of the status as column 2
I need the subtotal amount for refunds for the third column.
Example:
If a purchase is made on May 1st and refunded on May 3rd. The output should look like this
+-------+----------+--------+
| date | subtotal | refund |
+-------+----------+--------+
| 05-01 | 10.00 | 0.00 |
| 05-02 | 00.00 | 0.00 |
| 05-03 | 00.00 | 10.00 |
+-------+----------+--------+
while the row will look like that
+-----+----------+------------+------------+----------+
| id | subtotal | order_date | updated_at | status |
+-----+----------+------------+------------+----------+
| 123 | 10 | 2019-05-01 | 2019-05-03 | refunded |
+-----+----------+------------+------------+----------+
Query:
Currently what I have looks like this:
Note: Timezone discrepancy therefore bring back the dates by 8 hours.
;with cte as (
select id as orderid
, CAST(dateadd(hour,-8,order_date) as date) as order_date
, CAST(dateadd(hour,-8,updated_at) as date) as updated_at
, subtotal
, status
from orders
)
select
b.dates
, sum(a.subtotal_price) as subtotal
, -- not sure how to aggregate it to get the refunds
from Orders as o
inner join cte as a on orders.id=cte.orderid
inner join (select * from cte where status = ('refund')) as b on o.id=cte.orderid
where dates between '2019-05-01' and '2019-05-31'
group by dates
And do I need to join it twice? Hopefully not since my table is huge.
This looks like a job for a Calendar Table. Bit of a stab in the dark, but:
--Overly simplistic Calendar table
CREATE TABLE dbo.Calendar (CalendarDate date);
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1, N N2, N N3, N N4, N N5) --Many years of data
INSERT INTO dbo.Calendar
SELECT DATEADD(DAY, T.I, 0)
FROM Tally T;
GO
SELECT C.CalendarDate AS [date],
CASE C.CalendarDate WHEN V.order_date THEN subtotal ELSE 0 END AS subtotal,
CASE WHEN C.CalendarDate = V.updated_at AND V.[status] = 'refunded' THEN subtotal ELSE 0.00 END AS subtotal
FROM (VALUES(123,10.00,CONVERT(date,'20190501'),CONVERT(date,'20190503'),'refunded'))V(id,subtotal,order_date,updated_at,status)
JOIN dbo.Calendar C ON V.order_date <= C.CalendarDate AND V.updated_at >= C.CalendarDate;
GO
DROP TABLE dbo.Calendar;
Consider joining on a recursive CTE of sequential dates:
WITH dates AS (
SELECT CONVERT(datetime, '2019-01-01') AS rec_date
UNION ALL
SELECT DATEADD(d, 1, CONVERT(datetime, rec_date))
FROM dates
WHERE rec_date < '2019-12-31'
),
cte AS (
SELECT id AS orderid
, CAST(dateadd(hour,-8,order_date) AS date) as order_date
, CAST(dateadd(hour,-8,updated_at) AS date) as updated_at
, subtotal
, status
FROM orders
)
SELECT rec_date AS date,
CASE
WHEN c.order_date = d.rec_date THEN subtotal
ELSE 0
END AS subtotal,
CASE
WHEN c.updated_at = d.rec_date THEN subtotal
ELSE 0
END AS refund
FROM cte c
JOIN dates d ON d.rec_date BETWEEN c.order_date AND c.updated_at
WHERE c.status = 'refund'
option (maxrecursion 0)
GO
Rextester demo

How to duplicate data in sql with conditions

I havea table as table_A . table_A includes these columns
-CountryName
-Min_Date
-Max_Date
-Number
I want to duplicate data with seperating by months. For example
Argentina | 2015-01-04 | 2015-04-07 | 100
England | 2015-02-08 | 2015-03-11 | 90
I want to see a table as this (Monthly seperated)
Argentina | 01-2015 | 27 //(days to end of the min_date's month)
Argentina | 02-2015 | 29 //(days full month)
Argentina | 03-2015 | 31 //(days full month)
Argentina | 04-2015 | 7 //(days from start of the max_date's month)
England | 02-2015 | 21 //(days)
England | 03-2015 | 11 //(days)
I tried too much thing to made this for each records. But now my brain is so confusing and my project is delaying.
Does anybody know how can i solve this. I tried to duplicate each rows with datediff count but it is not working
WITH cte AS (
SELECT CountryName, ISNULL(DATEDIFF(M,Min_Date ,Max_Date )+1,1) as count FROM table_A
UNION ALL
SELECT CountryName, count-1 FROM cte WHERE count>1
)
SELECT CountryName,count FROM cte
-Generate all the dates between min and max dates for each country.
-Then get the month start and month end dates for each country,year,month.
-Finally get the date differences of the month start and month end.
WITH cte AS (
SELECT Country, min_date dt,min_date,max_date FROM t
UNION ALL
SELECT Country, dateadd(dd,1,dt),min_date,max_date FROM cte WHERE dt < max_date
)
,monthends as (
SELECT country,year(dt) yr,month(dt) mth,max(dt) monthend,min(dt) monthstart
FROM cte
GROUP BY country,year(dt),month(dt))
select country
,cast(mth as varchar(2))+'-'+cast(yr as varchar(4)) yr_month
,datediff(dd,monthstart,monthend)+1 days_diff
from monthends
Sample Demo
EDIT: Another option would be to generate all the dates once (the example shown here generates 51 years of dates from 2000 to 2050) and then joining it to the table to get the days by month.
WITH cte AS (
SELECT cast('2000-01-01' as date) dt,cast('2050-12-31' as date) maxdt
UNION ALL
SELECT dateadd(dd,1,dt),maxdt FROM cte WHERE dt < maxdt
)
SELECT country,year(dt) yr,month(dt) mth, datediff(dd,min(dt),max(dt))+1 days_diff
FROM cte c
JOIN t on c.dt BETWEEN t.min_date and t.max_date
GROUP BY country,year(dt),month(dt)
OPTION (MAXRECURSION 0)
I think you have the right idea. But you need to construct the months:
WITH cte AS (
SELECT CountryName, Min_Date as dte, Min_Date, Max_Date
FROM table_A
UNION ALL
SELECT CountryName, DATEADD(month, 1, dte), Min_Date, Max_Date
FROM cte
WHERE dte < Max_date
)
SELECT CountryName, dte
FROM cte;
Getting the number of days in the month is a bit more complicated. That requires some thought.
Oh, I forgot about EOMONTH():
select countryName, dte,
(case when dte = min_date
then datediff(day, min_date, eomonth(dte)) + 1
when dte = max_date
then day(dte)
else day(eomonth(dte))
end) as days
from cte;
Using a Calendar Table makes this stuff pretty easy. RexTester: http://rextester.com/EBTIMG23993
begin
create table #enderaric (
CountryName varchar(16)
, Min_Date date
, Max_Date date
, Number int
)
insert into #enderaric values
('Argentina' ,'2015-01-04' ,'2015-04-07' ,'100')
, ('England' ,'2015-02-08' ,'2015-03-11' ,'90')
end;
-- select * from #enderaric
--*/"
declare #FromDate date;
declare #ThruDate date;
set #FromDate = '2015-01-01';
set #ThruDate = '2015-12-31';
with x as (
select top (cast(sqrt(datediff(day, #FromDate, #ThruDate)) as int) + 1)
[number]
from [master]..spt_values v
)
/* Date Range CTE */
,cal as (
select top (1+datediff(day, #FromDate, #ThruDate))
DateValue = convert(date,dateadd(day,
row_number() over (order by x.number)-1,#FromDate)
)
from x cross join x as y
order by DateValue
)
select
e.CountryName
, YearMonth = convert(char(7),left(convert(varchar(10),DateValue),7))
, [Days]=count(c.DateValue)
from #enderaric as e
inner join cal c on c.DateValue >= e.min_date
and c.DateValue <= e.max_date
group by
e.CountryName
, e.Min_Date
, e.Max_Date
, e.Number
, convert(char(7),left(convert(varchar(10),DateValue),7))
results in:
CountryName YearMonth Days
---------------- --------- -----------
Argentina 2015-01 28
Argentina 2015-02 28
Argentina 2015-03 31
Argentina 2015-04 7
England 2015-02 21
England 2015-03 11
More about calendar tables:
Aaron Bertrand - Generate a set or sequence without loops
generate-a-set-1
generate-a-set-2
generate-a-set-3
David Stein - Creating a Date Table/Dimension on SQL 2008
Michael Valentine Jones - F_TABLE_DATE