SQL - Select next date query - sql

I have a table with many IDs and many dates associated with each ID, and even a few IDs with no date. For each ID and date combination, I want to select the ID, date, and the next largest date also associated with that same ID, or null as next date if none exists.
Sample Table:
ID Date
1 5/1/10
1 6/1/10
1 7/1/10
2 6/15/10
3 8/15/10
3 8/15/10
4 4/1/10
4 4/15/10
4
Desired Output:
ID Date Next_Date
1 5/1/10 6/1/10
1 6/1/10 7/1/10
1 7/1/10
2 6/15/10
3 8/15/10
3 8/15/10
4 4/1/10 4/15/10
4 4/15/10

SELECT
mytable.id,
mytable.date,
(
SELECT
MIN(mytablemin.date)
FROM mytable AS mytablemin
WHERE mytablemin.date > mytable.date
AND mytable.id = mytablemin.id
) AS NextDate
FROM mytable
This has been tested on SQL Server 2008 R2 (but it should work on other DBMSs) and produces the following output:
id date NextDate
----------- ----------------------- -----------------------
1 2010-05-01 00:00:00.000 2010-06-01 00:00:00.000
1 2010-06-01 00:00:00.000 2010-06-15 00:00:00.000
1 2010-07-01 00:00:00.000 2010-08-15 00:00:00.000
2 2010-06-15 00:00:00.000 2010-07-01 00:00:00.000
3 2010-08-15 00:00:00.000 NULL
3 2010-08-15 00:00:00.000 NULL
4 2010-04-01 00:00:00.000 2010-04-15 00:00:00.000
4 2010-04-15 00:00:00.000 2010-05-01 00:00:00.000
4 NULL NULL
Update 1:
For those that are interested, I've compared the performance of the two variants in SQL Server 2008 R2 (one uses MIN aggregate and the other uses TOP 1 with an ORDER BY):
Without an index on the date column, the MIN version had a cost of 0.0187916 and the TOP/ORDER BY version had a cost of 0.115073 so the MIN version was "better".
With an index on the date column, they performed identically.
Note that this was testing with just these 9 records so the results could be (very) spurious...
Update 2:
The results hold for 10,000 uniformly distributed random records. The TOP/ORDER BY query takes so long to run at 100,000 records I had to cancel it and give up.

If your db is oracle, you can use lead() and lag() functions.
SELECT id, date,
LEAD(date, 1, 0) OVER (PARTITION BY ID ORDER BY Date DESC NULLS LAST) NEXT_DATE,
FROM Your_table
ORDER BY ID;

SELECT
id,
date,
( SELECT date
FROM table t1
WHERE t1.date > t2.date
ORDER BY t1.date LIMIT 1 )
FROM table t2

I think self JOIN would be faster than subselect.
WITH dates AS (
SELECT 1 AS ID, '2010-05-01' AS Date
UNION ALL SELECT 1, '2010-06-01'
UNION ALL SELECT 1, '2010-07-01'
UNION ALL SELECT 2, '2010-06-15'
UNION ALL SELECT 3, '2010-08-15'
UNION ALL SELECT 3, '2010-08-15'
UNION ALL SELECT 4, '2010-04-01'
UNION ALL SELECT 4, '2010-04-15'
UNION ALL SELECT 4, ''
)
SELECT
dates.ID,
dates.Date,
nextDates.Date AS Next_Date
FROM
dates
LEFT JOIN
dates nextDates
ON nextDates.ID = dates.ID
AND nextDates.Date > dates.Date
LEFT JOIN
dates noLower
ON noLower.ID = nextDates.ID
AND noLower.Date < nextDates.Date
AND noLower.Date > dates.Date
WHERE
dates.Date > 0
AND noLower.ID IS NULL
https://www.db-fiddle.com/f/4sWRLt2hxjik5HqiJ21ez8/1

Related

SQL : create intermediate data from date range

I have a table as shown here:
USER
ROI
DATE
1
5
2021-11-24
1
4
2021-11-26
1
6
2021-11-29
I want to get the ROI for the dates in between the other dates, expected result will be as below
From 2021-11-24 to 2021-11-30
USER
ROI
DATE
1
5
2021-11-24
1
5
2021-11-25
1
4
2021-11-26
1
4
2021-11-27
1
4
2021-11-28
1
6
2021-11-29
1
6
2021-11-30
You may use a calendar table approach here. Create a table containing all dates and then join with it. Sans an actual table, you may use an inline CTE:
WITH dates AS (
SELECT '2021-11-24' AS dt UNION ALL
SELECT '2021-11-25' UNION ALL
SELECT '2021-11-26' UNION ALL
SELECT '2021-11-27' UNION ALL
SELECT '2021-11-28' UNION ALL
SELECT '2021-11-29' UNION ALL
SELECT '2021-11-30'
),
cte AS (
SELECT USER, ROI, DATE, LEAD(DATE) OVER (ORDER BY DATE) AS NEXT_DATE
FROM yourTable
)
SELECT t.USER, t.ROI, d.dt
FROM dates d
INNER JOIN cte t
ON d.dt >= t.DATE AND (d.dt < t.NEXT_DATE OR t.NEXT_DATE IS NULL)
ORDER BY d.dt;

Frequency Distribution by Day

I have records of No. of calls coming to a call center. When a call comes into a call center a ticket is open.
So, let's say ticket 1 (T1) is open on 8/1/19 and it stays open till 8/5/19. So, if a person ran a query everyday then on 8/1 it will show 1 ticket open...same think on day 2 till day 5....I want to get records by day to see how many tickets were open for each day.....
In short, Frequency Distribution by Day.
Ticket Open_date Close_date
T1 8/1/2019 8/5/2019
T2 8/1/2019 8/6/2019
Result:
Result
Date # Tickets_Open
8/1/2019 2
8/2/2019 2
8/3/2019 2
8/4/2019 2
8/5/2019 2
8/6/2019 1
8/7/2019 0
8/8/2019 0
8/9/2019 0
8/10/2019 0
We can handle your requirement via the use of a calendar table, which stores all dates covering the full range in your data set.
WITH dates AS (
SELECT '2019-08-01' AS dt UNION ALL
SELECT '2019-08-02' UNION ALL
SELECT '2019-08-03' UNION ALL
SELECT '2019-08-04' UNION ALL
SELECT '2019-08-05' UNION ALL
SELECT '2019-08-06' UNION ALL
SELECT '2019-08-07' UNION ALL
SELECT '2019-08-08' UNION ALL
SELECT '2019-08-09' UNION ALL
SELECT '2019-08-10'
)
SELECT
d.dt,
COUNT(t.Open_date) AS num_tickets_open
FROM dates d
LEFT JOIN tickets t
ON d.dt BETWEEN t.Open_date AND t.Close_date
GROUP BY
d.dt;
Note that in practice if you expect to have this reporting requirement in the long term, you might want to replace the dates CTE above with a bona-fide table of dates.
This solution generates the list of dates from the tickets table using CTE recursion and calculates the count:
WITH Tickets(Ticket, Open_date, Close_date) AS
(
SELECT "T1", "8/1/2019", "8/5/2019"
UNION ALL
SELECT "T2", "8/1/2019", "8/6/2019"
),
Ticket_dates(Ticket, Dates) as
(
SELECT t1.Ticket, CONVERT(DATETIME, t1.Open_date)
FROM Tickets t1
UNION ALL
SELECT t1.Ticket, DATEADD(dd, 1, CONVERT(DATETIME, t1.Dates))
FROM Ticket_dates t1
inner join Tickets t2 on t1.Ticket = t2.Ticket
where DATEADD(dd, 1, CONVERT(DATETIME, t1.Dates)) <= CONVERT(DATETIME, t2.Close_date)
)
SELECT CONVERT(varchar, Dates, 1), count(*)
FROM Ticket_dates
GROUP by Dates
ORDER by Dates
A "general purpose" trick is to generate a series of numbers, which can be done using CTE's but there are many alternatives, and from that create the needed range of dates. Once that exists then you can left join your ticket data to this and then count by date.
CREATE TABLE mytable(
Ticket VARCHAR(8) NOT NULL PRIMARY KEY
,Open_date DATE NOT NULL
,Close_date DATE NOT NULL
);
INSERT INTO mytable(Ticket,Open_date,Close_date) VALUES ('T1','8/1/2019','8/5/2019');
INSERT INTO mytable(Ticket,Open_date,Close_date) VALUES ('T2','8/1/2019','8/6/2019');
Also note I am using a cross apply in this example to "attach" the min and max dates of your tickets to each numbered row. You would need to include your own logic on what data to select here.
;WITH
cteDigits AS (
SELECT 0 AS digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
)
, cteTally AS (
SELECT
[1s].digit
+ [10s].digit * 10
+ [100s].digit * 100 /* add more like this as needed */
AS num
FROM cteDigits [1s]
CROSS JOIN cteDigits [10s]
CROSS JOIN cteDigits [100s] /* add more like this as needed */
)
select
n.num + 1 rownum
, dateadd(day,n.num,ca.min_date) as on_date
, count(t.Ticket) as tickets_open
from cteTally n
cross apply (select min(Open_date), max(Close_date) from mytable) ca (min_date, max_date)
left join mytable t on dateadd(day,n.num,ca.min_date) between t.Open_date and t.Close_date
where dateadd(day,n.num,ca.min_date) <= ca.max_date
group by
n.num + 1
, dateadd(day,n.num,ca.min_date)
order by
rownum
;
result:
+--------+---------------------+--------------+
| rownum | on_date | tickets_open |
+--------+---------------------+--------------+
| 1 | 01.08.2019 00:00:00 | 2 |
| 2 | 02.08.2019 00:00:00 | 2 |
| 3 | 03.08.2019 00:00:00 | 2 |
| 4 | 04.08.2019 00:00:00 | 2 |
| 5 | 05.08.2019 00:00:00 | 2 |
| 6 | 06.08.2019 00:00:00 | 1 |
+--------+---------------------+--------------+

Get max date for each from either of 2 columns

I have a table like below
AID BID CDate
-----------------------------------------------------
1 2 2018-11-01 00:00:00.000
8 1 2018-11-08 00:00:00.000
1 3 2018-11-09 00:00:00.000
7 1 2018-11-15 00:00:00.000
6 1 2018-12-24 00:00:00.000
2 5 2018-11-02 00:00:00.000
2 7 2018-12-15 00:00:00.000
And I am trying to get a result set as follows
ID MaxDate
-------------------
1 2018-12-24 00:00:00.000
2 2018-12-15 00:00:00.000
Each value in the id columns(AID,BID) should return the max of CDate .
ex: in the case of 1, its max CDate is 2018-12-24 00:00:00.000 (here 1 appears under BID)
in the case of 2 , max date is 2018-12-15 00:00:00.000 . (here 2 is under AID)
I tried the following.
1.
select
g.AID,g.BID,
max(g.CDate) as 'LastDate'
from dbo.TT g
inner join
(select AID,BID,max(CDate) as maxdate
from dbo.TT
group by AID,BID)a
on (a.AID=g.AID or a.BID=g.BID)
and a.maxdate=g.CDate
group by g.AID,g.BID
and 2.
SELECT
AID,
CDate
FROM (
SELECT
*,
max_date = MAX(CDate) OVER (PARTITION BY [AID])
FROM dbo.TT
) AS s
WHERE CDate= max_date
Please suggest a 3rd solution.
You can assemble the data in a table expression first, and the compute the max for each value is simple. For example:
select
id, max(cdate)
from (
select aid as id, cdate from t
union all
select bid, cdate from t
) x
group by id
You seem to only care about values that are in both columns. If this interpretation is correct, then:
select id, max(cdate)
from ((select aid as id, cdate, 1 as is_a, 0 as is_b
from t
) union all
(select bid as id, cdate, 1 as is_a, 0 as is_b
from t
)
) ab
group by id
having max(is_a) = 1 and max(is_b) = 1;

SQL Server 2008 - DateDiff

This is an example of some data
id CurrentMailPromoDate
1 1/1/2013
2 3/1/2013
3 6/9/2013
4 6/10/2013
5 9/18/2013
6 12/27/2013
7 12/27/2013
What I need is to extract only those id's where the date diff between the current and previous record >= 100
In other words the result set would be :
id CurrentMailPromoDate
1 1/1/2013 **initial record**
3 6/9/2013
5 9/18/2013
6 12/27/2013
7 12/27/2013
The challenge is getting the previous date. In SQL Server 2012, you can use lag(), but this is not supported in SQL Server 2008. Instead, I use a correlated subquery to get the previous date.
The following returns the set you are looking for:
with t as (
select 1 as id, CAST('2013-01-01' as DATE) as CurrentMailPromoDate union all
select 2, '2013-03-01' union all
select 3, '2013-06-09' union all
select 4, '2013-06-10' union all
select 5, '2013-09-18' union all
select 6, '2013-12-27' union all
select 7, '2013-12-27')
select t.id, t.CurrentMailPromoDate
from (select t.*,
(select top 1 CurrentMailPromoDate
from t t2
where t2.CurrentMailPromoDate < t.CurrentMailPromoDate
order by CurrentMailPromoDate desc
) as prevDate
from t
) t
where DATEDIFF(dd, prevDate, CurrentMailPromoDate) >= 100 or prevDate is null;
EDIT:
The subquery is returning the previous date for each record. For a given record, it is looking at all the dates less than the date on the record. These are sorted in descending order and the first one is picked. This is prevDate.
The where clause just filters out anything that is less than 100 days from the previous date.

Compare values with in a table in SQl

This is MyTable structure
MyTable : ID, Date
Select ID, Date from MyTable
ID Date
50 2013-01-01 00:00:00.000
51 2013-01-02 00:00:00.000
52 2013-01-02 00:00:00.000
I need to get the result like this.
ID Date
50 2013-01-01 00:00:00.000
50 2013-01-02 00:00:00.000
50 2013-01-03 00:00:00.000
51 2013-01-02 00:00:00.000
51 2013-01-03 00:00:00.000
52 2013-01-03 00:00:00.000
How do i get the results ?
Seems like you want to get a list of dates. Of so, then you can use a recursive CTE to get the list of dates:
;with data(id, date) as
(
select id, date
from mytable
union all
select id, dateadd(day, 1, date)
from data
where dateadd(day, 1, date) <= '1/3/2013'
)
select *
from data
order by id
See SQL Fiddle with Demo. This will generate the list of dates for each ID between the current date in your table and the end date that you provide. In my example it is 1/3/2013
I think this will work a bit faster ONLY if there is a CLUSTERED INDEX on the ID column, else bluefeet's solution is fantastic!
SELECT T1.ID
,Date = DATEADD(DAY, -ROW_NUMBER() OVER (PARTITION BY T1.ID ORDER BY (SELECT NULL))+1, '2013-01-03')
FROM MyTable T1
JOIN MyTable T2 ON T1.ID <= T2.ID