How to loop through dates, 30 times to find historical information at points in time? - sql

I am trying to find out the transaction details of customers at different points in time in the past. I came up with a query but I have to change the date every single time in my declare statement. Is there a way to loop through the dates to get data back x amount of days in the past?
DECLARE #StartDate AS Date
SET #StartDate = '2018-10-30'
SELECT u.member_id AS Member_id
,CAST(MAX(pe.executed_time) AS Date) AS Max_Date
,DATEDIFF(dd,#StartDate,MAX(pe.executed_time))*-1+1 AS R
,COUNT(*) AS F
,SUM(pi.Price) AS M
FROM [user] AS u
LEFT JOIN purchase_entry AS pe
ON u.member_id=pe.member_id
LEFT JOIN purchase_item AS pi
ON pe.order_id=pi.order_id
WHERE 1=1
AND pe.[status] = 'purchase'
AND pe.executed_time <= #StartDate -- Change the declared Date to have historical information.
GROUP BY u.id, u.member_id
ORDER BY Recency
I need a way to have my query to loop through the #StartDate for 30 days in the past for instance. For ex: #StartDate = '2018-11-30' then #StartDate '2018-11-29' , and so on ...

You can list the dates that you want:
WITH dates as (
SELECT v.*
FROM (VALUES (CONVERT(DATE, '2018-11-30')), (CONVERT(DATE, '2018-11-29'))
) v(dte)
)
SELECT dates.dte, u.member_id, . . .
FROM dates CROSS JOIN
[user] u JOIN
purchase_entry pe
ON u.member_id = pe.member_id LEFT JOIN
purchase_item pi
ON pe.order_id = pi.order_id
WHERE pe.[status] = 'purchase' AND
pe.executed_time <= date.dte -- Change the declared Date to have historical information.
GROUP BY v.dte, u.id, u.member_id
ORDER BY Recency;
For a specific period of dates, you can construct them using a recursive CTE:
with dates as (
select convert(date, '2018-11-30') as dte, 1 as n
union all
select dateadd(day, -1, date.dte), n + 1
from dates
where n < 30
)
. . .

Use a tally table to generate the date and then query:
DECLARE #StartDate AS Date
SET #StartDate = '2018-10-30'
WITH N AS (
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL)) N(N)),
Tally AS(
SELECT TOP 30 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1
CROSS JOIN N N2
CROSS JOIN N N3),
Dates AS(
SELECT DATEADD(DAY, I, #StartDate) AS [Date]
FROM Tally
)
SELECT u.member_id AS Member_id
,CAST(MAX(pe.executed_time) AS Date) AS Max_Date
,DATEDIFF(dd,#StartDate,MAX(pe.executed_time))*-1+1 AS R
,COUNT(*) AS F
,SUM(pi.Price) AS M
FROM [user] AS u
CROSS JOIN Dates D
LEFT JOIN purchase_entry AS pe
ON u.member_id=pe.member_id
LEFT JOIN purchase_item AS pi
ON pe.order_id=pi.order_id
WHERE pe.[status] = 'purchase'
AND pe.executed_time <= D.[Date]
ORDER BY Recency;

Related

How to extrapolate dates in SQL Server to calculate the daily counts?

This is how the data looks like. It's a long table
I need to calculate the number of people employed by day
How to write SQL Server logic to get this result? I treid to create a DATES table and then join, but this caused an error because the table is too big. Do I need a recursive logic?
For future questions, don't post images of data. Instead, use a service like dbfiddle. I'll anyhow add a sketch for an answer, with a better-prepared question you could have gotten a complete answer. Anyhow here it goes:
-- extrema is the least and the greatest date in staff table
with extrema(mn, mx) as (
select least(min(hired),min(retired)) as mn
, greatest(max(hired),max(retired)) as mx
from staff
), calendar (dt) as (
-- we construct a calendar with every date between extreme values
select mn from extrema
union all
select dateadd(day, 1, d)
from calendar
where dt < (select mx from extrema)
)
-- finally we can count the number of employed people for each such date
select dt, count(1)
from calendar c
join staff s
on c.dt between s.hired and s.retired
group by dt;
If you find yourself doing this kind of calculation often, it is a good idea to create a calendar table. You can add other attributes to it such as if it is a day of in the middle of the week etc.
With a constraint as:
CHECK(hired <= retired)
the first part can be simplified to:
with extrema(mn, mx) as (
select min(hired) as mn
, max(retired) as mx
from staff
),
Assuming Current Employees have a NULL retirement date
Declare #Date1 date = '2015-01-01'
Declare #Date2 date = getdate()
Select A.Date
,HeadCount = count(B.name)
From ( Select Top (DateDiff(DAY,#Date1,#Date2)+1)
Date=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),#Date1)
From master..spt_values n1,master..spt_values n2
) A
Left Join YourTable B on A.Date >= B.Hired and A.Date <= coalesce(B.Retired,getdate())
Group BY A.Date
You need a calendar table for this. You start with the calendar, and LEFT JOIN everything else, using BETWEEN logic.
You can use a real table. Or you can generate it on the fly, like this:
WITH
L0 AS ( SELECT c = 1
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1)) AS D(c) ),
L1 AS ( SELECT c = 1 FROM L0 A, L0 B, L0 C, L0 D ),
Nums AS ( SELECT rownum = ROW_NUMBER() OVER(ORDER BY (SELECT 1))
FROM L1 ),
Dates AS (
SELECT TOP (DATEDIFF(day, '20141231', GETDATE()))
Date = DATEADD(day, rownum, '20141231')
FROM Nums
)
SELECT
d.Date,
NumEmployed = COUNT(*)
FROM Dates d
JOIN YourTable t ON d.Date BETWEEN t.Hired AND t.Retired
GROUP BY
d.Date;
If your dates have a time component then you need to use >= AND < logic
Try limiting the scope of your date table. In this example I have a table of dates named TallyStickDT.
SELECT dt, COUNT(name)
FROM (
SELECT dt
FROM tallystickdt
WHERE dt >= (SELECT MIN(hired) FROM #employees)
AND dt <= GETDATE()
) A
LEFT OUTER JOIN #employees E ON A.dt >= E.Hired AND A.dt <= e.retired
GROUP BY dt
ORDER BY dt

return a result even if there is no data for that day/ week/ month

I have an MSSQL database and I want to get a value for each day/ week/ month in separate queries.
I got this working just fine except for intervals where there is no data, it wont return anything. And since im putting this in a graph, I want it to display a 0 or a NULL at least instead of jumping days or weeks etc.
I dont know if it will be different for each query but here is my daily query:
select CAST(Placements.CreatedOn AS DATE) AS
date,SUM(Placements.CommissionPerc * (Placements.PlacementFee / 100)) AS value
from [placements]
where [Placements].[CreatedOn] >= '2018-06-07' and [Placements].[CreatedOn] < '2018-06-12'
group by CAST(Placements.CreatedOn AS DATE)
order by CAST(Placements.CreatedOn AS DATE) ASC
This returns a result like:
So it returns 0 for when the data is actually 0 but when its missing, theres nothing like for days 9, 10 and 12
How can i fix this? thanks
Using a recursive CTE you can generate a list of dates.
Which can then be used to LEFT JOIN your table.
Example:
WITH DATES2018 AS
(
SELECT CAST('2018-01-01' AS DATE) AS [date]
UNION ALL
SELECT DATEADD(day, 1, [date])
FROM DATES2018
WHERE [date] < CAST('2018-12-31' AS DATE)
)
SELECT
d.[Date],
SUM(p.CommissionPerc * (p.PlacementFee / 100.0)) AS [value]
FROM DATES2018 AS d
LEFT JOIN [Placements] AS p ON CAST(p.CreatedOn AS DATE) = d.[Date]
WHERE d.[Date] BETWEEN '2018-06-07' AND '2018-06-11'
GROUP BY d.[Date]
ORDER BY d.[Date] ASC
OPTION (MAXRECURSION 366)
But you could also just add a new permanent table with all dates.
And use that table to left join your table.
Btw, if variables are used for the start and end date then that SQL can be optimized.
DECLARE #StartDate DATE = '2018-06-07';
DECLARE #EndDate DATE = '2018-06-11';
WITH DATES AS
(
SELECT #StartDate AS [date]
UNION ALL
SELECT DATEADD(day, 1, [date])
FROM DATES
WHERE [date] < #EndDate
)
SELECT
d.[Date],
SUM(p.CommissionPerc * (p.PlacementFee / 100.0)) AS [value]
FROM DATES AS d
LEFT JOIN [Placements] AS p
ON p.CreatedOn BETWEEN CAST(#StartDate AS DATETIME) AND CAST(DATEADD(day, 1, #EndDate) AS DATETIME) AND
CAST(p.CreatedOn AS DATE) = d.[Date]
GROUP BY d.[Date]
ORDER BY d.[Date] ASC
OPTION (MAXRECURSION 0)
A permanent calendar table would be best but here's and example that uses a CTE to create the dates needed for a LEFT JOIN. This uses a maximum of 1,000 days but can be extended as needed.
DECLARE
#StartDate date = '2018-06-07'
, #EndDate date = '2018-06-12';
WITH
t10 AS (SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) t(n))
,t1k AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1 AS num FROM t10 AS a CROSS JOIN t10 AS b CROSS JOIN t10 AS c)
,calendar AS (SELECT DATEADD(day, num, #StartDate) AS calendar_date
FROM t1k
WHERE num <= DATEDIFF(day, #StartDate, #EndDate)
)
SELECT
calendar.calendar_date AS date
, SUM( COALESCE(Placements.CommissionPerc * (Placements.PlacementFee / 100),0 ) ) AS value
FROM calendar
LEFT JOIN [placements] ON [Placements].[CreatedOn] = calendar.calendar_date
GROUP BY calendar.calendar_date
ORDER BY calendar.calendar_date ASC;

How do i find available date ranges from date ranges

Sql Server
I already added bookings from my hotel room management system reservation data. I want sql query for retrieve rooms available date ranges and also i want find specific date range is available
You can use something like the following. It's not an easy query, I'll try to explain as simple as possible.
Use a recursive CTE to generate dates from a specified start date to a specified end date.
Join each date to the different room IDs you might have in your table to create all potential available dates.
Determine which dates are unavailable for each room.
Determine which dates are available for each room by joining all potential available dates and removing unavailable ones (point 2 vs 3).
Determine how to group by each range (I used a ROW_NUMBER with a DENSE_RANK).
Display results in intervals, for each room.
Script:
-- Period to consider
DECLARE #StartDate DATE = '2018-06-20'
DECLARE #EndDate DATE = '2018-09-01'
;WITH GeneratedDates AS
(
SELECT
GeneratedDate = #StartDate
UNION ALL
SELECT
GeneratedDate = DATEADD(DAY, 1, G.GeneratedDate)
FROM
GeneratedDates AS G
WHERE
G.GeneratedDate < #EndDate
),
ExistingRooms AS
(
SELECT DISTINCT
RoomId
FROM
HotelReservation.dbo.Reservation AS R
),
UnavailableDatesByRoom AS
(
SELECT DISTINCT
R.RoomID,
UnavailableDate = G.GeneratedDate
FROM
HotelReservation.dbo.Reservation AS R
INNER JOIN GeneratedDates AS G ON G.GeneratedDate BETWEEN R.CheckIn AND R.CheckOut
),
AvailableDaysByRoom AS
(
SELECT
AvailableDate = G.GeneratedDate,
E.RoomID,
DateRanking = ROW_NUMBER() OVER (PARTITION BY E.RoomID ORDER BY G.GeneratedDate ASC)
FROM
GeneratedDates AS G
CROSS JOIN ExistingRooms AS E
WHERE
NOT EXISTS (
SELECT
'unavailable date for that room'
FROM
UnavailableDatesByRoom AS U
WHERE
U.RoomID = E.RoomID AND
G.GeneratedDate = U.UnavailableDate)
),
AvailableDaysByRoomGroupings AS
(
SELECT
A.*,
MagicRanking = DENSE_RANK() OVER (PARTITION BY A.RoomID ORDER BY DateRanking - DATEDIFF(DAY, '2010-01-01', A.AvailableDate))
FROM
AvailableDaysByRoom AS A
)
SELECT
G.RoomID,
FirstAvailableStartDate = MIN(G.AvailableDate),
LastAvailableStartDate = MAX(G.AvailableDate)
FROM
AvailableDaysByRoomGroupings AS G
GROUP BY
G.RoomID,
G.MagicRanking
ORDER BY
G.RoomID,
FirstAvailableStartDate
OPTION
(MAXRECURSION 32000)

SELECT DateTime not in SQL

I have the following table:
oDateTime pvalue
2017-06-01 00:00:00 70
2017-06-01 01:00:00 65
2017-06-01 02:00:00 90
ff.
2017-08-01 08:00:00 98
The oDateTime field is an hourly data which is impossible to have a duplicate value.
My question is, how can I know if the oDateTime data is correct? I meant, I need to make sure the data is not jump? It should be always 'hourly' base.
Am I missing the date? Am I missing the time?
Please advice. Thank you.
Based on this answer, you can get the missing times form your table MyLogTable it like this:
DECLARE #StartDate DATETIME = '20170601', #EndDate DATETIME = '20170801'
SELECT DATEADD(hour, nbr - 1, #StartDate)
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY c.object_id ) AS Nbr
FROM sys.columns c
) nbrs
WHERE nbr - 1 <= DATEDIFF(hour, #StartDate, #EndDate) AND
NOT EXISTS (SELECT 1 FROM MyLogTable WHERE DATEADD(hour, nbr - 1, #StartDate)= oDateTime )
If you need to check longer period, you can just add CROSS JOIN like this
FROM sys.columns c
CROSS JOIN sys.columns c1
It enables you to check much more than cca thousand records (rowcount of sys.columns table) in one query.
Since your table is not having any unique id number, use a row_number() to get the row number in the cte , then perform an self inner join with the row id and next id ,take the difference of oDateTime accordingly, this will show exactly which row do not have time difference of one hour
;with cte(oDateTime,pValue,Rid)
As
(
select *,row_number() over(order by oDateTime) from [YourTableName] t1
)
select *,datediff(HH,c1.oDateTime,c2.oDateTime) as HourDiff from cte c1
inner join cte c2
on c1.Rid=c2.Rid-1 where datediff(HH,c1.oDateTime,c2.oDateTime) >1
You could use DENSE_RANK() for numbering the hours in a day from 1 to 24. Then all you have to do is to check whether the max rank is 24 or not for a day. if there is at least one entry for each hour, then dense ranking will have max value of 24.
Use the following query to find the date when you have a oDateTime missing.
SELECT [date]
FROM
(
SELECT *
, CAST(oDateTime AS DATE) AS [date]
, DENSE_RANK() OVER(PARTITION BY CAST(oDateTime AS DATE) ORDER BY DATEPART(HOUR, oDateTime)) AS rank_num
FROM Test
) AS t
GROUP BY [date]
HAVING(MAX(rank_num) != 24);
If you need validation for each row of oDateTime, you could do self join based on rank and get the missing hour for each oDateTime.
Perhaps you are looking for this? This will return dates having count < 24 - which indicates a "jump"
;WITH datecount
AS ( SELECT CAST(oDateTime AS DATE) AS [date] ,
COUNT(CAST(oDateTime AS DATE)) AS [count]
FROM #temp
GROUP BY ( CAST(oDateTime AS DATE) )
)
SELECT *
FROM datecount
WHERE [count] < 24;
EDIT: Since you changed the requirement from "How to know if there is missing" to "What is the missing", here's an updated query.
DECLARE #calendar AS TABLE ( oDateTime DATETIME )
DECLARE #min DATETIME = (SELECT MIN([oDateTime]) FROM #yourTable)
DECLARE #max DATETIME = (SELECT MAX([oDateTime]) FROM #yourTable)
WHILE ( #min <= #max )
BEGIN
INSERT INTO #calendar
VALUES ( #min );
SET #min = DATEADD(hh, 1, #min);
END;
SELECT t1.[oDateTime]
FROM #calendar t1
LEFT JOIN #yourTable t2 ON t1.[oDateTime] = t2.[oDateTime]
GROUP BY t1.[oDateTime]
HAVING COUNT(t2.[oDateTime]) = 0;
I first created a hourly calendar based on your MAX and MIN Datetime, then compared your actual table to the calendar to find out if there is a "jump".

SQL adding missing dates to query

I'm trying to add missing dates to a SQL query but it does not work.
Please can you tell me what I'm doing wrong.
I only have read only rights to database.
SQL query:
With cteDateGen AS
(
SELECT 0 as Offset, CAST(DATEADD(dd, 0, '2015-11-01') AS DATE) AS WorkDate
UNION ALL
SELECT Offset + 1, CAST(DATEADD(dd, Offset, '2015-11-05') AS DATE)
FROM cteDateGen
WHERE Offset < 100
), -- generate date from to --
cte AS (
SELECT COUNT(*) OVER() AS 'total' ,ROW_NUMBER()OVER (ORDER BY c.dt DESC) as row
, c.*
FROM clockL c
RIGHT JOIN cteDateGen d ON CAST(c.dt AS DATE) = d.WorkDate
WHERE
c.dt between '2015-11-01' AND '2015-11-05' and
--d.WorkDate BETWEEN '2015-11-01' AND '2015-11-05'
and c.id =10
) -- select user log and add missing dates --
SELECT *
FROM cte
--WHERE row BETWEEN 0 AND 15
--option (maxrecursion 0)
I think your problem is simply the dates in the CTE. You can also simplify it a bit:
With cteDateGen AS (
SELECT 0 as Offset, CAST('2015-11-01' AS DATE) AS WorkDate
UNION ALL
SELECT Offset + 1, DATEADD(day, 1, WorkDate) AS DATE)
-----------------------------------^
FROM cteDateGen
WHERE Offset < 100
), -- generate date from to --
cte AS
(SELECT COUNT(*) OVER () AS total,
ROW_NUMBER() OVER (ORDER BY c.dt DESC) as row,
c.*
FROM cteDateGen d LEFT JOIN
clockL c
ON CAST(c.dt AS DATE) = d.WorkDate AND c.id = 10
-----------------------------------------------^
WHERE d.WorkDate between '2015-11-01' AND '2015-11-05'
) -- select user log and add missing dates --
SELECT *
FROM cte
Notes:
Your query used a constant for the second date in the CTE. The constant was different from the first constant. Hence, it was missing some days.
I think that LEFT JOIN is much easier to follow than RIGHT JOIN. LEFT JOIN is basically "keep all rows in the first table".
The WHERE clause was undoing the outer join in any case. The c.id logic needs to move to the ON clause.
The date arithmetic in the first CTE was unnecessarily complex.