Join Generated Date Sequence - sql

Currently I'm trying to join a date table to a ledger table so I can fill the gaps of the ledger table whenever there are no transactions in certain instances (e.g. there are transactions on March 1st and in March 3rd, but no transaction in March 2nd. And by joining both tables March 2nd would appear in the ledger table but with 0 for the variable we're analyzing.)
The challenge is that I can't create a Date object/table/dimension because I don't have permissions to create tables in the database. Therefore I've been generating a date sequence with this code:
DECLARE #startDate date = CAST('2016-01-01' AS date),
#endDate date = CAST(GETDATE() AS date);
SELECT DATEADD(day, number - 1, #startDate) AS [Date]
FROM (
SELECT ROW_NUMBER() OVER (
ORDER BY n.object_id
)
FROM sys.all_objects n
) S(number)
WHERE number <= DATEDIFF(day, #startDate, #endDate) + 1;
So, is there the possibility to join both tables into the same statement? Let's say the ledger table looks like this:
SELECT
date,cost
FROM ledger
I'd assume it can be done by using a subquery but I don't know how.
Thank you.

There is a very good article by Aaron Bertrand showing several methods for generating a sequence of numbers (or dates) in SQL Server: Generate a set or sequence without loops – part 1.
Try them out and see for yourself which is faster or more convenient to you. (spoiler - Recursive CTE is rather slow)
Once you've picked your preferred method you can wrap it in a CTE (common-table expression).
Here I'll use your method from the question
WITH
CTE_Dates
AS
(
SELECT
DATEADD(day, number - 1, #startDate) AS dt
FROM (
SELECT ROW_NUMBER() OVER (
ORDER BY n.object_id
)
FROM sys.all_objects n
) S(number)
WHERE number <= DATEDIFF(day, #startDate, #endDate) + 1
)
SELECT
...
FROM
CTE_Dates
LEFT JOIN Ledger ON Ledger.dt = CTE_Dates.dt
;

You can use your generated date sequence as a CTE and LEFT JOIN that to your ledger table. For example:
DECLARE #startDate date = CAST('2020-02-01' AS date);
DECLARE #endDate date = CAST(GETDATE() AS date);
WITH dates AS (
SELECT DATEADD(day, number - 1, #startDate) AS [Date]
FROM (
SELECT ROW_NUMBER() OVER (
ORDER BY n.object_id
)
FROM sys.all_objects n
) S(number)
WHERE number <= DATEDIFF(day, #startDate, #endDate) + 1
)
SELECT dates.Date, COALESCE(ledger.cost, 0)
FROM dates
LEFT JOIN (VALUES ('2020-02-02', 14), ('2020-02-05', 10)) AS ledger([Date], [cost]) ON dates.Date = ledger.Date
Output:
Date cost
2020-02-01 0
2020-02-02 14
2020-02-03 0
2020-02-04 0
2020-02-05 10
2020-02-06 0
Demo on dbfiddle

Related

Finding Active Clients By Date

I'm having trouble writing a recursive function that would count the number of active clients on any given day.
Say I have a table like this:
Client
Start Date
End Date
1
1-Jan-22
2
1-Jan-22
3-Jan-22
3
3-Jan-22
4
4-Jan-22
5-Jan-22
5
4-Jan-22
6-Jan-22
6
7-Jan-22
9-Jan-22
I want to return a table that would look like this:
Date
NumActive
1-Jan-22
2
2-Jan-22
2
3-Jan-22
3
4-Jan-22
4
5-Jan-22
4
6-Jan-22
3
7-Jan-22
3
8-Jan-22
3
9-Jan-22
4
Is there a way to do this? Ideally, I'd have a fixed start date and go to today's date.
Some pieces I have tried:
Creating a recursive date table
Truncated to Feb 1, 2022 for simplicity:
WITH DateDiffs AS (
SELECT DATEDIFF(DAY, '2022-02-02', GETDATE()) AS NumDays
)
, Numbers(Numbers) AS (
SELECT MAX(NumDays) FROM DateDiffs
UNION ALL
SELECT Numbers-1 FROM Numbers WHERE Numbers > 0
)
, Dates AS (
SELECT
Numbers
, DATEADD(DAY, -Numbers, CAST(GETDATE() -1 AS DATE)) AS [Date]
FROM Numbers
)
I would like to be able to loop over the dates in that table, such as by modifying the query below for each date, such as by #loopdate. Then UNION ALL it to a larger final query.
I'm now stuck as to how I can run the query to count the number of active users:
SELECT
COUNT(Client)
FROM clients
WHERE [Start Date] >= #loopdate AND ([End Date] <= #loopdate OR [End Date] IS NULL)
Thank you!
You don't need anything recursive in this particular case, you need as a minimum a list of dates in the range you want to report on, ideally a permanent calendar table.
for purposes of demonstration you can create something on the fly, and use it like so, with the list of dates something you outer join to:
with dates as (
select top(9)
Convert(date,DateAdd(day, -1 + Row_Number() over(order by (select null)), '20220101')) dt
from master.dbo.spt_values
)
select d.dt [Date], c.NumActive
from dates d
outer apply (
select Count(*) NumActive
from t
where d.dt >= t.StartDate and (d.dt <= t.EndDate or t.EndDate is null)
)c
See this Demo Fiddle

Loop Through Recordset and count if Within Date Range

Please note - this is my first post, so I apologize for anything I have missed.
I have a large event table that has a record for each time someone moves within a facility. What I would like to do, is say if that person was in the facility over the previous 365 days, count them (for each day). Essentially, I need the Average Daily Population (for everyone in the facility over the previous 365 days).
thank you.
Example Data:
PersonID ArriveDt LeaveDt Location
1111 1/1/2019 1/3/2019 ABC
1122 1/1/2019 1/5/2019 ABC
1123 1/2/2019 1/6/2019 ABC
Date Count
1/1/2019 2
1/2/2019 3
1/3/2019 3
1/4/2019 2
1/5/2019 2
1/6/2019 1
If you don't care about the intermediate values then really you just need to compute the total man-days and divide.
select sum(datediff(day, adjusted_start, adjusted_end) + 1) / 365.0
from T
cross apply (
select
dateadd(day, -365, cast(getdate() as date)),
dateadd(day, -1, cast(getdate() as date))
) d(range_start, range_end)
cross apply (
select
case when ArriveDt < range_start then range_start else ArriveDt end,
case when LeaveDt > range_end then range_end else LeaveDt end
) a(adjusted_start, adjusted_end);
If not, then come up with a table of dates (from any of numerous sources around the internet) and join from that. An outer join allows for dates with zero census.
with dates as (
select dt from calendar -- exercise for the reader
where
dt >= dateadd(day, -365, cast(getdate() as date)),
and dt <= dateadd(day, -1, cast(getdate() as date))
)
select d.dt, count(*)
from dates d left outer join T t
on d.dt between t.ArriveDt and t.LeaveDt
group by d.dt;
Create basic calendar table for the last X days:
IF OBJECT_ID('tempdb..#Calendar') IS NOT NULL
DROP TABLE #Calendar;
GO
DECLARE #StartDate DATE = DATEADD(d, -365, GETDATE())
DECLARE #EndDate DATE = GETDATE()
CREATE TABLE #Calendar
(
[CalendarDate] DATE
)
WHILE #StartDate <= #EndDate
BEGIN
INSERT INTO #Calendar
(
CalendarDate
)
SELECT
#StartDate
SET #StartDate = DATEADD(dd, 1, #StartDate)
END
GO
Now CROSS JOIN with Calendar Table to get rows per Person per day they were at a Location and the GROUP BY
SELECT
c.CalendarDate
,Location
,COUNT(PersonID) AS PersonCount
FROM (
SELECT PersonID ,ArriveDt ,LeaveDt ,Location FROM dbo.Table
) t
CROSS JOIN Calendar c
WHERE c.CalendarDate BETWEEN t.ArriveDt AND t.LeaveDt
GROUP BY
c.CalendarDate
,Location
Edit (2019-10-08):
If you're unable to create tables, you can use temp tables instead and do a DROP IF EXISTS. You can run both of these parts in one go on the fly to generate your final report.

return a result even if there is no data for that day/ week/ month

I have an MSSQL database and I want to get a value for each day/ week/ month in separate queries.
I got this working just fine except for intervals where there is no data, it wont return anything. And since im putting this in a graph, I want it to display a 0 or a NULL at least instead of jumping days or weeks etc.
I dont know if it will be different for each query but here is my daily query:
select CAST(Placements.CreatedOn AS DATE) AS
date,SUM(Placements.CommissionPerc * (Placements.PlacementFee / 100)) AS value
from [placements]
where [Placements].[CreatedOn] >= '2018-06-07' and [Placements].[CreatedOn] < '2018-06-12'
group by CAST(Placements.CreatedOn AS DATE)
order by CAST(Placements.CreatedOn AS DATE) ASC
This returns a result like:
So it returns 0 for when the data is actually 0 but when its missing, theres nothing like for days 9, 10 and 12
How can i fix this? thanks
Using a recursive CTE you can generate a list of dates.
Which can then be used to LEFT JOIN your table.
Example:
WITH DATES2018 AS
(
SELECT CAST('2018-01-01' AS DATE) AS [date]
UNION ALL
SELECT DATEADD(day, 1, [date])
FROM DATES2018
WHERE [date] < CAST('2018-12-31' AS DATE)
)
SELECT
d.[Date],
SUM(p.CommissionPerc * (p.PlacementFee / 100.0)) AS [value]
FROM DATES2018 AS d
LEFT JOIN [Placements] AS p ON CAST(p.CreatedOn AS DATE) = d.[Date]
WHERE d.[Date] BETWEEN '2018-06-07' AND '2018-06-11'
GROUP BY d.[Date]
ORDER BY d.[Date] ASC
OPTION (MAXRECURSION 366)
But you could also just add a new permanent table with all dates.
And use that table to left join your table.
Btw, if variables are used for the start and end date then that SQL can be optimized.
DECLARE #StartDate DATE = '2018-06-07';
DECLARE #EndDate DATE = '2018-06-11';
WITH DATES AS
(
SELECT #StartDate AS [date]
UNION ALL
SELECT DATEADD(day, 1, [date])
FROM DATES
WHERE [date] < #EndDate
)
SELECT
d.[Date],
SUM(p.CommissionPerc * (p.PlacementFee / 100.0)) AS [value]
FROM DATES AS d
LEFT JOIN [Placements] AS p
ON p.CreatedOn BETWEEN CAST(#StartDate AS DATETIME) AND CAST(DATEADD(day, 1, #EndDate) AS DATETIME) AND
CAST(p.CreatedOn AS DATE) = d.[Date]
GROUP BY d.[Date]
ORDER BY d.[Date] ASC
OPTION (MAXRECURSION 0)
A permanent calendar table would be best but here's and example that uses a CTE to create the dates needed for a LEFT JOIN. This uses a maximum of 1,000 days but can be extended as needed.
DECLARE
#StartDate date = '2018-06-07'
, #EndDate date = '2018-06-12';
WITH
t10 AS (SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) t(n))
,t1k AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1 AS num FROM t10 AS a CROSS JOIN t10 AS b CROSS JOIN t10 AS c)
,calendar AS (SELECT DATEADD(day, num, #StartDate) AS calendar_date
FROM t1k
WHERE num <= DATEDIFF(day, #StartDate, #EndDate)
)
SELECT
calendar.calendar_date AS date
, SUM( COALESCE(Placements.CommissionPerc * (Placements.PlacementFee / 100),0 ) ) AS value
FROM calendar
LEFT JOIN [placements] ON [Placements].[CreatedOn] = calendar.calendar_date
GROUP BY calendar.calendar_date
ORDER BY calendar.calendar_date ASC;

SQL Cross Join getting all dates between date range

I have a table with the following structure:
ID: StartDate: EndDate
I want to show all dates in the date range for each ID.
Eg
ID = 1: StartDate = 01/01/2018: EndDate = 03/01/2018
ID: 1 01/01/2018
ID: 1 02/01/2018
ID: 1 03/01/2018
I think i need to use a cross join but im unsure how to create this for multiple rows?
Here is the CTE for SQL Server, the syntax is somewhat different:
declare #startdate date = '2018-01-01';
declare #enddate date = '2018-03-18';
with
dates as (
select #startdate as [date]
union all
select dateadd(dd, 1, [date]) from dates where [date] < #enddate
)
select [date] from dates
So i ended up using a date table and just cross referencing that
select *
from Date d
inner join WorkingTable w
on d.Date >= w.StartDate
and d.date < w.EndDate
In standard SQL you can use a recursive CTE:
with recursive dates as (
select date '2018-01-01' as dte
union all
select dte + interval '1 day'
from dates
where dte < date '2018-01-03'
)
select dte
from dates;
The exact syntax (whether recursive is needed and date functions) differ among databases. Not all databases support this standard functionality.
Now got this for only one id..,
create table #dateTable(id int, col1 date, col2 date)
insert into #dateTable values(1,'05-May-2018','08-May-2018') ,(2,'05-May-2018','05-May-2018')
select *from #dateTable
with cte(start, ends) as(
select start = (select top 1 col1 from #dateTable), ends = (select top 1 col2 from #dateTable)
union all
select DATEADD(dd,1,start),ends from cte where start <> ends
)select start from cte option (maxrecursion 10)
I'm still working... I update soon...!

SELECT DateTime not in SQL

I have the following table:
oDateTime pvalue
2017-06-01 00:00:00 70
2017-06-01 01:00:00 65
2017-06-01 02:00:00 90
ff.
2017-08-01 08:00:00 98
The oDateTime field is an hourly data which is impossible to have a duplicate value.
My question is, how can I know if the oDateTime data is correct? I meant, I need to make sure the data is not jump? It should be always 'hourly' base.
Am I missing the date? Am I missing the time?
Please advice. Thank you.
Based on this answer, you can get the missing times form your table MyLogTable it like this:
DECLARE #StartDate DATETIME = '20170601', #EndDate DATETIME = '20170801'
SELECT DATEADD(hour, nbr - 1, #StartDate)
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY c.object_id ) AS Nbr
FROM sys.columns c
) nbrs
WHERE nbr - 1 <= DATEDIFF(hour, #StartDate, #EndDate) AND
NOT EXISTS (SELECT 1 FROM MyLogTable WHERE DATEADD(hour, nbr - 1, #StartDate)= oDateTime )
If you need to check longer period, you can just add CROSS JOIN like this
FROM sys.columns c
CROSS JOIN sys.columns c1
It enables you to check much more than cca thousand records (rowcount of sys.columns table) in one query.
Since your table is not having any unique id number, use a row_number() to get the row number in the cte , then perform an self inner join with the row id and next id ,take the difference of oDateTime accordingly, this will show exactly which row do not have time difference of one hour
;with cte(oDateTime,pValue,Rid)
As
(
select *,row_number() over(order by oDateTime) from [YourTableName] t1
)
select *,datediff(HH,c1.oDateTime,c2.oDateTime) as HourDiff from cte c1
inner join cte c2
on c1.Rid=c2.Rid-1 where datediff(HH,c1.oDateTime,c2.oDateTime) >1
You could use DENSE_RANK() for numbering the hours in a day from 1 to 24. Then all you have to do is to check whether the max rank is 24 or not for a day. if there is at least one entry for each hour, then dense ranking will have max value of 24.
Use the following query to find the date when you have a oDateTime missing.
SELECT [date]
FROM
(
SELECT *
, CAST(oDateTime AS DATE) AS [date]
, DENSE_RANK() OVER(PARTITION BY CAST(oDateTime AS DATE) ORDER BY DATEPART(HOUR, oDateTime)) AS rank_num
FROM Test
) AS t
GROUP BY [date]
HAVING(MAX(rank_num) != 24);
If you need validation for each row of oDateTime, you could do self join based on rank and get the missing hour for each oDateTime.
Perhaps you are looking for this? This will return dates having count < 24 - which indicates a "jump"
;WITH datecount
AS ( SELECT CAST(oDateTime AS DATE) AS [date] ,
COUNT(CAST(oDateTime AS DATE)) AS [count]
FROM #temp
GROUP BY ( CAST(oDateTime AS DATE) )
)
SELECT *
FROM datecount
WHERE [count] < 24;
EDIT: Since you changed the requirement from "How to know if there is missing" to "What is the missing", here's an updated query.
DECLARE #calendar AS TABLE ( oDateTime DATETIME )
DECLARE #min DATETIME = (SELECT MIN([oDateTime]) FROM #yourTable)
DECLARE #max DATETIME = (SELECT MAX([oDateTime]) FROM #yourTable)
WHILE ( #min <= #max )
BEGIN
INSERT INTO #calendar
VALUES ( #min );
SET #min = DATEADD(hh, 1, #min);
END;
SELECT t1.[oDateTime]
FROM #calendar t1
LEFT JOIN #yourTable t2 ON t1.[oDateTime] = t2.[oDateTime]
GROUP BY t1.[oDateTime]
HAVING COUNT(t2.[oDateTime]) = 0;
I first created a hourly calendar based on your MAX and MIN Datetime, then compared your actual table to the calendar to find out if there is a "jump".