SQL question - how to create date + hour dimension table? - sql

I would like to create a table showing hours 0 through 24 for each date since 1/1/2020 (until current). It would look something like this:
enter image description here
Column 1: Date from 1/1/2020 until current
Column 2: Hour 0-24, repeating for each date

Here is one way to do this using T-SQL:
with
tally as
(
select top 1000 n = row_number() over(order by (select null)) - 1 from sys.messages
),
calendar as
(
select [Date] = cast(dateadd(d, n, '1/1/2020') as date) from tally where n < datediff(d, '1/1/2020', getdate())
),
[hours] as
(
select top 24 n from tally
)
--select * from tally;
--select * from calendar;
select [Date] = format(c.[Date], 'M/d/yyyy'), Hrs = h.n
from [hours] h
cross join calendar c
order by c.[Date], h.n;
The tally CTE creates rows with a zero based index.
The calendar CTE creates the dates between 1/1/2020 and today.
The hours CTE creates the hours from 0 through 23.
The final query creates a Cartesian Product of calendar and hours.
This is a query that generates the desired data. If the data is to be persisted in a table, then insert logic would need to be added to the final query.

Related

Finding Active Clients By Date

I'm having trouble writing a recursive function that would count the number of active clients on any given day.
Say I have a table like this:
Client
Start Date
End Date
1
1-Jan-22
2
1-Jan-22
3-Jan-22
3
3-Jan-22
4
4-Jan-22
5-Jan-22
5
4-Jan-22
6-Jan-22
6
7-Jan-22
9-Jan-22
I want to return a table that would look like this:
Date
NumActive
1-Jan-22
2
2-Jan-22
2
3-Jan-22
3
4-Jan-22
4
5-Jan-22
4
6-Jan-22
3
7-Jan-22
3
8-Jan-22
3
9-Jan-22
4
Is there a way to do this? Ideally, I'd have a fixed start date and go to today's date.
Some pieces I have tried:
Creating a recursive date table
Truncated to Feb 1, 2022 for simplicity:
WITH DateDiffs AS (
SELECT DATEDIFF(DAY, '2022-02-02', GETDATE()) AS NumDays
)
, Numbers(Numbers) AS (
SELECT MAX(NumDays) FROM DateDiffs
UNION ALL
SELECT Numbers-1 FROM Numbers WHERE Numbers > 0
)
, Dates AS (
SELECT
Numbers
, DATEADD(DAY, -Numbers, CAST(GETDATE() -1 AS DATE)) AS [Date]
FROM Numbers
)
I would like to be able to loop over the dates in that table, such as by modifying the query below for each date, such as by #loopdate. Then UNION ALL it to a larger final query.
I'm now stuck as to how I can run the query to count the number of active users:
SELECT
COUNT(Client)
FROM clients
WHERE [Start Date] >= #loopdate AND ([End Date] <= #loopdate OR [End Date] IS NULL)
Thank you!
You don't need anything recursive in this particular case, you need as a minimum a list of dates in the range you want to report on, ideally a permanent calendar table.
for purposes of demonstration you can create something on the fly, and use it like so, with the list of dates something you outer join to:
with dates as (
select top(9)
Convert(date,DateAdd(day, -1 + Row_Number() over(order by (select null)), '20220101')) dt
from master.dbo.spt_values
)
select d.dt [Date], c.NumActive
from dates d
outer apply (
select Count(*) NumActive
from t
where d.dt >= t.StartDate and (d.dt <= t.EndDate or t.EndDate is null)
)c
See this Demo Fiddle

Every distinct Date between DateA and Date B -TSQL

I'm searching for a query like a calendar giving me back the distinct Dates between "Date A" and Date "A -49 days".
Date A is the a variable. If I look on the Query on Monday to Sunday it will give me back
the Date of the Sunday in the previous Week
the Date of the Sunday in the Week before the previous week
2 Weeks before the Previous Week
5 Weeks before the Previous Week
For Example: I started the query in '2022-01-23'
a_end: '2022-01-16' a_beginn: '2021-12-05' and every date between
b_end:'2022-01-09' b_beginn: '2021-11-29' and every date between
etc.
You could use a recursive CTE :
WITH T(d) AS (
SELECT CAST('2022-01-01' AS date)
UNION ALL
SELECT DATEADD(day, -1, d)
FROM T
WHERE d >= DATEADD(day, -49, '2022-01-01')
)
SELECT d
FROM T
-- OPTION (MAXRECURSION 1000)
If you have more than 100 days to generate you will need to set the MAXRECURSION query hint which is limited to 100 by default (0 means no limit). Beware of infinite loops with this setting though.
You can generate a dynamic calendar table as in this example:
with FindPrevSunday as (
select
dateadd(week,datediff(week, '1900-01-07', getdate()), '1900-01-07') PrevSunday
),
JustFourRows as (
select 1 as x union all select 1 as x union all
select 1 as x union all select 1 as x
),
LotsOfRows as (
select Dte=dateadd(day, -Row_number() over (order by a.x)+1, (select PrevSunday from FindPrevSunday))
from
JustFourRows a --4
cross Join
JustFourRows b --16
cross join
JustFourRows c --64
cross join
JustFourRows d -- 256
)
select Dte
from LotsOfRows
cross join
FindPrevSunday PrevS
where Dte between dateadd(day,-48, Prevs.PrevSunday) and PrevSunday
'1900-01-07' is a fixed reference point; known to be Sunday; datediff(week always brings whole/complete weeks; and we use the cross joins to quickly 'generate' rows corresponding to dates in the calendar; then we assign dates, and then filter for the limit we are interested in. This example can generate up to 256 days, but you can add more cross joins, if you wish.

How do I get the month number with the maximum number of days from the date range?

I have a table with 10 million rows, where there are two columns that contain the start date and the end date of the range. For example, 2019-09-25 and 2019-10-20. I want to extract the month number with the maximum number of days, in this example it will be 10. In addition to dates that are separated by one month, there are also such examples: 2019-07-01 and 2019-07-29 (within one month), as well as 2019-07-01 and 2019-09-05 (more than one month). How can I implement this?
Seems like you could do something like this:
SELECT CASE WHEN DATEDIFF(DAY, DATEFROMPARTS(YEAR(EndDate),MONTH(EndDate),1),EndDate) >= DATEDIFF(DAY, StartDate, EOMONTH(StartDate)) THEN DATEPART(MONTH,EndDate)
ELSE DATEPART(MONTH,StartDate)
END
FROM (VALUES('20190925','20191020'))V(StartDate,EndDate);
Does the following fit your requirements?
You can build a table of days-in-month (this would be permanent ideally)
and then join to it using the month numbers of your min and max dates.
declare #start date='20190925', #end date='20191020';
--declare #start date='20190701', #end date='20190729';
--declare #start date='20190701', #end date='20190905';
with dim as (
select m,DAY(DATEADD(DD,-1,DATEADD(mm, DATEDIFF(mm, 0, DateFromParts(Year(GetDate()),m,1) )+1, 0)))d
from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12))m(m)
)
select top(1) with ties m
from dim
where m between Month(#start) and Month(#end)
order by d desc
You don't state how you determin the most days where there are several months with the same number of months, so with ties includes all qualifying months.
Edit
So I don't know if there is a requirement to span years - the sample data suggests not - however with a permanent list of dates and corresponding days in month values (this is often part of a calendar table) a slight tweak will accomodate it.
with dim as (
select Year(#start)*100 + m m, Day(DATEADD(DD,-1,DATEADD(mm, DATEDIFF(mm, 0, DateFromParts(Year(#start),m,1) )+1, 0)))d
from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12))m(m)
union all
select Year(#end)*100 + m m, Day(DATEADD(DD,-1,DATEADD(mm, DATEDIFF(mm, 0, DateFromParts(Year(#end),m,1) )+1, 0)))d
from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12))m(m)
)
select top(1) with ties m
from dim
where m between Year(#start)*100 + Month(#start) and Year(#end)*100 + Month(#end)
order by d desc
You could try something like this
with
l0(n) as (
select 1 n
from (values (1),(1),(1),(1),(1),(1),(1),(1)) as v(n))
select top(1) with ties
vTable.*, calc.dt month_with_most_days
from (values ('20190925','20191020'),
('20190925','20191120')) vTable(startdate, enddate)
cross apply (values (datediff(month, vTable.startdate, vTable.enddate))) diff(mo_count)
cross apply (select top (diff.mo_count+1)
row_number() over (order by (select null)) n
from l0 l1, l0 l2, l0 l3, l0 l4) tally /* 8^4 months possible */
cross apply (values (cast(case when tally.n=1 then startdate
when tally.n=diff.mo_count+1 then enddate
else eomonth(dateadd(month, tally.n-1, startdate)) end as date))) calc(dt)
order by row_number() over (partition by startdate, enddate
order by day(calc.dt) desc);
startdate enddate month_with_most_days
20190925 20191020 2019-09-25
20190925 20191120 2019-10-31

SELECT DateTime not in SQL

I have the following table:
oDateTime pvalue
2017-06-01 00:00:00 70
2017-06-01 01:00:00 65
2017-06-01 02:00:00 90
ff.
2017-08-01 08:00:00 98
The oDateTime field is an hourly data which is impossible to have a duplicate value.
My question is, how can I know if the oDateTime data is correct? I meant, I need to make sure the data is not jump? It should be always 'hourly' base.
Am I missing the date? Am I missing the time?
Please advice. Thank you.
Based on this answer, you can get the missing times form your table MyLogTable it like this:
DECLARE #StartDate DATETIME = '20170601', #EndDate DATETIME = '20170801'
SELECT DATEADD(hour, nbr - 1, #StartDate)
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY c.object_id ) AS Nbr
FROM sys.columns c
) nbrs
WHERE nbr - 1 <= DATEDIFF(hour, #StartDate, #EndDate) AND
NOT EXISTS (SELECT 1 FROM MyLogTable WHERE DATEADD(hour, nbr - 1, #StartDate)= oDateTime )
If you need to check longer period, you can just add CROSS JOIN like this
FROM sys.columns c
CROSS JOIN sys.columns c1
It enables you to check much more than cca thousand records (rowcount of sys.columns table) in one query.
Since your table is not having any unique id number, use a row_number() to get the row number in the cte , then perform an self inner join with the row id and next id ,take the difference of oDateTime accordingly, this will show exactly which row do not have time difference of one hour
;with cte(oDateTime,pValue,Rid)
As
(
select *,row_number() over(order by oDateTime) from [YourTableName] t1
)
select *,datediff(HH,c1.oDateTime,c2.oDateTime) as HourDiff from cte c1
inner join cte c2
on c1.Rid=c2.Rid-1 where datediff(HH,c1.oDateTime,c2.oDateTime) >1
You could use DENSE_RANK() for numbering the hours in a day from 1 to 24. Then all you have to do is to check whether the max rank is 24 or not for a day. if there is at least one entry for each hour, then dense ranking will have max value of 24.
Use the following query to find the date when you have a oDateTime missing.
SELECT [date]
FROM
(
SELECT *
, CAST(oDateTime AS DATE) AS [date]
, DENSE_RANK() OVER(PARTITION BY CAST(oDateTime AS DATE) ORDER BY DATEPART(HOUR, oDateTime)) AS rank_num
FROM Test
) AS t
GROUP BY [date]
HAVING(MAX(rank_num) != 24);
If you need validation for each row of oDateTime, you could do self join based on rank and get the missing hour for each oDateTime.
Perhaps you are looking for this? This will return dates having count < 24 - which indicates a "jump"
;WITH datecount
AS ( SELECT CAST(oDateTime AS DATE) AS [date] ,
COUNT(CAST(oDateTime AS DATE)) AS [count]
FROM #temp
GROUP BY ( CAST(oDateTime AS DATE) )
)
SELECT *
FROM datecount
WHERE [count] < 24;
EDIT: Since you changed the requirement from "How to know if there is missing" to "What is the missing", here's an updated query.
DECLARE #calendar AS TABLE ( oDateTime DATETIME )
DECLARE #min DATETIME = (SELECT MIN([oDateTime]) FROM #yourTable)
DECLARE #max DATETIME = (SELECT MAX([oDateTime]) FROM #yourTable)
WHILE ( #min <= #max )
BEGIN
INSERT INTO #calendar
VALUES ( #min );
SET #min = DATEADD(hh, 1, #min);
END;
SELECT t1.[oDateTime]
FROM #calendar t1
LEFT JOIN #yourTable t2 ON t1.[oDateTime] = t2.[oDateTime]
GROUP BY t1.[oDateTime]
HAVING COUNT(t2.[oDateTime]) = 0;
I first created a hourly calendar based on your MAX and MIN Datetime, then compared your actual table to the calendar to find out if there is a "jump".

SQL Server - Split year into 4 weekly periods

I would like to split up the year into 13 periods with 4 weeks in each
52 weeks a year / 4 = 13 even periods
I would like each period to start on a saturday and end on a friday.
It should look like the below image
Obviously I could do this manually, but the dates would change each year and I am looking for a way to automate this with SQL rather than manually do this for each upcoming year
Is there a way to produce this yearly split automatically?
In this previous answer I show an approach to create a numbers/date table. Such a table is very handsome in many places.
With this approach you might try something like this:
CREATE TABLE dbo.RunningNumbers(Number INT NOT NULL,CalendarDate DATE NOT NULL, CalendarYear INT NOT NULL,CalendarMonth INT NOT NULL,CalendarDay INT NOT NULL, CalendarWeek INT NOT NULL, CalendarYearDay INT NOT NULL, CalendarWeekDay INT NOT NULL);
DECLARE #CountEntries INT = 100000;
DECLARE #StartNumber INT = 0;
WITH E1(N) AS(SELECT 1 FROM(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))t(N)), --10 ^ 1
E2(N) AS(SELECT 1 FROM E1 a CROSS JOIN E1 b), -- 10 ^ 2 = 100 rows
E4(N) AS(SELECT 1 FROM E2 a CROSS JOIN E2 b), -- 10 ^ 4 = 10,000 rows
E8(N) AS(SELECT 1 FROM E4 a CROSS JOIN E4 b), -- 10 ^ 8 = 10,000,000 rows
CteTally AS
(
SELECT TOP(ISNULL(#CountEntries,1000000)) ROW_NUMBER() OVER(ORDER BY(SELECT NULL)) -1 + ISNULL(#StartNumber,0) As Nmbr
FROM E8
)
INSERT INTO dbo.RunningNumbers
SELECT CteTally.Nmbr,CalendarDate.d,CalendarExt.*
FROM CteTally
CROSS APPLY
(
SELECT DATEADD(DAY,CteTally.Nmbr,{ts'1900-01-01 00:00:00'})
) AS CalendarDate(d)
CROSS APPLY
(
SELECT YEAR(CalendarDate.d) AS CalendarYear
,MONTH(CalendarDate.d) AS CalendarMonth
,DAY(CalendarDate.d) AS CalendarDay
,DATEPART(WEEK,CalendarDate.d) AS CalendarWeek
,DATEPART(DAYOFYEAR,CalendarDate.d) AS CalendarYearDay
,DATEPART(WEEKDAY,CalendarDate.d) AS CalendarWeekDay
) AS CalendarExt;
GO
NTILE - SQL Server 2008+ will create (almost) even chunks.
This the actual query
SELECT *,NTILE(13) OVER(ORDER BY CalendarDate) AS Periode
FROM RunningNumbers
WHERE CalendarWeekDay=6
AND CalendarDate>={d'2017-01-01'} AND CalendarDate <= {d'2017-12-31'};
GO
--Carefull with existing data!
--DROP TABLE dbo.RunningNumbers;
Hint 1: Place indexes!
Hint 2: Read the link about NTILE, especially the Remark-section.
I think this will fit for this case. You might think about using Prdp's approach with ROW_NUMBER() in conncetion with INT division. But - big advantage! - NTILE would allow PARTITION BY CalendarYear.
Hint 3: You might add a column to the table
...where you set the period's number as a fix value. This will make future queries very easy and would allow manual correction on special cases (53rd week..)
Here is one way using Calendar table
DECLARE #start DATE = '2017-04-01',
#end_date DATE = '2017-12-31'
SET DATEFIRST 7;
WITH Calendar
AS (SELECT 1 AS id,
#start AS start_date,
Dateadd(dd, 6, #start) AS end_date
UNION ALL
SELECT id + 1,
Dateadd(week, 1, start_date),
Dateadd(week, 1, end_date)
FROM Calendar
WHERE end_date < #end_date)
SELECT id,
( Row_number()OVER(ORDER BY id) - 1 ) / 4 + 1 AS Period,
start_date,
end_date
FROM Calendar
OPTION (maxrecursion 0)
I have generated dates using Recursive CTE but it is better to create a physical calendar table use it in queries like this
Firstly, you will never get 52 even weeks in a year, there are overlap weeks in most calendar standards. You will occasionally get a week 53.
You can tell SQL to use Saturday as the first day of the week with datefirst, then running a datepart on today's date with getdate() will tell you the week of the year:
SET datefirst 6 -- 6 is Saturday
SELECT datepart(ww,getdate()) as currentWeek
You could then divide this by 4 with a CEILING command to get the 4-week split:
SET datefirst 6
SELECT DATEPART(ww,getdate()) as currentWeek,
CEILING(DATEPART(ww,getdate())/4) as four_week_split