Please note - this is my first post, so I apologize for anything I have missed.
I have a large event table that has a record for each time someone moves within a facility. What I would like to do, is say if that person was in the facility over the previous 365 days, count them (for each day). Essentially, I need the Average Daily Population (for everyone in the facility over the previous 365 days).
thank you.
Example Data:
PersonID ArriveDt LeaveDt Location
1111 1/1/2019 1/3/2019 ABC
1122 1/1/2019 1/5/2019 ABC
1123 1/2/2019 1/6/2019 ABC
Date Count
1/1/2019 2
1/2/2019 3
1/3/2019 3
1/4/2019 2
1/5/2019 2
1/6/2019 1
If you don't care about the intermediate values then really you just need to compute the total man-days and divide.
select sum(datediff(day, adjusted_start, adjusted_end) + 1) / 365.0
from T
cross apply (
select
dateadd(day, -365, cast(getdate() as date)),
dateadd(day, -1, cast(getdate() as date))
) d(range_start, range_end)
cross apply (
select
case when ArriveDt < range_start then range_start else ArriveDt end,
case when LeaveDt > range_end then range_end else LeaveDt end
) a(adjusted_start, adjusted_end);
If not, then come up with a table of dates (from any of numerous sources around the internet) and join from that. An outer join allows for dates with zero census.
with dates as (
select dt from calendar -- exercise for the reader
where
dt >= dateadd(day, -365, cast(getdate() as date)),
and dt <= dateadd(day, -1, cast(getdate() as date))
)
select d.dt, count(*)
from dates d left outer join T t
on d.dt between t.ArriveDt and t.LeaveDt
group by d.dt;
Create basic calendar table for the last X days:
IF OBJECT_ID('tempdb..#Calendar') IS NOT NULL
DROP TABLE #Calendar;
GO
DECLARE #StartDate DATE = DATEADD(d, -365, GETDATE())
DECLARE #EndDate DATE = GETDATE()
CREATE TABLE #Calendar
(
[CalendarDate] DATE
)
WHILE #StartDate <= #EndDate
BEGIN
INSERT INTO #Calendar
(
CalendarDate
)
SELECT
#StartDate
SET #StartDate = DATEADD(dd, 1, #StartDate)
END
GO
Now CROSS JOIN with Calendar Table to get rows per Person per day they were at a Location and the GROUP BY
SELECT
c.CalendarDate
,Location
,COUNT(PersonID) AS PersonCount
FROM (
SELECT PersonID ,ArriveDt ,LeaveDt ,Location FROM dbo.Table
) t
CROSS JOIN Calendar c
WHERE c.CalendarDate BETWEEN t.ArriveDt AND t.LeaveDt
GROUP BY
c.CalendarDate
,Location
Edit (2019-10-08):
If you're unable to create tables, you can use temp tables instead and do a DROP IF EXISTS. You can run both of these parts in one go on the fly to generate your final report.
Related
I'm having trouble writing a recursive function that would count the number of active clients on any given day.
Say I have a table like this:
Client
Start Date
End Date
1
1-Jan-22
2
1-Jan-22
3-Jan-22
3
3-Jan-22
4
4-Jan-22
5-Jan-22
5
4-Jan-22
6-Jan-22
6
7-Jan-22
9-Jan-22
I want to return a table that would look like this:
Date
NumActive
1-Jan-22
2
2-Jan-22
2
3-Jan-22
3
4-Jan-22
4
5-Jan-22
4
6-Jan-22
3
7-Jan-22
3
8-Jan-22
3
9-Jan-22
4
Is there a way to do this? Ideally, I'd have a fixed start date and go to today's date.
Some pieces I have tried:
Creating a recursive date table
Truncated to Feb 1, 2022 for simplicity:
WITH DateDiffs AS (
SELECT DATEDIFF(DAY, '2022-02-02', GETDATE()) AS NumDays
)
, Numbers(Numbers) AS (
SELECT MAX(NumDays) FROM DateDiffs
UNION ALL
SELECT Numbers-1 FROM Numbers WHERE Numbers > 0
)
, Dates AS (
SELECT
Numbers
, DATEADD(DAY, -Numbers, CAST(GETDATE() -1 AS DATE)) AS [Date]
FROM Numbers
)
I would like to be able to loop over the dates in that table, such as by modifying the query below for each date, such as by #loopdate. Then UNION ALL it to a larger final query.
I'm now stuck as to how I can run the query to count the number of active users:
SELECT
COUNT(Client)
FROM clients
WHERE [Start Date] >= #loopdate AND ([End Date] <= #loopdate OR [End Date] IS NULL)
Thank you!
You don't need anything recursive in this particular case, you need as a minimum a list of dates in the range you want to report on, ideally a permanent calendar table.
for purposes of demonstration you can create something on the fly, and use it like so, with the list of dates something you outer join to:
with dates as (
select top(9)
Convert(date,DateAdd(day, -1 + Row_Number() over(order by (select null)), '20220101')) dt
from master.dbo.spt_values
)
select d.dt [Date], c.NumActive
from dates d
outer apply (
select Count(*) NumActive
from t
where d.dt >= t.StartDate and (d.dt <= t.EndDate or t.EndDate is null)
)c
See this Demo Fiddle
Currently I'm trying to join a date table to a ledger table so I can fill the gaps of the ledger table whenever there are no transactions in certain instances (e.g. there are transactions on March 1st and in March 3rd, but no transaction in March 2nd. And by joining both tables March 2nd would appear in the ledger table but with 0 for the variable we're analyzing.)
The challenge is that I can't create a Date object/table/dimension because I don't have permissions to create tables in the database. Therefore I've been generating a date sequence with this code:
DECLARE #startDate date = CAST('2016-01-01' AS date),
#endDate date = CAST(GETDATE() AS date);
SELECT DATEADD(day, number - 1, #startDate) AS [Date]
FROM (
SELECT ROW_NUMBER() OVER (
ORDER BY n.object_id
)
FROM sys.all_objects n
) S(number)
WHERE number <= DATEDIFF(day, #startDate, #endDate) + 1;
So, is there the possibility to join both tables into the same statement? Let's say the ledger table looks like this:
SELECT
date,cost
FROM ledger
I'd assume it can be done by using a subquery but I don't know how.
Thank you.
There is a very good article by Aaron Bertrand showing several methods for generating a sequence of numbers (or dates) in SQL Server: Generate a set or sequence without loops – part 1.
Try them out and see for yourself which is faster or more convenient to you. (spoiler - Recursive CTE is rather slow)
Once you've picked your preferred method you can wrap it in a CTE (common-table expression).
Here I'll use your method from the question
WITH
CTE_Dates
AS
(
SELECT
DATEADD(day, number - 1, #startDate) AS dt
FROM (
SELECT ROW_NUMBER() OVER (
ORDER BY n.object_id
)
FROM sys.all_objects n
) S(number)
WHERE number <= DATEDIFF(day, #startDate, #endDate) + 1
)
SELECT
...
FROM
CTE_Dates
LEFT JOIN Ledger ON Ledger.dt = CTE_Dates.dt
;
You can use your generated date sequence as a CTE and LEFT JOIN that to your ledger table. For example:
DECLARE #startDate date = CAST('2020-02-01' AS date);
DECLARE #endDate date = CAST(GETDATE() AS date);
WITH dates AS (
SELECT DATEADD(day, number - 1, #startDate) AS [Date]
FROM (
SELECT ROW_NUMBER() OVER (
ORDER BY n.object_id
)
FROM sys.all_objects n
) S(number)
WHERE number <= DATEDIFF(day, #startDate, #endDate) + 1
)
SELECT dates.Date, COALESCE(ledger.cost, 0)
FROM dates
LEFT JOIN (VALUES ('2020-02-02', 14), ('2020-02-05', 10)) AS ledger([Date], [cost]) ON dates.Date = ledger.Date
Output:
Date cost
2020-02-01 0
2020-02-02 14
2020-02-03 0
2020-02-04 0
2020-02-05 10
2020-02-06 0
Demo on dbfiddle
Can you please help me in Sql server, I have table from where I'am getting date wise wise data.
Table structure .
Date Amount
-----------------
2019-05-04 16128.00
2019-05-06 527008.00
2019-05-07 407608.00
2019-05-10 407608.00
Above query I want to fill the missing date , My expectation as shown below
Date Amount
-----------------
2019-05-04 16128.00
2019-05-05 00
2019-05-06 527008.00
2019-05-07 407608.00
2019-05-08 0
2019-05-09 0
2019-05-10 407608.00
Thanks in Advance
You can use calndar help table otherwise assign start date and from date like this DECLARE #fromdate DATE = '20190504', #todate DATE = '20190510' this from date and to date you can change.
CREATE TABLE #amounttable
(
Dt DATE,
Amount BIGINT
);
INSERT into #amounttable(Dt, Amount) VALUES('2019-05-04',16128);
INSERT into #amounttable(Dt, Amount) VALUES('2019-05-06',527008);
INSERT into #amounttable(Dt, Amount) VALUES('2019-05-07',407608);
INSERT into #amounttable(Dt, Amount) VALUES('2019-05-10',407608);
DECLARE #fromdate DATE = '20190504', #todate DATE = '20190510';
SELECT c.d as Date, Amount = COALESCE(s.Amount,0)
FROM
(
SELECT TOP (DATEDIFF(DAY, #fromdate, #todate)+1)
DATEADD(DAY, ROW_NUMBER() OVER (ORDER BY number)-1, #fromdate)
FROM [master].dbo.spt_values
WHERE [type] = N'P' ORDER BY number
) AS c(d)
LEFT OUTER JOIN #amounttable AS s
ON c.d = s.Dt
WHERE c.d >= #fromdate
AND c.d < DATEADD(DAY, 1, #todate);
for more reference check the answer here from stack exchange
One pretty simple method is to use a recursive CTE to generate the dates and then a left join to bring in the days:
with cte as (
select min(t.dte) as dte, max(t.dte) as maxdate
from t
union all
select dateadd(day, 1, dte), maxdate
from cte
where dte < maxdate
)
select cte.dte, coalesce(t.amount, 0) as amount
from cte left join
t
on cte.dte = t.dte;
Here is a db<>fiddle.
Note that the default depth for recursion is 100, so for longer periods you should add OPTION (MAXRECURSION 0).
What I need is to calculate the missing time periods within the calendar year given a table such as this in SQL:
DatesTable
|ID|DateStart |DateEnd |
1 NULL NULL
2 2015-1-1 2015-12-31
3 2015-3-1 2015-12-31
4 2015-1-1 2015-9-30
5 2015-1-1 2015-3-31
5 2015-6-1 2015-12-31
6 2015-3-1 2015-6-30
6 2015-7-1 2015-10-31
Expected return would be:
1 2015-1-1 2015-12-31
3 2015-1-1 2015-2-28
4 2015-10-1 2015-12-31
5 2015-4-1 2015-5-31
6 2015-1-1 2015-2-28
6 2015-11-1 2015-12-31
It's essentially work blocks. What I need to show is the part of the calendar year which was NOT worked. So for ID = 3, he worked from 3/1 through the rest of the year. But he did not work from 1/1 till 2/28. That's what I'm looking for.
You can do it using LEAD, LAG window functions available from SQL Server 2012+:
;WITH CTE AS (
SELECT ID,
LAG(DateEnd) OVER (PARTITION BY ID ORDER BY DateEnd) AS PrevEnd,
DateStart,
DateEnd,
LEAD(DateStart) OVER (PARTITION BY ID ORDER BY DateEnd) AS NextStart
FROM DatesTable
)
SELECT ID, DateStart, DateEnd
FROM (
-- Get interval right before current [DateStart, DateEnd] interval
SELECT ID,
CASE
WHEN DateStart IS NULL THEN '20150101'
WHEN DateStart > start THEN start
ELSE NULL
END AS DateStart,
CASE
WHEN DateStart IS NULL THEN '20151231'
WHEN DateStart > start THEN DATEADD(d, -1, DateStart)
ELSE NULL
END AS DateEnd
FROM CTE
CROSS APPLY (SELECT COALESCE(DATEADD(d, 1, PrevEnd), '20150101')) x(start)
-- If there is no next interval then get interval right after current
-- [DateStart, DateEnd] interval (up-to end of year)
UNION ALL
SELECT ID, DATEADD(d, 1, DateEnd) AS DateStart, '20151231' AS DateEnd
FROM CTE
WHERE DateStart IS NOT NULl -- Do not re-examine [Null, Null] interval
AND NextStart IS NULL -- There is no next [DateStart, DateEnd] interval
AND DateEnd < '20151231' -- Current [DateStart, DateEnd] interval
-- does not terminate on 31/12/2015
) AS t
WHERE t.DateStart IS NOT NULL
ORDER BY ID, DateStart
The idea behind the above query is simple: for every [DateStart, DateEnd] interval get 'not worked' interval right before it. If there is no interval following the current interval, then also get successive 'not worked' interval (if any).
Also note that I assume that if DateStart is NULL then DateStart is also NULL for the same ID.
Demo here
If your data is not too big, this approach will work. It expands all the days and ids and then re-groups them:
with d as (
select cast('2015-01-01' as date)
union all
select dateadd(day, 1, d)
from d
where d < cast('2015-12-31' as date)
),
td as (
select *
from d cross join
(select distinct id from t) t
where not exists (select 1
from t t2
where d.d between t2.startdate and t2.enddate
)
)
select id, min(d) as startdate, max(d) as enddate
from (select td.*,
dateadd(day, - row_number() over (partition by id order by d), d) as grp
from td
) td
group by id, grp
order by id, grp;
An alternative method relies on cumulative sums and similar functionality that is much easier to expression in SQL Server 2012+.
Somewhat simpler approach I think.
Basically create a list of dates for all work block ranges (A). Then create a list of dates for the whole year for each ID (B). Then remove the A from B. Compile the remaining list of dates into date ranges for each ID.
DECLARE #startdate DATETIME, #enddate DATETIME
SET #startdate = '2015-01-01'
SET #enddate = '2015-12-31'
--Build date ranges from remaining date list
;WITH dateRange(ID, dates, Grouping)
AS
(
SELECT dt1.id, dt1.Dates, dt1.Dates + row_number() over (order by dt1.id asc, dt1.Dates desc) AS Grouping
FROM
(
--Remove (A) from (B)
SELECT distinct dt.ID, tmp.Dates FROM DatesTable dt
CROSS APPLY
(
--GET (B) here
SELECT DATEADD(DAY, number, #startdate) [Dates]
FROM master..spt_values
WHERE type = 'P' AND DATEADD(DAY, number, #startdate) <= #enddate
) tmp
left join
(
--GET (A) here
SELECT DISTINCT T.Id,
D.Dates
FROM DatesTable AS T
INNER JOIN master..spt_values as N on N.number between 0 and datediff(day, T.DateStart, T.DateEnd)
CROSS APPLY (select dateadd(day, N.number, T.DateStart)) as D(Dates)
WHERE N.type ='P'
) dr
ON dr.Id = dt.Id and dr.Dates = tmp.Dates
WHERE dr.id is null
) dt1
)
SELECT ID, CAST(MIN(Dates) AS DATE) DateStart, CAST(MAX(Dates) AS DATE) DateEnd
FROM dateRange
GROUP BY ID, Grouping
ORDER BY ID
Heres the code:
http://sqlfiddle.com/#!3/f3615/1
I hope this helps!
I'll try to keep the specific details of my problem out of this question and focus only on the pertinent issues.
Lets say I have an Assets table with a primary key of AssetID.
I have another table called ProcessedDates with primary key PID and with additional columns AssetID, StartDate, EndDate.
I want to run a process for a list of assets between a start date and end date. Before I can run this process, I need to know which assets and which date ranges have already been processed.
For example, there are 2 entries in ProcessedDates:
AssetID StartDate EndDate
--------------------------
Asset1 Day4 day7
Asset1 Day10 Day12
I want to process Asset1 between day2 and day11. I don't need to waste time by processing on days that have already been done so in this example, I will only process asset1 from day2 to day3 and from day8 to day 9.
So what I need is a query that returns the gaps in the date ranges. In this case, the result set will be 2 lines:
AssetID StartDate EndDate
--------------------------
Asset1 day2 day3
Asset1 day8 day9
In my actual requirement I have many assetIDs. The ProcessedDates table may have multiple entries for each asset or none at all and each asset does not necessarily have the same processed dates as any other asset.
declare #StartDate date, #EndDate date (assume these are given)
--get distinct assets
select distinct AssetIDs from (some query) into #Assets
--get the already processed date ranges
select p.AssetID, p.StartDate, p.EndDate
from ProcessedDates p inner join #Assets a on p.AssetID = a.AssetID
where p.StartDate between #StartDate and #EndDate
or p.EndDate between #StartDate and #EndDate
From here I have no clue how to proceed. How do I get it to return AssetID, StartDate, EndDate for all the gaps in between?
Something like this:
declare #StartDate date = '2015-01-01', #EndDate date = '2015-05-05'
declare #Assets table (AssetID varchar(50), StartDate date, EndDate date)
declare #AssetTypes table (AssetID varchar(50))
insert into #AssetTypes values
('Asset1'),
('Asset2')
insert into #Assets values
('Asset1', '2014-12-10', '2014-12-31'), -- Ignored
('Asset1', '2015-02-02', '2015-03-02'),
('Asset1', '2015-03-05', '2015-05-01'),
('Asset1', '2015-06-01', '2015-06-06') -- Ignored
;WITH Base AS (
SELECT AT.AssetID
, CASE WHEN A.AssetID IS NULL THEN 1 ELSE 0 END EmptyAsset
, A.StartDate
, A.EndDate
, ROW_NUMBER() OVER (PARTITION BY AT.AssetID ORDER BY StartDate) RN
FROM #AssetTypes AT
LEFT JOIN #Assets A ON A.AssetID = AT.AssetID
WHERE A.AssetID IS NULL -- case of totally missing asset
OR (StartDate <= #EndDate AND EndDate >= #StartDate)
)
-- first missing range, before the first row
SELECT AssetID, #StartDate StartDate, DATEADD(dd, -1, StartDate) EndDate
FROM Base
WHERE RN = 1 AND StartDate > #StartDate
UNION ALL
-- each row joined with the next one
SELECT B1.AssetID, DATEADD(dd, 1, B1.EndDate), ISNULL(DATEADD(dd, -1, B2.StartDate), #EndDate)
FROM Base B1
LEFT JOIN Base B2 ON B2.AssetID = B1.AssetID AND B2.RN = B1.RN + 1
WHERE B1.EmptyAsset = 0
AND (B2.AssetID IS NULL -- Last row case
OR DATEADD(dd, 1, B1.EndDate) < B2.StartDate) -- Other rows case
AND B1.EndDate < #EndDate -- If the range ends after #EndDate, nothing to do
UNION ALL
-- case of totally missing asset
SELECT AssetID, #StartDate, #EndDate
FROM Base
WHERE EmptyAsset = 1
The main idea is that each row is joined with the next one. A new range is generated (if necessary) between the EndDate + 1 and the StartDate - 1. There is a special handling for the last row (B2.AssetID IS NULL and ISNULL(... #EndDate)). The first SELECT generated a row before the first range, and the last select is for the special case of no ranges present for an asset.
As I've written in the comments, it gets ugly quite quickly.
Here's an simple version to get the result you want. I use integer as date, and assume the min date is 0 and the max date is 999.
--DDL
create table Assets (AssetID integer, StartDate integer, EndDate integer);
insert into Assets values
(1,4,7),
(1,10,12),
(1,15,17),
(2,5,7),
(2,9,10);
with temp as(
select a1.AssetId,
a1.enddate+1 as StartDate,
coalesce(min(a2.startdate) - 1,999) as EndDate
from Assets a1
left join Assets a2
on a1.assetid = a2.assetid
and a1.enddate < a2.startdate
group by a1.assetid,a1.enddate
union all
select a.assetid,0,min(startdate) -1
from Assets a
group by a.assetid
)
select AssetId,
case when StartDate<2 then 2 else StartDate end as StartDate,
case when EndDate>11 then 11 else EndDate end as EndDate
from temp
where StartDate<=11 and EndDate>=2
order by AssetId,StartDate
The temp table can get the missing ranges. Then filter the match ranges between Day2 and Day11, will get the result that you want.
AssetId StartDate EndDate
1 2 3
1 8 9
2 2 4
2 8 8
2 11 11
Here's the SqlFiddle Demo