Calculating Timestamp Overlap Duration - sql

I have a business that rents out one particular product. I would like to know the duration in minutes of when a particular location is out of stock.
The first data set has transaction history with the location ID, rental starting time stamp, and rental ending time stamp.
The second data set has the location ID, date, and number of units available for rent that day. The number of units can change day to day as units are added/removed frequently.
I need to calculate how many minutes per day per location that all available units were out on rent.
Ex: Location A has 3 units on 2/1/2016. How many minutes (if any) on 2/1/2016 were there 3 units out on rent at the same time?
SQL is the language I need to use.
See sample data set Y below:
LOC_ID, ACT_RNTL_BGN_TS, ACT_RNTL_END_TS
A 30Jan2016 19:54:37.000 01Feb2016 10:00:24.053
A 31Jan2016 16:30:23.000 01Feb2016 9:07:06.588
A 01Feb2016 9:22:22.000 02Feb2016 10:00:23.716
A 01Feb2016 9:36:11.000 01Feb2016 11:05:34.249
A 01Feb2016 10:27:34.000 01Feb2016 12:59:29.883
A 01Feb2016 10:40:38.000 01Feb2016 15:36:27.119
A 01Feb2016 12:43:10.000 01Feb2016 14:23:15.914
A 01Feb2016 13:28:20.000 01Feb2016 14:40:15.573
A 01Feb2016 17:03:13.000 01Feb2016 19:02:57.413
A 01Feb2016 17:17:12.000 01Feb2016 18:54:14.708
Sample data set Z below:
LOC_ID, Date, Unit_Count
A 01Feb2016 3
A 02Feb2016 4
A 03Feb2016 3
B 01Feb2016 2
B 02Feb2016 2
B 03Feb2016 2
Since Location A has 3 total units on Feb 1st then the desired output would be 25 minutes which is the total amount of time that 3 units were out on rent at the same time on Feb 1st at Location A. Between 10:40am and 11:05am 3 units were out on rent at the same time.

Here is some T-SQL code (you didn't indicate which DBMS you were using)
-- DECLARING SAMPLE DATA
DECLARE #Z table (LOC_ID char, Date date, Unit_Count int)
INSERT #Z ( LOC_ID, Date, Unit_Count ) VALUES ( 'A', '2016-02-01', 3)
INSERT #Z ( LOC_ID, Date, Unit_Count ) VALUES ( 'A', '2016-02-02', 4)
INSERT #Z ( LOC_ID, Date, Unit_Count ) VALUES ( 'A', '2016-02-03', 3)
INSERT #Z ( LOC_ID, Date, Unit_Count ) VALUES ( 'B', '2016-02-01', 2)
INSERT #Z ( LOC_ID, Date, Unit_Count ) VALUES ( 'B', '2016-02-02', 2)
INSERT #Z ( LOC_ID, Date, Unit_Count ) VALUES ( 'B', '2016-02-03', 2)
DECLARE #Y table (LOC_ID char, ACT_RNTL_BGN_TS datetime, ACT_RNTL_END_TS datetime)
INSERT #Y ( LOC_ID, ACT_RNTL_BGN_TS, ACT_RNTL_END_TS ) VALUES ('A', '2016-01-30 19:54:37.000', '2016-02-01 10:00:24.053')
INSERT #Y ( LOC_ID, ACT_RNTL_BGN_TS, ACT_RNTL_END_TS ) VALUES ('A', '2016-01-31 16:30:23.000', '2016-02-01 09:07:06.588')
INSERT #Y ( LOC_ID, ACT_RNTL_BGN_TS, ACT_RNTL_END_TS ) VALUES ('A', '2016-02-01 09:22:22.000', '2016-02-02 10:00:23.716')
INSERT #Y ( LOC_ID, ACT_RNTL_BGN_TS, ACT_RNTL_END_TS ) VALUES ('A', '2016-02-01 09:36:11.000', '2016-02-01 11:05:34.249')
INSERT #Y ( LOC_ID, ACT_RNTL_BGN_TS, ACT_RNTL_END_TS ) VALUES ('A', '2016-02-01 10:27:34.000', '2016-02-01 12:59:29.883')
INSERT #Y ( LOC_ID, ACT_RNTL_BGN_TS, ACT_RNTL_END_TS ) VALUES ('A', '2016-02-01 10:40:38.000', '2016-02-01 15:36:27.119')
INSERT #Y ( LOC_ID, ACT_RNTL_BGN_TS, ACT_RNTL_END_TS ) VALUES ('A', '2016-02-01 12:43:10.000', '2016-02-01 14:23:15.914')
INSERT #Y ( LOC_ID, ACT_RNTL_BGN_TS, ACT_RNTL_END_TS ) VALUES ('A', '2016-02-01 13:28:20.000', '2016-02-01 14:40:15.573')
INSERT #Y ( LOC_ID, ACT_RNTL_BGN_TS, ACT_RNTL_END_TS ) VALUES ('A', '2016-02-01 17:03:13.000', '2016-02-01 19:02:57.413')
INSERT #Y ( LOC_ID, ACT_RNTL_BGN_TS, ACT_RNTL_END_TS ) VALUES ('A', '2016-02-01 17:17:12.000', '2016-02-01 18:54:14.708')
;
-- START OF QUERY
WITH SplittedRentedIntervals AS ( -- Intervals that span more than one day are splited.
SELECT Z.LOC_ID
, Z.Date
, BGN = CASE WHEN Y.ACT_RNTL_BGN_TS > CAST(Z.Date AS datetime) THEN Y.ACT_RNTL_BGN_TS ELSE CAST(Z.Date AS datetime) END
, [END] = CASE WHEN Y.ACT_RNTL_END_TS < CAST(DATEADD(DAY, 1, Z.Date) AS datetime) THEN Y.ACT_RNTL_END_TS ELSE CAST(DATEADD(DAY, 1, Z.Date) AS datetime) END
FROM #Z Z
JOIN #Y Y
ON Y.LOC_ID = Z.LOC_ID
AND (Z.Date = CAST(Y.ACT_RNTL_BGN_TS AS date)
OR Z.Date = CAST(Y.ACT_RNTL_END_TS AS date)
OR (Z.Date BETWEEN Y.ACT_RNTL_BGN_TS AND Y.ACT_RNTL_END_TS))
), Times AS ( -- All times starting and ending any interval, including start and end of day.
SELECT DISTINCT LOC_ID, Date, TIME = BGN FROM SplittedRentedIntervals
UNION SELECT DISTINCT LOC_ID, Date, TIME = [END] FROM SplittedRentedIntervals
UNION SELECT LOC_ID, Date, TIME = CAST(Date AS datetime) FROM #Z
UNION SELECT LOC_ID, Date, TIME = CAST(DATEADD(DAY, 1, Date) AS datetime) FROM #Z
), OrderedTimes AS (
SELECT LOC_ID
, Date
, TIME
, NUM = ROW_NUMBER() OVER (PARTITION BY LOC_ID, Date ORDER BY TIME ASC)
FROM Times
), Intervals AS ( -- Intervals are conformed by two consecutives times.
SELECT OT1.LOC_ID
, OT1.Date
, BGN = OT1.TIME
, [END] = OT2.TIME
FROM OrderedTimes OT1
JOIN OrderedTimes OT2
ON OT2.LOC_ID = OT1.LOC_ID
AND OT2.Date = OT1.Date
AND OT2.NUM = OT1.NUM + 1
), StockByInterval AS ( -- Intersect all time intervals with rented intervals to calculate rented units.
SELECT I.LOC_ID
, I.Date
, I.BGN
, I.[END]
, STOCK = Z.Unit_Count
- (SELECT COUNT(*)
FROM SplittedRentedIntervals SRI
WHERE I.Date = SRI.Date
AND I.LOC_ID = SRI.LOC_ID
AND SRI.BGN < I.[END]
AND SRI.[END] > I.BGN)
FROM Intervals I
JOIN #Z Z
ON Z.Date = I.Date
AND Z.LOC_ID = I.LOC_ID
), WithuotStock AS ( -- Sum the minutes of intervals where there is no stock.
SELECT LOC_ID
, Date
, MinutesWithoutStock = SUM(DATEDIFF(MINUTE, BGN, [END]))
FROM StockByInterval
WHERE STOCK <= 0 -- Sample data has some intervals where there are more items rented than are available.
GROUP BY LOC_ID, Date
)
SELECT Z.LOC_ID
, Z.Date
, MinutesWithoutStock = ISNULL(WS.MinutesWithoutStock, 0)
FROM #Z Z
LEFT JOIN WithuotStock WS
ON WS.Date = Z.Date
AND WS.LOC_ID = Z.LOC_ID
ORDER BY Z.LOC_ID, Z.Date
Sample output
LOC_ID Date MinutesWithoutStock
------ ---------- -------------------
A 2016-02-01 374
A 2016-02-02 0
A 2016-02-03 0
B 2016-02-01 0
B 2016-02-02 0
B 2016-02-03 0

Related

Partition the date into a weeks from a given date to the last date in the record

I wanted to count the time gap between two rows for the same id if the second is less than an hour after the first, and partition the count for the week.
Suppose given date with time is 2020-07-01 08:00
create table #Temp (
Id integer not null,
Time datetime not null
);
insert into #Temp values (1, '2020-07-01 08:00');
insert into #Temp values (1, '2020-07-01 08:01');
insert into #Temp values (1, '2020-07-01 08:06');
insert into #Temp values (1, '2020-07-01 08:30');
insert into #Temp values (1, '2020-07-08 09:35');
insert into #Temp values (1, '2020-07-15 16:10');
insert into #Temp values (1, '2020-07-15 16:20');
insert into #Temp values (1, '2020-07-17 06:40');
insert into #Temp values (1, '2020-07-17 06:41');
insert into #Temp values (2, '2020-07-01 08:30');
insert into #Temp values (2, '2020-07-01 09:26');
insert into #Temp values (2, '2020-07-01 10:25');
insert into #Temp values (2, '2020-07-09 08:30');
insert into #Temp values (2, '2020-07-09 09:26');
insert into #Temp values (2, '2020-07-09 10:25');
insert into #Temp values (3, '2020-07-21 08:30');
insert into #Temp values (3, '2020-07-21 09:26');
insert into #Temp values (3, '2020-07-21 10:25');
The week should extend up to the last date in the record. Here, the last date is
2020-07-21 10:25
Have to transform the output from this piece of code and divide the duration weekly.
select Id, sum(datediff(minute, Time, next_ts)) as duration_minutes
from (select t.*,
lead(Time) over (partition by id order by Time) as next_ts
from #Temp t
) t
where datediff(minute, Time, next_ts) < 60
group by Id;
Output:
id duration_minutes
1 41
2 230
3 115
The desired output should divide this duration on a weekly basis,
like Week 1, Week 2, Week 3, and so on.
Desired Output:
If the
start date is 2020-07-01 08:00
end date is 2020-07-21 10:25
id | Week 1 | Week 2 | Week 3
--------------------------------------
1 | 30 | 0 | 11
2 | 115 | 115 | 0
3 | 0 | 0 | 115
similarly, if the
start date is 2020-07-08 08:00
id | Week 1 | Week 2
---------------------------
1 | 11 | 0
2 | 115 | 0
3 | 0 | 115
Is this what you want?
select Id,
1 + datediff(second, '2020-07-01 06:00', time) / (24 * 60 * 60 * 7) as week_num,
sum(datediff(minute, Time, next_ts)) as duration_minutes
from (select t.*,
lead(Time) over (partition by id order by Time) as next_ts
from Temp t
) t
where datediff(minute, Time, next_ts) < 60
group by Id, datediff(second, '2020-07-01 06:00', time) / (24 * 60 * 60 * 7)
order by id, week_num;
Here is a db<>fiddle.
I am not able to understand the logic behind the week periods. Anyone, in the example below I am using the following code to set the week:
'Week ' + CAST(DENSE_RANK() OVER (ORDER BY DATEDIFF(DAY, #FirstDate, next_ts) / 7) AS VARCHAR(12))
You can adjust it to ignore the ours, be more precise or something else to match your real requirements.
Apart from that, you just need to perform a dynamic PIVOT. Here is the full working example:
DROP TABLE IF EXISTS #Temp;
create table #Temp (
Id integer not null,
Time datetime not null
);
insert into #Temp values (1, '2020-07-01 08:00');
insert into #Temp values (1, '2020-07-01 08:01');
insert into #Temp values (1, '2020-07-01 08:06');
insert into #Temp values (1, '2020-07-01 08:30');
insert into #Temp values (1, '2020-07-08 09:35');
insert into #Temp values (1, '2020-07-15 16:10');
insert into #Temp values (1, '2020-07-15 16:20');
insert into #Temp values (1, '2020-07-17 06:40');
insert into #Temp values (1, '2020-07-17 06:41');
insert into #Temp values (2, '2020-07-01 08:30');
insert into #Temp values (2, '2020-07-01 09:26');
insert into #Temp values (2, '2020-07-01 10:25');
insert into #Temp values (2, '2020-07-09 08:30');
insert into #Temp values (2, '2020-07-09 09:26');
insert into #Temp values (2, '2020-07-09 10:25');
insert into #Temp values (3, '2020-07-21 08:30');
insert into #Temp values (3, '2020-07-21 09:26');
insert into #Temp values (3, '2020-07-21 10:25');
DROP TABLE IF EXISTS #TEST
CREATE TABLE #TEST
(
[ID] INT
,[week_day] VARCHAR(12)
,[time_in_minutes] BIGINT
)
DECLARE #FirstDate DATE;
SELECT #FirstDate = MIN(Time)
FROM #Temp
INSERT INTO #TEST
select id
,'Week ' + CAST(DENSE_RANK() OVER (ORDER BY DATEDIFF(DAY, #FirstDate, next_ts) / 7) AS VARCHAR(12))
,datediff(minute, Time, next_ts)
from (select t.*,
lead(Time) over (partition by id order by Time) as next_ts
from #Temp t
) t
where datediff(minute, Time, next_ts) < 60
DECLARE #columns NVARCHAR(MAX);
SELECT #columns = STUFF
(
(
SELECT ',' + QUOTENAME([week_day])
FROM
(
SELECT DISTINCT CAST(REPLACE([week_day], 'Week ', '') AS INT)
,[week_day]
FROM #TEST
) DS ([rowID], [week_day])
ORDER BY [rowID]
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)')
,1
,1
,''
);
DECLARE #DanymicSQL NVARCHAR(MAX);
SET #DanymicSQL = N'
SELECT [ID], ' + #columns + '
FROM #TEST
PIVOT
(
SUM([time_in_minutes]) FOR [week_day] IN (' + #columns + ')
) PVT';
EXEC sp_executesql #DanymicSQL;

Daily totals for sawtooth pattern local maxima

I have multiple monotonic counters that can be reset ad-hoc. These counters exhibit sawtooth behavior when graphed (however they are not strictly increasing). I want a monthly report showing daily sums of the maxima for each counter.
My strategy so far is to put a '1' on the rows where the counter is less than the previous sampling of the counter (also less than or equal to the next). Then calculate a running total on that column to identify series without resets.
Then I group over the daily intervals to calculate max-min for each series in the day, then sum those portions to get grand totals for the day.
What I have works, but it takes ~10s to run. The execution plan shows two big sorts: one in cteData and I think the other is in cteSeries. I feel like I should be able to eliminate one of them but I'm at a loss how to do it.
The result of this code is (which I can now see is actually skipping a sample across the interval boundary):
interval tagname total
2020-01-01 alpha 3
2020-01-01 bravo 4
2020-01-02 alpha 3
2020-01-02 bravo 4
IF OBJECT_ID('tempdb..#counter_data') IS NOT NULL
DROP TABLE #counter_data;
CREATE TABLE #counter_data(
t_stamp DATETIME NOT NULL
,tagname VARCHAR(32) NOT NULL
,val REAL NULL
PRIMARY KEY(t_stamp, tagname)
);
INSERT INTO #counter_data(t_stamp, tagname, val)
VALUES
('2020-01-01 04:00', 'alpha', 0)
,('2020-01-01 04:00', 'bravo', 0)
,('2020-01-01 08:00', 'alpha', 1)
,('2020-01-01 08:00', 'bravo', 1)
,('2020-01-01 12:00', 'alpha', 2)
,('2020-01-01 12:00', 'bravo', 2)
,('2020-01-01 16:00', 'alpha', 0)
,('2020-01-01 16:00', 'bravo', 3)
,('2020-01-01 20:00', 'alpha', 1)
,('2020-01-01 20:00', 'bravo', 4)
,('2020-01-02 04:00', 'alpha', 2)
,('2020-01-02 04:00', 'bravo', 5)
,('2020-01-02 08:00', 'alpha', 3)
,('2020-01-02 08:00', 'bravo', 6)
,('2020-01-02 12:00', 'alpha', 0)
,('2020-01-02 12:00', 'bravo', 7)
,('2020-01-02 16:00', 'alpha', 1)
,('2020-01-02 16:00', 'bravo', 8)
,('2020-01-02 20:00', 'alpha', 2)
,('2020-01-02 20:00', 'bravo', 9)
;
DECLARE #dateStart AS DATETIME = '2020-01-01';
DECLARE #dateEnd AS DATETIME = DATEADD(month, 2, #dateStart);
WITH cteData AS(
SELECT
t_stamp
,tagname
,val
,CASE
WHEN val < LAG(val) OVER(PARTITION BY tagname ORDER BY t_stamp)
AND val <= LEAD(val) OVER(PARTITION BY tagname ORDER BY t_stamp)
THEN 1
ELSE 0
END AS rn
FROM #counter_data
WHERE
t_stamp >= #dateStart AND t_stamp < #dateEnd
AND tagname IN(
'alpha'
,'bravo'
)
)
,cteSeries AS(
SELECT
CAST(t_stamp AS DATE) AS interval
,tagname
,val
,SUM(rn) OVER(PARTITION BY tagname ORDER BY t_stamp) AS series
FROM cteData
)
,cteSubtotal AS(
SELECT
interval
,tagname
,MAX(val) - MIN(val) AS subtotal
FROM cteSeries
GROUP BY interval, tagname, series
)
,cteGrandTotal AS(
SELECT
interval
,tagname
,SUM(subtotal) AS total
FROM cteSubtotal
GROUP BY interval, tagname
)
SELECT *
FROM cteGrandTotal
ORDER BY interval, tagname
I would just calculate the increase of the counter in each row by comparing it to the previous row:
with cte
as
(
SELECT *,isnull(lag(val) over (partition by tagname order by t_stamp),0) as previousVal
FROM counter_data
)
SELECT cast(t_stamp as date),tagname, sum(case when val>previousVal then val-previousval else val end )
FROM cte
GROUP BY cast(t_stamp as date),tagname;
This looks like a gaps-and-islands problem. I think that you want lag() to get the "previous" value and a conditional sum to compute the daily count.
select
tag_name,
cast(t_stamp as date) t_date,
sum(case when val = lag_val + 1 the 1 else 0 end) total
from (
select
c.*,
lag(val) over(
partition by tagname, cast(t_stamp as date)
order by t_stamp
) lag_val
from #counter_data c
) c
group by tagname, cast(t_stamp as date)
order by t_date, tagname

Summing up the records as per given conditions

I have a table like below, What I need that for any particular fund and up to any particular date logic will sum the amount value. Let say I need the sum for 3 dates as 01/28/2015,03/30/2015 and 04/01/2015. Then logic will check for up to first date how many records are there in table . If it found more than one record then it'll sum the amount value. Then for next date it'll sum up to the next date but from the previous date it had summed up.
Id Fund Date Amount
1 A 01/20/2015 250
2 A 02/28/2015 300
3 A 03/20/2015 400
4 A 03/30/2015 200
5 B 04/01/2015 500
6 B 04/01/2015 600
I want result to be like below
Id Fund Date SumOfAmount
1 A 02/28/2015 550
2 A 03/30/2015 600
3 B 04/01/2015 1100
Based on your question, it seems that you want to select a set of dates, and then for each fund and selected date, get the sum of the fund amounts from the selected date to the previous selected date. Here is the result set I think you should be expecting:
Fund Date SumOfAmount
A 2015-02-28 550.00
A 2015-03-30 600.00
B 2015-04-01 1100.00
Here is the code to produce this output:
DECLARE #Dates TABLE
(
SelectedDate DATE PRIMARY KEY
)
INSERT INTO #Dates
VALUES
('02/28/2015')
,('03/30/2015')
,('04/01/2015')
DECLARE #FundAmounts TABLE
(
Id INT PRIMARY KEY
,Fund VARCHAR(5)
,Date DATE
,Amount MONEY
);
INSERT INTO #FundAmounts
VALUES
(1, 'A', '01/20/2015', 250)
,(2, 'A', '02/28/2015', 300)
,(3, 'A', '03/20/2015', 400)
,(4, 'A', '03/30/2015', 200)
,(5, 'B', '04/01/2015', 500)
,(6, 'B', '04/01/2015', 600);
SELECT
F.Fund
,D.SelectedDate AS Date
,SUM(F.Amount) AS SumOfAmount
FROM
(
SELECT
SelectedDate
,LAG(SelectedDate,1,'1/1/1900') OVER (ORDER BY SelectedDate ASC) AS PreviousDate
FROM #Dates
) D
JOIN
#FundAmounts F
ON
F.Date BETWEEN DATEADD(DAY,1,D.PreviousDate) AND D.SelectedDate
GROUP BY
D.SelectedDate
,F.Fund
EDIT: Here is alternative to the LAG function for this example:
FROM
(
SELECT
SelectedDate
,ISNULL((SELECT TOP 1 SelectedDate FROM #Dates WHERE SelectedDate < Dates.SelectedDate ORDER BY SelectedDate DESC),'1/1/1900') AS PreviousDate
FROM #Dates Dates
) D
If i change your incorrect sample data to ...
CREATE TABLE TableName
([Id] int, [Fund] varchar(1), [Date] datetime, [Amount] int)
;
INSERT INTO TableName
([Id], [Fund], [Date], [Amount])
VALUES
(1, 'A', '2015-01-28 00:00:00', 250),
(2, 'A', '2015-01-28 00:00:00', 300),
(3, 'A', '2015-03-30 00:00:00', 400),
(4, 'A', '2015-03-30 00:00:00', 200),
(5, 'B', '2015-04-01 00:00:00', 500),
(6, 'B', '2015-04-01 00:00:00', 600)
;
this query using GROUP BY works:
SELECT MIN(Id) AS Id,
MIN(Fund) AS Fund,
[Date],
SUM(Amount) AS SumOfAmount
FROM dbo.TableName t
WHERE [Date] IN ('01/28/2015','03/30/2015','04/01/2015')
GROUP BY [Date]
Demo
Initially i have used Row_number and month function to pick max date of every month and in 2nd cte i did sum of amounts and joined them..may be this result set matches your out put
declare #t table (Id int,Fund Varchar(1),Dated date,amount int)
insert into #t (id,Fund,dated,amount) values (1,'A','01/20/2015',250),
(2,'A','01/28/2015',300),
(3,'A','03/20/2015',400),
(4,'A','03/30/2015',200),
(5,'B','04/01/2015',600),
(6,'B','04/01/2015',500)
;with cte as (
select ID,Fund,Amount,Dated,ROW_NUMBER() OVER
(PARTITION BY DATEDIFF(MONTH, '20000101', dated)ORDER BY dated desc)AS RN from #t
group by ID,Fund,DATED,Amount
),
CTE2 AS
(select SUM(amount)Amt from #t
GROUP BY MONTH(dated))
,CTE3 AS
(Select Amt,ROW_NUMBER()OVER (ORDER BY amt)R from cte2)
,CTE4 AS
(
Select DISTINCT C.ID As ID,
C.Fund As Fund,
C.Dated As Dated
,ROW_NUMBER()OVER (PARTITION BY RN ORDER BY (SELECT NULL))R
from cte C INNER JOIN CTE3 CC ON c.RN = CC.R
Where C.RN = 1
GROUP BY C.ID,C.Fund,C.RN,C.Dated )
select C.R,C.Fund,C.Dated,cc.Amt from CTE4 C INNER JOIN CTE3 CC
ON c.R = cc.R
declare #TableName table([Id] int, [Fund] varchar(1), [Date] datetime, [Amount] int)
declare #Sample table([SampleDate] datetime)
INSERT INTO #TableName
([Id], [Fund], [Date], [Amount])
VALUES
(1, 'A', '20150120 00:00:00', 250),
(2, 'A', '20150128 00:00:00', 300),
(3, 'A', '20150320 00:00:00', 400),
(4, 'A', '20150330 00:00:00', 200),
(5, 'B', '20150401 00:00:00', 500),
(6, 'B', '20150401 00:00:00', 600)
INSERT INTO #Sample ([SampleDate])
values ('20150128 00:00:00'), ('20150330 00:00:00'), ('20150401 00:00:00')
-- select * from #TableName
-- select * from #Sample
;WITH groups AS (
SELECT [Fund], [Date], [AMOUNT], MIN([SampleDate]) [SampleDate] FROM #TableName
JOIN #Sample ON [Date] <= [SampleDate]
GROUP BY [Fund], [Date], [AMOUNT])
SELECT [Fund], [SampleDate], SUM([AMOUNT]) FROM groups
GROUP BY [Fund], [SampleDate]
Explanation:
The CTE groups finds the earliest SampleDate which is later than (or equals to) your
data's date and enriches your data accordingly, thus giving them the group to be summed up in.
After that, you can group on the derived date.

T-SQL: Paging WITH TIES

I am trying to implement a paging routine that's a little different.
For the sake of a simple example, let's assume that I have a table defined and populated as follows:
DECLARE #Temp TABLE
(
ParentId INT,
[TimeStamp] DATETIME,
Value INT
);
INSERT INTO #Temp VALUES (1, '1/1/2013 00:00', 6);
INSERT INTO #Temp VALUES (1, '1/1/2013 01:00', 7);
INSERT INTO #Temp VALUES (1, '1/1/2013 02:00', 8);
INSERT INTO #Temp VALUES (2, '1/1/2013 00:00', 6);
INSERT INTO #Temp VALUES (2, '1/1/2013 01:00', 7);
INSERT INTO #Temp VALUES (2, '1/1/2013 02:00', 8);
INSERT INTO #Temp VALUES (3, '1/1/2013 00:00', 6);
INSERT INTO #Temp VALUES (3, '1/1/2013 01:00', 7);
INSERT INTO #Temp VALUES (3, '1/1/2013 02:00', 8);
TimeStamp will always be the same interval, e.g. daily data, 1 hour data, 1 minute data, etc. It will not be mixed.
For reporting and presentation purposes, I want to implement paging that:
Orders by TimeStamp
Starts out using a suggested pageSize (say 4), but will automatically adjust to include additional records matching on TimeStamp. In other words, if 1/1/2013 01:00 is included for one ParentId, the suggested pageSize will be overridden and all records for hour 01:00 will be included for all ParentId's. It's almost like the TOP WITH TIES option.
So running this query with pageSize of 4 would return 6 records. There are 3 hour 00:00 and 1 hour 01:00 by default, but because there are more hour 01:00's, the pageSize would be overridden to return all hour 00:00 and 01:00.
Here's what I have so far, and I think I'm close as it works for the first iteration, but sequent queries for the next pageSize+ rows doesn't work.
WITH CTE AS
(
SELECT ParentId, [TimeStamp], Value,
RANK() OVER(ORDER BY [TimeStamp]) AS rnk,
ROW_NUMBER() OVER(ORDER BY [TimeStamp]) AS rownum
FROM #Temp
)
SELECT *
FROM CTE
WHERE (rownum BETWEEN 1 AND 4) OR (rnk BETWEEN 1 AND 4)
ORDER BY TimeStamp, ParentId
The ROW_NUMBER ensures the minimum pageSize is met, but the RANK will include additional ties.
declare #Temp as Table ( ParentId Int, [TimeStamp] DateTime, [Value] Int );
insert into #Temp ( ParentId, [TimeStamp], [Value] ) values
(1, '1/1/2013 00:00', 6),
(1, '1/1/2013 01:00', 7),
(1, '1/1/2013 02:00', 8),
(2, '1/1/2013 00:00', 6),
(2, '1/1/2013 01:00', 7),
(2, '1/1/2013 02:00', 8),
(3, '1/1/2013 00:00', 6),
(3, '1/1/2013 01:00', 7),
(3, '1/1/2013 02:00', 8);
declare #PageSize as Int = 4;
declare #Page as Int = 1;
with Alpha as (
select ParentId, [TimeStamp], Value,
Rank() over ( order by [TimeStamp] ) as Rnk,
Row_Number() over ( order by [TimeStamp] ) as RowNum
from #Temp ),
Beta as (
select Min( Rnk ) as MinRnk, Max( Rnk ) as MaxRnk
from Alpha
where ( #Page - 1 ) * #PageSize < RowNum and RowNum <= #Page * #PageSize )
select A.*
from Alpha as A inner join
Beta as B on B.MinRnk <= A.Rnk and A.Rnk <= B.MaxRnk
order by [TimeStamp], ParentId;
EDIT:
An alternative query that assigns page numbers as it goes, so that next/previous page can be implemented without overlapping rows:
with Alpha as (
select ParentId, [TimeStamp], Value,
Rank() over ( order by [TimeStamp] ) as Rnk,
Row_Number() over ( order by [TimeStamp] ) as RowNum
from #Temp ),
Beta as (
select ParentId, [TimeStamp], Value, Rnk, RowNum, 1 as Page, 1 as PageRow
from Alpha
where RowNum = 1
union all
select A.ParentId, A.[TimeStamp], A.Value, A.Rnk, A.RowNum,
case when B.PageRow >= #PageSize and A.TimeStamp <> B.TimeStamp then B.Page + 1 else B.Page end,
case when B.PageRow >= #PageSize and A.TimeStamp <> B.TimeStamp then 1 else B.PageRow + 1 end
from Alpha as A inner join
Beta as B on B.RowNum + 1 = A.RowNum
)
select * from Beta
option ( MaxRecursion 0 )
Note that recursive CTEs often scale poorly.
I think your strategy of using row_number() and rank() is overcomplicating things.
Just pick the top 4 timestamps from the data. Then choose any timestamps that match those:
select *
from #temp
where [timestamp] in (select top 4 [timestamp] from #temp order by [TimeStamp])

SQL to show concurrent cases

Need help with SQL to show concurrency by person for every minute in a day.
for a data set below:
drop table test
create table test (person varchar(2), caseid varchar(3), starttime datetime, endtime datetime)
insert into test values ('aa', '1', '01/01/2013 06:42', '01/01/2013 07:06')
insert into test values ('aa', '1', '01/01/2013 07:31', '01/01/2013 09:38')
insert into test values ('aa', '2', '01/01/2013 08:37', '01/01/2013 11:44')
insert into test values ('aa', '3','01/01/2013 09:39', '01/01/2013 11:31')
insert into test values ('aa', '4','01/01/2013 11:09', '01/01/2013 13:30')
insert into test values ('aa', '5','01/01/2013 12:05', '01/01/2013 15:38')
insert into test values ('aa', '6', '01/01/2013 13:58', '01/01/2013 14:13')
insert into test values ('aa', '7', '01/01/2013 15:53', '01/01/2013 16:14')
insert into test values ('bb', '8', '01/01/2013 08:42', '01/01/2013 09:06')
insert into test values ('bb', '8', '01/01/2013 10:31', '01/01/2013 19:38')
insert into test values ('bb', '8','01/01/2013 20:37', '01/01/2013 21:44')
insert into test values ('bb', '9', '01/01/2013 09:39', '01/01/2013 11:31')
insert into test values ('bb', '9', '01/01/2013 11:45', '01/01/2013 13:30')
insert into test values ('bb', '9', '01/01/2013 12:05', '01/01/2013 15:38')
insert into test values ('bb', '10', '01/01/2013 13:58', '01/01/2013 14:13')
insert into test values ('bb', '10', '01/01/2013 15:53', '01/01/2013 16:14')
the result needs to be similar to the following:
aa 01/01/2013 6:42 1
aa 01/01/2013 6:43 1
aa 01/01/2013 6:44 1
....
....
aa 01/01/2013 8:37 2
aa 01/01/2013 8:38 2
....
....
bb 01/01/2013 8:42 1
bb 01/01/2013 8:43 1
bb 01/01/2013 10:31 2
....
....
Thanks
You can do this with a correlated subquery:
Select t.*,
(Select count(*)
From t t2
Where t2. Start <= t.start and
T2.end >= t.end
) numoverlaps
From t
(Apologies for syntax errors; I'm on a mobile device)
This finds concurrency at every time in the input data. It does not do it for every minute of time.
This seems to work, but there may be a more elegant solution:
-- get range of days involved
declare #minDate date = (select MIN(starttime) from test)
declare #maxDate date = (select MAX(endtime) from test)
-- create table containing all days
if OBJECT_ID('tempdb..#days') is not null
drop table #days
create table #days (d date)
declare #day date = #minDate
while #day <= #maxDate
begin
insert #days (d) values (#day)
set #day = DATEADD(day, 1, #day)
end
-- create table containing all minutes in the day
if OBJECT_ID('tempdb..#minutes') is not null
drop table #minutes
create table #minutes (m int)
declare #minute int = 0
while #minute < 24*60
begin
insert #minutes (m) values (#minute)
set #minute = #minute + 1
end
select person, dateadd(minute, m, convert(datetime, startdate)), c from
(
select person, m, startdate, count(m) c from
(
-- cross join to select all days and minutes
select d.d, m.m from #days d cross join #minutes m
)
t0
inner join
(
select
person,
convert(date, starttime) startdate,
datediff(minute, convert(date, starttime), starttime) startmin,
datediff(minute, convert(date, endtime), endtime) endmin
from test
)
t1
on t0.m between t1.startmin and t1.endmin
and t0.d = t1.startdate
group by person, m, startdate
)
t2
order by person, startdate, m, c
Here is how I would approach it if using a database that supports CTE and inline views:
First a CTE to generate a two-col listing of the (24*60) minutes of the day for a specified date:
time1 time2
2013-02-12 00:00, 2013-02-12 00:01
.
.
.
2013-02-12 23:59, 2013-02-13 00:00
Left join that CTE to your cases table where cases.starttime between time1 and time2 or cases.endtime between time1 and time2. That brings back either nulls where no part of the case was ongoing during that minute or the caseid and personid when a case was ongoing during that minute.
Make the above an inline view. You end up with a set of all minutes in the day and the left-joined caseid and personid, or nulls:
time1, time2, caseid, personid
If you select from that inline view where caseid is not null you end up with the minutes where one or more cases was ongoing; if you then group by personid, time1 and count(caseid) you get the tally of cases per person in that particular one-minute time slot.