Using DimCalendar update days between two dates - sql

In DWH (SQL Server) i have two tables:
DWH.Days
DayStart
DayStop
DaysBetween
2022-04-21
2022-04-24
null
2022-03-12
2022-04-27
null
2022-04-21
2022-04-24
null
2022-03-01
2022-04-22
null
and
DWH.Calendar
Date
IsHoliday?
2022-05-11
yes
2022-05-12
no
2022-05-13
yes
2022-05-15
no
I need to update DWH.Days.DaysBetween as a number of days between DayStart and DayStop where DWH.DimCalendar.IsHoliday?='no'. I don't have premission change the data model. I don't have any ideas how to do it, any ideas?

your data
however there exist numerous issues regarding your sample data
declare #Days table (
DayStart DATE NOT NULL,
DayStop DATE NOT NULL,
DaysBetween VARCHAR(40)
);
INSERT INTO #Days (DayStart, DayStop, DaysBetween)
VALUES
('2022-04-21', '2022-04-24', NULL),
('2022-03-12', '2022-04-27', NULL),
('2022-04-21', '2022-04-24', NULL),
--repeated data it should be deleted
('2022-03-01', '2022-04-22', NULL);
declare #Calendar table(
Date DATE NOT NULL,
IsHoliday VARCHAR(30) NOT NULL
);
INSERT INTO #Calendar (Date, IsHoliday)
VALUES
('2022-05-11', 'yes'),
--actual values
('2022-05-12', 'no'),
--actual values
('2022-05-13', 'yes'),
--actual values
('2022-05-15', 'no'),
--actual values
('2022-04-23', 'yes'),
--added values for test. it can be deleted.
('2022-03-10', 'no'),
--added values for test. it can be deleted.
('2022-03-14', 'yes'),
--added values for test. it can be deleted.
('2022-04-22', 'no'),
--added values for test. it can be deleted.
('2022-03-15', 'no');
first you should use cross join for satisfy the Condition DayStart<Date<DayStop and IsHoliday='no'. then use subquery and DATEDIFF and group by as follows
SELECT Daystart,
Daystop,
Datediff(d, Daystart, Daystop) - Count(Isholiday) AS DaysBetween
FROM (SELECT Daystart,
Daystop,
Daysbetween,
Isholiday
FROM #days d,
#Calendar c
WHERE d.Daystart < c.Date
AND d.Daystop > c.Date
AND c.Isholiday = 'no') a
GROUP BY Daystart,
Daystop

Related

Get the first row in the group of dates where there is a sequence of dates with no break

DECLARE #TestData TABLE (
Idntty Int Not Null
,[DATE] DATE NOT NULL
,[TYPE] VARCHAR(20) NOT NULL
)
INSERT INTO #TestData VALUES
(1, '2016-03-01', 'Inventory'),
(2, '2016-04-01', 'Inventory'),
(3, '2016-06-01', 'Inventory'),
(4, '2016-07-01', 'Inventory'),
(5, '2016-08-01', 'Inventory'),
(6, '2016-09-01', 'Inventory'),
(7, '2017-01-01', 'Inventory'),
(8, '2017-02-01', 'Inventory'),
(9, '2017-03-01', 'Inventory'),
;
Basically I need to get the first row in the LAST group where there is a sequence of dates with no break.
for example here '2016-03-01' can't be right because '2016-05-01' is missing, so there is a break in sequence for this date record.
Criteria for grouping is continuous dates, so here in example there are 3 groups as there are 2 breaks, one because '2016-06-01' is missing and second because '2016-10-01', '2016-11-01', '2016-12-01' are missing:
(1, '2016-03-01', 'Inventory'),
(2, '2016-04-01', 'Inventory'),
and
(3, '2016-06-01', 'Inventory'),
(4, '2016-07-01', 'Inventory'),
(5, '2016-08-01', 'Inventory'),
(6, '2016-09-01', 'Inventory'),
and
(7, '2017-01-01', 'Inventory'),
(8, '2017-02-01', 'Inventory'),
(9, '2017-03-01', 'Inventory'),
So I need '2017-01-01' to be the output as its the first date record of a continuous sequence and also its LAST sequence.
I tried to use standard gaps-and-island solution but couldn't get any success, like on what to apply group by here.
I want to solve the problem using SQL only. I am using SQL Server 2008.
This indeed is a gaps-and-islands problem. Basically you want the beginning of he last island. Here is one option using window functions:
select max(date) res
from (
select t.*, lag(date) over(partition by type order by Idntty) lag_date
from mytable t
) t
where lag_date is null or date > dateadd(day, 1, lag_date)
In the subquery, lag() gives you the date of the "previous" record. Then the outer query filters on rows whose date has a difference greater than 1 day with the previous record (that is, the beginning of each island), and gets the maximum date within this resultset.
One way is to use the looping. try the following:
DECLARE CUR CURSOR FAST_FORWARD FOR SELECT DISTINCT [DATE] FROM #TestData ORDER BY [DATE]
DECLARE #DATE DATE
DECLARE #PREV_DATE DATE = NULL, #FINAL_GRP_DATE DATE, #CN INT = -1
OPEN CUR
FETCH NEXT FROM CUR INTO #DATE
WHILE (##FETCH_STATUS = 0)
BEGIN
IF DATEDIFF(MM, #PREV_DATE, #DATE) = 1
BEGIN
SET #FINAL_GRP_DATE = #DATE
SET #CN = -1
END
ELSE
BEGIN
SET #CN = 0
END
SET #PREV_DATE = #DATE
FETCH NEXT FROM CUR INTO #DATE
END
CLOSE CUR
DEALLOCATE CUR
SELECT #FINAL_GRP_DATE
Please find the db<>fiddle here.

Update table data, fetched from another table

I have a table which is storing the attendance information on an employee and another table that's storing the information about the shift of the employee which is basically a duty roster.
Here is the structure to attendance table
CREATE TABLE Attendance
(
ID INT,
EmpCode INT,
ShiftCode INT,
CheckIn DATETIME,
CheckOut DATETIME
)
INSERT INTO Attendance VALUES (1, 1, 1, '2019-09-01 09:16:23', NULL)
INSERT INTO Attendance VALUES (2, 1, 1, NULL, '2019-09-01 18:01:56')
INSERT INTO Attendance VALUES (3, 1, 2, '2019-09-02 09:00:00', NULL)
INSERT INTO Attendance VALUES (4, 1, 2, NULL, '2019-09-02 18:48:21')
INSERT INTO Attendance VALUES (5, 1, 1, '2019-09-13 09:27:00', NULL)
INSERT INTO Attendance VALUES (6, 1, 1, NULL, '2019-09-13 18:45:00')
INSERT INTO Attendance VALUES (7, 2, 2, '2019-09-01 21:19:17', NULL)
INSERT INTO Attendance VALUES (8, 2, 2, NULL, '2019-09-01 23:30:56')
INSERT INTO Attendance VALUES (9, 2, 2, '2019-09-05 09:23:00', NULL)
INSERT INTO Attendance VALUES (10, 2, 2, NULL, '2019-09-05 17:19:00')
Here is the structure and sample data for Duty roster.
CREATE TABLE Shifts
(
ID INT PRIMARY KEY,
EmpCode INT,
ShiftCode INT,
StartDate DATETIME,
EndDate DATETIME
)
INSERT INTO Shifts VALUES (1, 1, 24, '2019-09-01 00:00:00', '2019-09-05 00:00:00');
INSERT INTO Shifts VALUES (2, 2, 25, '2019-09-01 00:00:00', '2019-09-05 00:00:00');
The idea is to update the ShiftCode in Attendance table wrt to the shifts stored in the duty roster. So if the attendance for employee 1 is between '2019-09-01' and '2019-09-05' then the shift code for this employee should be updated to 24 and same for other employee. If the duty roster does not exist for the dates present in attendance table it should not update it and let it the way it is.
I need an update query.
Something like this:
SELECT *
FROM Attendance A
INNER JOIN Shifts S
ON A.EmpCode = S.[EmpCode]
AND
(
A.CheckIn BETWEEN S.[StartDate] AND S.[EndDate]
OR
A.CheckOut BETWEEN S.[StartDate] AND S.[EndDate]
)
and with update:
UPDATE Attendance
SET ShiftCode = S.[ShiftCode]
FROM Attendance A
INNER JOIN Shifts S
ON A.EmpCode = S.[EmpCode]
AND
(
A.CheckIn BETWEEN S.[StartDate] AND S.[EndDate]
OR
A.CheckOut BETWEEN S.[StartDate] AND S.[EndDate]
);
I have tried this one and it works too:
UPDATE Attendance
SET ShiftCode = ISNULL((SELECT ShiftCode FROM Shifts Roster
WHERE CAST(COALESCE(CheckIn, CheckOut) AS DATE) BETWEEN StartDate AND EndDate AND EmpCode = Attendance.EmpCode),
(SELECT ShiftCode FROM EmployeeInfo WHERE EmployeeInfo.ID = Attendance.EmpCode))
Try this. It will helpful
UPDATE Attendance SET ShiftCode=c.ShiftsShiftCode
FROM Attendance a
JOIN
(
SELECT a.EmpCode, a.ShiftCode, CheckIn, CheckOut, b.ShiftCode AS ShiftsShiftCode FROM Attendance a
JOIN Shifts b ON a.EmpCode=b.EmpCode
AND (a.CheckIn BETWEEN StartDate AND EndDate OR a.CheckOut BETWEEN StartDate AND EndDate)
)c
ON a.EmpCode = c.EmpCode
AND (a.checkin=c.checkin OR a.CheckOut=c.CheckOut)

Summing up the records as per given conditions

I have a table like below, What I need that for any particular fund and up to any particular date logic will sum the amount value. Let say I need the sum for 3 dates as 01/28/2015,03/30/2015 and 04/01/2015. Then logic will check for up to first date how many records are there in table . If it found more than one record then it'll sum the amount value. Then for next date it'll sum up to the next date but from the previous date it had summed up.
Id Fund Date Amount
1 A 01/20/2015 250
2 A 02/28/2015 300
3 A 03/20/2015 400
4 A 03/30/2015 200
5 B 04/01/2015 500
6 B 04/01/2015 600
I want result to be like below
Id Fund Date SumOfAmount
1 A 02/28/2015 550
2 A 03/30/2015 600
3 B 04/01/2015 1100
Based on your question, it seems that you want to select a set of dates, and then for each fund and selected date, get the sum of the fund amounts from the selected date to the previous selected date. Here is the result set I think you should be expecting:
Fund Date SumOfAmount
A 2015-02-28 550.00
A 2015-03-30 600.00
B 2015-04-01 1100.00
Here is the code to produce this output:
DECLARE #Dates TABLE
(
SelectedDate DATE PRIMARY KEY
)
INSERT INTO #Dates
VALUES
('02/28/2015')
,('03/30/2015')
,('04/01/2015')
DECLARE #FundAmounts TABLE
(
Id INT PRIMARY KEY
,Fund VARCHAR(5)
,Date DATE
,Amount MONEY
);
INSERT INTO #FundAmounts
VALUES
(1, 'A', '01/20/2015', 250)
,(2, 'A', '02/28/2015', 300)
,(3, 'A', '03/20/2015', 400)
,(4, 'A', '03/30/2015', 200)
,(5, 'B', '04/01/2015', 500)
,(6, 'B', '04/01/2015', 600);
SELECT
F.Fund
,D.SelectedDate AS Date
,SUM(F.Amount) AS SumOfAmount
FROM
(
SELECT
SelectedDate
,LAG(SelectedDate,1,'1/1/1900') OVER (ORDER BY SelectedDate ASC) AS PreviousDate
FROM #Dates
) D
JOIN
#FundAmounts F
ON
F.Date BETWEEN DATEADD(DAY,1,D.PreviousDate) AND D.SelectedDate
GROUP BY
D.SelectedDate
,F.Fund
EDIT: Here is alternative to the LAG function for this example:
FROM
(
SELECT
SelectedDate
,ISNULL((SELECT TOP 1 SelectedDate FROM #Dates WHERE SelectedDate < Dates.SelectedDate ORDER BY SelectedDate DESC),'1/1/1900') AS PreviousDate
FROM #Dates Dates
) D
If i change your incorrect sample data to ...
CREATE TABLE TableName
([Id] int, [Fund] varchar(1), [Date] datetime, [Amount] int)
;
INSERT INTO TableName
([Id], [Fund], [Date], [Amount])
VALUES
(1, 'A', '2015-01-28 00:00:00', 250),
(2, 'A', '2015-01-28 00:00:00', 300),
(3, 'A', '2015-03-30 00:00:00', 400),
(4, 'A', '2015-03-30 00:00:00', 200),
(5, 'B', '2015-04-01 00:00:00', 500),
(6, 'B', '2015-04-01 00:00:00', 600)
;
this query using GROUP BY works:
SELECT MIN(Id) AS Id,
MIN(Fund) AS Fund,
[Date],
SUM(Amount) AS SumOfAmount
FROM dbo.TableName t
WHERE [Date] IN ('01/28/2015','03/30/2015','04/01/2015')
GROUP BY [Date]
Demo
Initially i have used Row_number and month function to pick max date of every month and in 2nd cte i did sum of amounts and joined them..may be this result set matches your out put
declare #t table (Id int,Fund Varchar(1),Dated date,amount int)
insert into #t (id,Fund,dated,amount) values (1,'A','01/20/2015',250),
(2,'A','01/28/2015',300),
(3,'A','03/20/2015',400),
(4,'A','03/30/2015',200),
(5,'B','04/01/2015',600),
(6,'B','04/01/2015',500)
;with cte as (
select ID,Fund,Amount,Dated,ROW_NUMBER() OVER
(PARTITION BY DATEDIFF(MONTH, '20000101', dated)ORDER BY dated desc)AS RN from #t
group by ID,Fund,DATED,Amount
),
CTE2 AS
(select SUM(amount)Amt from #t
GROUP BY MONTH(dated))
,CTE3 AS
(Select Amt,ROW_NUMBER()OVER (ORDER BY amt)R from cte2)
,CTE4 AS
(
Select DISTINCT C.ID As ID,
C.Fund As Fund,
C.Dated As Dated
,ROW_NUMBER()OVER (PARTITION BY RN ORDER BY (SELECT NULL))R
from cte C INNER JOIN CTE3 CC ON c.RN = CC.R
Where C.RN = 1
GROUP BY C.ID,C.Fund,C.RN,C.Dated )
select C.R,C.Fund,C.Dated,cc.Amt from CTE4 C INNER JOIN CTE3 CC
ON c.R = cc.R
declare #TableName table([Id] int, [Fund] varchar(1), [Date] datetime, [Amount] int)
declare #Sample table([SampleDate] datetime)
INSERT INTO #TableName
([Id], [Fund], [Date], [Amount])
VALUES
(1, 'A', '20150120 00:00:00', 250),
(2, 'A', '20150128 00:00:00', 300),
(3, 'A', '20150320 00:00:00', 400),
(4, 'A', '20150330 00:00:00', 200),
(5, 'B', '20150401 00:00:00', 500),
(6, 'B', '20150401 00:00:00', 600)
INSERT INTO #Sample ([SampleDate])
values ('20150128 00:00:00'), ('20150330 00:00:00'), ('20150401 00:00:00')
-- select * from #TableName
-- select * from #Sample
;WITH groups AS (
SELECT [Fund], [Date], [AMOUNT], MIN([SampleDate]) [SampleDate] FROM #TableName
JOIN #Sample ON [Date] <= [SampleDate]
GROUP BY [Fund], [Date], [AMOUNT])
SELECT [Fund], [SampleDate], SUM([AMOUNT]) FROM groups
GROUP BY [Fund], [SampleDate]
Explanation:
The CTE groups finds the earliest SampleDate which is later than (or equals to) your
data's date and enriches your data accordingly, thus giving them the group to be summed up in.
After that, you can group on the derived date.

sql server insert missing data

Table A
empid tdate transcode
1 2006-01-1 HI
1 2008-01-1 PR
1 2008-11-30 TE
1 2009-01-02 RH
2 2007-01-1 HI
2 2009-01-1 PR
2 2011-11-30 TE
I am trying to do this in SQL Server 2008, basically I want to fill the missing tdate as show below, however if RH tdate is greater than TE tdate than include missing tdate upto current year as in 2013 for each employee else stop at TE tdate. Thank for the help
Final Results
Table A
empid tdate transcode
1 2006-01-1 HI
1 2007-01-1 HI
1 2008-01-1 PR
1 2008-11-30 TE
1 2009-01-02 RH
1 2010-01-02 RH
1 2011-01-02 RH
1 2012-01-02 RH
1 2013-01-02 RH
2 2007-01-1 HI
2 2008-01-1 HI
2 2009-01-1 PR
2 2010-01-1 PR
2 2011-11-30 TE
I started but I'm stuck, not sure if I am on the right path.
select t.empid, t.tdate,t.transcode
from table1 t
inner join table2 t2 on t.empid = t2.empid and t.tdate < t2.tdate
order by t.transcode desc
Here are the queries one can use to make my table quickly and run queries on it -
CREATE TABLE [dbo].[TableDates](
[empid] [int] NULL,
[tdate] [datetime] NULL,
[transcode] [varchar](2) NULL
) ON [PRIMARY]
GO
INSERT [dbo].[TableDates] ([empid], [tdate], [transcode]) VALUES (1, CAST(0x0000973C00000000 AS DateTime), N'HI')
INSERT [dbo].[TableDates] ([empid], [tdate], [transcode]) VALUES (1, CAST(0x00009A1600000000 AS DateTime), N'PR')
INSERT [dbo].[TableDates] ([empid], [tdate], [transcode]) VALUES (1, CAST(0x00009B6400000000 AS DateTime), N'TE')
INSERT [dbo].[TableDates] ([empid], [tdate], [transcode]) VALUES (1, CAST(0x00009B8500000000 AS DateTime), N'RH')
INSERT [dbo].[TableDates] ([empid], [tdate], [transcode]) VALUES (2, CAST(0x000098A900000000 AS DateTime), N'HI')
INSERT [dbo].[TableDates] ([empid], [tdate], [transcode]) VALUES (2, CAST(0x00009B8400000000 AS DateTime), N'PR')
INSERT [dbo].[TableDates] ([empid], [tdate], [transcode]) VALUES (2, CAST(0x00009FAB00000000 AS DateTime), N'TE')
It seems that you want to insert rows with a trans code of 'RH' for combinations of employee id and date that are not in the table.
The following query does this:
insert into TableDates(empid, tdate, transcode)
select etd.empid, etd.tdate, 'RH' as transcode
from (select e.empid, td.tdate
from (select distinct empid from TableDates) e cross join
(select distinct tdate from TableDates) td
) etd left outer join
TableDates td
on td.empid = etd.empid and td.tdate = etd.tdate
where td.empid is null;
You can see this work on this SQL Fiddle.

SQL to show concurrent cases

Need help with SQL to show concurrency by person for every minute in a day.
for a data set below:
drop table test
create table test (person varchar(2), caseid varchar(3), starttime datetime, endtime datetime)
insert into test values ('aa', '1', '01/01/2013 06:42', '01/01/2013 07:06')
insert into test values ('aa', '1', '01/01/2013 07:31', '01/01/2013 09:38')
insert into test values ('aa', '2', '01/01/2013 08:37', '01/01/2013 11:44')
insert into test values ('aa', '3','01/01/2013 09:39', '01/01/2013 11:31')
insert into test values ('aa', '4','01/01/2013 11:09', '01/01/2013 13:30')
insert into test values ('aa', '5','01/01/2013 12:05', '01/01/2013 15:38')
insert into test values ('aa', '6', '01/01/2013 13:58', '01/01/2013 14:13')
insert into test values ('aa', '7', '01/01/2013 15:53', '01/01/2013 16:14')
insert into test values ('bb', '8', '01/01/2013 08:42', '01/01/2013 09:06')
insert into test values ('bb', '8', '01/01/2013 10:31', '01/01/2013 19:38')
insert into test values ('bb', '8','01/01/2013 20:37', '01/01/2013 21:44')
insert into test values ('bb', '9', '01/01/2013 09:39', '01/01/2013 11:31')
insert into test values ('bb', '9', '01/01/2013 11:45', '01/01/2013 13:30')
insert into test values ('bb', '9', '01/01/2013 12:05', '01/01/2013 15:38')
insert into test values ('bb', '10', '01/01/2013 13:58', '01/01/2013 14:13')
insert into test values ('bb', '10', '01/01/2013 15:53', '01/01/2013 16:14')
the result needs to be similar to the following:
aa 01/01/2013 6:42 1
aa 01/01/2013 6:43 1
aa 01/01/2013 6:44 1
....
....
aa 01/01/2013 8:37 2
aa 01/01/2013 8:38 2
....
....
bb 01/01/2013 8:42 1
bb 01/01/2013 8:43 1
bb 01/01/2013 10:31 2
....
....
Thanks
You can do this with a correlated subquery:
Select t.*,
(Select count(*)
From t t2
Where t2. Start <= t.start and
T2.end >= t.end
) numoverlaps
From t
(Apologies for syntax errors; I'm on a mobile device)
This finds concurrency at every time in the input data. It does not do it for every minute of time.
This seems to work, but there may be a more elegant solution:
-- get range of days involved
declare #minDate date = (select MIN(starttime) from test)
declare #maxDate date = (select MAX(endtime) from test)
-- create table containing all days
if OBJECT_ID('tempdb..#days') is not null
drop table #days
create table #days (d date)
declare #day date = #minDate
while #day <= #maxDate
begin
insert #days (d) values (#day)
set #day = DATEADD(day, 1, #day)
end
-- create table containing all minutes in the day
if OBJECT_ID('tempdb..#minutes') is not null
drop table #minutes
create table #minutes (m int)
declare #minute int = 0
while #minute < 24*60
begin
insert #minutes (m) values (#minute)
set #minute = #minute + 1
end
select person, dateadd(minute, m, convert(datetime, startdate)), c from
(
select person, m, startdate, count(m) c from
(
-- cross join to select all days and minutes
select d.d, m.m from #days d cross join #minutes m
)
t0
inner join
(
select
person,
convert(date, starttime) startdate,
datediff(minute, convert(date, starttime), starttime) startmin,
datediff(minute, convert(date, endtime), endtime) endmin
from test
)
t1
on t0.m between t1.startmin and t1.endmin
and t0.d = t1.startdate
group by person, m, startdate
)
t2
order by person, startdate, m, c
Here is how I would approach it if using a database that supports CTE and inline views:
First a CTE to generate a two-col listing of the (24*60) minutes of the day for a specified date:
time1 time2
2013-02-12 00:00, 2013-02-12 00:01
.
.
.
2013-02-12 23:59, 2013-02-13 00:00
Left join that CTE to your cases table where cases.starttime between time1 and time2 or cases.endtime between time1 and time2. That brings back either nulls where no part of the case was ongoing during that minute or the caseid and personid when a case was ongoing during that minute.
Make the above an inline view. You end up with a set of all minutes in the day and the left-joined caseid and personid, or nulls:
time1, time2, caseid, personid
If you select from that inline view where caseid is not null you end up with the minutes where one or more cases was ongoing; if you then group by personid, time1 and count(caseid) you get the tally of cases per person in that particular one-minute time slot.