How to handle NULL in a SQL subquery? - sql

I'm pulling my hair out over such a simple thing...
I'm recording the number of days a member attends a gym club. By default, I assume the member attends every day. When they are sick, I record the dates and total number of days absent in a table (ie DateFrom, DateEnd, TotalDays). The total days absent is the difference between DateFrom and DateEnd.
Now sometimes I don't know when the member is coming back to gym. Just that they have stopped attending on a certain day. Hence the DateEnd and TotalDays are unknown. So the total number days are calculated by taking the difference between DateFrom and today's date.
Table: InactiveOnProgram
Columns: PersonId, DateFrom, DateEnd, TotalDays
Data:
1,01/01/2012,05/01/2012,5
1,05/01/2012,08/01/2012,3
2,01/02/2012,05/02/2012,5
2,05/02/2012,08/02/2012,3
2,20/02/2012,null,null
My below query works fine for personId=2. The total days absent is 8+2=10 days (2 days being 20/02/2012 till 22/02/2012 = today ). But for personId=1, it returns null , instead of 8 days!
sql:
(SELECT
case ( isnull(sum(TotalDays), 0) )
when 0 then 0
else CAST(SUM(TotalDays) as DECIMAL(20,2))
end
FROM InactiveOnProgram
)
+
(SELECT
case ( isnull( DateFrom, 0) )
when null then 0
when 0 then 0
else CAST(datediff(day,DateFrom, getdate()) as DECIMAL(20,2))
end
FROM InactiveOnProgram
WHERE (TotalDays is null or TotalDays =0)
AND DateTo is null
)
Any idea what I'm missing here?! As far as I can guess the second part of sql returns null and because of this it ignores the first part!
Any help is much appreciated.
Thanks

You can write it as a single query:
declare #InactiveOnProgram table
(PersonId int, DateFrom datetime, DateEnd datetime, TotalDays int)
insert into #InactiveOnProgram (PersonId , DateFrom , DateEnd , TotalDays)
select 1,'20120101','20120105',5 union all
select 1,'20120105','20120108',3 union all
select 2,'20120201','20120205',5 union all
select 2,'20120205','20120208',3 union all
select 2,'20120220',null,null
select PersonId,SUM(COALESCE(TotalDays,DATEDIFF(day,DateFrom,CURRENT_TIMESTAMP)))
from #InactiveOnProgram group by PersonId
I'm not really happy with the storing of TotalDays, but given your data set, it seems necessary, since apparently, from 1st - 5th = 5 days, but from 5th - 8th = 3 days.

Do you only guess that second part returns null or do you know that? Because as far as I can see, first part is returning something undefined.
You need to use SUM() and ISNULL() in different order, like:
select cast(sum(isnull(TotalDays, 0)) as decimal(20,2)) as totdays
And in second case you can use next:
datediff(day, isnull(DateFrom, getdate()), getdate())
This way you can eliminate null values before calculation/conversion.

Maybe this is your solution:
select personid, sum( closed + unclosed)
from (
SELECT personid
, CAST(SUM(isnull(nullif(TotalDays,0),0)) as DECIMAL(20,2)) as closed
, case when min(isnull(nullif(DateFrom,0),0))=0 OR (SUM(isnull(nullif(TotalDays,0),0)) >0 AND min(isnull(nullif(dateend,0),0)) >0) then 0 else min(CAST(datediff(day,DateFrom, getdate()) as DECIMAL(20,2))) end as unclosed
FROM test
group by personid
--WITH ROLLUP
) as test
group by personid
WITH ROLLUP

The problem is basically that in SQL a null term in a calculation results in a null.
Make sure null is not possible in your results.
BTW, your logic is waaaaay too complicated - simplify it

Related

Split Hours into Multiple Months using SQL

CREATE TABLE EventLog
(
EventID INT
, EventName VARCHAR(50) NOT NULL
, EventStartDateTime DATETIME NOT NULL
, EventEndDateTime DATETIME NULL
)
INSERT INTO EventLog(EventID, EventName, EventStartDateTime, EventEndDateTime)
VALUES(100, 'Planting', '20210620 10:34:09 AM','20211018 10:54:49 PM')
,(200, 'Foundation', '20200420 10:34:09 AM','20211018 10:54:49 PM')
,(300, 'Seeding', '20210410 10:27:19 AM','')
,(400, 'Spreading', '20220310 10:24:09 PM','')
I have a requirement to split hours into multiple months and even years depending on the length of the event. In some cases, the event may have an end date but if the event is still ongoing, there will be no end date.
The result or output of the solution is to be held by another table:
CREATE TABLE EventSummary
(
EventID INT
, EventName VARCHAR(50) NOT NULL
, [Year] INT
, [MonthName] VARCHAR(25)
, [Hours] DECIMAL(12,2)
)
The image above is the output of the first row.
If the event runs over multiple years, the values should be spread across the multiple years and months likewise.
In cases where there is no end date I am to use GETUTCDATE() to do the calculation.
Some events span across months or event years. I would like to be able to break down the total duration into individual month's duration (or individual duration by month) in hours respectively and populate it into a table
Consider that I have an event with start and end date: '20210620 10:34:09 AM','20211018 10:54:49 PM'
For the first month which basically is not a full month, I am to calculate the remaining hours of that month and store it against that month.
I do the same for the next month. If the event runs for the entire month which is now the month of July, I store the entire hours for that month which is 744 hours against July. I keep doing that till the end of the event. But if the event is still open(blank or empty) I use
GETUTCDATE() as the end date
The sum of Hours is grouped by EventID, EventName, Year and Months
It is expected that the first and or last month may be decimals as they may not be fully formed.
I have tried to work this out but do not know how to get the best result with SQL Server.
I kindly will appreciate your help with this.
Thanks
If I understand correctly, you can try to use CTE recursive, get each month startdate then use antoher cte2 get the starttime and endtime between each month.
;WITH CTE AS (
SELECT EventID,EventName,EventStartDateTime,IIF(EventEndDateTime = '',GETUTCDATE(),EventEndDateTime) EventEndDateTime
FROM EventLog
UNION ALL
SELECT EventID,EventName, DATEADD(month, DATEDIFF(month, 0, DATEADD(month , 1 , EventStartDateTime)), 0) , EventEndDateTime
FROM CTE
WHERE DATEADD(month, DATEDIFF(month, 0, DATEADD(month , 1 , EventStartDateTime)), 0) <= EventEndDateTime
), CTE2 AS (
SELECT EventID,EventName,EventStartDateTime,LEAD(EventStartDateTime,1,EventEndDateTime) OVER(PARTITION BY EventID,EventName ORDER BY EventStartDateTime) n_EventStartDateTime
FROM CTE
)
INSERT INTO EventSummary(EventID,EventName,Year,MonthName,Hours)
SELECT EventID,EventName,YEAR(EventStartDateTime),DATENAME(MONTH,EventStartDateTime),DATEDIFF(second, EventStartDateTime, n_EventStartDateTime) / 3600.0
FROM CTE2
option (maxrecursion 0)
sqlfiddle

How to calculate number of days working on tasks if we have many tasks and the date range of each tasks could have overlap

I run into a question during working and I would really appreciate if anyone could give me some ideas.
We have a table which keeps tracking of tasks employee has finished. Table structure as below :
EmployeeNum | TaskID |Start Date of task | End Date of task
I want to calculate how many days each employee has invested in each task using this table. At first my code looks like this:
Select
EmployeeNum,TaskID,DateDiff(day,StartDate,EndDate)+1 as PureDay
from
TaskTable
Group by
EmployeeNum,TaskID
But then I found a problem that there are overlaps in the date range for each task.
For example, we have TaskA, TaskB, TaskC for one employee.
TaskA is from 2018-10-01 to 2018-10-05
TaskB from 2018-10-02 to 2018-10-07
TaskC from 2018-10-09 to 2018-10-10
In this way, the actual working days of this employee should be from 2018-10-01 to 2018-10-07, and then 2018-10-09 to 2018-10-10 which is 9 days. If I calculate date range of each task then add them together then actual working days become 5+6+2=13 days instead of 9.
I'm wandering if there could be any good ways to solve this overlapping problem ? Thank you very much for any ideas!
Following query will count how many working days each employee spent on each task ;
SELECT
EmployeeNum,
TaskID,
(DATEDIFF(dd, StartDate, EndDate) + 1)
-(DATEDIFF(wk, StartDate, EndDate) * 2)
-(CASE WHEN DATENAME(dw, StartDate) = 'Sunday' THEN 1 ELSE 0 END)
-(CASE WHEN DATENAME(dw, EndDate) = 'Saturday' THEN 1 ELSE 0 END) as PureDay
FROM
TaskTable
GROUP BY
EmployeeNum,
TaskID
See this link for on explanation on how this computation works.
Once you know the date when a task starts, you can use a cumulative sum to assign a group to each record and then simply aggregate by that group (and other information).
The following query should do what you want:
with starts as (
select sm.*,
(case when exists (select 1
from tb_TaskMaster sm2
where sm2.EmpID = sm.EmpID and
sm2.StartDate < sm.StartDate and
sm2.EndDate >= sm.StartDate
)
then 0 else 1
end) as isstart
from tb_TaskMaster sm
)
select EmpID, count(TaskId) as cnt_TaskID, min(StartDate) as StartDate, max(EndDate) as EndDate,
datediff(Day, min(StartDate), max(EndDate)) + 1 as PureDay
from (select s.*, sum(isstart) over (partition by EmpID order by StartDate) as grp
from starts s
) s
group by EmpID, grp
order by EmpID
In this db<>fiddle, you could find the DDL & DML for my example data and the working of the code.
You can try this.
Im not sure it will work all the way but you can give it a try :)
declare #table table (empid int,taskid nvarchar(50),startdate date, enddate date)
insert into #table
values
(1,'TaskA','2018-10-01','2018-10-05'),
(1,'TaskB','2018-10-02','2018-10-07'),
(1,'TaskC','2018-10-09','2018-10-10')
select *,case when comparedate > startdate then datediff(dd,comparedate,enddate) else datediff(dd,startdate,enddate)+1 end as countofworkingdays from (
Select empid,taskid,startdate,enddate,lag(enddate,1,'1900-01-01') over(partition by empid order by startdate) as CompareDate from #table
)x
Result
This eliminates overlapping ranges by adjusting the start date based on all previous end dates:
with maxEndDates as
( -- find the maximum previous end date
Select empid,taskid,startdate,enddate,
max(EndDate)
over (partition by EmpID
order by StartDate, EndDate desc
rows between unbounded preceding and 1 preceding) as maxEndDate
from TaskTable
),
daysPerTask as
( -- calculate the difference based on the adjusted start date to eliminate overlaping days
select *,
case when maxEndDate >= enddate then 0 -- range already fully covered
when maxEndDate > startdate then datediff(dd, maxEndDate, enddate) -- range partially overlapping
else datediff(dd, startdate, enddate)+1 -- new range
end as dayCount
from maxEndDates
)
-- get the final count
select EmpID, sum(dayCount)
from daysPerTask
group by EmpID;
See db<>fiddle
Thank you all very much for your responding and help. I found a solution during searching in Stackoverflow, the following is it's link:
T-SQL date range in a table split and add the individual date to the table
The Tally table suggested by Felix in the above question is a great way to solve my problem since I have millions of records and the real situation is really complicated.
Thank you all again for your help!

Is there a way to aggregate a variable range of dates in SQL using a SET operation

I have a table like this one....
CREATE TABLE AbsentStudents
(
Id int not null primary key identity(1,1),
StudentId int not null,
AbsentDate datetime not null
)
This is a very large table that has 1 row for each student for each day that they were absent.
I have been asked to write a stored procedure that gets student absences by date range. What makes this query tricky is that I have to filter/aggregate by "absence episodes". The number of days that constitutes an "absence episode" is a procedure parameter so it can vary.
So for example, I need to get a list of students who were absent between 1/1/2016 to 1/17/2016 but only if they were absent for more than #Days (2 or 3 or whatever the parameter dictates) days.
I think that alone I could figure out. However, within the date range a student can have more than one "absence episode". So a student might have been absent for 3 days at the beginning of the date range, 2 days in the middle of the date range, and 4 days at the end of the date range and each of those constitutes a different "absence episodes". Assuming that my #Days parameter is 2, that should return 3 rows for that student. And, each returned row should calculate how many days the student was absent for that "absence episode."
So I would like my procedure require 3 parameters (#StartDate datetime,#EndDate datetime, #Days int) and return something like this...
StudentId, InitialAbsentDate, ConsecutiveDaysMissed
And ideally it would do this using a SET operation and avoid cursors. (Although cursors are fine if that is the only option.)
UPDATE (by Shnugo)
A test scenario
DECLARE #AbsentStudents TABLE(
Id int not null primary key identity(1,1),
StudentId int not null,
AbsentDate datetime not null
);
INSERT INTO #AbsentStudents VALUES
--student 1
(1,{d'2016-10-01'}),(1,{d'2016-10-02'}),(1,{d'2016-10-03'}) --three days
,(1,{d'2016-10-05'}) --one day
,(1,{d'2016-10-07'}),(1,{d'2016-10-08'}) --two days
--student 2
,(2,{d'2016-10-01'}),(2,{d'2016-10-02'}),(2,{d'2016-10-03'}),(2,{d'2016-10-04'}) --four days
,(2,{d'2016-10-08'}),(2,{d'2016-10-09'}),(2,{d'2016-10-10'}) --three days
,(2,{d'2016-10-12'}); --one day
DECLARE #startDate DATETIME={d'2016-10-01'};
DECLARE #endDate DATETIME={d'2016-10-31'};
DECLARE #Days INT = 3;
If you just want periods of times when students are absent, you can do this with a difference of row numbers approach.
Now, the following assumes that days are sequential with no gaps and uses the difference of row numbers to get periods of absences:
select student_id,
min(AbsentDate),
max(AbsentDate),
count(*) as number_of_days
from (select a.*,
row_number() over (partition by student_id order by AbsentDate) as seqnum_sa
from AbsentStudents a
) a
group by student_id,
dateadd(day, - seqnum_sa, AbsentDate);
Notes:
You have additional requirements on minimum days and date ranges. These are easily handled with a where clause.
I suspect you have a hidden requirement on avoiding week ends an holidays. Neither this (nor other answers) cover this. Ask another question if this is an issue.
You can try this query:
SELECT
StudentId
, MIN(AbsentDate) AS InitialDate
, COUNT(*) AS ConsecutiveDaysMissed
FROM (
SELECT
dateNumber - ROW_NUMBER() OVER(PARTITION BY StudentId ORDER BY dateNumber) AS PeriodId
, AbsentDate
, StudentId
FROM(
SELECT
StudentId
, AbsentDate
, CAST(CONVERT(CHAR(8), AbsentDate, 112) AS INT) AS dateNumber
FROM AbsentStudents
WHERE AbsentDate BETWEEN #StartDate AND #EndDate
) AS T
) AS StudentPeriod
GROUP BY StudentID, PeriodId
Well, you can make a table with dates and their order numbers without holidays and weekends. Then make the join with AbsentStudents by date and use order number instead of CAST(CONVERT(CHAR(8), AbsentDate, 112) AS INT) AS dateNumber.
You can use a trick. If you order by date, you can find date groups by subtracting the number of days from smallest element and adding a counter that goes up by one every row.
SELECT StudentID
FROM (
SELECT StudentID, GROUP_NUM, COUNT(*) AS GROUP_DAY_CNT
FROM (
SELECT StudentId,
DATEDIFF(dd,DATEADD(dd,M.Min, ROW_NUMBER() OVER (ORDER BY AbsetntDate),AbsentDate) as GROUP_NUM
FROM AbsentStudent
CROSS JOIN (SELECT MIN(AbsentDate) as Min FROM AbsentStudents WHERE AbsentDate BETWEEN #StartDate AND #EndDate) M
WHERE AbsentDate BETWEEN #StartDate AND #EndDate
) X
GROUP BY StudentID, GROUP_NUM
) Z
WHERE GROUP_DAY_CNT >= #Days

How to count number of work days and hours extracting public holidays between two dates in SQL

I am new to SQL and stuck in some complex query.
What am I trying to achieve?
I want to calculate following two types of total days between two timestamp fields.
Number of Working Days (Excluding Weekend & Public Holidays)
Number of Total Days (Including Weekend & Public Holidays)
Calculation Condition
If OrderDate time is <= 12:00 PM then start count from 0
If OrderDate Time is > 12:00 PM then start count from -1
If Delivery Date is NULL then count different till Today's Date
Data Model
OrderDate & DeliveryDate resides in 'OrderTable'
PublicHolidayDate resides 'PublicHolidaysTable'
As with many tasks in SQL, this could be solved in multiple ways.
You can use COUNT aggregate operations on the date range with the BETWEEN operator to give you aggregate totals of the weekend days and holidays from a start date (OrderDate) to an end date (DeliveryDate).
This functionality can be combined with CTEs (Common Table Expressions) to give you the end result set you are looking for.
I've put together a query that illustrates one way you could go about doing it. I've also put together some test data and results to illustrate how the query works.
DECLARE #DateRangeBegin DATETIME = '2016-01-01'
, #DateRangeEnd DATETIME = '2016-07-01'
DECLARE #OrderTable TABLE
(OrderNum INT, OrderDate DATETIME, DeliveryDate DATETIME)
INSERT INTO #OrderTable VALUES
(1, '2016-02-12 09:30', '2016-03-01 13:00')
, (2, '2016-03-15 13:00', '2016-03-30 13:00')
, (3, '2016-03-22 14:00', NULL)
, (4, '2016-05-06 10:00', '2016-05-19 13:00')
DECLARE #PublicHolidaysTable TABLE
(PublicHolidayDate DATETIME, Description NVARCHAR(255))
INSERT INTO #PublicHolidaysTable VALUES
('2016-02-15', 'President''s Day')
, ('2016-03-17', 'St. Patrick''s Day')
, ('2016-03-25', 'Good Friday')
, ('2016-03-27', 'Easter Sunday')
, ('2016-05-05', 'Cinco de Mayo')
Some considerations you may of already thought of are that you don't want to count both weekend days and holidays that occur on a weekend, unless your company observes the holiday on the next Monday. For simplicity, I've excluded any holiday that occurs on a weekend day in the query.
You'll also want to limit this type of query to a specific date range.
The first CTE (cteAllDates) gets all dates between the start and end date range.
The second CTE (cteWeekendDates) gets all weekend dates from the first CTE (cteAllDates).
The third CTE (ctePublicHolidays) gets all holidays that occur on weekdays from your PublicHolidaysTable.
The last CTE (cteOrders) fulfills the requirement that the count of total days and working days must begin from the next day if the OrderDate is after 12:00PM and the requirement that the DeliveryDate should use today's date if it is null.
The select statement at the end of the CTE statements gets your total day count, weekend count, holiday count, and working days for each order.
;WITH cteAllDates AS (
SELECT 1 [DayID]
, #DateRangeBegin [CalendarDate]
, DATENAME(dw, #DateRangeBegin) [NameOfDay]
UNION ALL
SELECT cteAllDates.DayID + 1 [DayID]
, DATEADD(dd, 1 ,cteAllDates.CalendarDate) [CalenderDate]
, DATENAME(dw, DATEADD(d, 1 ,cteAllDates.CalendarDate)) [NameOfDay]
FROM cteAllDates
WHERE DATEADD(d,1,cteAllDates.CalendarDate) < #DateRangeEnd
)
, cteWeekendDates AS (
SELECT CalendarDate
FROM cteAllDates
WHERE NameOfDay IN ('Saturday','Sunday')
)
, ctePublicHolidays AS (
SELECT PublicHolidayDate
FROM #PublicHolidaysTable
WHERE DATENAME(dw, PublicHolidayDate) NOT IN ('Saturday', 'Sunday')
)
, cteOrders AS (
SELECT OrderNum
, OrderDate
, CASE WHEN DATEPART(hh, OrderDate) >= 12 THEN DATEADD(dd, 1, OrderDate)
ELSE OrderDate
END [AdjustedOrderDate]
, CASE WHEN DeliveryDate IS NOT NULL THEN DeliveryDate
ELSE GETDATE()
END [DeliveryDate]
FROM #OrderTable
)
SELECT o.OrderNum
, o.OrderDate
, o.DeliveryDate
, DATEDIFF(DAY, o.AdjustedOrderDate, o.DeliveryDate) [TotalDayCount]
, (SELECT COUNT(*) FROM cteWeekendDates w
WHERE w.CalendarDate BETWEEN o.AdjustedOrderDate AND o.DeliveryDate) [WeekendDayCount]
, (SELECT COUNT(*) FROM ctePublicHolidays h
WHERE h.PublicHolidayDate BETWEEN o.AdjustedOrderDate AND o.DeliveryDate) [HolidayCount]
, DATEDIFF(DAY, o.AdjustedOrderDate, o.DeliveryDate)
- (SELECT COUNT(*) FROM cteWeekendDays w
WHERE w.CalendarDate BETWEEN o.AdjustedOrderDate AND o.DeliveryDate)
- (SELECT COUNT(*) FROM ctePublicHolidays h
WHERE h.PublicHolidayDate BETWEEN o.AdjustedOrderDate AND o.DeliveryDate) [WorkingDays]
FROM cteOrders o
WHERE o.OrderDate BETWEEN #DateRangeBegin AND #DateRangeEnd
OPTION (MaxRecursion 500)
Results from the above query using the test data...
What I'd probably do is simplify the above by adding a Calendar table populated with sufficiently wide date ranges. Then I'd take some of the CTE statements and turn them into views.
I think specifically valuable to you would be a view that gets you the work days without weekends or holidays. Then you could just simply get the date difference between the two dates and count the work days in the same range.

Pending Monthly SQL Counts

The below query returns accurate info, I just haven't had any luck trying to make this:
1) More dynamic so I'm not repeating the same line of code every month
2) Formatted differently, so just 2 columns of month + year are needed to view pending counts by field1 + field2
Example code (basically, sum when (OPEN date is before/on the last day of the month) and (CLOSE date comes after the month OR it's still opened)
SELECT
SUM(CAST(case when OPENDATE <= '2014-11-30 23:59:59'
and ((CLOSED >= '2014-12-01')
or (CLOSED is null)) then '1' else '0' end as int)) Nov14
,SUM(CAST(case when OPENDATE <= '2014-12-31 23:59:59'
and ((CLOSED >= '2015-01-01')
or (CLOSED is null)) then '1' else '0' end as int)) Dec14
,SUM(CAST(case when OPENDATE <= '2015-01-30 23:59:59'
and ((CLOSED >= '2015-02-01')
or (CLOSED is null)) then '1' else '0' end as int)) Jan15
,FIELD1,FIELD2
FROM T
GROUP BY FIELD1,FIELD2
Results:
FIELD1 FIELD2 NOV14 DEC14 JAN15
A A 2 5 7
A B 6 8 4
C A 5 6 5
…
Instead of:
COUNT FIELD1 FIELD2 MO YR
14 A A 12 2014
18 A B 12 2014
16 C A 1 2015
...
Is there a way to get this in one shot? Sorry if this is a repeat topic, I've looked at some boards and they've helped me get closing counts.. but using a range between two date fields, I haven't had any luck.
Thanks in advance
One way to do it is to use a table of numbers or calendar table.
In the code below the table Numbers has a column Number, which contains integer numbers starting from 1. There are many ways to generate such table.
You can do it on the fly, or have the actual table. I personally have such table in the database with 100,000 rows.
The first CROSS APPLY effectively creates a column CurrentMonth, so that I don't have to repeat the call to DATEADD many times later.
Second CROSS APPLY is your query that you want to run for each month. It can be as complicated as needed, it can return more than one row if needed.
-- Start and end dates should be the first day of the month
DECLARE #StartDate date = '20141201';
DECLARE #EndDate date = '20150201';
SELECT
CurrentMonth
,FIELD1
,FIELD2
,Counts
FROM
Numbers
CROSS APPLY
(
SELECT DATEADD(month, Numbers.Number-1, #StartDate) AS CurrentMonth
) AS CA_Month
CROSS APPLY
(
SELECT
FIELD1
,FIELD2
,COUNT(*) AS Counts
FROM T
WHERE
OPENDATE < CurrentMonth
AND (CLOSED >= CurrentMonth OR CLOSED IS NULL)
GROUP BY
FIELD1
,FIELD2
) AS CA
WHERE
Numbers.Number < DATEDIFF(month, #StartDate, #EndDate) + 1
;
If you provide a table with sample data and expected output, I could verify that the query produces correct results.
The solution is written in SQL Server 2008.
Like this:
SELECT
FIELD1,FIELD2,datepart(month, OPENDATE), datepart(year, OPENDATE), sum(1)
FROM T
GROUP BY FIELD1,FIELD2, datepart(month, OPENDATE), datepart(year, OPENDATE)
But this of course is just based on OPENDATE, if you need to have the same thing calculated into several months, that's going to be more difficult, and you'll probably need a calendar "table" that you'll have to cross apply with this data.