How to get a SUM of a DATEDIFF but provide cut-off at 24 hours IF a single day is specified - sql

This is actually my first question on stackoverflow, so I sincerely apologize if I am confusing or unclear.
That being said, here is my issue:
I work at a car manufacturing company and we have recently implemented the ability to track when our machines are idle. This is done by assessing the start and end time of the event called "idle_start."
Right now, I am trying to get the SUM of how long a machine is idle. Now, I figured this out BUT, some of the idle_times are LONGER than 24 hours.
So, when I specify that I only want to see the idle_time sums of ONE particular day, the sum is also counting the idle time past 24 hours.
I want to provide the option of CUTTING OFF at that 24 hours. Is this possible?
Here is the query:
{code}
SELECT r.`name` 'Producer'
, m.`name` 'Manufacturer'
-- , timediff(re.time_end, re.time_start) 'Idle Time Length'
, SEC_TO_TIME(SUM((TIME_TO_SEC(TIMEDIFF(re.time_end, re.time_start))))) 'Total Time'
, (SUM((TIME_TO_SEC(TIMEDIFF(re.time_end, re.time_start)))))/3600 'Total Time in Hours'
, (((SUM((TIME_TO_SEC(TIMEDIFF(re.time_end, re.time_start)))))/3600))/((IF(r.resource_status_id = 2, COUNT(r.resource_id), NULL))*24) 'Percent Machine is Idle divided by Machine Hours'
FROM resource_event re
JOIN resource_event_type ret
ON re.resource_event_type_id = ret.resource_event_type_id
JOIN resource_event_type reep
ON ret.parent_resource_event_type_id = reep.resource_event_type_id
JOIN resource r
ON r.`resource_id` = re.`resource_id`
JOIN manufacturer m
ON m.`manufacturer_id` = r.`manufacturer_id`
WHERE re.`resource_event_type_id` = 19
AND ret.`parent_resource_event_type_id` = 3
AND DATE_FORMAT(re.time_start, '%Y-%m-%d') >= '2013-08-12'
AND DATE_FORMAT(re.time_start, '%Y-%m-%d') <= '2013-08-18'
-- AND re.`resource_id` = 8
AND "Idle Time Length" IS NOT NULL
AND r.manufacturer_id = 13
AND r.resource_status_id = 2
GROUP BY 1, 2
Feel free to ignore the dash marks up top. And please tell me if I can be more specific as to figure this out easier and provide less headaches for those willing to help me out.
Thank you so much!

You'll want a conditional SUM, using CASE.
Not sure of syntax for your db exactly, but something like:
, SUM (CASE WHEN TIME_TO_SEC(TIMEDIFF(re.time_end, re.time_start))/3600 > 24 THEN 0
ELSE TIME_TO_SEC(TIMEDIFF(re.time_end, re.time_start))/3600
END)'Total Time in Hours'

This is not an attempt to answer your question. It's being presented as an answer rather than a comment for better formatting and readability.
You have this
AND DATE_FORMAT(re.time_start, '%Y-%m-%d') >= '2013-08-12'
AND DATE_FORMAT(re.time_start, '%Y-%m-%d') <= '2013-08-18'
in your where clause. Using functions like this make your query take longer to execute, especially on indexed fields. Something like this would run quicker.
AND re.time_start >= a date value goes here
AND re.time_start <= another date value goes here

Do you want to cut off when start/end are before/after your time range?
You can use a case to adjust it based on your timeframe, e.g. for time_start
case
when re.time_start < timestamp '2013-08-12 00:00:00'
then timestamp '2013-08-12 00:00:00'
else re.time_start
end
similar for time_end and then use those CASEs within your TIMEDIFF.
Btw, your where-condition for a given date range should be:
where time_start < timestamp '2013-08-19 00:00:00'
and time_end >= timestamp '2013-08-12 00:00:00'
This will return all idle times between 2013-08-12 and 2013-08-18

Related

Get time difference between Log records

I have a log table that tracks the bug's status. I would like to extract the amount of time spent when the log changes from OPEN (OldStatus) to FIXED or REQUEST CLOSE (NewStatus). Right now, my query looks at the max and min of the log which does not produce the result I want. For example, the bug #1 was fixed in 2 hours on 2020-01-01, then reopened (OldStatus) and got a REQUEST CLOSE (NewStatus) in 3 hours on 2020-12-12. I want the query result to return two rows with date and number of hours spent to fix the bug since its most recently opened time.
Here's the data:
CREATE TABLE Log (
BugID int,
CurrentTime timestamp,
Person varchar(20),
OldStatus varchar(20),
NewStatus varchar(20)
);
INSERT INTO Log (BugID, CurrentTime, Person, OldStatus, NewStatus)
VALUES (1, '2020-01-01 00:00:00', 'A', 'OPEN', 'In Progress'),
(1, '2020-01-01 00:00:01', 'A', 'In Progress', 'REVIEW In Progress'),
(1, '2020-01-01 02:00:00', 'A', 'In Progress', 'FIXED'),
(1, '2020-01-01 06:00:00', 'B', 'OPEN', 'In Progress'),
(1, '2020-01-01 00:00:00', 'B', 'In Progress', 'REQUEST CLOSE')
SELECT DATEDIFF(HOUR, start_time, finish_time) AS Time_Spent_Min
FROM (
SELECT BugId,
MAX(CurrentTime) as finish_time,
MIN(CurrentTime) as start_time
FROM Log
WHERE (OldStatus = 'OPEN' AND NewString = 'In Progress') OR NewString = 'FIXED'
) AS TEMP
The actual data looks as below:
FYI #Charlieface
This is a type of gaps-and-islands problem.
There are a number of solutions, here is one:
We need to assign a grouping ID to each island of OPEN -> In Progress. We can use windowed conditional COUNT to get a grouping number for each start point.
To get a grouping for the end point, we need to assign the previous row's NewStatus using LAG, then do another conditional COUNT on that.
We then simply group by BugId and our calculated grouping and return the start and end times
WITH IslandStart AS (
SELECT *,
COUNT(CASE WHEN OldStatus = 'OPEN' AND NewStatus = 'In Progress' THEN 1 END)
OVER (PARTITION BY BugID ORDER BY CurrentTime ROWS UNBOUNDED PRECEDING) AS GroupStart,
LAG(NewStatus) OVER (PARTITION BY BugID ORDER BY CurrentTime) AS Prev_NewStatus
FROM Log l
),
IslandEnd AS (
SELECT *,
COUNT(CASE WHEN Prev_NewStatus IN ('CLAIM FIXED', 'REQUEST CLOSE') THEN 1 END)
OVER (PARTITION BY BugID ORDER BY CurrentTime ROWS UNBOUNDED PRECEDING) AS GroupEnd
FROM IslandStart l
)
SELECT
BugId,
MAX(CurrentTime) as finish_time,
MIN(CurrentTime) as start_time,
DATEDIFF(minute, MIN(CurrentTime), MAX(CurrentTime)) AS Time_Spent_Min
FROM IslandEnd l
WHERE GroupStart = GroupEnd + 1
GROUP BY
BugId,
GroupStart;
Notes:
timestamp is not meant for actual dates and times, instead use datetime or datetime2
You may need to adjust the COUNT condition if OPEN -> In Progress is not always the first row of an island
You have a few competing factors here:
You should use a SmallDateTime, DateTime2 or DateTimeOffset typed columns to store the actual time in the log, these types allow for calculating the differece between values using DateDiff() and DateAdd() and other date/time based comparison logic, where as Timestamp is designed to be used as a currency token, you can use it to determine if one record is more recent than another, you shouldn't try to use it to determine the actual time of the event.
What is difference between datetime and timestamp
DATETIMEOFFSET, DATE, TIME, SMALLDATETIME, DATETIME SYSUTCDATETIME and SYSUTCDATE
You have not explained the expected workflow, we can only assume that the flow is [OPEN]=>[In Progress]=>[CLAIM FIXED]. There is also no mention of 'In Progress', which we assume is an interim state. What actually happens here is that this structure can really only tell you the time spent in the 'In Progress' state, which is probably OK for your needs as this is the time spent actually working, but it is important to recognise that we do not know when the bug is changed to 'OPEN' in the first place, unless that is also logged but we need to see the data to explain that.
Your example dataset does not cover enough combinations for you to notice that the existing logic will fail as soon as you add more than 1 bug. What is more you have asked to calculate the number of hours, but your example data only shows a variation minutes and has no example where the bug is completed at all.
Without a realistic set of data to test with, you will find it hard to debug your logic and hard to accept that it actually works before you execute this against a larger dataaset. It can help to have a scripted scenario, much like your post here, but you should create the data to reflect that script.
You use 'FIXED' in your example, but 'CLAIM FIXED' in query, so which one is it?
Step 1: Structure
Change the datatype of CurrentTime to a DateTime based column. Your application logic may drive requirements here. If your system is cloud based or international, then you may see benefits from using DateTimeOffset instead of having to convert into UTC, otherwise if you do not need high precision timing in your logs, it is very common to use SmallDateTime for logging.
Many ORM and application frameworks will allow you to configure a DateTime based column as the concurrency token, it you need one at all. If you are not happy using a lower precision value for concurrency, then you could have the two columns side by side, to compare the time difference between two records, we need to use a DateTime based type.
In the case of log, we rarely allow or expect logs to be edited, if your logs are read-only then having a concurrency token at all may not be necessary, especially if you only use the concurrency token to determine concurrency during edits of individual records.
NOTE: You should consider using an enum or FK for the Status concept. Already in your example dataset there was a typo for 'In Progerss', using a numeric comparison for the status may provide some performance benefits but it will help to prevent spelling mistakes, especially when FK or lookup lists are used from any application logic.
Step 2: Example Data
If the requirement is to calculate the number of hours spent between records, then we need to create some simple examples that show a difference of a few hours, and then add some examples where the same bug is opened, fixed and then re-opened.
bug #1 was fixed in 2 hours on 2020-01-01, then reopened and got fixed in 3 hours on 2020-12-12
The following table shows the known data states and the expected hrs, we need to throw in a few more data stories to validate that the end query handles obvious boundary conditions like multiple Bugs and overlapping dates
BUG #
Time
Previous State
New State
Hrs In Progress
1
2020-01-01 08:00:00
OPEN
In Progress
1
2020-01-01 10:00:00
In Progress
FIXED
(2 hrs)
1
2020-12-10 09:00:00
FIXED
OPEN
1
2020-12-12 9:30:00
OPEN
In Progress
1
2020-12-12 12:30:00
In Progress
FIXED
(3 hrs)
2
2020-03-17 11:15:00
OPEN
In Progress
2
2020-03-17 14:30:00
In Progress
FIXED
(3.25 hrs)
3
2020-08-22 10:00:00
OPEN
In Progress
3
2020-08-22 16:30:00
In Progress
FIXED
(6.5 hrs)
Step 3: Query
What is interesting to notice here is that 'In Progress' is actually the significant state to query against. What we actually want is to see all rows where the OldStatus is 'In Progress' and we want to link that row to the most recent record before this one with the same BugID and with a NewStatus equal to 'In Progress'
What is interesting in the above table is that not all the expected hours are whole numbers (integers) which makes using DateDiff a little bit tricky because it only counts the boundary changes, not the total number of hours. to highlight this, look at the next two queries, the first one represents 59 minutes, the other only 2 minutes:
SELECT DateDiff(HOUR, '2020-01-01 08:00:00', '2020-01-01 08:59:00') -- 0 (59m)
SELECT DateDiff(HOUR, '2020-01-01 08:59:00', '2020-01-01 09:01:00') -- 1 (1m)
However the SQL results show the first query as 0 hours, but the second query returns 1 hour. That is because it only compares the HOUR column, it is not actually doing a subtraction of the time value at all.
To work around this, we can use MINUTE or MI as the date part argument and divide the result by 60.
SELECT CAST(ROUND(DateDiff(MI, '2020-01-01 08:00:00', '2020-01-01 08:59:00')/60.0,2) as Numeric(10,2)) -- 0.98
SELECT CAST(ROUND(DateDiff(MI, '2020-01-01 08:59:00', '2020-01-01 09:01:00')/60.0,2) as Numeric(10,2)) -- 0.03
You can choose to format this in other ways by calculating the modulo to get the minutes in whole numbers instead of a fraction but that is out of scope for this post, understanding the limitations of DateDiff is what is important to take this further.
There are a number of ways to correlate a previous record within the same table, if you need other values form the record then you might use a join with a sub-query to return the TOP 1 from all the records before the current one, you could use window queries or a CROSS APPLY to perform a nested lookup. The following uses CROSS APPLY which is NOT standard across all RDBMS but I feel it keeps MS SQL queries really clean:
SELECT [Fixed].BugID, [start_time], [Fixed].[CurrentTime] as [finish_time]
, DATEDIFF(MI, [start_time], [Fixed].[CurrentTime]) / 60 AS Time_Spent_Hr
, DATEDIFF(MI, [start_time], [Fixed].[CurrentTime]) % 60 AS Time_Spent_Min
FROM Log as Fixed
CROSS APPLY (SELECT MAX(CurrentTime) AS start_time
FROM Log as Started
WHERE Fixed.BugID = Started.BugID
AND Started.NewStatus = 'In Progress'
AND CurrentTime < Fixed.CurrentTime) as Started
WHERE Fixed.OldStatus = 'In Progress'
You can play with this fiddle: http://sqlfiddle.com/#!18/c408d4/3
However the results show this:
BugID
start_time
finish_time
Time_Spent_Hr
Time_Spent_Min
1
2020-01-01T08:00:00Z
2020-01-01T10:00:00Z
2
0
1
2020-12-12T09:30:00Z
2020-12-12T12:30:00Z
3
0
2
2020-03-17T11:15:00Z
2020-03-17T14:30:00Z
3
15
3
2020-08-22T10:00:00Z
2020-08-22T16:30:00Z
6
30
If I assume that every "open" is followed by one "fixed" before the next open, then you can basically use lead() to solve this problem.
This version unpivots the data, so you could have "open" and "fixed" in the same row:
select l.*, datediff(hour, currenttime, fixed_time)
from (select v.*,
lead(v.currenttime) over (partition by v.bugid order by v.currenttime) as fixed_time
from log l cross apply
(values (bugid, currentTime, oldStatus),
(bugid, currentTime, newStatus)
) v(bugid, currentTime, status)
where v.status in ('OPEN', 'FIXED')
) l
where status = 'OPEN';
Here is a db<>fiddle, which uses data compatible with your explanation. (Your sample data is not correct.)

How to create time slots of the day in SQL server

I apologize in advance if you think the question has already been answered, I searched on Stack, but similar answers were much too complex for me.
I have this request in SQL :
SELECT dbo.tbTransaction.dateHeure, dbo.tbPoste.nomPoste
FROM dbo.tbTransaction
INNER JOIN dbo.tbPoste ON dbo.tbTransaction.ID_poste = dbo.tbPoste.ID_POSTE
WHERE dbo.tbPoste.id_site LIKE 17
AND dbo.tbPoste.nomPoste LIKE 'POSTE1'
AND dateHeure >= DATEADD(dd, DATEDIFF(dd,0,GETDATE()), 0)
ORDER BY dateHeure DESC
It gives me all the production of the day, the problem is : it's not precise enough because there is a team shift in the afternoon and another team shift in the night.
If you see all the production of the day, you cannot see which team produced what.
I would like to create 3 different queries that allow me to see only the production between certain hours (for example, production between 6am and noon). Do you have any advice on how to do this?
I think you will have to hardcode your time slots. For example you know that morning shift is going to work from 06:00 to 12:00.
As you know the hours, you can use HOUR function on your datetime column with appropriate >, <, = and AND operators to get the required results.
From what I have understood, it should be something like this:
HOUR(datetime column) >= 6 AND HOUR (datetime column) < 12
Okay I managed to find the query I wanted :
SELECT dbo.tbTransaction.dateHeure, dbo.tbPoste.nomPoste FROM dbo.tbTransaction INNER JOIN dbo.tbPoste ON dbo.tbTransaction.ID_poste = dbo.tbPoste.ID_POSTE WHERE dbo.tbPoste.id_site LIKE 17
AND dbo.tbPoste.nomPoste LIKE 'POSTE1'
AND dateHeure >= DATEADD(dd, DATEDIFF(dd,0,GETDATE()), 0)
AND DATEPART(HOUR, dateHeure) > 6
AND DATEPART(HOUR, dateHeure) < 13
ORDER BY dateHeure DESC

SQL Date Calculation formatting

It's been a while since I posted. I have an issue regarding date calculations.
I am trying to find the difference between two dates as in start time and finish time.
I have been able to find the difference in days so for instance if I have the dates:
start = 12/11/2014 12:05:05
finish = 13/11/2014 09:44:19
then the query gives me -0.90224537......
However, I need the answer in the form of hours, minutes, seconds for wage purposes. What is the best way of doing this?
My query so far is:
select
time_sheet.time_sheet_id,
time_sheet.start_date_time - time_sheet.finish_date_time,
employee_case.case, employee_case.employee
from
time_sheet
inner join
employee_case on time_sheet.employee_case = employee_case.employee_case_id
where
employee_case.case = 1;
P.S. I am using an Oracle database :)
date - date yields the difference in days. So you can use the standard time conversion to convert it to hours,minutes and seconds as below:
select
time_sheet.time_sheet_id,
(time_sheet.finish_date_time - time_sheet.start_date_time)||' days'||(time_sheet.finish_date_time - time_sheet.start_date_time)*24||' hours '||(time_sheet.finish_date_time - time_sheet.start_date_time)*24*60||' minutes '||(time_sheet.finish_date_time - time_sheet.start_date_time)*24*60*60||' seconds ' ,
employee_case.case, employee_case.employee
from
time_sheet
inner join
employee_case on time_sheet.employee_case = employee_case.employee_case_id
where
employee_case.case = 1;
Also, you would need finish_date_time as the minuend and start_date_time as the subtrahend to avoid negative values.
Thanks for the help toddlermenot but i have decided to go another way I have found. For anybody else who has the same issue I have done the following:
select
time_sheet.time_sheet_id,
to_number( to_char(to_date('1','J') +
(time_sheet.finish_date_time - time_sheet.start_date_time), 'J') - 1) days,
to_char(to_date('00:00:00','HH24:MI:SS') +
(time_sheet.finish_date_time - time_sheet.start_date_time), 'HH24:MI:SS') time,
employee_case.case, employee_case.employee
from
time_sheet
inner join
employee_case on time_sheet.employee_case = employee_case.employee_case_id
where
employee_case.case = 1;
This seems to do exactly what I need. It does the days in a serperate field to the hours minutes and seconds but for my purposes this is acceptable

Change select to a previous date

I have basic knowledge of SQL and have a question:
I am trying to select data from a time series (date and windspeed). I want to select the original wind speed value if it lies between hours 7 and 21. If the hour is outside this range I would like to assign the wind speed to the previous wind speed at hour 21. There is also a concern that there is the occasional point where hour 21 does not exist and would like to assign the windspeed as hour 20... 19 etc until it finds the next available hour.
SELECT
date,
CASE WHEN DATEPART(HH,date) < 7 OR DATEPART(HH,date) > 21
THEN '<WIND SPEED AT HOUR 21> ELSE <WIND SPEED> END AS ModifiedWindspeed
,WindSpeed, winddirection
from TerrainCorrectedHourlyWind w
This might make things clearer. If the hour is in the specified range, select windspeed. If not then select the wind speed from the prior day at 21 hours.
Though you've tagged the question mysql, I'm guessing this is actually SQL Server because of the DATEPART() function used. Try the following, which uses an OUTER APPLY to get your alternate value:
SELECT Date
, CASE
WHEN DATEPART(HOUR, Date)BETWEEN 7 AND 21 THEN w.WindSpeed
ELSE m.WindSpeed
END AS ModifiedWindSpeed
, w.WindSpeed
, w.WindDirection
FROM TerrainCorrectedHourlyWind AS w
OUTER APPLY(SELECT TOP 1 WindSpeed
FROM TerrainCorrectedHourlyWind
WHERE DATEPART(HOUR, Date)BETWEEN 7 AND 21
AND Date < w.Date
ORDER BY Date DESC)AS m;
Just to explain what this is doing--the OUTER APPLY will get the single most recent record (TOP 1 and ORDER BY Date DESC) for dates prior to the record in question (Date < w.Date) as well as within the hours specified. The CASE near the top chooses whether to use the current value or this alternate one based on the hour.

Calculating working days including holidays between dates without a calendar table in oracle SQL

Okay, so I've done quite a lot of reading on the possibility of emulating the networkdays function of excel in sql, and have come to the conclusion that by far the easiest solution is to have a calendar table which will flag working days or non working days. However, due to circumstances out of my control, we don't have access to such a luxury and it's unlikely that we will any time in the near future.
Currently I have managed to bodge together what is undoubtedly a horrible ineffecient query in SQL that does work - the catch is, it will only work for a single client record at a time.
SELECT O_ASSESSMENTS.ASM_ID,
O_ASSESSMENTS.ASM_START_DATE,
O_ASSESSMENTS.ASM_END_DATE,
sum(CASE
When TO_CHAR(O_ASSESSMENTS.ASM_START_DATE + rownum -1,'Day')
= 'Sunday ' THEN 0
When TO_CHAR(O_ASSESSMENTS.ASM_START_DATE + rownum -1,'Day')
= 'Saturday ' THEN 0
WHEN O_ASSESSMENTS.ASM_START_DATE + rownum - 1
IN ('03-01-2000','21-04-2000','24-04-2000','01-05-2000','29-05-2000','28-08-2000','25-12-2000','26-12-2000','01-01-2001','13-04-2001','16-04-2001','07-05-2001','28-05-2001','27-08-2001','25-12-2001','26-12-2001','01-01-2002','29-03-2002','01-04-2002','06-04-2002','03-06-2002','04-06-2002','26-08-2002','25-12-2002','26-12-2002','01-01-2003','18-04-2003','21-04-2003','05-05-2003','26-05-2003','25-08-2003','25-12-2003','26-12-2003','01-01-2004','09-04-2004','12-04-2004','03-05-2004','31-05-2004','30-08-2004','25-12-2004','26-12-2004','27-12-2004','28-12-2004','01-01-2005','03-01-2005','25-03-2005','28-03-2005','02-05-2005','30-05-2005','29-08-2005','27-12-2005','28-12-2005','02-01-2006','14-04-2006','17-04-2006','01-05-2006','29-05-2006','28-08-2006','25-12-2006','26-12-2006','02-01-2007','06-04-2007','09-04-2007','07-05-2007','28-05-2007','27-08-2007','25-12-2007','26-12-2007','01-01-2008','21-03-2008','24-03-2008','05-05-2008','26-05-2008','25-08-2008','25-12-2008','26-12-2008','01-01-2009','10-04-2009','13-04-2009','04-05-2009','25-05-2009','31-08-2009','25-12-2009','28-12-2009','01-01-2010','02-04-2010','05-04-2010','03-05-2010','31-05-2010','30-08-2010','24-12-2010','27-12-2010','28-12-2010','31-12-2010','03-01-2011','22-04-2011','25-04-2011','29-04-2011','02-05-2011','30-05-2011','29-08-2011','26-12-2011','27-12-2011')
THEN 0
ELSE 1
END)-1 AS Week_Day
From O_ASSESSMENTS,
ALL_OBJECTS
WHERE O_ASSESSMENTS.ASM_QSA_ID IN ('TYPE1')
AND O_ASSESSMENTS.ASM_END_DATE >= '01/01/2012'
AND O_ASSESSMENTS.ASM_ID = 'A00000'
AND ROWNUM <= O_ASSESSMENTS.ASM_END_DATE-O_ASSESSMENTS.ASM_START_DATE+1
GROUP BY
O_ASSESSMENTS.ASM_ID,
O_ASSESSMENTS.ASM_START_DATE,
O_ASSESSMENTS.ASM_END_DATE
Basically, I'm wondering if a) I should stop wasting my time on this or b) is it possible to get this to work for multiple clients? Any pointers appreciated thanks!
Edit: Further clarification - I already work out timescales using excel, but it would be ideal if we could do it in the report as the report in question is something that we would like end users to be able to run without any further manipulation.
Edit:
MarkBannister's answer works perfectly albeit slowly (though I had expected as much given it's not the preferred solution) - the challenge now lies in me integrating this into an existing report!
with
calendar_cte as (select
to_date('01-01-2000')+level-1 calendar_date,
case when to_char(to_date('01-01-2000')+level-1, 'day') in ('sunday ','saturday ') then 0 when to_date('01-01-2000')+level-1 in ('03-01-2000','21-04-2000','24-04-2000','01-05-2000','29-05-2000','28-08-2000','25-12-2000','26-12-2000','01-01-2001','13-04-2001','16-04-2001','07-05-2001','28-05-2001','27-08-2001','25-12-2001','26-12-2001','01-01-2002','29-03-2002','01-04-2002','06-04-2002','03-06-2002','04-06-2002','26-08-2002','25-12-2002','26-12-2002','01-01-2003','18-04-2003','21-04-2003','05-05-2003','26-05-2003','25-08-2003','25-12-2003','26-12-2003','01-01-2004','09-04-2004','12-04-2004','03-05-2004','31-05-2004','30-08-2004','25-12-2004','26-12-2004','27-12-2004','28-12-2004','01-01-2005','03-01-2005','25-03-2005','28-03-2005','02-05-2005','30-05-2005','29-08-2005','27-12-2005','28-12-2005','02-01-2006','14-04-2006','17-04-2006','01-05-2006','29-05-2006','28-08-2006','25-12-2006','26-12-2006','02-01-2007','06-04-2007','09-04-2007','07-05-2007','28-05-2007','27-08-2007','25-12-2007','26-12-2007','01-01-2008','21-03-2008','24-03-2008','05-05-2008','26-05-2008','25-08-2008','25-12-2008','26-12-2008','01-01-2009','10-04-2009','13-04-2009','04-05-2009','25-05-2009','31-08-2009','25-12-2009','28-12-2009','01-01-2010','02-04-2010','05-04-2010','03-05-2010','31-05-2010','30-08-2010','24-12-2010','27-12-2010','28-12-2010','31-12-2010','03-01-2011','22-04-2011','25-04-2011','29-04-2011','02-05-2011','30-05-2011','29-08-2011','26-12-2011','27-12-2011','01-01-2012','02-01-2012') then 0 else 1 end working_day
from dual
connect by level <= 1825 + sysdate - to_date('01-01-2000') )
SELECT
a.ASM_ID,
a.ASM_START_DATE,
a.ASM_END_DATE,
sum(c.working_day)-1 AS Week_Day
From
O_ASSESSMENTS a
join calendar_cte c
on c.calendar_date between a.ASM_START_DATE and a.ASM_END_DATE
WHERE a.ASM_QSA_ID IN ('TYPE1')
and a.ASM_END_DATE >= '01/01/2012'
GROUP BY
a.ASM_ID,
a.ASM_START_DATE,
a.ASM_END_DATE
There are a few ways to do this. Perhaps the simplest might be to create a CTE that produces a virtual calendar table, based on Oracle's connect by syntax, and then join it to the Assesments table, like so:
with calendar_cte as (
select to_date('01-01-2000')+level-1 calendar_date,
case when to_char(to_date('01-01-2000')+level-1, 'Day')
in ('Sunday ','Saturday ') then 0
when to_date('01-01-2000')+level-1
in ('03-01-2000','21-04-2000','24-04-2000','01-05-2000','29-05-2000','28-08-2000','25-12-2000','26-12-2000','01-01-2001','13-04-2001','16-04-2001','07-05-2001','28-05-2001','27-08-2001','25-12-2001','26-12-2001','01-01-2002','29-03-2002','01-04-2002','06-04-2002','03-06-2002','04-06-2002','26-08-2002','25-12-2002','26-12-2002','01-01-2003','18-04-2003','21-04-2003','05-05-2003','26-05-2003','25-08-2003','25-12-2003','26-12-2003','01-01-2004','09-04-2004','12-04-2004','03-05-2004','31-05-2004','30-08-2004','25-12-2004','26-12-2004','27-12-2004','28-12-2004','01-01-2005','03-01-2005','25-03-2005','28-03-2005','02-05-2005','30-05-2005','29-08-2005','27-12-2005','28-12-2005','02-01-2006','14-04-2006','17-04-2006','01-05-2006','29-05-2006','28-08-2006','25-12-2006','26-12-2006','02-01-2007','06-04-2007','09-04-2007','07-05-2007','28-05-2007','27-08-2007','25-12-2007','26-12-2007','01-01-2008','21-03-2008','24-03-2008','05-05-2008','26-05-2008','25-08-2008','25-12-2008','26-12-2008','01-01-2009','10-04-2009','13-04-2009','04-05-2009','25-05-2009','31-08-2009','25-12-2009','28-12-2009','01-01-2010','02-04-2010','05-04-2010','03-05-2010','31-05-2010','30-08-2010','24-12-2010','27-12-2010','28-12-2010','31-12-2010','03-01-2011','22-04-2011','25-04-2011','29-04-2011','02-05-2011','30-05-2011','29-08-2011','26-12-2011','27-12-2011')
then 0
else 1
end working_day
from dual
connect by level <= 36525 + sysdate - to_date('01-01-2000') )
SELECT a.ASM_ID,
a.ASM_START_DATE,
a.ASM_END_DATE,
sum(c.working_day) AS Week_Day
From O_ASSESSMENTS a
join calendar_cte c
on c.calendar_date between a.ASM_START_DATE and a.ASM_END_DATE
WHERE a.ASM_QSA_ID IN ('TYPE1') and
a.ASM_END_DATE >= '01/01/2012' -- and a.ASM_ID = 'A00000'
GROUP BY
a.ASM_ID,
a.ASM_START_DATE,
a.ASM_END_DATE
This will produce a virtual table populated with dates from 01 January 2000 to 10 years after the current date, with all weekends marked as non-working days and all days specified in the second in clause (ie. up to 27 December 2011) also marked as non-working days.
The drawback of this method (or any method where the holiday dates are hardcoded into the query) is that each time new holiday dates are defined, every single query that uses this approach will have to have those dates added.
If you can't use a calendar table in Oracle, you might be better off exporting to Excel. Brute force always works.
Networkdays() "returns the number of whole working days between start_date and end_date. Working days exclude weekends and any dates identified in holidays."
Excluding weekends seems fairly straightforward. Every 7-day period will contain two weekend days. You'll just need to take some care with the leftover days.
Holidays are a different story. You have to either store them or pass them as an argument. If you could store them, you'd store them in a calendar table, and your problem would be over. But you can't do that.
So you're looking at passing them as an argument. Off the top of my head--and I haven't had any tea yet this morning--I'd consider a common table expression or a wrapper for a stored procedure.