Determining Sick Periods from ranges - sql

I have the below set of data which represents employee sick/absence days over a period (12 months) of time, in a table named Absence:
Day Date DaysSick OccasionsSick Notes
Tuesday 2016-09-27 1 Lisa A working today
Thursday 2016-09-29 1 Lisa sick today Celeste
Thursday 2017-01-05 1 Lisa sick today
I would like to update the OccasionsSick column based upon the instances of being sick. So i would have the following:
Day Date DaysSick OccasionsSick Notes
Tuesday 2016-09-27 1 1 Lisa A working today
Thursday 2016-09-29 1 Lisa sick today Celeste
Thursday 2017-01-05 1 1 Lisa sick today
So, the first two entries are the same period of sick leave, so i need a 1 in the first row, and the last entry is a separate sick period, so a 1 again.
Now, in order to establish a sick period there would be a reference to a roster table containing the below:
Date RosterType
2016-09-27 Sick
2016-09-28 Day Off
2016-09-29 Sick
2016-09-30 Normal
So the 27th and 29th were sick days, but the 28th was a standard day off, which is a likely occurrence, so using consecutive days is not an option. I need to be able to look for sick days until a "normal" RosterType is found, this then breaks the sick period. This 1 then needs to be assigned as per the desired result set.
What is the best way of updating the data here? I have come up with a big blank on this, apart from the logic of determining a sick period.
I am presenting this data in Excel with VBA, so it could also be easier to assign the sick periods in VBA, as opposed to SQL for the raw data

Please check this out.
This assumes that there is an entry in the roster for each day.
Basically I'm just building a period and counting the days in the roster.
If there are normal days in between it counts as a new period.
WITH CTE AS (
SELECT
[day]
,[date]
,LAG(date, 1) over (order by date) datebefore
,[dayssick]
FROM [dbo].[absence]
)
SELECT
*
,CASE WHEN ((SELECT COUNT(1) FROM [dbo].[rostertype] WHERE date < c.date AND date > c.datebefore AND rostertype = 'Normal') > 0
OR c.datebefore IS NULL) THEN 1 ELSE 0 END OccasionsSick
FROM CTE c

Related

Timeseries : date time averaging and abnormally detection

I"m dealing with a dataset with 4 week sales data (data will be refreshed every hour) and need to observer for abnormality
I think I'll go with a basic approach, to compare with average numbers and I'm trying to figure out how to best break this down so I can answer some questions below
On average, how many orders received at 9:00 , 15:00 or 16:00 past 30 days
On average, how many orders received at 9:00 every Wednesday (past 4 Wednesdays), at 15:00 every Thursday (past 4 Thursdays),
Not sure how do we go about this (after breaking date/time down to Hour and Weekday columns)
date
order ID
order hour
order weekday
10/07/2022 10:26:12 PM
1111
22
6
10/07/2022 10:27:12 PM
2222
22
6
....
....
....
....
19/07/2022 11:34:19 AM
9998
11
1
19/07/2022 11:34:35 AM
9999
11
1
I would love to get your advice please
Thanks
I've ended up going with a tedious approach.
#get current hour & weekday
now=datetime.datetime.now()
today=datetime.date.today()
current_hour=now.hour
current_weekday=today.weekday()
#create a DF with orders from the same hour & weekday window
same_hour_weekday_df=order_df[(order_df.order_hour==current_hour ) & (order_df.order_weekday==current_weekday) ]
#calculate avg orders generated from the past weeks within the same hour and weekyday timeframe
orders_same_hour_weekday=same_hour_weekday_df['order_created_at'].count()
same_hour_weekday_periods=same_hour_weekday_df['order_week'].nunique()
avg_orders_same_hour_weekday=orders_same_hour_weekday/same_hour_weekday_periods

how to return a specific set of data from multiple columns in a database in sql

I am new to sql and this is my first ever question. I am working with a sample database that I want to extract specific information from to display as a dashboard. The issue is that I can do this partially but I cannot seem to figure it out properly.
``SELECT
S_date as date,
p_time as time,
process_id as process,
sc_gun as scannumb,
sum(line_qty) as linetotal,
sum(area_qty) as areatotal
FROM dbfile6
WHERE
process_id in('0010','0020','0030')
and sc_gun in = ('10','20','30','40','50')
and s_date = curdate() - 1 and p_time between '22:00:00' and '23:59:59'
or s_date = curdate() and p_time between '00:00:00' and '06:00:00'
GROUP BY p_time, s_date, process_id, sc_gun
ORDER BY s_date, process_id
What do I want to display?
I can do partially where I want it to work to yesterdays date (s_date) recurring but I want this to only happen Monday to Friday, skipping the weekend so when we are on Monday, it looks at Fridays data from the database.
I want to show the time as a range, a night range. The range is 20:00:00 - 06:00:00. The range is tricky as it crosses over to the next day, this could work for Monday to Thursday but not Friday as there is no working weekend so what would I do here? In addition to this, I can sum up the values successfully and display it as averages for each process but then once I add the time in, it displays each result individually.
The table below is what it looks like in the database, however as mentioned earlier, the desired result is for each process to have the line_qty and area_qty summed up by time range and a day and night cycle.
s_date
p_time
process_id
sc_gun
line_qty
area_qty
04/05/2022
04:49:52
0010
10
2
12
03/05/2022
11:50:00
0010
10
5
14
03/05/2022
19:50:00
0010
10
7
16
03/05/2022
13:50:00
0020
20
4
6
03/05/2022
19:50:00
0010
10
7
16

January 1st = Week 1

The below gives me week numbers where week 1 starts on 1/4/2021
date_trunc('week', transaction_date) as week_number
How can I create a week_number where the week starts on January 1st and counts up 7 days for every week thereafter (for every year)?
And round up/down to 52 weeks at the end of the year?
Code attempted:
This doesn't give me the answer, but I'm thinking something like this might work...
ceil(extract(day from transaction_date)/7) as week_number
Expected Output:
transaction_date
week_number
1/1/2020
1
1/8/2020
2
...
...
12/31/2020
52
1/1/2021
1
1/8/2021
2
...
...
12/27/2021
52
12/28/2021
52
12/29/2021
52
12/30/2021
52
12/31/2021
52
1/1/2022
1
Thanks in advance!
A simple way is to use date arithmetic:
select 1 + (transaction_date - date_trunc('year', transaction_date)) / 7 as year_week
The below gives me week numbers where week 1 starts on 1/4/2021
It is the default behaviour and it is defined that way in ISO.
WEEK_OF_YEAR_POLICY
Type Session — Can be set for Account » User » Session
Description
Specifies how the weeks in a given year are computed.
Values
0: The semantics used are equivalent to the ISO semantics, in which a week belongs to a given year if at least 4 days of that week are in that year.
1: January 1 is included in the first week of the year and December 31 is included in the last week of the year.
Default 0 (i.e. ISO-like behavior)
It could be overrriden on multiple levels. The most granular is on the session level:
ALTER SESSION SET WEEK_OF_YEAR_POLICY = 1;
Then you could use the standard code:
SELECT date_trunc('week', transaction_date) as week_number
FROM ...;

How to measure an average count from a set of days each with their own data points, in SQL/LookerML

I have the following table:
id | decided_at | reviewer
1 2020-08-10 13:00 john
2 2020-08-10 14:00 john
3 2020-08-10 16:00 john
4 2020-08-12 14:00 jane
5 2020-08-12 17:00 jane
6 2020-08-12 17:50 jane
7 2020-08-12 19:00 jane
What I would like to do is get the difference between the min and max for each day and get the total count from the id's that are the min, the range between min and max, and the max. Currently, I'm only able to get this data for the past day.
Desired output:
Date | Time(h) | Count | reviewer
2020-08-10 3 3 john
2020-08-12 5 4 jane
From this, I would like to get the average show this data over the past x number of days.
Example:
If today was the 13th, filter on the past 2 days (48 hours)
Output:
reviewer | reviews/hour
jane 5/4 = 1.25
Example 2:
If today was the 13th, filter on the past 3 days (48 hours)
reviewer | reviews/hour
john 3/3 = 1
jane 5/4 = 1.25
Ideally, if this is possible in LookML without the use of a derived table, it would be nicest to have that. Otherwise, a solution in SQL would be great and I can try to convert to LookerML.
Thanks!
In SQL, one solution is to use two levels of aggregation:
select reviewer, sum(cnt) / sum(diff_h) review_per_hour
from (
select
reviewer,
date(decided_at) decided_date,
count(*) cnt,
timestampdiff(hour, min(decided_at), max(decided_at)) time_h
from mytable
where decided_at >= current_date - interval 2 day
group by reviewer, date(decided_at)
) t
group by reviewer
The subquery filters on the date range, aggregates by reviewer and day, and computes the number of records and the difference between the minimum and the maximum date, as hours. Then, the outer query aggregates by reviewer and does the final computation.
The actual function to compute the date difference varies across databases; timestampdiff() is supported in MySQL - other engines all have alternatives.

Oracle query to count no of mondays......... sundays and total no of jobs on particular day in particular time?

I have a table
JobID Date_of_Completion Region day
1 23/05/2016 South monday
2 23/05/2016 north monday
3 23/05/2016 north monday
4 23/05/2016 east monday
5 22/05/2016 South sunday
6 22/05/2016 north sunday
7 22/05/2016 south sunday
8 22/05/2016 east sunday
.
.
.
..
23 2/05/2016 north monday
24 2/05/2016 east monday
25 2/05/2016 South monday
26 2/05/2016 north monday
27 2/05/2016 south monday
28 2/05/2016 east monday
desired output :
for last two months
Day Region countofjobsonparticularday no of days
sunday south 34 8 (no of sund forlast 2 months)
sunday north 24 8 (no of sund forlast 2 months)
monday south 74 9 (no of mon forlast 2 months)
tuesday east 64 8 (no of tue forlast 2 months)
how to write a query? plz help me
It seems that you need something like this:
select Day, Region, count(1), count(distinct date_of_completion)
from your_table
where date_of_completion between add_months(sysdate, -2) and sysdate
group by Day, Region
This will count the number of jobs and the number of DISTINCT days on which job completed.
You should refine this, based on your need ( for example how you want to consider hours, minutes, ...,)
If - as I suspect - you mean the last column, no of days, is supposed to show the total number of Mondays, Tuesdays, etc. over the last two months (regardless of whether there were any jobs on some of the days), first create a (sub)query as below and then join to Aleksej's result on the Day column. Speaking of Day, it is an Oracle keyword; it is always best to avoid using Oracle keywords as table or column names. I use day_name below.
Result of query (can be used as subquery):
DAY_NAME CT
--------------- ----------
monday 9
thursday 8
sunday 9
saturday 9
tuesday 8
friday 9
wednesday 8
I didn't order the results (not needed, if used for a join) and I used low-caps as the OP did. That is controlled by the format model (the middle argument to to_char in the query, below; if capitalized names, like Monday, are desired, change that from 'day' to 'Day').
Query:
with x (day_name) as (
select to_char(sysdate - level + 1, 'day', 'nls_date_language = American')
from dual
connect by level <= sysdate - add_months(sysdate, -2) - 1
)
select day_name, count(*) as ct
from x
group by day_name;
Note 'nls_date_language = American' - it is always best to make that explicit than to rely on default parameters. (Without this third argument, someone else running this with German or Chinese date language wouldn't get the expected result for joining with the other table.) Also, the definition of "last two months" is fuzzy; I used all days between today (included) and two months ago, that is between March 24 and May 23, 2016. These are controlled by the two expressions containing sysdate.
Thanks #mathguy and#Aleksej
I tried this query it worked
select to_char(dayofcompletion,'DY') as day_name, count(1),count(distinct(trunc(dayofcompletion))) as noofdays
from tablename
where trunc(dayofcompletion)>= trunc(sysdate-60)and trunc(dayofcompletion)<=trunc(sysdate-1)
group by to_char(dayofcompletion,'DY')