I am trying to calculate the worked hours for specific days in Google BigQuery (SQL).
The pay wage is $10 when you work on a day time but $15 when you work on a night time.
Day time is defined as 6am to 10pm whereas night time is defined as 10pm to 6am.
Employees can work flexibly as they are limousine drivers.
The following is an example of my table:
id
start_at
end_at
date
abc123
04:00:00
07:00:00
2020-01-05
abc123
09:00:00
15:32:00
2020-01-05
abc123
23:00:00
23:35:00
2020-01-05
abc123
23:40:00
23:59:00
2020-01-05
abc123
23:59:00
01:35:00
2020-01-05
abc123
02:02:00
04:35:00
2020-01-06
abc123
05:40:00
06:59:00
2020-01-06
So the actual work hours is calculated by taking the difference between start_at and end_at but the day time and night time conditions are becoming a hassle in my query..
*the date column is based on start_at. Even when you start at 11:59pm and end at the next day 12:05am, the date follows the date of the start_at instead of end_at.
Any ideas? Thanks in advance!
Consider below solution
create temp function night_day_split(start_at time, end_at time, date date) as (array(
select as struct
extract(date from time_point) day,
if(extract(hour from time_point) between 6 and 22, 'day', 'night') day_night,
count(1) minutes
from unnest(generate_timestamp_array(
timestamp(datetime(date, start_at)),
timestamp(datetime(if(start_at < end_at, date, date + 1), end_at)),
interval 1 minute
)) time_point
group by 1, 2
));
select id, day,
sum(if(day_night = 'day', minutes, null)) day_minutes,
sum(if(day_night = 'night', minutes, null)) night_minutes
from yourtable,
unnest(night_day_split(start_at, end_at, date)) v
group by id, day
if applied to sample data in your question - output is
You can try following code :-
with mytable as (
select 'abc123' id, cast( '04:00:00' as time) start_dt, cast( '07:00:00' as time) end_dt, date('2020-01-05' ) date union all
select 'abc123', cast( '09:00:00' as time), cast( '15:32:00' as time), date('2020-01-05') union all
select 'abc123', cast( '23:00:00' as time), cast( '23:35:00' as time), date('2020-01-05' ) union all
select 'abc123', cast('23:40:00' as time), cast( '23:59:00' as time), date('2020-01-05') union all
select 'abc123', cast ('23:59:00' as time), cast( '01:35:00' as time), date('2020-01-05') union all
select 'abc123', cast('02:02:00' as time), cast( '04:35:00' as time), date('2020-01-06') union all
select 'abc123', cast('05:40:00' as time), cast( '06:59:00' as time), date('2020-01-06')
)
select id, date, sum (value) as sal from(
select id, date,
case when start_dt > cast( '06:00:00' as time) and end_dt < cast( '22:00:00' as time) and start_dt < end_dt then (time_diff(end_dt, start_dt, Minute)/60) * 10
when start_dt < cast( '06:00:00' as time) and end_dt < cast( '06:00:00' as time) then (time_diff(end_dt, start_dt, Minute)/60) * 15
when start_dt < cast( '06:00:00' as time) and end_dt < cast( '22:00:00' as time) then (time_diff(cast( '06:00:00' as time), start_dt, Minute)/60) * 15 + (time_diff( end_dt,cast( '06:00:00' as time), Minute)/60) * 10
when start_dt > cast( '22:00:00' as time) and end_dt < cast( '06:00:00' as time) then (time_diff(cast( '23:59:00' as time), start_dt, Minute)/60) * 15 + (time_diff( end_dt,cast( '00:00:00' as time), Minute)/60) * 15
when start_dt > cast( '22:00:00' as time) and end_dt > cast( '22:00:00' as time) then (time_diff(end_dt, start_dt, Minute)/60) * 15
else 0
end as value
from mytable) group by id, date
Output :-
You can further group by on month for monthly salary.
Related
How can you make a date range in a big query? A date range starts from 29th of the month and ends with 28th of the next month. It should be like this
Date | Starting Date | Ending Date
03-13-2020 | 02-29-2020 | 03-28-2021
06-30-2020 | 06-29-2020 | 07-28-2021
01-01-2021 | 12-29-2020 | 01-28-2021
11-11-2021 | 10-28-2021 | 11-29-2021
Actually, i make an article on it.
Check this out:
https://www.theaccountingtactics.com/2021/12/BigQueryBQ-DateProblems-DateSituations-that-are-Hard-to-Analyze-and-Takes-Time-ToCrack%20.html?m=1
Consider below approach
create temp function set_day(date date, day int64) as (
ifnull(
safe.date(extract(year from date), extract(month from date), day),
last_day(date)
)
);
select Date,
set_day(Starting_Date, 29) as Starting_Date,
set_day(Ending_Date, 28) as Ending_Date
from (
select *, if(extract(day from Date) < 29,
struct(date_sub(Date, interval 1 month) as Starting_Date, Date as Ending_Date),
struct(Date as Starting_Date, date_add(Date, interval 1 month) as Ending_Date)
).*
from your_table
)
if applied to sample data as in your question
with your_table as (
select date '2020-03-13' Date union all
select '2021-03-13' union all
select '2020-06-30' union all
select '2021-01-01' union all
select '2021-11-11'
)
output is
You can test whole stuff using below
create temp function set_day(date date, day int64) as (
ifnull(
safe.date(extract(year from date), extract(month from date), day),
last_day(date)
)
);
with your_table as (
select date '2020-03-13' Date union all
select '2021-03-13' union all
select '2020-06-30' union all
select '2021-01-01' union all
select '2021-11-11'
)
select Date,
set_day(Starting_Date, 29) as Starting_Date,
set_day(Ending_Date, 28) as Ending_Date
from (
select *, if(extract(day from Date) < 29,
struct(date_sub(Date, interval 1 month) as Starting_Date, Date as Ending_Date),
struct(Date as Starting_Date, date_add(Date, interval 1 month) as Ending_Date)
).*
from your_table
)
I need to split the following data in Oracle SQL:
WITH sample_data AS
(SELECT DATE '2020-12-16' Start_Date, DATE '2021-01-07' End_Date FROM DUAL)
in week ranges for every working week(from Monday to Friday) of this given period. The final result it should look like this:
NEW_STARTDATE NEW_END_DATE
2020-12-16 2020-12-18
2020-12-21 2020-12-25
2020-12-28 2021-01-01
2021-01-04 2021-01-07
So in this example, the first row starts with the initial start date (2020-12-16) which is on Wednesday and continues with the new end date(2020-12-18) which is the next Friday, and so on with ranges of working weeks until the actual end date of this period.
You can use:
WITH sample_data ( start_date, end_date ) AS (
SELECT DATE '2020-12-16', DATE '2021-01-07' FROM DUAL
),
weeks ( start_date, week_start, week_end, end_date ) AS (
SELECT start_date,
TRUNC( start_date, 'IW' ),
LEAST( TRUNC( start_date, 'IW' ) + INTERVAL '4' DAY, end_date ),
end_date
FROM sample_data
UNION ALL
SELECT start_date,
week_start + INTERVAL '7' DAY,
LEAST( week_end + INTERVAL '7' DAY, end_date ),
end_date
FROM weeks
WHERE week_start + INTERVAL '7' DAY <= end_date
)
SELECT GREATEST( week_start, start_date ) AS new_start_date,
week_end AS new_end_date
FROM weeks
WHERE GREATEST( week_start, start_date ) <= week_end;
Which outputs (where the NLS_DATE_FORMAT is YYYY-MM-DD (DY)):
NEW_START_DATE
NEW_END_DATE
2020-12-16 (WED)
2020-12-18 (FRI)
2020-12-21 (MON)
2020-12-25 (FRI)
2020-12-28 (MON)
2021-01-01 (FRI)
2021-01-04 (MON)
2021-01-07 (THU)
db<>fiddle here
Here is one way - compute the Monday dates for the ISO week of the input dates (start_date and end_date), while also keeping track of which is the first and which is the last such Monday in the same hierarchical (connect by) query. Then produce the requested output; for the first week, check that the start_date is not a Saturday or a Sunday (if it is, that "first week" should not produce a row in the output); that is done in the where clause. This is illustrated in the sample dates I used for testing - the input "start date" is a Saturday, so the first "new start date" is the following Monday.
with
sample_data (start_date, end_date) as (
select date '2020-12-12', date '2021-01-02' from dual
)
, mondays (dt, rn, mx) as (
select trunc(start_date, 'iw') + 7 * (level - 1), level, max(level) over ()
from sample_data
connect by level <= 1 + (trunc(end_date, 'iw') - trunc(start_date, 'iw'))/7
)
select case rn when 1 then greatest(start_date, dt)
else dt end as new_start_date,
case rn when mx then least(dt + 4, end_date)
else dt + 4 end as new_end_date
from sample_data cross join mondays
where rn >= 2 or start_date <= dt + 4
;
NEW_START_DATE NEW_END_DATE
--------------- ---------------
MON 14-DEC-2020 FRI 18-DEC-2020
MON 21-DEC-2020 FRI 25-DEC-2020
MON 28-DEC-2020 FRI 01-JAN-2021
i have two table
the first table contains the record of a ticket with start date and end date
start_date | End_Date
21-02-2017 07:52:32 | 22-02-2017 09:56:32
21-02-2017 09:52:32 | 23-02-2017 17:52:32
the second table contains the details of the weekly shift:
shift_day | Start_Time | End_Time
MON 9:00 18:00
TUE 10:00 19:00
WED 9:00 18:00
THU 10:00 19:00
FRI 9:00 18:00
I am looking to get the time difference in the first table which will only include the time as per the second table.
Use a recursive sub-query factoring clause to generate each day within your time ranges and then correlate that with your shifts to restrict the time for each day to be within the shift hours and then aggregate to get the total:
Oracle 18 Setup:
CREATE TABLE times ( start_date, End_Date ) AS
SELECT DATE '2017-02-21' + INTERVAL '07:52:32' HOUR TO SECOND,
DATE '2017-02-22' + INTERVAL '09:56:32' HOUR TO SECOND
FROM DUAL
UNION ALL
SELECT DATE '2017-02-21' + INTERVAL '09:52:32' HOUR TO SECOND,
DATE '2017-02-23' + INTERVAL '17:52:32' HOUR TO SECOND
FROM DUAL;
CREATE TABLE weekly_shifts ( shift_day, Start_Time, End_Time ) AS
SELECT 'MON', INTERVAL '09:00' HOUR TO MINUTE, INTERVAL '18:00' HOUR TO MINUTE FROM DUAL UNION ALL
SELECT 'TUE', INTERVAL '10:00' HOUR TO MINUTE, INTERVAL '19:00' HOUR TO MINUTE FROM DUAL UNION ALL
SELECT 'WED', INTERVAL '09:00' HOUR TO MINUTE, INTERVAL '18:00' HOUR TO MINUTE FROM DUAL UNION ALL
SELECT 'THU', INTERVAL '10:00' HOUR TO MINUTE, INTERVAL '19:00' HOUR TO MINUTE FROM DUAL UNION ALL
SELECT 'FRI', INTERVAL '09:00' HOUR TO MINUTE, INTERVAL '18:00' HOUR TO MINUTE FROM DUAL;
Query 1:
WITH days ( id, start_date, day_start, day_end, end_date ) AS (
SELECT ROWNUM,
start_date,
start_date,
LEAST( TRUNC( start_date ) + INTERVAL '1' DAY, end_date ),
end_date
FROM times
UNION ALL
SELECT id,
start_date,
day_end,
LEAST( day_end + INTERVAL '1' DAY, end_date ),
end_date
FROM days
WHERE day_end < end_date
)
SELECT start_date,
end_date,
SUM( shift_end - shift_start ) AS days_worked_on_shift
FROM (
SELECT ID,
start_date,
end_date,
GREATEST( day_start, TRUNC( day_start ) + start_time ) AS shift_start,
LEAST( day_end, TRUNC( day_start ) + end_time ) AS shift_end
FROM days d
INNER JOIN
weekly_shifts w
ON ( TO_CHAR( d.day_start, 'DY' ) = w.shift_day )
)
GROUP BY id, start_date, end_date;
Result:
START_DATE END_DATE DAYS_WORKED_ON_SHIFT
------------------- ------------------- --------------------
2017-02-21 07:52:32 2017-02-22 09:56:32 0.414259259259259259
2017-02-21 09:52:32 2017-02-23 17:52:32 1.078148148148148148
Is there an easy way to do this? By fully between, I mean don't count the 7am or 7pm datetimes that are equal to the start or end time.
I imagine this can be done using the unix timestamp in seconds and a bit of algebra, but I can't figure it out.
I'm happy to use something in PLSQL or plain SQL.
Examples:
start end num_7am_7pm_between_dates
2012-06-16 05:00 2012-06-16 08:00 1
2012-06-16 16:00 2012-06-16 20:00 1
2012-06-16 05:00 2012-06-16 07:00 0
2012-06-16 07:00 2012-06-16 19:00 0
2012-06-16 08:00 2012-06-16 15:00 0
2012-06-16 05:00 2012-06-16 19:01 2
2012-06-16 05:00 2012-06-18 20:00 6
I think this could be reduced further but I don't have Oracle at my disposal to completely test this Oracle SQL:
SELECT StartDate
, EndDate
, CASE WHEN TRUNC(EndDate) - TRUNC(StartDate) < 1
AND TO_CHAR(EndDate, 'HH24') > 19
AND TO_CHAR(StartDate, 'HH24') < 7
THEN 2
WHEN TRUNC(EndDate) - TRUNC(StartDate) < 1
AND (TO_CHAR(EndDate, 'HH24') > 19
OR TO_CHAR(StartDate, 'HH24') < 7)
THEN 1
WHEN TRUNC(EndDate) - TRUNC(StartDate) > 0
AND TO_CHAR(EndDate, 'HH24') > 19
AND TO_CHAR(StartDate, 'HH24') < 7
THEN 2 + ((TRUNC(EndDate) - TRUNC(StartDate)) * 2)
WHEN TRUNC(EndDate) - TRUNC(StartDate) > 0
AND TO_CHAR(EndDate, 'HH24') > 19
OR TO_CHAR(StartDate, 'HH24') < 7
THEN 1 + ((TRUNC(EndDate) - TRUNC(StartDate)) * 2)
ELSE 0
END
FROM MyTable;
Thanks to #A.B.Cade for the Fiddle, it looks like my CASE Logic can be condensed further to:
SELECT SDate
, EDate
, CASE WHEN TO_CHAR(EDate, 'HH24') > 19
AND TO_CHAR(SDate, 'HH24') < 7
THEN 2 + ((TRUNC(EDate) - TRUNC(SDate)) * 2)
WHEN TO_CHAR(EDate, 'HH24') > 19
OR TO_CHAR(SDate, 'HH24') < 7
THEN 1 + ((TRUNC(EDate) - TRUNC(SDate)) * 2)
ELSE 0
END AS MyCalc2
FROM MyTable;
I had fun writing the following solution:
with date_range as (
select min(sdate) as sdate, max(edate) as edate
from t
),
all_dates as (
select sdate + (level-1)/24 as hour
from date_range
connect by level <= (edate-sdate) * 24 + 1
),
counts as (
select t.id, count(*) as c
from all_dates, t
where to_char(hour, 'HH') = '07'
and hour > t.sdate and hour < t.edate
group by t.id
)
select t.sdate, t.edate, nvl(counts.c, 0)
from t, counts
where t.id = counts.id(+)
order by t.id;
I added an id column to the table in case the range of dates aren't unique.
http://www.sqlfiddle.com/#!4/5fa19/13
This may not have the best performance but might work for you:
select sdate, edate, count(*)
from (select distinct edate, sdate, sdate + (level / 24) hr
from t
connect by sdate + (level / 24) <= edate )
where to_char(hr, 'hh') = '07'
group by sdate, edate
UPDATE: As to #FlorinGhita's comment - fixed the query to include zero occurences
select sdate, edate, sum( decode(to_char(hr, 'hh'), '07',1,0))
from (select distinct edate, sdate, sdate + (level / 24) hr
from t
connect by sdate + (level / 24) <= edate )
group by sdate, edate
Do like this (in SQL)
declare #table table ( start datetime, ends datetime)
insert into #table select'2012-06-16 05:00','2012-06-16 08:00' --1
insert into #table select'2012-06-16 16:00','2012-06-16 20:00' --1
insert into #table select'2012-06-16 05:00','2012-06-16 07:00' --0
insert into #table select'2012-06-16 07:00','2012-06-16 19:00' --0
insert into #table select'2012-06-16 08:00','2012-06-16 15:00' --0
insert into #table select'2012-06-16 05:00','2012-06-16 19:01' --2
insert into #table select'2012-06-16 05:00','2012-06-18 20:00' --6
insert into #table select'2012-06-16 07:00','2012-06-18 07:00' --3
Declare #From DATETIME
Declare #To DATETIME
select #From = MIN(start) from #table
select #To = max(ends) from #table
;with CTE AS
(
SELECT distinct
DATEADD(DD,DATEDIFF(D,0,start),0)+'07:00' AS AimTime
FROM #table
),CTE1 AS
(
Select AimTime
FROM CTE
UNION ALL
Select DATEADD(hour, 12, AimTime)
From CTE1
WHERE AimTime< #To
)
select start,ends, count(AimTime)
from CTE1 right join #table t
on t.start < CTE1.AimTime and t.ends > CTE1.AimTime
group by start,ends
I've got a table in a Sybase DB with a column createdDateTime.
What I want to be able to do is count how many rows were created between specific but accumulating time periods, ie:
7:00 - 7:15
7:00 - 7:30
7:00 - 7:45
7:00 - 8:00
...
and so on until I have the last time group, 7:00 - 18:00.
Is there a nice way to make one query in SQL that will return all the rows for me with all the row counts:
Time Rows Created
7:00 - 7:15 0
7:00 - 7:30 5
7:00 - 7:45 8
7:00 - 8:00 15
... ...
I have a solution at the moment, but it requires me running a parameterised query 44 times to get all the data.
Thanks,
I recently blogged about this exact topic, not sure if it works in Sybase though, here's the solution
declare #interval int
set #interval = 5
select datepart(hh, DateTimeColumn)
, datepart(mi, DateTimeColumn)/#interval*#interval
, count(*)
from thetable
group by datepart(hh, DateTimeColumn)
, datepart(mi, DateTimeColumn)/#interval*#interval
and more details
http://ebersys.blogspot.com/2010/12/sql-group-datetime-by-arbitrary-time.html
try this
select count(*) from table groupedby createdDateTime where createdDateTime in (
SELECT *
FROM table
WHERE createdDateTime between createdDateTime ('2011/01/01:07:00', 'yyyy/mm/dd:hh:mm')
AND createdDateTime ('2011/01/01:07:15', 'yyyy/mm/dd:hh:mm')
)
Does Sybase have a CASE statement? If so try this:
SELECT SUM(CASE WHEN CreatedTime BETWEEN ('7:00:00' AND '7:14:59') THEN 1 ELSE 0) as '7-7:15',
SUM(CASE WHEN CreatedTime BETWEEN ('7:15:00' AND '7:29:59') THEN 1 ELSE 0) as '7:15-7:30',
FROM MyTable
Where <conditions>
I use this a LOT in SQL Server.
You could determine the quarter of the hour in which a row was created and group by that value. Please note that this is Oracle SQL, but Sybase probably has an equivalent.
select to_char(datetime_created, 'HH24') hour
, floor(to_char(datetime_created, 'MI')/15)+1 quarter
, count(1)
from my_table
group by to_char(datetime_created, 'HH24')
, floor(to_char(datetime_created, 'MI')/15)+1;
You have irregular periods (some are 15 min length, others are 1 hour length, others are a few hours length). In that case, the best you can do is running a query with case statements:
with thetable as
(
SELECT 'TM' code, convert(datetime, '2011-04-15 07:01:00 AM') date, 1 id union all
SELECT 'TM', convert(datetime, '2011-04-15 07:05:00 AM'), 2 union all
SELECT 'TM', convert(datetime, '2011-04-15 07:08:00 AM'), 3 union all
SELECT 'TM', convert(datetime, '2011-04-15 07:20:00 AM'), 4 union all
SELECT 'TM', convert(datetime, '2011-04-15 08:25:00 AM'), 5
)
SELECT '07:00 - 07:15' interval, sum(case when CONVERT(varchar, date, 108) between '07:00:00' AND '07:14:59' then 1 else 0 end) counting
FROM thetable
union
select '07:15 - 08:00', sum(case when CONVERT(varchar, date, 108) between '07:15:00' AND '07:59:59' then 1 else 0 end)
from thetable
union
select '08:00 - 09:00', sum(case when CONVERT(varchar, date, 108) between '07:59:59' AND '08:59:59' then 1 else 0 end)
from thetable
Now, if you did have regular intervals, you'd do something like that:
select counting,
dateadd(ms,500-((datepart(ms,interval)+500)%1000),interval) intini
from
(
SELECT COUNT(1) counting, CONVERT(datetime, round(floor(CONVERT(float, date) * 24 * 4) / (24 * 4), 11)) interval
FROM
(
SELECT 'TM' code, convert(datetime, '2011-04-15 07:01:00 AM') date, 1 id union all
SELECT 'TM', convert(datetime, '2011-04-15 07:05:00 AM'), 2 union all
SELECT 'TM', convert(datetime, '2011-04-15 07:08:00 AM'), 3 union all
SELECT 'TM', convert(datetime, '2011-04-15 07:20:00 AM'), 4 union all
SELECT 'TM', convert(datetime, '2011-04-15 08:25:00 AM'), 5
) thetable
group by FLOOR(CONVERT(float, date) * 24 * 4)
) thetable2
Notice that 24 * 4 is the interval of 15 minutes. If your interval is 1 hour, you should replace that with 24. If you interval is 10 minutes, it should be 24 * 6. I think you got the picture.