Redshift SQL: Get date difference based on start and end dates

Redshift SQL: Get date difference based on start and end dates - sql

My table has start_date and end_date from which I need to find the hour difference. The issue is that both of these date times are not on the same day.
user start_date end_date difference
Alex 7/25/2016 16:00 7/26/2016 0:30 8.5
Alex 7/24/2016 16:00 7/25/2016 0:30 8.5
Alex 7/21/2016 16:00 7/22/2016 0:30 8.5
Alex 7/20/2016 16:00 7/21/2016 0:30 8.5
Alex 7/19/2016 16:00 7/20/2016 0:30 8.5
Alex 7/18/2016 16:00 7/19/2016 0:30 8.5
Alex 7/17/2016 16:00 7/18/2016 0:30 8.5
Alex 7/14/2016 16:00 7/15/2016 0:30 8.5
Alex 7/13/2016 16:00 7/14/2016 0:30 8.5
Alex 7/12/2016 16:00 7/13/2016 0:30 8.5
Alex 7/11/2016 16:00 7/12/2016 0:30 8.5
Alex 7/10/2016 16:00 7/11/2016 0:30 8.5
Usually it is 5 working days and I get the answer if I group them by start_date. But I need an new date column where I need the output as below. Please note that 15/7/2016 and 22/7/2016 was not present in the above table. I need the additional 0.5 hour & date for the 6th day to be included to my derived table.
User Date difference
Alex 7/25/2016 8.5
Alex 7/24/2016 8.5
Alex 7/22/2016 0.5
Alex 7/21/2016 8.0
Alex 7/20/2016 8.5
Alex 7/19/2016 8.5
Alex 7/18/2016 8.5
Alex 7/17/2016 8.5
Alex 7/15/2016 0.5
Alex 7/14/2016 8.0
Alex 7/13/2016 8.5
Alex 7/12/2016 8.5
Alex 7/11/2016 8.5
Alex 7/10/2016 8.5
I calculate the difference as
round(cast(datediff(seconds, start_date, end_date) as decimal)/3600,2)

Whenever there is sophisticated logic, I'd suggest to use union queries and split the logic into a select query (or even table) each. Then you'd be able to calculate this in two steps. The main difference seems to be whether the 0.5 between 00:00:00 and 00:30:00 should be counted to the previous workday or whether it should stand alone. The latter seems to be determined based on whether the end_date is also a workday itself. I see three cases:
Next day is a workday:
Report all hours on start_date
Next day is not a workday:
Report hours from start_date to midnight on start_date
Report hours from midnight to end_date on end_date
I used the following example data based on your description:
create temporary table _test (user varchar(20), start_date timestamp, end_date timestamp);
insert into _test values ('Alex', '7/25/2016 16:00', '7/26/2016 0:30'), ('Alex', '7/24/2016 16:00', '7/25/2016 0:30'), ('Alex', '7/21/2016 16:00', '7/22/2016 0:30'), ('Alex', '7/20/2016 16:00', '7/21/2016 0:30'), ('Alex', '7/19/2016 16:00', '7/20/2016 0:30'), ('Alex', '7/18/2016 16:00', '7/19/2016 0:30'), ('Alex', '7/17/2016 16:00', '7/18/2016 0:30'), ('Alex', '7/14/2016 16:00', '7/15/2016 0:30'), ('Alex', '7/13/2016 16:00', '7/14/2016 0:30'), ('Alex', '7/12/2016 16:00', '7/13/2016 0:30'), ('Alex', '7/11/2016 16:00', '7/12/2016 0:30'), ('Alex', '7/10/2016 16:00', '7/11/2016 0:30');
We will need to know whether the next day is a workday, so I suggest using the lead() window function (see documentation) which will give you the start_date from the next row.
create temporary table _differences as (
select
user_name
, start_date::date as start_date
, end_date::date as end_date
/**
* Calculate difference in hours between start_date and end_date: */
, round(cast(datediff(seconds, start_date, end_date) as decimal)/3600,2) as hours_start_to_end
/**
* Calculate difference in hours between start_date and midnight: */
, round(cast(datediff(seconds, start_date, dateadd(day, 1, start_date::date)) as decimal)/3600,2) as hours_start_to_midnight
/**
* Calculate difference between midnight on end_date and end_date: */
, round(cast(datediff(seconds, end_date::date, end_date) as decimal)/3600,2) as hours_midnight_to_end
/**
* Calculate number of days from end_date until next start_date: */
, datediff(day, end_date::date, lead(start_date::date) over(partition by user_name order by start_date::date)) as days_until_next_workday
from
_test
);
Then the following query:
select
user_name as user_name
, start_date as ref_date
, hours_start_to_end as difference
from
_differences
where
days_until_next_workday = 0 -- report all work hours on start_date
union
select
user_name as user_name
, start_date as ref_date
, hours_start_to_midnight as difference
from
_differences
where
days_until_next_workday > 0 -- report partial work hours on start_date
union
select
user_name as user_name
, end_date as ref_date
, hours_midnight_to_end as difference
from
_differences
where
days_until_next_workday > 0 -- report partial work hours on end_date
order by
user_name
, ref_date desc
;
Would yield the following result:
user_name | ref_date | difference
-----------+------------+------------
Alex | 2016-07-24 | 8.50
Alex | 2016-07-22 | 0.50
Alex | 2016-07-21 | 8.00
Alex | 2016-07-20 | 8.50
Alex | 2016-07-19 | 8.50
Alex | 2016-07-18 | 8.50
Alex | 2016-07-17 | 8.50
Alex | 2016-07-15 | 0.50
Alex | 2016-07-14 | 8.00
Alex | 2016-07-13 | 8.50
Alex | 2016-07-12 | 8.50
Alex | 2016-07-11 | 8.50
Alex | 2016-07-10 | 8.50
(13 rows)
You can see that 7/25/2016 is missing because there is no start_date on or after 7/26/2016, so you'll need to figure out how to account for that special case.

here is how I have done the calc and it works perfectly
select user, trunc(start_time) as date1,
SUM(case when id = 1 then round(cast(datediff(seconds, start_time, st_t1) as decimal)/3600,2) end) as SCHEDULE
from
(
select user, start_time,
case when trunc(start_time) <> trunc(end_time) then cast(to_char(start_time,'yyyy-mm-dd 23:59:59') as timestamp) else cast(to_char(end_time,'yyyy-mm-dd hh24:mi:ss') as timestamp) end as st_t1
from table1 a
where id = 1
group by user_name, trunc(start_time)
union
select user_name, trunc(end_time) as date1,
SUM(case when id = 1 then round(cast(datediff(seconds, st_t2, end_time) as decimal)/3600,2) end) as SCHEDULE
from
(
select user_name, end_time,
case when trunc(start_time) <> trunc(end_time) then cast(to_char(end_time,'yyyy-mm-dd 00:00:00') as timestamp) else cast(to_char(end_time,'yyyy-mm-dd hh24:mi:ss') as timestamp) end as st_t2
from table1 a
where id = 1
)
group by user, trunc(end_time)

Related

postgres query to group the records by hourly interval with date field

I have a table that has some file input data with file_id and file_input_date. I want to filter / group these file_ids depending on file_input_date. The problem is my date is in format of YYYY-MM-DD HH:mm:ss and I want to go further to group them by hour and not just the date.
Edit: some sample data
file_id | file_input_date
597872 | 2023-01-12 16:06:22.92879
497872 | 2023-01-11 16:06:22.92879
397872 | 2023-01-11 16:06:22.92879
297872 | 2023-01-11 17:06:22.92879
297872 | 2023-01-11 17:06:22.92879
297872 | 2023-01-11 17:06:22.92879
297872 | 2023-01-11 18:06:22.92879
what I want to see is
1 for 2023-01-12 16:06
2 for 2023-01-11 16:06
3 for 2023-01-11 17:06
1 for 2023-01-11 18:06
the output format will be different but this kind of gives what I want.

You could convert the dates to strings with the format you want and group by it:
SELECT TO_CHAR(file_input_date, 'YYYY-MM-DD HH24:MI'), COUNT(*)
FROM mytable
GROUP BY TO_CHAR(file_input_date, 'YYYY-MM-DD HH24:MI')

To get to hour not minute:
create table date_grp (file_id integer, file_input_date timestamp);
INSERT INTO date_grp VALUES
(597872, '2023-01-12 16:06:22.92879'),
(497872, '2023-01-11 16:06:22.92879'),
(397872, '2023-01-11 16:06:22.92879'),
(297872, '2023-01-11 17:06:22.92879'),
(297872, '2023-01-11 17:06:22.92879'),
(297872, '2023-01-11 17:06:22.92879'),
(297872, '2023-01-11 18:06:22.92879');
SELECT
date_trunc('hour', file_input_date),
count(date_trunc('hour', file_input_date))
FROM
date_grp
GROUP BY
date_trunc('hour', file_input_date);
date_trunc | count
---------------------+-------
01/11/2023 18:00:00 | 1
01/11/2023 17:00:00 | 3
01/12/2023 16:00:00 | 1
01/11/2023 16:00:00 | 2
(4 rows)
Though if you want to minute
SELECT
date_trunc('minute', file_input_date),
count(date_trunc('minute', file_input_date))
FROM
date_grp
GROUP BY
date_trunc('minute', file_input_date);
date_trunc | count
---------------------+-------
01/11/2023 18:06:00 | 1
01/11/2023 16:06:00 | 2
01/12/2023 16:06:00 | 1
01/11/2023 17:06:00 | 3

Shift start date where shift spans 2 days

I am attempting to find all employees that worked during a 24 hour period. Their shift starts at 08:00 and ends the next day at 08:00. Most start at 08:00 and end the next day at 08:00. However, the start and end times may occur at any time during the 24 hour period. My goal would be to have all of those that worked during the 24 hour period to be marked as starting on the day the shift started.
For example, in the table below even though there are multiple start times over 2 days they all occurred within the 24 hours of the shift start time beginning on 7/26/22 at 08:00. I would like a new column showing a shift day of 7/26/22.
first
start_time
end_time
hours
Nicole
7/26/22 8:00
7/27/22 8:00
24
Callan
7/26/22 8:00
7/27/22 8:00
24
Bob
7/26/22 18:30
7/27/22 6:30
12
Kevin
7/27/22 7:00
7/27/22 8:00
1
Michael
7/27/22 7:00
7/27/22 8:00
1

If the start time is greater than or equal to 8 am you want the shift date to be the same date as the start time, otherwise it is the previous day:
create table #shifts
(
[first] varchar(32), -- this is a very weird column name. first_name, perhaps?
start_time datetime,
end_time datetime,
[hours] tinyint
);
set dateformat mdy; -- your sample data uses an ambiguous date format so I have to do this
insert #shifts values
('Nicole ', '7/26/22 08:00', '7/27/22 8:00', 24),
('Callan ', '7/26/22 08:00', '7/27/22 8:00', 24),
('Bob ', '7/26/22 18:30', '7/27/22 6:30', 12),
('Kevin ', '7/27/22 07:00', '7/27/22 8:00', 1),
('Michael', '7/27/22 07:00', '7/27/22 8:00', 1);
select [first],
start_time,
end_time,
[hours],
shift_date = iif
(
cast(start_time as time) >= '8:00',
cast(start_time as date),
cast(dateadd(day, -1, start_time) as date)
)
from #shifts;
Produces:
first
start_time
end_time
hours
shift_date
Nicole
2022-07-26 08:00:00.000
2022-07-27 08:00:00.000
24
2022-07-26
Callan
2022-07-26 08:00:00.000
2022-07-27 08:00:00.000
24
2022-07-26
Bob
2022-07-26 18:30:00.000
2022-07-27 06:30:00.000
12
2022-07-26
Kevin
2022-07-27 07:00:00.000
2022-07-27 08:00:00.000
1
2022-07-26
Michael
2022-07-27 07:00:00.000
2022-07-27 08:00:00.000
1
2022-07-26

T-SQL: Get the list of hours between two datetime columns

RAW DATA
table name: guest
guest_id
guest_name
arrival
departure
1
John
2022-01-31 13:00:15
2022-01-31 17:00:12
2
Mary
2022-02-01 12:09:03
2022-02-01 14:00:03
EXPECTED RESULTS
guest_id
guest_name
time
1
John
2022-01-31 13:00:00
1
John
2022-01-31 14:00:00
1
John
2022-01-31 15:00:00
1
John
2022-01-31 16:00:00
1
John
2022-01-31 17:00:00
2
Mary
2022-02-01 12:00:00
2
Mary
2022-02-01 13:00:00
2
Mary
2022-02-01 14:00:00
This is my base query
select guest_id, guest_name, arrival, departure from guest
where guest_name in ('John', 'Mary')
Recursive CTE is fine but not preferred.

select
t1.begin_date,
t1.end_date,
DATEDIFF(hour, begin_date, end_date) as diff_dates
from (
select
convert(datetime, '2022-01-31 13:00:15') as begin_date,
convert(datetime, '2022-01-31 17:00:12') as end_date
) t1

Grouping rows based on consecutive time values

I'm trying to create a separate group with the rows that have consecutive time values within the same day.
For example, my current dataset is as follows:
Date StartTime EndTime StudentID Type Class Work Group *
2020-01-30 09:00:00 11:00:00 20789 A 178 56 1
2020-01-30 11:00:00 13:00:00 20789 A 789 67 1
2021-01-08 09:00:00 10:00:00 78945 D 195 13 2
2021-01-08 10:00:00 12:00:00 78945 D 789 12 2
2021-01-08 13:00:00 14:00:00 78945 D 398 13 3
2021-01-08 14:00:00 16:00:00 78945 D 543 13 3
If the rows have same Student ID and Type and the Start/End Time are consecutive within the same day,
I'd like to assign the same, unique Group ID number like the "Group" column in the data set. I tried to create the group column using Lag/Lead and Partition by but my current code is not working.
Could anyone please help me with this?
Thank you.

You may use window functions:
Table:
SELECT *
INTO Data
FROM (VALUES
(CONVERT(date, '2020-01-30'), CONVERT(time, '09:00:00'), CONVERT(time, '11:00:00'), 20789, 'A', 178, 56),
(CONVERT(date, '2020-01-30'), CONVERT(time, '11:00:00'), CONVERT(time, '13:00:00'), 20789, 'A', 789, 67),
(CONVERT(date, '2021-01-08'), CONVERT(time, '09:00:00'), CONVERT(time, '10:00:00'), 78945, 'D', 195, 13),
(CONVERT(date, '2021-01-08'), CONVERT(time, '10:00:00'), CONVERT(time, '12:00:00'), 78945, 'D', 789, 12),
(CONVERT(date, '2021-01-08'), CONVERT(time, '13:00:00'), CONVERT(time, '14:00:00'), 78945, 'D', 398, 13),
(CONVERT(date, '2021-01-08'), CONVERT(time, '14:00:00'), CONVERT(time, '16:00:00'), 78945, 'D', 543, 13)
) v (Date, StartTime, EndTime, StudentID, Type, Class, Work)
Statement:
SELECT
Date, StartTime, EndTime, StudentID, Type, Class, Work,
SUM(Change) OVER (ORDER BY Date, StartTime) AS GroupID
FROM (
SELECT
Date, StartTime, EndTime, StudentID, Type, Class, Work,
CASE
WHEN
StudentID = LAG(StudentID) OVER (ORDER BY Date, StudentID, StartTime) AND
StartTime = LAG(EndTime) OVER (ORDER BY Date, StudentID, StartTime) THEN 0
ELSE 1
END AS Change
FROM Data
) t
Result:
Date StartTime EndTime StudentID Type Class Work GroupID
2020-01-30 09:00:00.0000000 11:00:00.0000000 20789 A 178 56 1
2020-01-30 11:00:00.0000000 13:00:00.0000000 20789 A 789 67 1
2021-01-08 09:00:00.0000000 10:00:00.0000000 78945 D 195 13 2
2021-01-08 10:00:00.0000000 12:00:00.0000000 78945 D 789 12 2
2021-01-08 13:00:00.0000000 14:00:00.0000000 78945 D 398 13 3
2021-01-08 14:00:00.0000000 16:00:00.0000000 78945 D 543 13 3

Add new column to temp table based on values from a column

I would like to create a temp table from the below table.
------------------------|--------
Date | Length
------------------------|--------
2014-08-28 00:00:00.000 | 1.5
2014-08-28 00:00:00.000 | 2.6
2014-08-28 00:00:00.000 | 1.5
2014-08-28 00:00:00.000 | 3.3
2014-08-28 00:00:00.000 | 1.1
2014-08-28 00:00:00.000 | 8.5
2014-08-28 00:00:00.000 | 8.6
2014-08-28 00:00:00.000 | 11.3
And have the temp table look like the one below.
Date | Length | Length_Range
------------------------|---------|--------------
2014-08-28 00:00:00.000 | 1.5 | 1-4
2014-08-28 00:00:00.000 | 2.6 | 1-4
2014-08-28 00:00:00.000 | 6.5 | 5-10
2014-08-28 00:00:00.000 | 3.3 | 1-4
2014-08-28 00:00:00.000 | 1.1 | 1-4
2014-08-28 00:00:00.000 | 8.5 | 5-10
2014-08-28 00:00:00.000 | 8.6 | 5-10
2014-08-28 00:00:00.000 | 11.3 | 11-15
I would like to be able to define the [Length_Range].
Microsoft SQL Server 2016.
Compatibility level: SQL Server 2005 (90)

Use case:
select t.*,
(case when length >= 1 and length < 4 then '1-4'
when length < 10 then '5-10'
when length < 15 then '11-15'
else '16+'
end) as length_range
into #temp_t
from t;

CREATE TABLE #TABLE1
([DATE] DATETIME, [LENGTH] FLOAT)
INSERT INTO #TABLE1
([DATE], [LENGTH])
VALUES
('2014-08-28 00:00:00', 1.5),
('2014-08-28 00:00:00', 2.6),
('2014-08-28 00:00:00', 1.5),
('2014-08-28 00:00:00', 3.3),
('2014-08-28 00:00:00', 1.1),
('2014-08-28 00:00:00', 8.5),
('2014-08-28 00:00:00', 8.6),
('2014-08-28 00:00:00', 1.3)
SELECT *,CASE
WHEN LENGTH BETWEEN 1 AND 4 THEN '1-4'
WHEN LENGTH BETWEEN 5 AND 10 THEN '5-10'
WHEN LENGTH BETWEEN 11 AND 15 THEN '11-15' END AS LENGHT_RANGE
FROM #TABLE1
OUTPUT
Date Length LENGHT_RANGE
2014-08-28 00:00:00.000 1.5 1-4
2014-08-28 00:00:00.000 2.6 1-4
2014-08-28 00:00:00.000 1.5 1-4
2014-08-28 00:00:00.000 3.3 1-4
2014-08-28 00:00:00.000 1.1 1-4
2014-08-28 00:00:00.000 8.5 5-10
2014-08-28 00:00:00.000 8.6 5-10
2014-08-28 00:00:00.000 1.3 1-4

You can use the Create Table ... As Select... syntax
CREATE TABLE temp_table AS
SELECT [date], [length],
CASE
WHEN length BETWEEN 1 AND 4 THEN '1-4'
WHEN length BETWEEN 5 AND 10 THEN '5-10'
WHEN length BETWEEN 11 AND 15 THEN '11-15'
END AS LENGTH_RANGE
FROM orig_table
Sources:
Tech on the Net - SQL: CREATE TABLE AS Statement
MSDN - CREATE TABLE AS SELECT
Oracle - Create Table
...

I understand your question. but your answer already available on following link.
Click here
So refer this link and get more details.

--May help this
;WITH cte (
[Date]
,[Length]
)
AS (
SELECT cast('2014-08-28 00:00:00.000' AS DATETIME)
,CAST('1.5' AS DECIMAL(4, 2))
UNION ALL
SELECT '2014-08-28 00:00:00.000'
,'2.6'
UNION ALL
SELECT '2014-08-28 00:00:00.000'
,'1.5'
UNION ALL
SELECT '2014-08-28 00:00:00.000'
,'3.3'
UNION ALL
SELECT '2014-08-28 00:00:00.000'
,'1.1'
UNION ALL
SELECT '2014-08-28 00:00:00.000'
,'8.5'
UNION ALL
SELECT '2014-08-28 00:00:00.000'
,'8.6'
UNION ALL
SELECT '2014-08-28 00:00:00.000'
,'11.3'
)
SELECT *
,CASE
WHEN Length < 1
THEN '< 1'
WHEN Length BETWEEN 1
AND 4
THEN '1-4'
WHEN Length BETWEEN 5
AND 10
THEN '5-10'
WHEN Length BETWEEN 11
AND 15
THEN '11-15'
WHEN Length > 15
THEN '> 15'
END AS Length_Range
FROM cte

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Redshift SQL: Get date difference based on start and end dates - sql

Related

postgres query to group the records by hourly interval with date field

Shift start date where shift spans 2 days

T-SQL: Get the list of hours between two datetime columns

Grouping rows based on consecutive time values

Add new column to temp table based on values from a column

Categories

Resources