SQL query hourly for each day - sql

I have a question that seems to be quite complex. I needed to know what happens in a session that day, at a certain time.
Briefly I have a table that shows me all sessions of a given area. These sessions have a start date and a start time and an end time.
You can see in this table:
idArea | idSession | startDate | startTime | endTime
1 | 1 | 2013-01-01 | 1900-01-01 09:00:00 | 1900-01-01 12:00:00
1 | 2 | 2013-01-01 | 1900-01-01 14:00:00 | 1900-01-01 15:00:00
1 | 3 | 2013-01-04 | 1900-01-01 09:00:00 | 1900-01-01 13:00:00
1 | 4 | 2013-01-07 | 1900-01-01 10:00:00 | 1900-01-01 12:00:00
1 | 5 | 2013-01-07 | 1900-01-01 13:00:00 | 1900-01-01 18:00:00
1 | 6 | 2013-01-08 | 1900-01-01 10:00:00 | 1900-01-01 12:00:00
Then I also have a table that shows me all hours interspersed, ie every half hour (I created this table on purpose for this requirement, if someone has a better idea, I can say that I will try to adapt).
idHour | Hour
1 | 1900-01-01 00:00:00
2 | 1900-01-01 00:30:00
3 | 1900-01-01 01:00:00
............................
4 | 1900-01-01 09:00:00
5 | 1900-01-01 09:30:00
6 | 1900-01-01 10:00:00
7 | 1900-01-01 10:30:00
............................
In the end that's what I want to present was this:
startDate | startTime | SessionID
2013-01-01 | 1900-01-01 09:00:00 | 1
2013-01-01 | 1900-01-01 09:30:00 | 1
2013-01-01 | 1900-01-01 10:00:00 | 1
2013-01-01 | 1900-01-01 10:30:00 | 1
2013-01-01 | 1900-01-01 11:00:00 | 1
2013-01-01 | 1900-01-01 11:30:00 | 1
2013-01-01 | 1900-01-01 11:30:00 | 1
2013-01-01 | 1900-01-01 14:00:00 | 1
2013-01-01 | 1900-01-01 14:30:00 | 1
2013-01-01 | 1900-01-01 15:00:00 | 1
This table is only for idSession=1 what I wanted was for all sessions. If there are no sessions for one day can return NULL.
The hard this query or procedure, is that they have to show me all the days of the month when there are sessions for that area.
For this, I already used this query:
;WITH t1 AS
(
SELECT
startDate,
DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', startDate), '1900-01-01') firstInMonth,
DATEADD(DAY, -1, DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', startDate) + 1, '1900-01-01')) lastInMonth,
COUNT(*) cnt
FROM
#SessionsPerArea
WHERE
idArea = 1
GROUP BY
startDate
), calendar AS
(
SELECT DISTINCT
DATEADD(DAY, c.number, t1.firstInMonth) d
FROM
t1
JOIN
master..spt_values c ON type = 'P'
AND DATEADD(DAY, c.number, t1.firstInMonth) BETWEEN t1.firstInMonth AND t1.lastInMonth
)
SELECT
d date,
cnt Session
FROM
calendar c
LEFT JOIN
t1 ON t1.startDate = c.d
It is quite complex, if anyone has an easy way to do this was excellent.

If I understand correctly, this is simply a join between the calendar table and #SessionPerArea,w ith the right conditions:
select spa.StartDate, c.hour as StartTime, spa.idSession as SessionId
from calendar c join
#SessionsPerArea spa
on c.hour between spa.startTime and spa.EndTime
The join is matching all times between the start and end times in the data, and then returning the values.

I think maybe you simply need an outer join between calendar and #SessionsPerArea...so all the days in the calendar table are returned regardless of a match to the #SessionsPerArea table?

Related

SQL-Aggregate Timeseries Table (HourOfDay, Val) to Average Value of HourOfDay by Weeekday (fi. Avg of Mondays 10:00-11:00, 11:00-12:00,...,Tue...)

So far I made an SQL query that provides me with a table containing the amount of customers handled for each hour of the day - given a arbitrary start and an end datetime value (from Grafana interface). The result might be over many weeks. My goal is to implement an hourly heatmap by weekday with averaged values.
How do I aggregate those customer per hour to show the average value of that hours per weekday?
So let's say I got 24 values per day over 19 days. How do I aggregate so I get 24 values for each mon, tue, wed, thu, fri, sat, sun - each hour representing the average value for those days?
Also only use data of full weeks, so strip leading and trailing days, that are not part of a fully represented week (so same amount of individual weekdays representing an average value).
Here is a segment on how the return of my SQL query looks so far. (hour of each day, number of customers):
...
2021-12-13 11:00:00 | 0
2021-12-13 12:00:00 | 3
2021-12-13 13:00:00 | 4
2021-12-13 14:00:00 | 4
2021-12-13 15:00:00 | 7
2021-12-13 16:00:00 | 17
2021-12-13 17:00:00 | 12
2021-12-13 18:00:00 | 18
2021-12-13 19:00:00 | 15
2021-12-13 20:00:00 | 8
2021-12-13 21:00:00 | 10
2021-12-13 22:00:00 | 1
2021-12-13 23:00:00 | 0
2021-12-14 00:00:00 | 0
2021-12-14 01:00:00 | 0
2021-12-14 02:00:00 | 0
2021-12-14 03:00:00 | 0
2021-12-14 04:00:00 | 0
2021-12-14 05:00:00 | 0
2021-12-14 06:00:00 | 0
2021-12-14 07:00:00 | 0
2021-12-14 08:00:00 | 0
2021-12-14 09:00:00 | 0
2021-12-14 10:00:00 | 12
2021-12-14 11:00:00 | 12
2021-12-14 12:00:00 | 19
2021-12-14 13:00:00 | 11
2021-12-14 14:00:00 | 11
2021-12-14 15:00:00 | 12
2021-12-14 16:00:00 | 9
2021-12-14 17:00:00 | 2
...
So (schematically, example data) startDate 2021-12-10 11:00 to endDate 2021-12-31 17:00
-------------------------------
...
Mon 2021-12-13 12:00 | 3
Mon 2021-12-13 13:00 | 4
Mon 2021-12-13 14:00 | 4
...
Mon 2021-12-20 12:00 | 1
Mon 2021-12-20 13:00 | 6
Mon 2021-12-20 13:00 | 2
...
Mon 2021-12-27 12:00 | 2
Mon 2021-12-27 13:00 | 2
Mon 2021-12-27 13:00 | 3
...
-------------------------------
into this:
strip leading fri 10., sat 11., sun 12.
strip trailing tue 28., wen 29., thu 30., fri 31.
average hours per weekday
-------------------------------
...
Mon 12:00 | 2
Mon 13:00 | 4
Mon 14:00 | 3
...
Tue 12:00 | x
Tue 13:00 | y
Tue 13:00 | z
...
-------------------------------
My approach so far:
WITH CustomersPerHour as (
SELECT dateadd(hour, datediff(hour, 0, Systemdatum),0) as DayHour, Count(*) as C
FROM CustomerList
WHERE CustomerID > 0
AND Datum BETWEEN '2021-12-010T11:00:00Z' AND '2021-12-31T17:00:00Z'
AND EntryID IN (62,65)
AND CustomerID IN (SELECT * FROM udf_getActiveUsers())
GROUP BY dateadd(hour, datediff(hour, 0, Systemdatum), 0)
)
-- add null values on missing data/insert missing hours
SELECT DATEDIFF(second, '1970-01-01', dt.Date) AS time, C as Customers
FROM dbo.udf_generateHoursTable('2021-12-03T18:14:56Z', '2022-03-13T18:14:56Z') as dt
LEFT JOIN CustomersPerHour cPh ON dt.Date = cPh.DayHour
ORDER BY
time ASC
Hi simpliest solution is just do what you have written in example. Create custom base for aggregation.
So first step is to prepare your data in aggregated table with Date & Hour precision & customer count.
Then create base.
This is example of basic idea:
-- EXAMPLE
SELECT
DATENAME(WEEKDAY, GETDATE()) + ' ' + CAST(DATEPART(HOUR, GETDATE()) + ':00' AS varchar(8))
-- OUTPUT: Sunday 21:00
You can concatenate data and then use it in GROUP BY clause.
Adjust this query for your use case:
SELECT
DATENAME(WEEKDAY, <DATETIME_COL>) + ' ' + CAST(DATEPART(HOUR, <DATETIME_COL>) AS varchar(8)) + ':00' as base
,SUM(...) as sum_of_whatever
,AVG(...) as avg_of_whatever
FROM <YOUR_AGG_TABLE>
GROUP BY DATENAME(WEEKDAY, <DATETIME_COL>) + ' ' + CAST(DATEPART(HOUR, <DATETIME_COL>) AS varchar(8)) + ':00'
This create base exactly as you wanted.
You can use this logic to create other desired agg. bases.

SQL to show overlapping time periods

How to check in Postgresql 9.2 (SQL command), if in the timestamp records there is some period overlapping others from same id_user. I need to correct an existing table.
For example, a query show the rows 1,3 and 4.
id | id_user | timedate0 | timedate2
---------------------------------------------------
1 | 1 | 2020-04-20 12:00:00 | 2020-04-20 14:00:00
2 | 1 | 2020-04-20 17:00:00 | 2020-04-20 19:30:00
3 | 1 | 2020-04-20 14:30:00 | 2020-04-20 15:40:00
4 | 1 | 2020-04-20 13:00:00 | 2020-04-20 15:00:00
5 | 1 | 2020-04-21 13:00:00 | 2020-04-21 14:00:00
6 | 1 | 2020-04-21 14:00:00 | 2020-04-21 15:00:00
You can use exists:
select t.*
from t
where exists (select 1
from t t2
where t2.timedate0 < t.timedate2 and
t2.timedate2 > t.timedate0 and
t2.id_user = t.id_user and t2.id <> t.id
);

Oracle SQL List Intervals

I need to create new interval rows based on a start datetime column and an end datetime column.
My statement looks like this currently
select id,
startdatetime,
enddatetime
from calls
result looks like this
id startdatetime enddatetime
1 01/01/2020 00:00:00 01/01/2020 04:00:00
I would like a result like this
id startdatetime enddatetime Intervals
1 01/01/2020 00:00:00 01/01/2020 03:00:00 01/01/2020 00:00:00
1 01/01/2020 00:00:00 01/01/2020 03:00:00 01/01/2020 01:00:00
1 01/01/2020 00:00:00 01/01/2020 03:00:00 01/01/2020 02:00:00
1 01/01/2020 00:00:00 01/01/2020 03:00:00 01/01/2020 03:00:00
Thanking you in advance
p.s. I'm new to SQL
You can use a recursive sub-query factoring clause to loop and incrementally add an hour:
WITH times ( id, startdatetime, enddatetime, intervals ) AS (
SELECT id,
startdatetime,
enddatetime,
startdatetime
FROM calls c
UNION ALL
SELECT id,
startdatetime,
enddatetime,
intervals + INTERVAL '1' HOUR
FROM times
WHERE intervals + INTERVAL '1' HOUR <= enddatetime
)
SELECT *
FROM times;
outputs:
ID | STARTDATETIME | ENDDATETIME | INTERVALS
-: | :------------------ | :------------------ | :------------------
1 | 2020-01-01 00:00:00 | 2020-01-01 04:00:00 | 2020-01-01 00:00:00
1 | 2020-01-01 00:00:00 | 2020-01-01 04:00:00 | 2020-01-01 01:00:00
1 | 2020-01-01 00:00:00 | 2020-01-01 04:00:00 | 2020-01-01 02:00:00
1 | 2020-01-01 00:00:00 | 2020-01-01 04:00:00 | 2020-01-01 03:00:00
1 | 2020-01-01 00:00:00 | 2020-01-01 04:00:00 | 2020-01-01 04:00:00
db<>fiddle here
You can use the hierarchy query as following:
SQL> WITH CALLS (ID, STARTDATETIME, ENDDATETIME)
2 AS ( SELECT 1,
3 TO_DATE('01/01/2020 00:00:00', 'dd/mm/rrrr hh24:mi:ss'),
4 TO_DATE('01/01/2020 04:00:00', 'dd/mm/rrrr hh24:mi:ss')
5 FROM DUAL)
6 -- Your query starts from here
7 SELECT
8 ID,
9 STARTDATETIME,
10 ENDDATETIME,
11 STARTDATETIME + ( COLUMN_VALUE / 24 ) AS INTERVALS
12 FROM
13 CALLS C
14 CROSS JOIN TABLE ( CAST(MULTISET(
15 SELECT LEVEL - 1
16 FROM DUAL
17 CONNECT BY LEVEL <= TRUNC(24 *(ENDDATETIME - STARTDATETIME))
18 ) AS SYS.ODCINUMBERLIST) )
19 ORDER BY INTERVALS;
ID STARTDATETIME ENDDATETIME INTERVALS
---------- ------------------- ------------------- -------------------
1 01/01/2020 00:00:00 01/01/2020 04:00:00 01/01/2020 00:00:00
1 01/01/2020 00:00:00 01/01/2020 04:00:00 01/01/2020 01:00:00
1 01/01/2020 00:00:00 01/01/2020 04:00:00 01/01/2020 02:00:00
1 01/01/2020 00:00:00 01/01/2020 04:00:00 01/01/2020 03:00:00
SQL>
Cheers!!

How to generate series for date range with minutes interval in oracle?

In Postgres below query is working using generate_series function
SELECT dates
FROM generate_series(CAST('2019-03-01' as TIMESTAMP), CAST('2019-04-01' as TIMESTAMP), interval '30 mins') AS dates
Below query is also working in Oracle but only for date interval
select to_date('2019-03-01','YYYY-MM-DD') + rownum -1 as dates
from all_objects
where rownum <= to_date('2019-03-06','YYYY-MM-DD')-to_date('2019-03-01','YYYY-MM-DD')+1
SELECT dates
FROM generate_series(CAST('2019-03-01' as TIMESTAMP), CAST('2019-04-01' as TIMESTAMP), interval '30 mins') AS dates
I want same result in Oracle for below query
SELECT dates
FROM generate_series(CAST('2019-03-01' as TIMESTAMP), CAST('2019-04-01' as TIMESTAMP), interval '30 mins') AS dates
Use a hierarchical query:
SELECT DATE '2019-03-01' + ( LEVEL - 1 ) * INTERVAL '30' MINUTE AS dates
FROM DUAL
CONNECT BY DATE '2019-03-01' + ( LEVEL - 1 ) * INTERVAL '30' MINUTE <= DATE '2019-04-01';
Output:
| DATES |
| :------------------ |
| 2019-03-01 00:00:00 |
| 2019-03-01 00:30:00 |
| 2019-03-01 01:00:00 |
| 2019-03-01 01:30:00 |
| 2019-03-01 02:00:00 |
| 2019-03-01 02:30:00 |
| 2019-03-01 03:00:00 |
| 2019-03-01 03:30:00 |
| 2019-03-01 04:00:00 |
| 2019-03-01 04:30:00 |
| 2019-03-01 05:00:00 |
| 2019-03-01 05:30:00 |
...
| 2019-03-31 19:30:00 |
| 2019-03-31 20:00:00 |
| 2019-03-31 20:30:00 |
| 2019-03-31 21:00:00 |
| 2019-03-31 21:30:00 |
| 2019-03-31 22:00:00 |
| 2019-03-31 22:30:00 |
| 2019-03-31 23:00:00 |
| 2019-03-31 23:30:00 |
| 2019-04-01 00:00:00 |
db<>fiddle here

Splitting interval overlapping more days in PostgreSQL

I have a PostgreSQL table containing start timestamp and duration time.
timestamp | interval
------------------------------
2018-01-01 15:00:00 | 06:00:00
2018-01-02 23:00:00 | 04:00:00
2018-01-04 09:00:00 | 2 days 16 hours
What I would like is to have the interval splitted into every day like this:
timestamp | interval
------------------------------
2018-01-01 15:00:00 | 06:00:00
2018-01-02 23:00:00 | 01:00:00
2018-01-03 00:00:00 | 03:00:00
2018-01-04 09:00:00 | 15:00:00
2018-01-05 00:00:00 | 24:00:00
2018-01-06 00:00:00 | 24:00:00
2018-01-07 00:00:00 | 01:00:00
I am playing with generate_series(), width_bucket(), range functions, but I still can't find plausible solution. Is there any existing or working solution?
not sure about all edge cases, but this seems working:
t=# with c as (select *,min(t) over (), max(t+i) over (), tsrange(date_trunc('day',t),t+i) tr from t)
, mid as (
select distinct t,i,g,tr
, case when g < t then t else g end tt
from c
right outer join (select generate_series(date_trunc('day',min),date_trunc('day',max),'1 day') g from c) e on g <# tr order by 3,1
)
select
tt
, i
, case when tt+'1 day' > upper(tr) and t < g then upper(tr)::time::interval when upper(tr) - lower(tr) < '1 day' then i else g+'1 day' - tt end
from mid
order by tt;
tt | i | case
---------------------+-----------------+----------
2018-01-01 15:00:00 | 06:00:00 | 06:00:00
2018-01-02 23:00:00 | 04:00:00 | 01:00:00
2018-01-03 00:00:00 | 04:00:00 | 03:00:00
2018-01-04 09:00:00 | 2 days 16:00:00 | 15:00:00
2018-01-05 00:00:00 | 2 days 16:00:00 | 1 day
2018-01-06 00:00:00 | 2 days 16:00:00 | 1 day
2018-01-07 00:00:00 | 2 days 16:00:00 | 01:00:00
(7 rows)
also please mind that timestamp without time zone can fail you when comparing timestamps...