compare oracle row count between different dates hourly - sql

I am using this sql to query the count of rows hourly for three days ago ...
select trunc(sendtime ,'hh24') , count(*)
FROM t_sendedmsglog
where msgcontext like '%sm_%_tone_succ%' and sendtime > sysdate -3
group by trunc(sendtime ,'hh24')
order by trunc(sendtime ,'hh24') desc;
and the result shows like :
for example:
#|TRUNC(SENDTIME,'HH24')|COUNT(*)|
1|10/15/2020|12:00:00 PM|593|
2|10/15/2020|11:00:00 AM|889|
3|10/15/2020|10:00:00 AM|854|
4|10/15/2020|9:00:00 AM|1027|
5|10/15/2020|8:00:00 AM|8409|
.
.
.
12|10/15/2020|1:00:00 AM|101|
13|10/15/2020|281|
14|10/14/2020|11:00:00 PM|722|
15|10/14/2020|10:00:00 PM|1381|
16|10/14/2020|9:00:00 PM|2123|
.
.
25|10/14/2020|12:00:00 PM|1195|
26|10/14/2020|11:00:00 AM|1699|
27|10/14/2020|10:00:00 AM|747|
28|10/14/2020|9:00:00 AM|827|
.
.
40|10/13/2020|9:00:00 PM|2058|
41|10/13/2020|8:00:00 PM|2800|
but how I can make the result appear like below instead, so I can compare the count between different days for the same hour ?
hour|10/12/2020|10/13/2020|10/14/2020|count(*)
11:00:00 PM|618 |509 |722 |
10:00:00 PM|3181|1144|1381|
09:00:00 PM|3520|2058|2123|
08:00:00 PM|3688|2800|9347|
07:00:00 PM|3648|3166|3469|
06:00:00 PM|3628|2973|4518|
05:00:00 PM|3644|2429|3607|
04:00:00 PM|3652|3678|2291|
03:00:00 PM|1017|7711|819 |
02:00:00 PM|814 |7693|1310|
01:00:00 PM|856 |825 |848 |
12:00:00 PM|558 |1531|1195|
11:00:00 AM|0 |1132|1699|
10:00:00 AM|0 |732 |747 |
09:00:00 AM|0 |709 |827 |
08:00:00 AM|0 |1256|947 |
07:00:00 AM|0 |1465|1502|
06:00:00 AM|0 |749 |780 |
05:00:00 AM|0 |181 |169 |
04:00:00 AM|0 |46 |32 |
03:00:00 AM|0 |23 |34 |
02:00:00 AM|0 |46 |39 |
01:00:00 AM|0 |82 |81 |
00:00:00 AM|0 | |218 |

Use conditional aggregation:
select trunc(sendtime, 'hh24') , count(*) as total,
sum(case when trunc(sendtime) = trunc(sysdate) - interval '2' day then 1 else 0 end) as yester2day,
sum(case when trunc(sendtime) = trunc(sysdate) - interval '1' day then 1 else 0 end) as yesterday,
sum(case when trunc(sendtime) = trunc(sysdate) - interval '0' day then 1 else 0 end) as today
from t_sendedmsglog
where msgcontext like '%sm_%_tone_succ%' and
sendtime >= trunc(sysdate) - interval '2' day
group by trunc(sendtime, 'hh24')
order by trunc(sendtime, 'hh24') desc;
Note that I tweaked the date comparison in the where clause as well. In Oracle, sysdate has a time component, which you don't care about for the filtering purposes.

Related

postgres query to group the records by hourly interval with date field

I have a table that has some file input data with file_id and file_input_date. I want to filter / group these file_ids depending on file_input_date. The problem is my date is in format of YYYY-MM-DD HH:mm:ss and I want to go further to group them by hour and not just the date.
Edit: some sample data
file_id | file_input_date
597872 | 2023-01-12 16:06:22.92879
497872 | 2023-01-11 16:06:22.92879
397872 | 2023-01-11 16:06:22.92879
297872 | 2023-01-11 17:06:22.92879
297872 | 2023-01-11 17:06:22.92879
297872 | 2023-01-11 17:06:22.92879
297872 | 2023-01-11 18:06:22.92879
what I want to see is
1 for 2023-01-12 16:06
2 for 2023-01-11 16:06
3 for 2023-01-11 17:06
1 for 2023-01-11 18:06
the output format will be different but this kind of gives what I want.
You could convert the dates to strings with the format you want and group by it:
SELECT TO_CHAR(file_input_date, 'YYYY-MM-DD HH24:MI'), COUNT(*)
FROM mytable
GROUP BY TO_CHAR(file_input_date, 'YYYY-MM-DD HH24:MI')
To get to hour not minute:
create table date_grp (file_id integer, file_input_date timestamp);
INSERT INTO date_grp VALUES
(597872, '2023-01-12 16:06:22.92879'),
(497872, '2023-01-11 16:06:22.92879'),
(397872, '2023-01-11 16:06:22.92879'),
(297872, '2023-01-11 17:06:22.92879'),
(297872, '2023-01-11 17:06:22.92879'),
(297872, '2023-01-11 17:06:22.92879'),
(297872, '2023-01-11 18:06:22.92879');
SELECT
date_trunc('hour', file_input_date),
count(date_trunc('hour', file_input_date))
FROM
date_grp
GROUP BY
date_trunc('hour', file_input_date);
date_trunc | count
---------------------+-------
01/11/2023 18:00:00 | 1
01/11/2023 17:00:00 | 3
01/12/2023 16:00:00 | 1
01/11/2023 16:00:00 | 2
(4 rows)
Though if you want to minute
SELECT
date_trunc('minute', file_input_date),
count(date_trunc('minute', file_input_date))
FROM
date_grp
GROUP BY
date_trunc('minute', file_input_date);
date_trunc | count
---------------------+-------
01/11/2023 18:06:00 | 1
01/11/2023 16:06:00 | 2
01/12/2023 16:06:00 | 1
01/11/2023 17:06:00 | 3

SQL time-series resampling

I have clickhouse table with some rows like that
id
created_at
6962098097124188161
2022-07-01 00:00:00
6968111372399976448
2022-07-02 00:00:00
6968111483775524864
2022-07-03 00:00:00
6968465518567268352
2022-07-04 00:00:00
6968952917160271872
2022-07-07 00:00:00
6968952924479332352
2022-07-09 00:00:00
I need to resample time-series and get count by date like this
created_at
count
2022-07-01 00:00:00
1
2022-07-02 00:00:00
2
2022-07-03 00:00:00
3
2022-07-04 00:00:00
4
2022-07-05 00:00:00
4
2022-07-06 00:00:00
4
2022-07-07 00:00:00
5
2022-07-08 00:00:00
5
2022-07-09 00:00:00
6
I've tried this
SELECT
arrayJoin(
timeSlots(
MIN(created_at),
toUInt32(24 * 3600 * 10),
24 * 3600
)
) as ts,
SUM(
COUNT(*)
) OVER (
ORDER BY
ts
)
FROM
table
but it counts all rows.
How can I get expected result?
why not use group by created_at
like
select count(*) from table_name group by toDate(created_at)

How to aggregate rows in the range of timestamp in vertica db (vsql)

Suppose I have a table with data like this:
ts | bandwidth_bytes
---------------------+-----------------
2021-08-27 22:00:00 | 3792
2021-08-27 21:45:00 | 1164
2021-08-27 21:30:00 | 7062
2021-08-27 21:15:00 | 3637
2021-08-27 21:00:00 | 2472
2021-08-27 20:45:00 | 1328
2021-08-27 20:30:00 | 1932
2021-08-27 20:15:00 | 1434
2021-08-27 20:00:00 | 1530
2021-08-27 19:45:00 | 1457
2021-08-27 19:30:00 | 1948
2021-08-27 19:15:00 | 1160
I need to output something like this:
ts | bandwidth_bytes
---------------------+-----------------
2021-08-27 22:00:00 | 15,655
2021-08-27 21:00:00 | 7166
2021-08-27 20:00:00 | 6095
I want to do sum bandwidth_bytes over 1 hour timestamp of data.
I want to do this in vsql specifically.
More columns are present but for simplification I have shown only these two.
You can use date_trunc():
select [date_trunc('hour', ts)][1] as ts_hh, sum(bandwidth_bytes)
from t
group by ts_hh;
Use Vertica's lovely function TIME_SLICE().
You can't only go by hour, you can also go by slices of 2 or 3 hours, which DATE_TRUNC() does not offer.
You seem to want all between 20:00:01 and 21:00:00 to belong to a time slice of 21:00:00. In both DATE_TRUNC() and TIME_SLICE(), however, it's 20:00:00 to 20:59:59 that belongs to the same time slice. So I subtracted one second before applying TIME_SLICE() .
WITH
-- your in data ...
indata(ts,bandwidth_bytes) AS (
SELECT TIMESTAMP '2021-08-27 22:00:00',3792
UNION ALL SELECT TIMESTAMP '2021-08-27 21:45:00',1164
UNION ALL SELECT TIMESTAMP '2021-08-27 21:30:00',7062
UNION ALL SELECT TIMESTAMP '2021-08-27 21:15:00',3637
UNION ALL SELECT TIMESTAMP '2021-08-27 21:00:00',2472
UNION ALL SELECT TIMESTAMP '2021-08-27 20:45:00',1328
UNION ALL SELECT TIMESTAMP '2021-08-27 20:30:00',1932
UNION ALL SELECT TIMESTAMP '2021-08-27 20:15:00',1434
UNION ALL SELECT TIMESTAMP '2021-08-27 20:00:00',1530
UNION ALL SELECT TIMESTAMP '2021-08-27 19:45:00',1457
UNION ALL SELECT TIMESTAMP '2021-08-27 19:30:00',1948
UNION ALL SELECT TIMESTAMP '2021-08-27 19:15:00',1160
)
SELECT
TIME_SLICE(ts - INTERVAL '1 SECOND' ,1,'HOUR','END') AS ts
, SUM(bandwidth_bytes) AS bandwidth_bytes
FROM indata
GROUP BY 1
ORDER BY 1 DESC;
ts | bandwidth_bytes
---------------------+-----------------
2021-08-27 22:00:00 | 15655
2021-08-27 21:00:00 | 7166
2021-08-27 20:00:00 | 6095

SQL - move date to within 48 hr window

I have a bunch of historic timestamp dates. Basically, I need to simulate a new date such that the historic dates are moved to within a 48 hour window of the current date.
This is an extract of the date column:
2019-05-07 17:46:57.733 UTC
2019-05-15 13:03:25.247 UTC
2019-05-07 13:27:49.453 UTC
2019-05-11 04:24:02.293 UTC
2019-04-18 08:00:54.660 UTC
2019-04-25 05:34:36.777 UTC
2019-05-14 16:48:07.863 UTC
Assuming the current date is 2019-10-03 15:00:00. The expected range of dates should be between 2019-10-03 15:00:00 and 2019-10-01 15:00:00
The expected results should be the following.
2019-10-02 17:46:57.733 UTC
2019-10-03 13:03:25.247 UTC
2019-10-03 13:27:49.453 UTC
2019-10-03 04:24:02.293 UTC
2019-10-02 08:00:54.660 UTC
2019-10-02 05:34:36.777 UTC
2019-10-01 16:48:07.863 UTC
Why not just construct two days of random timestamps?
select timestamp_add(current_timestamp, interval cast(rand() * (60 * 60 * 24 * 2) as int64) second)
from t
It feels like you are looking for a random date function.
CREATE TEMP FUNCTION random_date()
RETURNS DATE
AS ( DATE_SUB(CURRENT_DATE(), INTERVAL CAST(FLOOR(RAND() * 29 / 10) AS INT64) DAY));
with data as (
select "2019-05-07 17:46:57.733 UTC" as date_time UNION ALL
select "2019-05-15 13:03:25.247 UTC" UNION ALL
select "2019-05-07 13:27:49.453 UTC" UNION ALL
select "2019-05-11 04:24:02.293 UTC" UNION ALL
select "2019-04-18 08:00:54.660 UTC" UNION ALL
select "2019-04-25 05:34:36.777 UTC" UNION ALL
select "2019-05-14 16:48:07.863 UTC" )
SELECT
CONCAT(FORMAT_DATE("%Y-%m-%d", random_date()), " ", SUBSTR(date_time, 12))
FROM data;
Output:
+-----------------------------+
| f0_ |
+-----------------------------+
| 2019-10-01 17:46:57.733 UTC |
| 2019-10-01 13:03:25.247 UTC |
| 2019-10-02 13:27:49.453 UTC |
| 2019-10-03 04:24:02.293 UTC |
| 2019-10-03 08:00:54.660 UTC |
| 2019-10-03 05:34:36.777 UTC |
| 2019-10-02 16:48:07.863 UTC |
+-----------------------------+

How to group by hour in HANA

I have the following table in HANA :
vehicle_id time roaming_time parking_time
1 Sep 01,2016 3:09:03 AM 3 9
2 Sep 01,2016 3:12:03 AM 6 8
1 Sep 01,2016 9:10:03 AM 10 6
4 Sep 01,2016 10:09:03 AM 9 3
1 Sep 01,2016 10:10:03 AM 10 10
4 Sep 01,2016 12:09:03 AM 3 9
from these information I wanted to know that what is the sum of roaming_time and sum of parking_time for each hour from all the vehicles and want the output in the format:
time roaming_time parking_time
____ _____________ ____________
2016-09-01 00:00:00 3 9
2016-09-01 01:00:00 6 8
2016-09-01 02:00:00 9 6
2016-09-01 03:00:00 3 6
2016-09-01 04:00:00 12 3
2016-09-01 05:00:00 15 8
2016-09-01 06:00:00 18 4
2016-09-01 07:00:00 8 3
2016-09-01 08:00:00 9 4
2016-09-01 09:00:00 6 6
2016-09-01 10:00:00 6 9
........
2016-09-01 23:00:00 3 12
I need to group the following query which gives all the sum by hour wise and get the expected result:
select sum(roaming_time) as roaming_time,sum(parking_time) as parking_time
from time>='2016-09-01 00:00:00'
time>='2016-09-01 23:59:59'
I do not know how to do the grouping by hour in HANA. Any help is appreciated
Here is one method . . . it converts the time to a date and hour format:
select to_varchar(time, 'YYYY-MM-DD'), hour(time),
sum(roaming_time) as roaming_time, sum(parking_time) as parking_time from t
group by date(time), hour(time)
order by to_varchar(time, 'YYYY-MM-DD'), hour(time);
Use a group by clause with SERIES_ROUND(). Avoid date() and hour() and similar data/time functions on large data sets as they tend to be slower.
select SERIES_ROUND(time, 'INTERVAL 1 HOUR') as time,
sum(roaming_time) as roaming_time, sum(parking_time) as parking_time from t
group by SERIES_ROUND(time, 'INTERVAL 1 HOUR')
order by SERIES_ROUND(time, 'INTERVAL 1 HOUR');
Another approach is to convert it to a string, especially if no further time calculations are required.
This could look like this:
select to_varchar(time, 'DD.MM.YYYY HH24') as parking_hour ,
sum(roaming_time) as roaming_time, sum(parking_time) as parking_time from t
group by to_varchar(time, 'DD.MM.YYYY HH24') as parking_hour
order byto_varchar(time, 'DD.MM.YYYY HH24') as parking_hour;