How to extract the hour of day from an epoch and count each instance that occurs during that hour? - sql

I have a question that I feel is pretty straight forward but is giving me some issues.
I have a column in table X called event_time which is an epoch. I am wanting to extract the hour of day out of that and count the number of rides that have occurred during that hour.
So the output will end up being a bar chart with x values 0-24 and the Y being the number of instances that occur (which is bike rides for example).
Here is what I have now, that isn't giving me the correct output:
select extract(hour from to_timestamp(start_time)::date) as hr,
count(*) as ct
from x
group by hr
order by hr asc
Any hints or help are appreciated.
Thanks

You can use arithmetic:
select floor( (start_time % (24 * 60 * 60)) / (60 * 60) ) as hour,
count(*)
from x
group by hour;
Or convert to a date/time and extract the hour:
select extract(hour from '1970-01-01'::date + start_time * interval '1 second') as hour, count(*)
from x
group by hour;

Related

How can I aggregate time series data in postgres from a specific timestamp & fixed intervals (e.g. 1 hour , 1 day, 7 day ) without using date_trunc()?

I have a postgres table "Generation" with half-hourly timestamps spanning 2009 - present with energy data:
I need to aggregate (average) the data across different intervals from specific timepoints, for example data from 2021-01-07T00:00:00.000Z for one year at 7 day intervals, or 3 months at 1 day interval or 7 days at 1h interval etc. date_trunc() partly solves this, but rounds the weeks to the nearest monday e.g.
SELECT date_trunc('week', "DATETIME") AS week,
count(*),
AVG("GAS") AS gas,
AVG("COAL") AS coal
FROM "Generation"
WHERE "DATETIME" >= '2021-01-07T00:00:00.000Z' AND "DATETIME" <= '2022-01-06T23:59:59.999Z'
GROUP BY week
ORDER BY week ASC
;
returns the first time series interval as 2021-01-04 with an incorrect count:
week count gas coal
"2021-01-04 00:00:00" 192 18291.34375 2321.4427083333335
"2021-01-11 00:00:00" 336 14477.407738095239 2027.547619047619
"2021-01-18 00:00:00" 336 13947.044642857143 1152.047619047619
****EDIT: the following will return the correct weekly intervals by checking the start date relative to the nearest monday / start of week, and adjusts the results accordingly:
WITH vars1 AS (
SELECT '2021-01-07T00:00:00.000Z'::timestamp as start_time,
'2021-01-28T00:00:00.000Z'::timestamp as end_time
),
vars2 AS (
SELECT
((select start_time from vars1)::date - (date_trunc('week', (select start_time from vars1)::timestamp))::date) as diff
)
SELECT date_trunc('week', "DATETIME" - ((select diff from vars2) || ' day')::interval)::date + ((select diff from vars2) || ' day')::interval AS week,
count(*),
AVG("GAS") AS gas,
AVG("COAL") AS coal
FROM "Generation"
WHERE "DATETIME" >= (select start_time from vars1) AND "DATETIME" < (select end_time from vars1)
GROUP BY week
ORDER BY week ASC
returns..
week count gas coal
"2021-01-07 00:00:00" 336 17242.752976190477 2293.8541666666665
"2021-01-14 00:00:00" 336 13481.497023809523 1483.0565476190477
"2021-01-21 00:00:00" 336 15278.854166666666 1592.7916666666667
And then for any daily or hourly (swap out day with hour) intervals you can use the following:
SELECT date_trunc('day', "DATETIME") AS day,
count(*),
AVG("GAS") AS gas,
AVG("COAL") AS coal
FROM "Generation"
WHERE "DATETIME" >= '2022-01-07T00:00:00.000Z' AND "DATETIME" < '2022-01-10T23:59:59.999Z'
GROUP BY day
ORDER BY day ASC
;
In order to select the complete week, you should change the WHERe-clause to something like:
WHERE "DATETIME" >= date_trunc('week','2021-01-07T00:00:00.000Z'::timestamp)
AND "DATETIME" < (date_trunc('week','2022-01-06T23:59:59.999Z'::timestamp) + interval '7' day)::date
This will effectively get the records from January 4,2021 until (and including ) January 9,2022
Note: I changed <= to < to stop the end-date being included!
EDIT:
when you want your weeks to start on January 7, you can always group by:
(date_part('day',(d-'2021-01-07'))::int-(date_part('day',(d-'2021-01-07'))::int % 7))/7
(where d is the column containing the datetime-value.)
see: dbfiddle
EDIT:
This will get the list from a given date, and a specified interval.
see DBFIFFLE
WITH vars AS (
SELECT
'2021-01-07T00:00:00.000Z'::timestamp AS qstart,
'2022-01-06T23:59:59.999Z'::timestamp AS qend,
7 as qint,
INTERVAL '1 DAY' as qinterval
)
SELECT
(select date(qstart) FROM vars) + (SELECT qinterval from vars) * ((date_part('day',("DATETIME"-(select date(qstart) FROM vars)))::int-(date_part('day',("DATETIME"-(select date(qstart) FROM vars)))::int % (SELECT qint FROM vars)))::int) AS week,
count(*),
AVG("GAS") AS gas,
AVG("COAL") AS coal
FROM "Generation"
WHERE "DATETIME" >= (SELECT qstart FROM vars) AND "DATETIME" <= (SELECT qend FROM vars)
GROUP BY week
ORDER BY week
;
I added the WITH vars to do the variable stuff on top and no need to mess with the rest of the query. (Idea borrowed here)
I only tested with qint=7,qinterval='1 DAY' and qint=14,qinterval='1 DAY' (but others values should work too...)
Using the function EXTRACT you may calculate the difference in days, weeks and hours between your timestamp ts and the start_date as follows
Difference in Days
extract (day from ts - start_date)
Difference in Weeks
Is the difference in day divided by 7 and truncated
trunc(extract (day from ts - start_date)/7)
Difference in Hours
Is the difference in day times 24 + the difference in hours of the day
extract (day from ts - start_date)*24 + extract (hour from ts - start_date)
The difference can be used in GROUP BY directly. E.g. for week grouping the first group is difference 0, i.e. same week, the next group with difference 1, the next week, etc.
Sample Example
I'm using a CTE for the start date to avoid multpile copies of the paramater
with start_time as
(select DATE'2021-01-07' as start_ts),
prep as (
select
ts,
extract (day from ts - (select start_ts from start_time)) day_diff,
trunc(extract (day from ts - (select start_ts from start_time))/7) week_diff,
extract (day from ts - (select start_ts from start_time)) *24 + extract (hour from ts - (select start_ts from start_time)) hour_diff,
value
from test_table
where ts >= (select start_ts from start_time)
)
select week_diff, avg(value)
from prep
group by week_diff order by 1

Not show correct delta in minutes

column created_at type timestamp without timezone.
I need to get delta in minutes between current date and column created_at
Query:
select id, created_at,
extract(minutes from (CURRENT_TIMESTAMP) - created_at) as delta
from shop_order order by created_at
And here result:
Why in record with id = 20 the delta is 19 ?
It's difference is 3 DAYS. Why show only 19 minutes?
An interval (which is the result of subtracting two timestamp) consists of several "parts" (similar to a date) and extract only extracts the named part, not the representation of that interval for that unit. If the result of the subtraction is e.g. 3 days 19 minutes extract will return 19 minutes - similar to the way extract(year ...) or extract(month ...) work.
You can extract the number of seconds and then divide that by 60 to get the total duration in minutes:
select id,
created_at,
extract(epoch from CURRENT_TIMESTAMP - created_at) / 60 as delta
from shop_order order
by created_at

How to get epoch function in Oracle

select *
from sample_table scr
where extract('epoch' from systimestamp - scr.created_date)/60 > :defaultTimeOut
This is a postgres query. Trying to convert this query into oracle.
How do I convert epoch in oracle?
TIA.
You are actually trying to convert an interval (ie the difference between two dates) to a number of seconds. Assuming that created_date is of date datatype, one method is:
select *
from sample_table
where (sysdate - created_date) * 24 * 60 * 60 > :defaultTimeOut
Rationale: in Oracle, substracting two dates returns a decimal number that represents the difference in days. You can multiply that by 24 (hours per day), 60 (minutes per hour) and 60 (seconds per minute) to convert that to a number of seconds, then compare it to the target value.
If created_date is of timestamp datatype, you can cast it to a date first:
select *
from sample_table
where (sysdate - cast(created_date as date)) * 24 * 60 * 60 > :defaultTimeOut

Getting a count of timestamps that fall into each 5 minute bucket

I have a table of timestamps. I'd like to group them into 5 minute buckets and get the count of timestamps in that bucket. I having some trouble getting the SQL quite right. I am using Postgres. It's telling me the timestamp column in the last line doesn't exist, but it's defined as an alias.
SELECT
TIMESTAMP WITH TIME ZONE 'epoch' +
INTERVAL '1 second' * round(extract('epoch' from my_timestamp) / 300) * 300
as timestamp,
count(my_timestamp)
FROM logs
GROUP BY
round(extract('epoch' from timestamp) / 300)
I think your GROUP BY is off. Try this:
SELECT TIMESTAMP WITH TIME ZONE 'epoch' + INTERVAL '1 second' * round(extract('epoch' from my_timestamp) / 300) * 300 as timestamp,
count(*)
FROM (values (now())) logs(my_timestamp)
GROUP BY timestamp

Count by hour interval

i have written an SQL query in postgresql that works fine it gets me the number of work done per employee for every hour
SELECT COUNT(work_done) AS count, EXTRACT(hour from start_time) AS hour
FROM c_call
WHERE start_time >= '2018-10-13 00:00:00'
GROUP BY employee_id;
it's perfect if an emplyee was actif ine every interval hour but when an hour has no data for an employee it is omitted . how can make it work so that the result contains a row for each interval with the value field set to zero if the employee didnt work at that hour.
You can generate a hour series using generate_series function:
SELECT * FROM generate_series(0, 23) AS foo(bar)
And then use it to fill the hour gaps:
WITH employee_call AS (
SELECT
COUNT(work_done) AS count,
EXTRACT(hour from start_time) AS hour_fraction
FROM
c_call
WHERE
start_time >= '2018-10-13 00:00:00'
GROUP BY
employee_id
), hour_series (hour_fraction) AS (
SELECT generate_series(0, 23)
)
SELECT
COALESCE(c.count, 0) AS count,
COALESCE(c.hour_fraction, h.hour_fraction) AS hour_fraction
FROM
hour_series h
LEFT JOIN employee_call c ON (c.hour_fraction = h.hour_fraction)