I was wondering if you can help me write a query that should just SELECT count(*) but only include data from last hour and group it by minute.
So I have a table that has a createdts so I have the date there. I just want to see how many entries I have in the last hour, but group COUNT(*) per minute.
SELECT COUNT(*) FROM mytable
WHERE createdts >= now()::date - interval '1 hour'
GROUP BY 'every minute'
DATE_TRUNC() does this:
SELECT DATE_TRUNC('minute', createdts), COUNT(*)
FROM mytable
WHERE createdts >= now()::date - interval '1 hour'
GROUP BY DATE_TRUNC('minute', createdts)
ORDER BY DATE_TRUNC('minute', createdts);
Related
Hey Pros,
I am far away to have good knowledge about SQL, and would ask you to give me some hints.
Currently we aggregate our data with python and I would try to switch this when possible to. (SQL (Postgresql server)
My goal is to have one statment that generate an average for two seperates column's for specific time intervals (1 Hour, 1 Day, 1 Week, Overall) also all events in each period shoud be counted.
I can create 4 single statments for each interval but strugle how to combine this four selects into on result set.
select
count(id) as hour_count,
camera_name,
round(avg("pconf")) as hour_p_conf,
round(avg("dconf")) as hour_d_conf
from camera_events where timestamp between NOW() - interval '1 HOUR' and NOW() group by camera_name;
select
count(id) as day_count,
camera_name,
round(avg("pconf")) as day_p_conf,
round(avg("dconf")) as day_d_conf
from camera_events where timestamp between NOW() - interval '1 DAY' and NOW() group by camera_name;
select
count(id) as week_count,
camera_name,
round(avg("pconf")) as week_p_conf,
round(avg("dconf")) as week_d_conf
from camera_events where timestamp between NOW() - interval '1 WEEK' and NOW() group by camera_name;
select
count(id) as overall_count,
camera_name,
round(avg("pconf")) as overall_p_conf,
round(avg("dconf")) as overall_d_conf
from camera_events group by camera_name;
When possbile the result should look like the data on image
Some hints would be great, thank u
Consider conditional aggregation by moving WHERE logic to CASE statements in SELECT. Alternatively, in PostgreSQL use FILTER clauses.
select
camera_name,
count(id) filter(timestamp between NOW() - interval '1 HOUR' and NOW()) as hour_count,
round(avg("pconf") filter(timestamp between NOW() - interval '1 HOUR' and NOW())) as hour_p_conf,
round(avg("dconf") filter(timestamp between NOW() - interval '1 HOUR' and NOW())) as hour_d_conf,
count(id) filter(timestamp between NOW() - interval '1 DAY' and NOW()) as day_count,
round(avg("pconf") filter(timestamp between NOW() - interval '1 DAY' and NOW())) as day_p_conf,
round(avg("dconf") filter(timestamp between NOW() - interval '1 DAY' and NOW())) as day_d_conf,
count(id) filter(timestamp between NOW() - interval '1 WEEK' and NOW()) as week_count,
round(avg("pconf") filter(timestamp between NOW() - interval '1 WEEK' and NOW())) as week_p_conf,
round(avg("dconf") filter(timestamp between NOW() - interval '1 WEEK' and NOW())) as week_d_conf,
count(id) as overall_count,
round(avg("pconf")) as overall_p_conf,
round(avg("dconf")) as overall_d_conf
from camera_events
group by camera_name;
The simplest way is to join them. For example:
select
coalesce(h.camera_name, d.camera_name, w.camera_name) as camera_name
h.hour_count, h.hour_p_conf, h.hour_d_conf
d.day_count, d.day_p_conf, d.day_d_conf
w.week_count, w.week_p_conf, w.week_d_conf
from (
-- hourly query here
) h
full join (
-- daily query here
) d on d.camera_name = h.camera_name
full join (
-- weekly query here
) w on w.camera_name = coalesce(h.camera_name, d.camera_name)
I have a table squitters with, amongst others, a column parsed_time. I want to know the number of records per hour for the last two days and used this query:
SELECT date_trunc('hour', parsed_time) AS hour , count(*)
FROM squitters
WHERE parsed_time > date_trunc('hour', now()) - interval '2 day'
GROUP BY hour
ORDER BY hour DESC;
This works, but hours with zero records do not appear in the result. I want to have hours
with zero records also in the result with a count equal to zero, so I wrote this query using the generate_series function:
SELECT bins.hour, count(squitters.parsed_time)
FROM generate_series(date_trunc('hour', now() - interval '2 day'), now(), '1 hour') bins(hour)
LEFT OUTER JOIN squitters ON bins.hour = date_trunc('hours', squitters.parsed_time)
GROUP BY bins.hour
ORDER BY bins.hour DESC;
This works, in the results are hour-bins with counts equal to zero, but is considerably slower.
How can I have the speed of the first query with the count=zero results of the second query?
(btw. there is an index on parsed_time)
You could try and change the join condition so no date function is applied on column parsed_time:
SELECT b.hour, COUNT(s.parsed_time) cnt
FROM generate_series(date_trunc('hour', now() - interval '2 day'), now(), '1 hour') b(hour)
LEFT OUTER JOIN squitters s
ON s.parsed_time >= b.hour
AND s.parsed_time < b.hours + interval '1 hour'
GROUP BY b.hour
ORDER BY b.hour DESC;
Alternatively, you could also try using a correlated subquery (or a lateral join) instead of a left join - this avoids the need for outer aggregation:
SELECT
b.hour,
(
SELECT COUNT(*)
FROM squitters s
WHERE s.parsed_time >= b.hour AND s.parsed_time < b.hours + interval '1 hour'
) cnt
FROM generate_series(date_trunc('hour', now() - interval '2 day'), now(), '1 hour') b(hour)
ORDER BY b.hour desc
You could take advantage of Common Table Expressions to divide your problem into small chunks:
WITH cte AS (
--First query your table
SELECT date_trunc('hour', parsed_time) AS sq_hour , count(*)
FROM squitters
WHERE parsed_time > date_trunc('hour', now()) - interval '2 day'
GROUP BY hour
ORDER BY hour DESC
), series AS (
--Create the series without the data returned from 1st query
SELECT
bins.series_hour,
0
FROM
generate_series(date_trunc('hour', now() - interval '2 day'), now(), '1 hour') bins(series_hour)
WHERE
series_hour not in (SELECT sq_hour FROM cte)
)
--Union the result
SELECT * FROM cte
UNION
SELECT * FROM series
ORDER BY 1
I have a table like below image. What I need is to get average value of Volume column, grouped by User both for 1 hour and 24 hours ago. How can I use avg with two different date range in single query?
You can do it like:
SELECT user, AVG(Volume)
FROM mytable
WHERE created >= NOW() - interval '1 hour'
AND created <= NOW()
GROUP BY user
Few things to remember, you are executing the query on same server with same time zone. You need to group by the user to group all the values in volume column and then apply the aggregation function like avg to find average. Similarly if you need both together then you could do the following:
SELECT u1.user, u1.average, u2.average
FROM
(SELECT user, AVG(Volume) as average
FROM mytable
WHERE created >= NOW() - interval '1 hour'
AND created <= NOW()
GROUP BY user) AS u1
INNER JOIN
(SELECT user, AVG(Volume) as average
FROM mytable
WHERE created >= NOW() - interval '1 day'
AND created <= NOW()
GROUP BY user) AS u2
ON u1.user = u2.user
Use conditional aggregation. Postgres offers very convenient syntax using the FILTER clause:
SELECT user,
AVG(Volume) FILTER (WHERE created >= NOW() - interval '1 hour' AND created <= NOW()) as avg_1hour,
AVG(Volume) FILTER (WHERE created >= NOW() - interval '1 day' AND created <= NOW()) as avg_1day
FROM mytable
WHERE created >= NOW() - interval '1 DAY' AND
created <= NOW()
GROUP BY user;
This will filter out users who have had no activity in the past day. If you want all users -- even those with no recent activity -- remove the WHERE clause.
The more traditional method uses CASE:
SELECT user,
AVG(CASE WHEN created >= NOW() - interval '1 hour' AND created <= NOW() THEN Volume END) as avg_1hour,
AVG(CASE WHEN created >= NOW() - interval '1 day' AND created <= NOW() THEN Volume END) as avg_1day
. . .
SELECT User, AVG(Volume) , ( IIF(created < DATE_SUB(NOW(), INTERVAL 1 HOUR) , 1 , 0) )IntervalType
WHERE created < DATE_SUB(NOW(), INTERVAL 1 HOUR)
AND created < DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY User, (IIF(created < DATE_SUB(NOW(), INTERVAL 1 HOUR))
Please Tell me about it's result :)
I'm having an issue generating a series of dates and then returning the COUNT of rows matching that each date in the series.
SELECT generate_series(current_date - interval '30 days', current_date, '1 day':: interval) AS i, COUNT(*)
FROM download
WHERE product_uuid = 'someUUID'
AND created_at = i
GROUP BY created_at::date
ORDER BY created_at::date ASC
I want the output to be the number of rows that match the current date in the series.
05-05-2018, 35
05-06-2018, 23
05-07-2018, 0
05-08-2018, 10
...
The schema has the following columns: id, product_uuid, created_at. Any help would be greatly appreciated. I can add more detail if needed.
Put the table generating function in the from and use a join:
SELECT g.dte, COUNT(d.product_uuid)
FROM generate_series(current_date - interval '30 days', current_date, '1 day':: interval
) gs(dte) left join
download d
on d.product_uuid = 'someUUID' AND
d.created_at::date = g.dte
GROUP BY g.dte
ORDER BY g.dte;
How to get min value(temp) of all cities of yesterday.
I want:
Indore:min value:yesterday date
Bhopal:min value:yesterday date
Mumbai:min value:yesterday date
In Postgres, you can do:
select name, min(temp)
from t
where write_date < current_date and
write_date >= current_date - interval '1 day'
group by name;
You can also write the where clause as:
where date_trunc('day', write_date) = current_date - interval '1 day'
However, using the function date_trunc() prevents the use of the index for the where clause.
select name, min(temp) from table
where date(write_date) BETWEEN TRUNC(SYSDATE - 1)
AND TRUNC(SYSDATE) - 1/86400
group by name
this will do your job