Postgres group by timestamp into 6 hourly buckets - sql

I have the following simple table:
ID TIMESTAMP VALUE
4 2011-05-27 15:50:04 1253
5 2011-05-27 15:55:02 1304
6 2011-05-27 16:00:02 1322
7 2011-05-27 16:05:01 1364
I would like to average the VALUES, and GROUP each TIMESTAMP day into 6 hourly buckets. e.g 00:00 to 06:00, 06:00 to 12:00, 12:00 to 18:00 & 18:00 to 00:00.
I am able to group by year, month, day & hour using the following query:
select avg(VALUE),
EXTRACT(year from TIMESTAMP) AS year,
EXTRACT(month from TIMESTAMP) AS month,
EXTRACT(day from TIMESTAMP) as day
from TABLE
group by year,month,day
But I am unable to group each day into 4 periods as defined above, any help is most welcome.

I think grouping the integer value of the quotient of the (Hour of your timestamp / 6) should help. Try it and see if it helps.
Your group by should be something like
group by year, month, day, trunc(EXTRACT(hour from TIMESTAMP) / 6)
The logic behind this is that when the hour part of the date is divided by 6, the int values can only be
0 - 0:00 - 5:59:59
1 - 6:00 - 11:59:59
2 - 12:00 - 17:59:59
3 - 18:00 - 23:59:59
Grouping using this should put your data into 4 groups per day, which is what you need.

Related

Last 14 days vs before last 14 days data

orders
id
order_id
created_at
updated_at
total_amount
1
abc123
2021-06-13 11:00:00
2021-06-13 11:00:00
230.5
2
abc456
2021-06-01 07:00:00
2021-06-01 07:00:00
240
To get no of purchases on last 7 days vs before last 7 days I wrote the following query
select
date_trunc('week', created_at) as "Week",
count(*) "No of purchases"
from orders
How can I get no of purchases on last 14 days vs before last 14 days?
Is there a way I can pass like '14 days' or something like that to date_turnc method?
If not How Can I write this query?
Why not just use date comparisons?
select count(*) as num_purchases
from orders
where created_at >= current_date - interval '14 day'
Just subtract the respective intervals from now() (or any other function/variable that gives you the right time for the current moment) and compare that to the creation timestamp. Something along the lines of:
SELECT count(*)
FROM orders
WHERE created_at >= now() - '14 days'::interval
AND created_at < now() - '7 days'::interval;

Issue with Date_SUB And BETWEEN function

I came across this issue while creating a parametrised query; Then intent of the query is to pull past 5 months (excluding current month) data based on the date passed as a variable. Basic table schema for Table A is as follows:
as_of_date Y X
2019-12-31 1 AB
2019-11-30 2 CD
2019-10-31 3 EF
2019-09-30 4 GH
2019-08-31 5 MN
2019-07-31 6 XYZ
2020-01-31 7 PQR
2020-02-29 8 AAA
Following is the query I wrote:
WITH
date
AS
(
SELECT CAST("2020-02-29" AS Date) as run_date
)
SELECT DISTINCT CAST(a.as_of_date AS DATE) as_of_date,
FROM A as a
WHERE CAST(a.as_of_date AS DATE) BETWEEN DATE_SUB((SELECT run_date FROM date), INTERVAL 5 Month) AND DATE_SUB((SELECT run_date FROM date), INTERVAL 1 Month)
This query runs fine when run_date is set to "2020-01-31" and returns past 5 months data i.e. Dec, Nov, October, Sept, and August. But fails when date is set to "2020-02-29" it only returns 4 months data.
Simple "fix" is to add DATE_TRUNC(..., MONTH) as in below example
SELECT DISTINCT CAST(a.as_of_date AS DATE) as_of_date,
FROM `project.dataset.tableA` AS a
WHERE DATE_TRUNC(CAST(a.as_of_date AS DATE), MONTH)
BETWEEN DATE_TRUNC(DATE_SUB((SELECT run_date FROM date_cte), INTERVAL 5 Month), MONTH)
AND DATE_TRUNC(DATE_SUB((SELECT run_date FROM date_cte), INTERVAL 1 Month), MONTH)

SQL query to include time segments with no counts

I am working in SQL Server 2014. I have table that records 'counts' and a timestamp of the count. The counting period is a two hour block that can start at any quarter hour. In the example data below, the count starts at 16:00 and goes through 18:00. The counting block could have started at 01:30 and stopped at 03:30.
Timestamp Count
16:00:31 1
16:00:42 1
16:16:04 1
16:16:06 1
16:45:10 1
16:45:31 1
16:45:32 1
17:16:45 1
17:16:52 1
17:16:53 1
17:33:19 1
17:34:01 1
17:45:03 1
17:46:08 1
I have a query which sums the counts over 15 minute intervals within the two hour block:
SELECT
FORMAT(DATEPART(HOUR, [Timestamp]), '0#') + ':' + FORMAT(DATEPART(MINUTE, [TimeStamp]) / 15 * 15, '0#') AS QtrHrBeg
, COUNT(*) AS CountTotal
FROM
[Sandbox].[trippetoe].[SURVEYCOUNTS]
GROUP BY
DATEPART(HOUR, [TIMESTAMP])
, (DATEPART(MINUTE, [TIMESTAMP]) / 15 * 15)
which results in this:
QtrHrBeg Count
16:00 2
16:15 2
16:45 3
17:15 3
17:30 2
17:45 2
I'd like to include 15 minute intervals where there are no counts - in this example the quarter hours beginning at 16:30 and 17:00, like below:
QtrHrBeg Count
16:00 2
16:15 2
16:30 0
16:45 3
17:00 0
17:15 3
17:30 2
17:45 2
How can i do that?
See below.
Begin by creating a time table of all intervals for the day, then restricting that to the intervals for the 2 hour window you want.
Then left join that to the sum of your data table, pushing 0 where the join returns null.
DECLARE #Data TABLE ([TimeStamp] TIME, [Count] INT)
INSERT INTO #Data ([TimeStamp],[Count])
VALUES ('16:00:31',1),
('16:00:42',1),
('16:16:04',1),
('16:16:06',1),
('16:45:10',1),
('16:45:31',1),
('16:45:32',1),
('17:16:45',1),
('17:16:52',1),
('17:16:53',1),
('17:33:19',1),
('17:34:01',1),
('17:45:03',1),
('17:46:08',1)
;with AllIntervals AS
(
SELECT CONVERT(TIME,'00:00:00') AS Interval
UNION ALL
SELECT DATEADD(MINUTE,15,Interval)
FROM AllIntervals
WHERE Interval<'23:45:00'
), MyIntervals AS
(
SELECT CONVERT(VARCHAR(5),Interval,108) AS Interval
FROM AllIntervals
WHERE Interval >= (SELECT MIN(CONVERT(TIME,DATEADD(minute,(DATEDIFF(minute,0,[TimeStamp])/15)*15,0))) FROM #Data)
AND Interval < DATEADD(HOUR,2,(SELECT MIN(CONVERT(TIME,DATEADD(minute,(DATEDIFF(minute,0,[TimeStamp])/15)*15,0))) FROM #Data))
)
SELECT M.Interval, ISNULL(I.[Count],0)
FROM MyIntervals M
LEFT JOIN (SELECT CONVERT(TIME,DATEADD(minute,(DATEDIFF(minute,0,[TimeStamp])/15)*15,0)) AS Interval, SUM([Count]) AS Count
FROM #Data
GROUP BY CONVERT(TIME,DATEADD(minute,(DATEDIFF(minute,0,[TimeStamp])/15)*15,0))) I
ON M.Interval=I.Interval
You can use the following
Find the minimum date and the maximum date in the data you are going to work on , then round these two values to the nearest 15
Split the segment into 15 minutes intervals
Left join your data with the result came out and apply group by the StartTime and I used format in order to show the time formatting only
The benefit of this approach is that it works on specific interval and will not take any time interval outside of your data ranges.
with initial as(
select dateadd(minute, datediff(minute,0,min([Time])) / 15 * 15, 0) as MinTime,
dateadd(minute, datediff(minute,0,max([Time])) / 15 * 15, 0) as MaxTime
from data
), times as(
select StartTime = MinTime,
EndTime =dateadd(millisecond,-1,dateadd(minute,15,MinTime)),
MaxTime
from initial
union all
select dateadd(millisecond,1,EndTime),
dateadd(minute,15,EndTime),
MaxTime
from times
where EndTime<MaxTime
)
select format(t.StartTime,'HH:mm') as [Time],isnull(sum(d.[Count]),0) as [Count]
from times t
left join data d on d.[Time] between t.StartTime and t.EndTime
group by t.StartTime
Here is the output
Time Count
16:00 2
16:15 2
16:30 0
16:45 3
17:00 0
17:15 3
17:30 2
17:45 2
Here a working demo
Hope this will help you
EDIT
I changed the usage of second to millisecond based on the comment from #HABO, it will solve the case where there is some times like 16:59:59

How do I compare a current partial month vs a previous partial month with postgres?

I'm building some basic reports and I want to see if I'm on track to surpass last month's metrics without waiting for the month to end. Basically I want to compare June 1 (start of current month) through June 23 (current_date) against May 1 (start of previous month) through May 23 (current_date - 1 month).
My goal is to show a count of distinct users that did event1 and event2.
Here's what I have so far:
CREATE VIEW events AS
(SELECT *
FROM public.event
WHERE TYPE in ('event1',
'event2')
AND created_at > now() - interval '1 months' );
CREATE VIEW MAU AS
(SELECT EXTRACT(DOW
FROM created_at) AS month,
DATE_TRUNC('week', created_at) AS week,
COUNT(*) AS total_engagement,
COUNT(DISTINCT user_id) AS total_users
FROM events
GROUP BY 2,
1
ORDER BY week DESC);
SELECT month,
week,
SUM(total_engagement) OVER (PARTITION BY month
ORDER BY week) AS total_engagment
FROM MAU
ORDER BY 1 DESC,
2
Here's an example of what that returns:
Month Week Unique Engagement
6 2017-05-22 00:00:00 165
6 2017-05-29 00:00:00 355
6 2017-06-05 00:00:00 572
6 2017-06-12 00:00:00 723
5 2017-05-22 00:00:00 757
5 2017-05-29 00:00:00 1549
5 2017-06-05 00:00:00 2394
5 2017-06-12 00:00:00 3261
5 2017-06-19 00:00:00 3592
Expected return
Month Day Total Engagement
6 1 50
6 2 100
6 3 180
5 1 89
5 2 213
5 3 284
5 4 341
Can you point out where I've got this wrong or if there's an easier way to do it?
You are confusing days, weeks and months in your question but from the expected output I assume that you want month number, week number within a month and a count of those pairs.
SELECT
month,
week,
count(*) as total_engagement
FROM (
SELECT
extract(month from created_at) as month,
extract('day' from date_trunc('week', created_at::date) -
date_trunc('week', date_trunc('month', created_at::date))) / 7 + 1 as week
FROM public.event
WHERE type IN ('event1', 'event2')
AND created_at > now() - interval '1 month'
) t
GROUP BY 1,2
The most interesting part could be getting the week number within a month and for that you can check this answer.

Get the highest value of the last 7 days with SQL

I would like to fetch the highest value (from the column named value) for the 7 past days. I have tried with this sql:
SELECT MAX(value) as value_of_week
FROM events
WHERE event_date > UNIX_TIMESTAMP() -(7 * 86400);
But it gives me 86.1 that is older than 7 days from today´s date. Given the rows below, I should get 55.2 with date 2014-05-16 07:07:00.
id value event_date
1 28. 2014-04-18 08:23:00
2 23.6 2014-04-22 06:43:00
3 86.1 2014-04-29 05:32:00
4 43.3 2014-05-03 08:12:00
5 55.2 2014-05-16 07:07:00
6 25.6 2014-05-19 06:11:00
You are comparing unix time stamps to date. How about this?
SELECT MAX(value) as value_of_week
FROM events
WHERE event_date > date_add(now(), interval -7 day);
Im guessing this is MySQL and in that case you could do this:
select max(value) as value_of_week from events where event_date between date_sub(now(),INTERVAL 1 WEEK) and now();
you can use
SELECT MAX(value) as value_of_week
FROM events
where event_date>= curdate() - INTERVAL DAYOFWEEK(curdate())+6 DAY
AND event_date< curdate() - INTERVAL DAYOFWEEK(curdate())-1 DAY;