Grouping the results of a query with CTEs - sql

I have a CTE based query into which I pass about 2600 4-tuple latitude/longitude values - that have been ID tagged and held in a second table called coordinates. These top left and bottom right latitude / longitude values are passed into the CTE in order to display the amount of requests (hourly) made within those coordinates for given two timestamps).
However, I would like to get the total requests per day within the timestamps given. That is, I want to get the total count of user requests on every specified day. E.g. user opts to see every Wednesday or Wednesday AND Thursday etc. - between 11:55 and 22:04 between dates January 1 and 16, 2012 for every latitude/longitude 4-tuples I pass. The output would basically be like:
coordinates_id | stamp | zcount
1 Jan 4 2012 200 (total requests on Wednesday Jan 4 between 11:55 and 22:04)
1 Jan 11 2012 121 (total requests on Wednesday Jan 11 between 11:55 and 22:04)
2 Jan 4 2012 255 (total requests on Wednesday Jan 4 between 11:55 and 22:04)
2 Jan 11 2012 211 (total requests on Wednesday Jan 11 between 11:55 and 22:04)
.
.
.
How would I do that? My query is as below:
WITH v AS (
SELECT '2012-01-1 11:55:11'::timestamp AS _from -- provide times once
,'2012-01-16 22:02:21'::timestamp AS _to
)
, q AS (
SELECT c.coordinates_id
, date_trunc('hour', t.calltime) AS stamp
, count(*) AS zcount
FROM v
JOIN mytable t ON t.calltime BETWEEN v._from AND v._to
AND (t.calltime::time >= v._from::time AND
t.calltime::time <= v._to::time) AND
(extract(DOW from t.calltime) = 3)
JOIN coordinates c ON (t.lat, t.lon)
BETWEEN (c.bottomrightlat, c.topleftlon)
AND (c.topleftlat, c.bottomrightlon)
GROUP BY c.coordinates_id, date_trunc('hour', t.calltime)
)
, cal AS (
SELECT generate_series('2011-2-2 00:00:00'::timestamp
, '2012-4-1 05:00:00'::timestamp
, '1 hour'::interval) AS stamp
FROM v
)
SELECT q.coordinates_id, cal.stamp, COALESCE (q.zcount, 0) AS zcount
FROM v, cal
LEFT JOIN q USING (stamp)
WHERE (extract(hour from cal.stamp) >= extract(hour from v._from) AND
extract(hour from cal.stamp) <= extract(hour from v._to)) AND
(extract(DOW from cal.stamp) = 3)
AND cal.stamp >= v._from AND cal.stamp <= v._to
GROUP BY q.coordinates_id, cal.stamp, q.zcount
ORDER BY q.coordinates_id ASC, stamp ASC;
And the sample result it yields is like this:
coordinates_id | stamp | zcount
1 2012-01-04 16:00:00 1
1 2012-01-04 19:00:00 1
1 2012-01-11 14:00:00 1
1 2012-01-11 17:00:00 1
1 2012-01-11 19:00:00 1
2 2012-01-04 16:00:00 1
So, as I mentioned above, I would like to see this as
coordinates_id | stamp | zcount
1 2012-01-04 2
1 2012-01-11 3
2 2012-01-04 1

Change your final SELECT to:
SELECT q.coordinates_id, cal.stamp::date, sum(q.zcount) AS zcount
FROM v, cal
LEFT JOIN q USING (stamp)
WHERE extract(hour from cal.stamp) BETWEEN extract(hour from v._from)
AND extract(hour from v._to)
AND extract(DOW from cal.stamp) = 3
AND cal.stamp >= v._from
AND cal.stamp <= v._to
GROUP BY 1,2
ORDER BY 1,2;
The crucial part it to cast cal.stamp to date: cal.stamp::date.
That, and sum(q.zcount).

Related

How to select data with an unusual grouping by date?

There is a table:
id
direction_id
created_at
1
2
22 November 2021 г., 16:00:00
2
2
22 November 2021 г., 16:20:00
43
2
22 November 2021 г., 16:25:00
455
1
22 November 2021 г., 16:27:00
6567
2
22 November 2021 г., 17:36:00
674556
2
22 November 2021 г., 20:01:00
5243554
1
22 November 2021 г., 20:50:00
5243554
1
22 November 2021 г., 21:46:00
I need to get the following result:
1
2
created_at_by_hour
1
3
22.11.21 17
1
4
22.11.21 18
1
4
22.11.21 19
1
4
22.11.21 20
2
5
22.11.21 21
3
5
22.11.21 22
1 and 2 in the header are all possible values of direction_id that are in the table.
created_at is reduced to hours and you need to count how many records satisfy the condition <= created_at_by_hour. But the grouping should be such that if the time (hour) when no records were created, then just duplicate the previous hour.
The table consists of three fields - id (int), direction_id (int), created_at (timestamptz). I need to get an hourly (based on the created_at field) data upload with the number of records created before this "grouped" time. But I need not just the number, but separately for each direction_id (there are only two of them - 1 and 2). If no records were created for a certain direction_id at a certain hour, duplicate the previous one, but the result should end at the last created_at. created_at is the time when the record was created.
In my opinion, better to generate a date between min and max date according to an hour then calculate the count of each direction.
Demo
with time_range as (
select
min(created_at) + interval '1 hour' as min,
max(created_at) + interval '1 hour' as max
from test
)
select
count(*) filter (where direction_id = 1) as "1",
count(*) filter (where direction_id = 2) as "2",
to_char(gs.hour, 'dd.mm.yy HH24') as created_at_by_hour
from
test t
cross join time_range tr
inner join generate_series(tr.min, tr.max, interval '1 hour') gs(hour)
on t.created_at <= gs.hour
group by gs.hour
order by gs.hour
Truncate the date down to the hour, group by it and count. Then use SUM OVER to get a running total of the counts. In order to show missing hours in the table, you must generate a series of hours and outer join your data.
with hourly as
(
select date_trunc('hour', created_at) as hour, direction_id from mytable
)
, hours(hour) as
(
select *
from generate_series
(
(select min(hour) from hourly), (select max(hour) from hourly), interval '1 hour'
)
)
select
hours.hour,
sum(count(*) filter (where hourly.direction_id = 1)) over (order by hour) as "1",
sum(count(*) filter (where hourly.direction_id = 2)) over (order by hour) as "2"
from hours
left join hourly using (hour)
group by hour
order by hour;
Demo: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=21d0c838452a09feac4ebc57906829f4

SQL query to include time segments with no counts

I am working in SQL Server 2014. I have table that records 'counts' and a timestamp of the count. The counting period is a two hour block that can start at any quarter hour. In the example data below, the count starts at 16:00 and goes through 18:00. The counting block could have started at 01:30 and stopped at 03:30.
Timestamp Count
16:00:31 1
16:00:42 1
16:16:04 1
16:16:06 1
16:45:10 1
16:45:31 1
16:45:32 1
17:16:45 1
17:16:52 1
17:16:53 1
17:33:19 1
17:34:01 1
17:45:03 1
17:46:08 1
I have a query which sums the counts over 15 minute intervals within the two hour block:
SELECT
FORMAT(DATEPART(HOUR, [Timestamp]), '0#') + ':' + FORMAT(DATEPART(MINUTE, [TimeStamp]) / 15 * 15, '0#') AS QtrHrBeg
, COUNT(*) AS CountTotal
FROM
[Sandbox].[trippetoe].[SURVEYCOUNTS]
GROUP BY
DATEPART(HOUR, [TIMESTAMP])
, (DATEPART(MINUTE, [TIMESTAMP]) / 15 * 15)
which results in this:
QtrHrBeg Count
16:00 2
16:15 2
16:45 3
17:15 3
17:30 2
17:45 2
I'd like to include 15 minute intervals where there are no counts - in this example the quarter hours beginning at 16:30 and 17:00, like below:
QtrHrBeg Count
16:00 2
16:15 2
16:30 0
16:45 3
17:00 0
17:15 3
17:30 2
17:45 2
How can i do that?
See below.
Begin by creating a time table of all intervals for the day, then restricting that to the intervals for the 2 hour window you want.
Then left join that to the sum of your data table, pushing 0 where the join returns null.
DECLARE #Data TABLE ([TimeStamp] TIME, [Count] INT)
INSERT INTO #Data ([TimeStamp],[Count])
VALUES ('16:00:31',1),
('16:00:42',1),
('16:16:04',1),
('16:16:06',1),
('16:45:10',1),
('16:45:31',1),
('16:45:32',1),
('17:16:45',1),
('17:16:52',1),
('17:16:53',1),
('17:33:19',1),
('17:34:01',1),
('17:45:03',1),
('17:46:08',1)
;with AllIntervals AS
(
SELECT CONVERT(TIME,'00:00:00') AS Interval
UNION ALL
SELECT DATEADD(MINUTE,15,Interval)
FROM AllIntervals
WHERE Interval<'23:45:00'
), MyIntervals AS
(
SELECT CONVERT(VARCHAR(5),Interval,108) AS Interval
FROM AllIntervals
WHERE Interval >= (SELECT MIN(CONVERT(TIME,DATEADD(minute,(DATEDIFF(minute,0,[TimeStamp])/15)*15,0))) FROM #Data)
AND Interval < DATEADD(HOUR,2,(SELECT MIN(CONVERT(TIME,DATEADD(minute,(DATEDIFF(minute,0,[TimeStamp])/15)*15,0))) FROM #Data))
)
SELECT M.Interval, ISNULL(I.[Count],0)
FROM MyIntervals M
LEFT JOIN (SELECT CONVERT(TIME,DATEADD(minute,(DATEDIFF(minute,0,[TimeStamp])/15)*15,0)) AS Interval, SUM([Count]) AS Count
FROM #Data
GROUP BY CONVERT(TIME,DATEADD(minute,(DATEDIFF(minute,0,[TimeStamp])/15)*15,0))) I
ON M.Interval=I.Interval
You can use the following
Find the minimum date and the maximum date in the data you are going to work on , then round these two values to the nearest 15
Split the segment into 15 minutes intervals
Left join your data with the result came out and apply group by the StartTime and I used format in order to show the time formatting only
The benefit of this approach is that it works on specific interval and will not take any time interval outside of your data ranges.
with initial as(
select dateadd(minute, datediff(minute,0,min([Time])) / 15 * 15, 0) as MinTime,
dateadd(minute, datediff(minute,0,max([Time])) / 15 * 15, 0) as MaxTime
from data
), times as(
select StartTime = MinTime,
EndTime =dateadd(millisecond,-1,dateadd(minute,15,MinTime)),
MaxTime
from initial
union all
select dateadd(millisecond,1,EndTime),
dateadd(minute,15,EndTime),
MaxTime
from times
where EndTime<MaxTime
)
select format(t.StartTime,'HH:mm') as [Time],isnull(sum(d.[Count]),0) as [Count]
from times t
left join data d on d.[Time] between t.StartTime and t.EndTime
group by t.StartTime
Here is the output
Time Count
16:00 2
16:15 2
16:30 0
16:45 3
17:00 0
17:15 3
17:30 2
17:45 2
Here a working demo
Hope this will help you
EDIT
I changed the usage of second to millisecond based on the comment from #HABO, it will solve the case where there is some times like 16:59:59

How do I compare a current partial month vs a previous partial month with postgres?

I'm building some basic reports and I want to see if I'm on track to surpass last month's metrics without waiting for the month to end. Basically I want to compare June 1 (start of current month) through June 23 (current_date) against May 1 (start of previous month) through May 23 (current_date - 1 month).
My goal is to show a count of distinct users that did event1 and event2.
Here's what I have so far:
CREATE VIEW events AS
(SELECT *
FROM public.event
WHERE TYPE in ('event1',
'event2')
AND created_at > now() - interval '1 months' );
CREATE VIEW MAU AS
(SELECT EXTRACT(DOW
FROM created_at) AS month,
DATE_TRUNC('week', created_at) AS week,
COUNT(*) AS total_engagement,
COUNT(DISTINCT user_id) AS total_users
FROM events
GROUP BY 2,
1
ORDER BY week DESC);
SELECT month,
week,
SUM(total_engagement) OVER (PARTITION BY month
ORDER BY week) AS total_engagment
FROM MAU
ORDER BY 1 DESC,
2
Here's an example of what that returns:
Month Week Unique Engagement
6 2017-05-22 00:00:00 165
6 2017-05-29 00:00:00 355
6 2017-06-05 00:00:00 572
6 2017-06-12 00:00:00 723
5 2017-05-22 00:00:00 757
5 2017-05-29 00:00:00 1549
5 2017-06-05 00:00:00 2394
5 2017-06-12 00:00:00 3261
5 2017-06-19 00:00:00 3592
Expected return
Month Day Total Engagement
6 1 50
6 2 100
6 3 180
5 1 89
5 2 213
5 3 284
5 4 341
Can you point out where I've got this wrong or if there's an easier way to do it?
You are confusing days, weeks and months in your question but from the expected output I assume that you want month number, week number within a month and a count of those pairs.
SELECT
month,
week,
count(*) as total_engagement
FROM (
SELECT
extract(month from created_at) as month,
extract('day' from date_trunc('week', created_at::date) -
date_trunc('week', date_trunc('month', created_at::date))) / 7 + 1 as week
FROM public.event
WHERE type IN ('event1', 'event2')
AND created_at > now() - interval '1 month'
) t
GROUP BY 1,2
The most interesting part could be getting the week number within a month and for that you can check this answer.

Querying average and rolling 12 month average

I want to be able to find out the average per month and rolling average over the last 12 months of a count for the number of changes per customer.
SELECT
crq_requested_by_company as 'Customer',
COUNT(crq_number) as 'Number of Changes'
FROM
change_information ci1
GROUP BY
crq_requested_by_company
At the moment I am just doing the count of the total and my results look like this
crq_requested_by_company count
A 4
B 2
C 2269
D 7696
E 110
F 91
G 33
The date column I will be using is called 'start_date'.
I assume GETDATE() will be needed to work out the rolling average for the last 12 months.
Additional info after comments:
Using the code
;WITH CTE as
(
SELECT
crq_requested_by_company as Customer,
COUNT(crq_number) Nuc,
dateadd(month, datediff(month, 0, crq_start_date),0) m
FROM
change_information ci1
WHERE
crq_start_date >= dateadd(month,datediff(month, 0,getdate()) - 12,0)
GROUP BY
crq_requested_by_company,
datediff(month, 0, crq_start_date)
)
SELECT
Customer,
avg(Nuc) over (partition by Customer order by m) running_avg,
m start_month,
avg(Nuc) over (partition by Customer) simply_average
FROM
CTE
ORDER BY Customer, start_month
This gives the results
Customer running_avg start_month simply_average
A 8 01/01/2016 00:00 13
A 10 01/02/2016 00:00 13
A 10 01/03/2016 00:00 13
A 11 01/04/2016 00:00 13
A 14 01/05/2016 00:00 13
A 13 01/06/2016 00:00 13
B 1 01/01/2016 00:00 1
C 3 01/01/2016 00:00 2
C 3 01/02/2016 00:00 2
C 2 01/03/2016 00:00 2
C 2 01/04/2016 00:00 2
C 2 01/05/2016 00:00 2
C 2 01/06/2016 00:00 2
It needs to look like this so the average of the results above - the average of the 6 months above (I only currently have 6 months of data and needs to be 12 eventually)
Customer avg_of_running_avg
A 11
B 1
C 2
Try this, it should work for sqlserver 2012 using running average:
;WITH CTE as
(
SELECT
crq_requested_by_company as Customer,
COUNT(crq_number) Nuc,
dateadd(month, datediff(month, 0, start_date),0) m
FROM
change_information ci1
WHERE
start_date >= dateadd(month,datediff(month, 0,getdate()) - 12,0)
GROUP BY
crq_requested_by_company,
datediff(month, 0, start_date)
)
SELECT
Customer,
avg(Nuc) over (partition by Customer order by m) running_avg,
m start_month,
avg(Nuc) over (partition by Customer) simply_average
FROM
CTE
ORDER BY Customer, start_month

SQL count number of users every 7 days

I am new to SQL and I need to find count of users every 7 days. I have a table with users for every single day starting from April 2015 up until now:
...
2015-05-16 00:00
2015-05-16 00:00
2015-05-17 00:00
2015-05-17 00:00
2015-05-17 00:00
2015-05-17 00:00
2015-05-17 00:00
2015-05-18 00:00
2015-05-18 00:00
...
and I need to count the number of users every 7 days (weekly) so I have data weekly.
SELECT COUNT(user_id), Activity_Date FROM TABLE_NAME
I need output like this:
TotalUsers week1 week2 week3 ..........and so on
82 80 14 16
I am using DB Visualizer to query Oracle database.
You should try following,
Select
sum(Week1) + sum(Week2) + sum(Week3) + sum(Week4) + sum(Week5) as Total,
sum(Week1) as Week1,
sum(Week2) as Week2,
sum(Week3) as Week3,
sum(Week4) as Week4,
sum(Week5) as Week5
From (
select
case when week = 1 then 1 else 0 end as Week1,
case when week = 2 then 1 else 0 end as Week2,
case when week = 3 then 1 else 0 end as Week3,
case when week = 4 then 1 else 0 end as Week4,
case when week = 5 then 1 else 0 end as Week5
from
(
Select
CEILING(datepart(dd,visitdate)/7+1) week,
user_id
from visitor
)T
)D
Here is Fiddle
You need to add month & year in the result as well.
SELECT COUNT(user_id), Activity_Date FROM TABLE_NAME WHERE Activity_Date > '2015-06-31';
That would get the amount of users for the last 7 days.
This is my test table:
user_id act_date
1 01/04/2015
2 01/04/2015
3 04/04/2015
4 05/04/2015
..
This is my query:
select week_offset, count(*) nb from (
select trunc((act_date-to_date('01042015','DDMMYYYY'))/7) as week_offset from test_date)
group by week_offset
order by 1
and this is the output:
week_offset nb
0 6
1 3
4 5
5 7
6 3
7 1
18 1
Week offset is the number of the week from 01/04/2015, and we can show the first day of the week.
See here for live testing.
How do you define your weeks? Here's an approach for SQL Server that starts each seven-day block relative to the start of April. The expressions will vary according to your specific needs:
select
dateadd(
dd,
datediff(dd, cast('20150401' as date), Activity_Date) / 7 * 7,
cast('20150401' as date)
) as WeekStart,
count(*)
from T
group by datediff(dd, cast('20150401' as date), Activity_Date) / 7
Oracle:
select
trunc(Activity_date, 'DAY') as WeekStart,
count(*)
from T
group by trunc(Activity_date, 'DAY') /* D and DAY are the same thing */