bigquery creating timestamp buckets with 15 minutes interval - sql

I want to achieve this:
Output
12:00:00 - 12:15:00
12:15:00 - 12:30:00
12:30:00 - 12:45:00
12:45:00 - 1:00:00 .......,
count(orders)
from table
I have a timestamp in the data table available (2022-07-05 19:45:00 UTC), I want to achieve #orders with every 15 minutes interval for a day.

Using RANGE_BUCKET function, you can create timestamp buckets for each 15 minutes. Consider below sample query:
https://cloud.google.com/bigquery/docs/reference/standard-sql/mathematical_functions#range_bucket
CREATE TEMP TABLE sample_table AS
SELECT * FROM UNNEST(GENERATE_TIMESTAMP_ARRAY('2022-07-05 00:00:00', '2022-07-05 10:00:00', INTERVAL 3 MINUTE)) `order`
;
SELECT TIMESTAMP_SECONDS(intervals[SAFE_OFFSET(RANGE_BUCKET(UNIX_SECONDS(`order`), intervals) - 1)]) ts,
COUNT(`order`) AS orders,
FROM `sample_table`,
UNNEST ([STRUCT(GENERATE_ARRAY(UNIX_SECONDS('2022-07-05'), UNIX_SECONDS('2022-07-06'), 60 * 15) AS intervals)])
GROUP BY 1
ORDER BY 1

Related

Combine 2 series of timestamps in BigQuery

I'm trying to generate 2 series of timestamps with 30 minute interval like so:
interval_start,interval_end
2023-01-30 05:30:00.000000 +00:00,2023-01-30 06:00:00.000000 +00:00
2023-01-30 05:00:00.000000 +00:00,2023-01-30 05:30:00.000000 +00:00
2023-01-30 04:30:00.000000 +00:00,2023-01-30 05:00:00.000000 +00:00
I can generate each series but cannot combine them:
select *
from unnest(GENERATE_TIMESTAMP_ARRAY('2020-01-01', '2021-01-01', interval 30 minute)) start_times
select *
from unnest(GENERATE_TIMESTAMP_ARRAY(TIMESTAMP_ADD('2020-01-01', interval 30 MINUTE), '2021-01-01', interval 30 minute)) end_times
Consider below:
WITH intervals AS (
select *
from unnest(GENERATE_TIMESTAMP_ARRAY('2020-01-01', '2021-01-01', interval 30 minute)) interval_start
)
SELECT
interval_start, TIMESTAMP_ADD(interval_start, interval 30 minute) interval_end
FROM intervals
Output:

Custom range field based on Hour and time

I want to create a Custom field like below based on below two fields (Hour and Time). Does anyone know how to do this is SQL?
You do not need the Hour column for this result. The Time values are enough.
Sample data
create table data
(
TimeValue time(0)
);
insert into data (TimeValue) values
('12:00:00 AM'),
('12:15:00 AM'),
('12:30:00 AM'),
('12:45:00 AM'),
( '1:00:00 AM'),
( '1:15:00 AM'),
( '1:30:00 AM'),
( '1:45:00 AM'),
( '2:00:00 AM'),
( '2:15:00 AM'),
( '2:30:00 AM'),
( '2:45:00 AM'),
( '3:00:00 AM'),
( '3:15:00 AM'),
( '3:30:00 AM'),
( '3:45:00 AM'),
( '4:00:00 AM'),
( '4:15:00 AM'),
( '4:30:00 AM');
Solution
select 'Hour ' + convert(nvarchar(2), datepart(hour, d.TimeValue)) as [Hour],
convert(nvarchar(11), d.TimeValue, 22) as [Time],
'Hour ' +
convert(nvarchar(2), case
when datepart(hour, d.TimeValue)-1 < 0 then 0
else datepart(hour, d.TimeValue)-1
end) + '-' +
convert(nvarchar(2), datepart(hour, d.TimeValue)) as [Custom]
from data d;
Result
Hour Time Custom
------ ----------- --------
Hour 0 12:00:00 AM Hour 0-0
Hour 0 12:15:00 AM Hour 0-0
Hour 0 12:30:00 AM Hour 0-0
Hour 0 12:45:00 AM Hour 0-0
Hour 1 1:00:00 AM Hour 0-1
Hour 1 1:15:00 AM Hour 0-1
Hour 1 1:30:00 AM Hour 0-1
Hour 1 1:45:00 AM Hour 0-1
Hour 2 2:00:00 AM Hour 1-2
Hour 2 2:15:00 AM Hour 1-2
Hour 2 2:30:00 AM Hour 1-2
Hour 2 2:45:00 AM Hour 1-2
Hour 3 3:00:00 AM Hour 2-3
Hour 3 3:15:00 AM Hour 2-3
Hour 3 3:30:00 AM Hour 2-3
Hour 3 3:45:00 AM Hour 2-3
Hour 4 4:00:00 AM Hour 3-4
Hour 4 4:15:00 AM Hour 3-4
Hour 4 4:30:00 AM Hour 3-4
Fiddle to see things in action.
I want to create a Custom field like below based on below two fields (Hour and Time).
Assuming that you want the column generated from the hour column and the "hour" from the time column, then you can use a generated column:
alter table t add custom as
(concat(hour, '-', datepart(hour, time));
This is now part of the table. If you just want the value in a result set, you can put the expression in a result set.
Note: This doesn't return the results you have specified. You haven't explained the logic for those results.

Rolling 30 days of data from Big Query

Suppose I have this query:
SELECT ga_channelGrouping, ga_sourceMedium,ga_campaign, SUM(ga_sessions) as sessions,
SUM(ga_sessionDuration)/SUM(ga_sessions) as avg_sessionDuration,
SUM(ga_users)as Users, SUM(ga_newUsers)as New_Users, SUM(ga_bounces)/SUM(ga_sessions)
AS ga_bounceRate, SUM(ga_pageviews)/SUM(ga_sessions)as pageViews_per_sessions,
SUM( ga_transactions)/SUM(ga_sessions) AS ga_conversionRate
FROM db.table
group by ga_channelGrouping, ga_sourceMedium,ga_campaign
How do I find rolling 30 days of data from Big Query. My DATE column value is of this format: 2018-06-19 11:00:00 UTC
You can use the DATE_ADD or DATE_SUB functions to shift date values and TIMESTAMP_ADD, TIMESTAMP_SUB to shift timestamp values.
So you could try:
SELECT ga_channelGrouping, ga_sourceMedium,ga_campaign, SUM(ga_sessions) as sessions,
SUM(ga_sessionDuration)/SUM(ga_sessions) as avg_sessionDuration,
SUM(ga_users)as Users, SUM(ga_newUsers)as New_Users, SUM(ga_bounces)/SUM(ga_sessions)
AS ga_bounceRate, SUM(ga_pageviews)/SUM(ga_sessions)as pageViews_per_sessions,
SUM( ga_transactions)/SUM(ga_sessions) AS ga_conversionRate
FROM db.table
WHERE your_date_column >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24*30 HOUR)
group by ga_channelGrouping, ga_sourceMedium,ga_campaign
The TIMESTAMP_SUB doesn't take DAY as an interval, so here we've done 24*30 hours to go back 30 days.
EDIT: If you want to roll back 30 days regardless of the time of the day you can do the following:
WHERE your_date_column >= TIMESTAMP_TRUNC(TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24*30 HOUR), DAY)
OR
WHERE CAST(your_date_column AS DATE) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
How do I find rolling 30 days of data from Big Query. My DATE column value is of this format: 2018-06-19 11:00:00 UTC
First, I wanted to point out that aggregating last 30 days is quite different from rolling 30 days - so below answer is actually focusing on rolling 30 days vs. just last 30 days
Below is for BigQuery Standard SQL and assumes that your date column is named your_date_column and is of TIMESTAMP data type
#standardSQL
SELECT
your_date_column, -- data type of TIMESTAMP with value like 2018-06-19 11:00:00 UTC
ga_channelGrouping,
ga_sourceMedium,
ga_campaign,
SUM(ga_sessions) OVER(win) AS sessions,
(SUM(ga_sessionDuration) OVER(win))/(SUM(ga_sessions) OVER(win)) AS avg_sessionDuration,
SUM(ga_users) OVER(win) AS Users,
SUM(ga_newUsers) OVER(win) AS New_Users,
(SUM(ga_bounces) OVER(win))/(SUM(ga_sessions) OVER(win)) AS ga_bounceRate,
(SUM(ga_pageviews) OVER(win))/(SUM(ga_sessions) OVER(win)) AS pageViews_per_sessions,
(SUM(ga_transactions) OVER(win))/(SUM(ga_sessions) OVER(win)) AS ga_conversionRate
FROM `project.dataset.table`
WINDOW win AS (
PARTITION BY ga_channelGrouping, ga_sourceMedium, ga_campaign
ORDER BY UNIX_DATE(DATE(your_date_column))
RANGE BETWEEN 29 PRECEDING AND CURRENT ROW
)
For you to understand how it works - try and play with below dummy example (for simplicity it does rolling 3 days)
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 value, TIMESTAMP '2018-06-19 11:00:00 UTC' your_date_column UNION ALL
SELECT 2, '2018-06-20 11:00:00 UTC' UNION ALL
SELECT 3, '2018-06-21 11:00:00 UTC' UNION ALL
SELECT 4, '2018-06-22 11:00:00 UTC' UNION ALL
SELECT 5, '2018-06-23 11:00:00 UTC' UNION ALL
SELECT 6, '2018-06-24 11:00:00 UTC' UNION ALL
SELECT 7, '2018-06-25 11:00:00 UTC' UNION ALL
SELECT 8, '2018-06-26 11:00:00 UTC' UNION ALL
SELECT 9, '2018-06-27 11:00:00 UTC' UNION ALL
SELECT 10, '2018-06-28 11:00:00 UTC'
)
SELECT
your_date_column,
value,
SUM(value) OVER(win) rolling_value
FROM `project.dataset.table`
WINDOW win AS (ORDER BY UNIX_DATE(DATE(your_date_column)) RANGE BETWEEN 2 PRECEDING AND CURRENT ROW)
ORDER BY your_date_column
where result is
Row your_date_column value rolling_value
1 2018-06-19 11:00:00 UTC 1 1
2 2018-06-20 11:00:00 UTC 2 3
3 2018-06-21 11:00:00 UTC 3 6
4 2018-06-22 11:00:00 UTC 4 9
5 2018-06-23 11:00:00 UTC 5 12
6 2018-06-24 11:00:00 UTC 6 15
7 2018-06-25 11:00:00 UTC 7 18
8 2018-06-26 11:00:00 UTC 8 21
9 2018-06-27 11:00:00 UTC 9 24
10 2018-06-28 11:00:00 UTC 10 27

SQL: Create a table with 24 hourly rows

In PostreSQL, it is fairly easy to create a timestamp with date and the current hour:
vioozer=> SELECT to_char(NOW(), 'YYYY-MM-DD HH24:00');
to_char
------------------
2014-08-12 12:00
(1 row)
The previous hour can be shown using NOW()-interval '1 hour':
vioozer=> SELECT to_char(NOW()-interval '1 hour', 'YYYY-MM-DD HH24:00');
to_char
------------------
2014-08-12 11:00
(1 row)
Is there a way to easily generate a table with 24 columns for the last 24 hours, a-la:
vioozer=> SELECT MAGIC_FROM_STACK_OVERFLOW();
to_char
------------------
2014-08-12 12:00
2014-08-12 11:00
2014-08-12 10:00
...
2014-08-11 13:00
(24 row)
Use generate_series()
select i
from generate_series(current_timestamp - interval '24' hour, current_timestamp, interval '1' hour) i

how to convert date format in sql

if i have dates/time like these
8/5/2014 12:00:01 AM
8/5/2014 12:00:16 AM
8/5/2014 12:00:18 AM
8/5/2014 12:17:18 AM
8/5/2014 12:19:18 AM
i want these date/times
if the minutes less than 15 and greater than 00 i want the time for minutes to be 00
if the minutes less than 30 and greater than 15 i want the minutes to be 15
if the minutes less than 45 and greater than 30 i want the minutes to be 30
if the minutes less than 00 and greater than 45 i want the minutes to be 45
8/5/2014 12:00:00 AM
...
...
8/5/2014 12:15:00AM
...
...
8/5/2014 12:30:00AM
...
...
8/5/2014 12:45:00AM
i need to do that for my report
how can i apply these in oracle
Here's another method, which is really a variation of Gordon Linoff's approach:
trunc(dt, 'HH24') + ((15/1440) * (floor(to_number(to_char(dt, 'MI'))/15)))
The trunc(dt, 'HH24') gives you the value truncated to hour precision, so for your sample that's always midnight. Then floor(to_number(to_char(dt, 'MI'))/15) gives you the number of complete 15-minute periods represented by the minute value; with your data that's either zero or 1. As Gordon mentioned when you add a numeric value to a date it's treated as fractions of a day, so that needs to be multiplied by '15 minutes' (15/1400).
with t as (
select to_date('8/5/2014 12:00:01 AM') as dt from dual
union all select to_date('8/5/2014 12:00:16 AM') as dt from dual
union all select to_date('8/5/2014 12:00:18 AM') as dt from dual
union all select to_date('8/5/2014 12:17:18 AM') as dt from dual
union all select to_date('8/5/2014 12:19:18 AM') as dt from dual
union all select to_date('8/5/2014 12:37:37 AM') as dt from dual
union all select to_date('8/5/2014 12:51:51 AM') as dt from dual
)
select dt, trunc(dt, 'HH24')
+ ((15/1440) * (floor(to_number(to_char(dt, 'MI'))/15))) as new_dt
from t;
DT NEW_DT
---------------------- ----------------------
08/05/2014 12:00:01 AM 08/05/2014 12:00:00 AM
08/05/2014 12:00:16 AM 08/05/2014 12:00:00 AM
08/05/2014 12:00:18 AM 08/05/2014 12:00:00 AM
08/05/2014 12:17:18 AM 08/05/2014 12:15:00 AM
08/05/2014 12:19:18 AM 08/05/2014 12:15:00 AM
08/05/2014 12:37:37 AM 08/05/2014 12:30:00 AM
08/05/2014 12:51:51 AM 08/05/2014 12:45:00 AM
To add yet another method this looks like a situation where the function NUMTODSINTERVAL() could be useful - it makes it slightly more obvious what's happening:
select trunc(dt)
+ numtodsinterval(trunc(to_char(dt, 'sssss') / 900) * 15, 'MINUTE')
from ...
TRUNC() truncates the date to the beginning of that day. The format model sssss calculates the number of seconds since midnight. The number of complete quarter-hours since midnight is the number of seconds divided by 900 (as there are 900 quarter-hours in the day). This is truncated again to remove any part-completed quarter-hours, multipled by 15 to give the number of minutes (there are 15 minutes in a quarter hour). Lastly, convert this to an INTERVAL DAY TO SECOND and add to the original date.
SQL> alter session set nls_date_format = 'dd/mm/yyyy hh24:mi:ss';
Session altered.
SQL> with t as (
2 select to_date('05/08/2014 12:00:01') as dt from dual union all
3 select to_date('05/08/2014 12:00:16') as dt from dual union all
4 select to_date('05/08/2014 12:00:18') as dt from dual union all
5 select to_date('05/08/2014 12:17:18') as dt from dual union all
6 select to_date('05/08/2014 12:19:18') as dt from dual union all
7 select to_date('05/08/2014 12:37:37') as dt from dual union all
8 select to_date('05/08/2014 12:51:51') as dt from dual
9 )
10 select trunc(dt)
11 + numtodsinterval(trunc(to_char(dt, 'sssss') / 900) * 15, 'MINUTE')
12 from t
13 ;
TRUNC(DT)+NUMTODSIN
-------------------
05/08/2014 12:00:00
05/08/2014 12:00:00
05/08/2014 12:00:00
05/08/2014 12:15:00
05/08/2014 12:15:00
05/08/2014 12:30:00
05/08/2014 12:45:00
7 rows selected.
I've explicitly set my NLS_DATE_FORMAT so I can rely on implicit conversion in TO_DATE() so that it fits on the page without scrolling. It is not recommended to use implicit conversion normally.
Here is one method. Extract the date and then add in what you want as hours and minutes:
select trunc(dt) + extract(hour from dt) / 24.0 +
(trunc(extract(minute from dt) / 15) * 15) / (24.0 * 60);
This uses the fact that + for dates adds a number of days. The three terms are the original date at midnight, the number of hours converted to days (hence the / 24) and the third is the number of minutes, suitably rounded.
Here is an example for ORACLE with actual date:
select trunc(sysdate,'HH')+trunc(to_number(to_char(sysdate,'MI'))/15)*15/24/60
from dual;