Rolling 30 days of data from Big Query - google-bigquery

Suppose I have this query:
SELECT ga_channelGrouping, ga_sourceMedium,ga_campaign, SUM(ga_sessions) as sessions,
SUM(ga_sessionDuration)/SUM(ga_sessions) as avg_sessionDuration,
SUM(ga_users)as Users, SUM(ga_newUsers)as New_Users, SUM(ga_bounces)/SUM(ga_sessions)
AS ga_bounceRate, SUM(ga_pageviews)/SUM(ga_sessions)as pageViews_per_sessions,
SUM( ga_transactions)/SUM(ga_sessions) AS ga_conversionRate
FROM db.table
group by ga_channelGrouping, ga_sourceMedium,ga_campaign
How do I find rolling 30 days of data from Big Query. My DATE column value is of this format: 2018-06-19 11:00:00 UTC

You can use the DATE_ADD or DATE_SUB functions to shift date values and TIMESTAMP_ADD, TIMESTAMP_SUB to shift timestamp values.
So you could try:
SELECT ga_channelGrouping, ga_sourceMedium,ga_campaign, SUM(ga_sessions) as sessions,
SUM(ga_sessionDuration)/SUM(ga_sessions) as avg_sessionDuration,
SUM(ga_users)as Users, SUM(ga_newUsers)as New_Users, SUM(ga_bounces)/SUM(ga_sessions)
AS ga_bounceRate, SUM(ga_pageviews)/SUM(ga_sessions)as pageViews_per_sessions,
SUM( ga_transactions)/SUM(ga_sessions) AS ga_conversionRate
FROM db.table
WHERE your_date_column >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24*30 HOUR)
group by ga_channelGrouping, ga_sourceMedium,ga_campaign
The TIMESTAMP_SUB doesn't take DAY as an interval, so here we've done 24*30 hours to go back 30 days.
EDIT: If you want to roll back 30 days regardless of the time of the day you can do the following:
WHERE your_date_column >= TIMESTAMP_TRUNC(TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24*30 HOUR), DAY)
OR
WHERE CAST(your_date_column AS DATE) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))

How do I find rolling 30 days of data from Big Query. My DATE column value is of this format: 2018-06-19 11:00:00 UTC
First, I wanted to point out that aggregating last 30 days is quite different from rolling 30 days - so below answer is actually focusing on rolling 30 days vs. just last 30 days
Below is for BigQuery Standard SQL and assumes that your date column is named your_date_column and is of TIMESTAMP data type
#standardSQL
SELECT
your_date_column, -- data type of TIMESTAMP with value like 2018-06-19 11:00:00 UTC
ga_channelGrouping,
ga_sourceMedium,
ga_campaign,
SUM(ga_sessions) OVER(win) AS sessions,
(SUM(ga_sessionDuration) OVER(win))/(SUM(ga_sessions) OVER(win)) AS avg_sessionDuration,
SUM(ga_users) OVER(win) AS Users,
SUM(ga_newUsers) OVER(win) AS New_Users,
(SUM(ga_bounces) OVER(win))/(SUM(ga_sessions) OVER(win)) AS ga_bounceRate,
(SUM(ga_pageviews) OVER(win))/(SUM(ga_sessions) OVER(win)) AS pageViews_per_sessions,
(SUM(ga_transactions) OVER(win))/(SUM(ga_sessions) OVER(win)) AS ga_conversionRate
FROM `project.dataset.table`
WINDOW win AS (
PARTITION BY ga_channelGrouping, ga_sourceMedium, ga_campaign
ORDER BY UNIX_DATE(DATE(your_date_column))
RANGE BETWEEN 29 PRECEDING AND CURRENT ROW
)
For you to understand how it works - try and play with below dummy example (for simplicity it does rolling 3 days)
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 value, TIMESTAMP '2018-06-19 11:00:00 UTC' your_date_column UNION ALL
SELECT 2, '2018-06-20 11:00:00 UTC' UNION ALL
SELECT 3, '2018-06-21 11:00:00 UTC' UNION ALL
SELECT 4, '2018-06-22 11:00:00 UTC' UNION ALL
SELECT 5, '2018-06-23 11:00:00 UTC' UNION ALL
SELECT 6, '2018-06-24 11:00:00 UTC' UNION ALL
SELECT 7, '2018-06-25 11:00:00 UTC' UNION ALL
SELECT 8, '2018-06-26 11:00:00 UTC' UNION ALL
SELECT 9, '2018-06-27 11:00:00 UTC' UNION ALL
SELECT 10, '2018-06-28 11:00:00 UTC'
)
SELECT
your_date_column,
value,
SUM(value) OVER(win) rolling_value
FROM `project.dataset.table`
WINDOW win AS (ORDER BY UNIX_DATE(DATE(your_date_column)) RANGE BETWEEN 2 PRECEDING AND CURRENT ROW)
ORDER BY your_date_column
where result is
Row your_date_column value rolling_value
1 2018-06-19 11:00:00 UTC 1 1
2 2018-06-20 11:00:00 UTC 2 3
3 2018-06-21 11:00:00 UTC 3 6
4 2018-06-22 11:00:00 UTC 4 9
5 2018-06-23 11:00:00 UTC 5 12
6 2018-06-24 11:00:00 UTC 6 15
7 2018-06-25 11:00:00 UTC 7 18
8 2018-06-26 11:00:00 UTC 8 21
9 2018-06-27 11:00:00 UTC 9 24
10 2018-06-28 11:00:00 UTC 10 27

Related

Query that counts total records per day and total records with same time timestamp and id per day in Bigquery

I have timeseries data like this:
time
id
value
2018-04-25 22:00:00 UTC
A
1
2018-04-25 23:00:00 UTC
A
2
2018-04-25 23:00:00 UTC
A
2.1
2018-04-25 23:00:00 UTC
B
1
2018-04-26 23:00:00 UTC
B
1.3
How do i write a query to produce an output table with these columns:
date: the truncated time
records: the number of records during this date
records_conflicting_time_id: the number of records during this date where the combination of time, id are not unique. In the example data above the two records with id==A at 2018-04-25 23:00:00 UTC would be counted for date 2018-04-25
So the output of our query should be:
date
records
records_conflicting_time_id
2018-04-25
4
2
2018-04-26
1
0
Getting records is easy, i just truncate the time to get date and then group by date. But i'm really struggling to produce a column that counts the number of records where id + time is not unique over that date...
Consider below approach
select date(time) date,
sum(cnt) records,
sum(if(cnt > 1, cnt, 0)) conflicting_records
from (
select time, id, count(*) cnt
from your_table
group by time, id
)
group by date
if applied to sample data in your question - output is
with YOUR_DATA as
(
select cast('2018-04-25 22:00:00 UTC' as timestamp) as `time`, 'A' as id, 1.0 as value
union all select cast('2018-04-25 23:00:00 UTC' as timestamp) as `time`, 'A' as id, 2.0 as value
union all select cast('2018-04-25 23:00:00 UTC' as timestamp) as `time`, 'A' as id, 2.1 as value
union all select cast('2018-04-25 23:00:00 UTC' as timestamp) as `time`, 'B' as id, 1.0 as value
union all select cast('2018-04-26 23:00:00 UTC' as timestamp) as `time`, 'B' as id, 1.3 as value
)
select cast(timestamp_trunc(t1.`time`, day) as date) as `date`,
count(*) as records,
case when count(*)-count(distinct cast(t1.`time` as string) || t1.id) = 0 then 0
else count(*)-count(distinct cast(t1.`time` as string) || t1.id)+1
end as records_conflicting_time_id
from YOUR_DATA t1
group by cast(timestamp_trunc(t1.`time`, day) as date)
;

How to convert unix timestamp and aggregate min and max date in Oracle SQL Developer?

I have table in Oracle SQL like below:
ID | date | place
-----------------------------
123 | 1610295784376 | OBJ_1
444 | 1748596758291 | OBJ_1
567 | 8391749204754 | OBJ_2
888 | 1747264526789 | OBJ_3
ID - ID of client
date - date in Unix timestamp in UTC
place - place of contact with client
And I need to aggregate above date to achieve results as below, so I need to:
convert unix timestamp in UTC from column "date" to normal date as below
calculate min and max date for each values from column "place"
min_date
max_date
distinct_place
2022-01-05
2022-02-15
OBJ_1
2022-02-10
2022-03-20
OBJ_2
2021-10-15
2021-11-21
OBJ_3
You can use:
SELECT TIMESTAMP '1970-01-01 00:00:00 UTC'
+ MIN(date_column) * INTERVAL '0.001' SECOND(3)
AS min_date,
TIMESTAMP '1970-01-01 00:00:00 UTC'
+ MAX(date_column) * INTERVAL '0.001' SECOND(3)
AS max_date,
place
FROM table_name
GROUP BY place;
Note: the (3) after SECOND is optional and will just explicitly specify the precision of the fractional seconds.
or:
SELECT TIMESTAMP '1970-01-01 00:00:00 UTC'
+ NUMTODSINTERVAL( MIN(date_column) / 1000, 'SECOND')
AS min_date,
TIMESTAMP '1970-01-01 00:00:00 UTC'
+ NUMTODSINTERVAL( MAX(date_column) / 1000, 'SECOND')
AS max_date,
place
FROM table_name
GROUP BY place;
Which, for the sample data:
CREATE TABLE table_name (ID, date_column, place) AS
SELECT 123, 1610295784376, 'OBJ_1' FROM DUAL UNION ALL
SELECT 444, 1748596758291, 'OBJ_1' FROM DUAL UNION ALL
SELECT 567, 1391749204754, 'OBJ_2' FROM DUAL UNION ALL -- Fixed leading digit
SELECT 888, 1747264526789, 'OBJ_3' FROM DUAL;
Both output:
MIN_DATE
MAX_DATE
PLACE
2021-01-10 16:23:04.376000000 UTC
2025-05-30 09:19:18.291000000 UTC
OBJ_1
2014-02-07 05:00:04.754000000 UTC
2014-02-07 05:00:04.754000000 UTC
OBJ_2
2025-05-14 23:15:26.789000000 UTC
2025-05-14 23:15:26.789000000 UTC
OBJ_3
db<>fiddle here

BigQuery - Query for each and set elements in column

I would like to loop over several elements for a query.
Here is the query :
SELECT
timestamp_trunc(timestamp, DAY) as Day,
count(1) as Number
FROM `table`
WHERE user_id="12345" AND timestamp >= '2021-07-05 00:00:00 UTC' AND timestamp <= '2021-07-08 23:59:59 UTC'
GROUP BY 1
ORDER BY Day
So I have for the user "12345" a row counter per each day between two dates, this is perfect.
But I would like to do this query for each user_id of my table,
and if possible I would like each day on column, so each row is a user and the number available for each column (which is a day).
Result wanted :
User | 2021-07-05 | 2021-07-06 | 2021-07-07
---------------------------------------------
user_1 | 345 | 16 | 41
user_2 | 555 | 53 | 26
Thank you very much
Use below approach
SELECT * FROM (
SELECT
user_id,
DATE(timestamp) as Day,
COUNT(1) as Number
FROM `project.dataset.table`
WHERE timestamp >= '2021-07-05 00:00:00 UTC' AND timestamp <= '2021-07-08 23:59:59 UTC'
GROUP BY 1, 2
)
PIVOT (SUM(Number) FOR Day IN ('2021-07-05','2021-07-06','2021-07-07'))
Or even simpler (w/o GROUP BY as in your original query)
SELECT * FROM (
SELECT
user_id,
DATE(timestamp) as Day,
FROM `project.dataset.table`
WHERE timestamp >= '2021-07-05 00:00:00 UTC' AND timestamp <= '2021-07-08 23:59:59 UTC'
)
PIVOT (COUNT(*) FOR Day IN ('2021-07-05','2021-07-06','2021-07-07'))

How to fill the time gap after grouping date record for months in postgres

I have table records as -
date n_count
2020-02-19 00:00:00 4
2020-07-14 00:00:00 1
2020-07-17 00:00:00 1
2020-07-30 00:00:00 2
2020-08-03 00:00:00 1
2020-08-04 00:00:00 2
2020-08-25 00:00:00 2
2020-09-23 00:00:00 2
2020-09-30 00:00:00 3
2020-10-01 00:00:00 11
2020-10-05 00:00:00 12
2020-10-19 00:00:00 1
2020-10-20 00:00:00 1
2020-10-22 00:00:00 1
2020-11-02 00:00:00 376
2020-11-04 00:00:00 72
2020-11-11 00:00:00 1
I want to be grouped all the records into months for finding month total count which is working, but there is a missing of month. how to fill this gap.
time month_count
"2020-02-01" 4
"2020-07-01" 4
"2020-08-01" 5
"2020-09-01" 5
"2020-10-01" 26
"2020-11-01" 449
This is what I have tried.
SELECT (date_trunc('month', date))::date AS time,
sum(n_count) as month_count
FROM table1
group by time
order by time asc
You can use generate_series() to generate all starts of months between the earliest and latest date available in the table, then bring the table with a left join:
select d.dt, coalesce(sum(t.n_count), 0) as month_count
from (
select generate_series(date_trunc('month', min(date)), date_trunc('month', max(date)), '1 month') as dt
from table1
) as d(dt)
left join table1 t on t.date >= d.dt and t.date < d.dt + interval '1 month'
group by d.dt
order by d.dt
I would simply UNION a date series, generated from MIN and MAX date:
demo:db<>fiddle
WITH cte AS ( -- 1
SELECT
*,
date_trunc('month', date)::date AS time
FROM
t
)
SELECT
time,
SUM(n_count) as month_count --3
FROM (
SELECT
time,
n_count
FROM cte
UNION
SELECT -- 2
generate_series(
(SELECT MIN(time) FROM cte),
(SELECT MAX(time) FROM cte),
interval '1 month'
)::date,
0
) s
GROUP BY time
ORDER BY time
Use CTE to calculate date_trunc only once. Could be left out if you like to call your table twice in the UNION below
Generate monthly date series from MIN to MAX date containing your n_count value = 0. Add it to the table
Do your calculation

how to convert date format in sql

if i have dates/time like these
8/5/2014 12:00:01 AM
8/5/2014 12:00:16 AM
8/5/2014 12:00:18 AM
8/5/2014 12:17:18 AM
8/5/2014 12:19:18 AM
i want these date/times
if the minutes less than 15 and greater than 00 i want the time for minutes to be 00
if the minutes less than 30 and greater than 15 i want the minutes to be 15
if the minutes less than 45 and greater than 30 i want the minutes to be 30
if the minutes less than 00 and greater than 45 i want the minutes to be 45
8/5/2014 12:00:00 AM
...
...
8/5/2014 12:15:00AM
...
...
8/5/2014 12:30:00AM
...
...
8/5/2014 12:45:00AM
i need to do that for my report
how can i apply these in oracle
Here's another method, which is really a variation of Gordon Linoff's approach:
trunc(dt, 'HH24') + ((15/1440) * (floor(to_number(to_char(dt, 'MI'))/15)))
The trunc(dt, 'HH24') gives you the value truncated to hour precision, so for your sample that's always midnight. Then floor(to_number(to_char(dt, 'MI'))/15) gives you the number of complete 15-minute periods represented by the minute value; with your data that's either zero or 1. As Gordon mentioned when you add a numeric value to a date it's treated as fractions of a day, so that needs to be multiplied by '15 minutes' (15/1400).
with t as (
select to_date('8/5/2014 12:00:01 AM') as dt from dual
union all select to_date('8/5/2014 12:00:16 AM') as dt from dual
union all select to_date('8/5/2014 12:00:18 AM') as dt from dual
union all select to_date('8/5/2014 12:17:18 AM') as dt from dual
union all select to_date('8/5/2014 12:19:18 AM') as dt from dual
union all select to_date('8/5/2014 12:37:37 AM') as dt from dual
union all select to_date('8/5/2014 12:51:51 AM') as dt from dual
)
select dt, trunc(dt, 'HH24')
+ ((15/1440) * (floor(to_number(to_char(dt, 'MI'))/15))) as new_dt
from t;
DT NEW_DT
---------------------- ----------------------
08/05/2014 12:00:01 AM 08/05/2014 12:00:00 AM
08/05/2014 12:00:16 AM 08/05/2014 12:00:00 AM
08/05/2014 12:00:18 AM 08/05/2014 12:00:00 AM
08/05/2014 12:17:18 AM 08/05/2014 12:15:00 AM
08/05/2014 12:19:18 AM 08/05/2014 12:15:00 AM
08/05/2014 12:37:37 AM 08/05/2014 12:30:00 AM
08/05/2014 12:51:51 AM 08/05/2014 12:45:00 AM
To add yet another method this looks like a situation where the function NUMTODSINTERVAL() could be useful - it makes it slightly more obvious what's happening:
select trunc(dt)
+ numtodsinterval(trunc(to_char(dt, 'sssss') / 900) * 15, 'MINUTE')
from ...
TRUNC() truncates the date to the beginning of that day. The format model sssss calculates the number of seconds since midnight. The number of complete quarter-hours since midnight is the number of seconds divided by 900 (as there are 900 quarter-hours in the day). This is truncated again to remove any part-completed quarter-hours, multipled by 15 to give the number of minutes (there are 15 minutes in a quarter hour). Lastly, convert this to an INTERVAL DAY TO SECOND and add to the original date.
SQL> alter session set nls_date_format = 'dd/mm/yyyy hh24:mi:ss';
Session altered.
SQL> with t as (
2 select to_date('05/08/2014 12:00:01') as dt from dual union all
3 select to_date('05/08/2014 12:00:16') as dt from dual union all
4 select to_date('05/08/2014 12:00:18') as dt from dual union all
5 select to_date('05/08/2014 12:17:18') as dt from dual union all
6 select to_date('05/08/2014 12:19:18') as dt from dual union all
7 select to_date('05/08/2014 12:37:37') as dt from dual union all
8 select to_date('05/08/2014 12:51:51') as dt from dual
9 )
10 select trunc(dt)
11 + numtodsinterval(trunc(to_char(dt, 'sssss') / 900) * 15, 'MINUTE')
12 from t
13 ;
TRUNC(DT)+NUMTODSIN
-------------------
05/08/2014 12:00:00
05/08/2014 12:00:00
05/08/2014 12:00:00
05/08/2014 12:15:00
05/08/2014 12:15:00
05/08/2014 12:30:00
05/08/2014 12:45:00
7 rows selected.
I've explicitly set my NLS_DATE_FORMAT so I can rely on implicit conversion in TO_DATE() so that it fits on the page without scrolling. It is not recommended to use implicit conversion normally.
Here is one method. Extract the date and then add in what you want as hours and minutes:
select trunc(dt) + extract(hour from dt) / 24.0 +
(trunc(extract(minute from dt) / 15) * 15) / (24.0 * 60);
This uses the fact that + for dates adds a number of days. The three terms are the original date at midnight, the number of hours converted to days (hence the / 24) and the third is the number of minutes, suitably rounded.
Here is an example for ORACLE with actual date:
select trunc(sysdate,'HH')+trunc(to_number(to_char(sysdate,'MI'))/15)*15/24/60
from dual;