I have two tables; df1 contains Date1 (timestamp) and PolygonWKT (geometry), df2 contains Date2 (timestamp) and PointWKT (geometry). I joined df1 and df2 based on geomtery, so each PointWKT fell under the corresponding PolygonWKT. The problem is, that Date1 and Date2e columns are messed up and what i also need is matched Date1 and Date2.
I would like to join tables based on geometry and also closest timestamp match between Date1 and Date2.
df2
| PointWKT | Date2 |
--------------------------------------
| b | 2020-05-05 12:00:00 UTC |
| b | 2020-05-05 12:00:10 UTC |
| b | 2020-05-05 12:00:20 UTC |
| b | 2020-05-05 12:17:00 UTC |
| c | 2020-05-06 18:00:00 UTC |
df1
| PolygonWKT | Date1 |
--------------------------------------
| A | 2020-05-03 9:00:00 UTC |
| A | 2020-05-03 9:30:10 UTC |
| B | 2020-05-05 12:05:00 UTC |
| B | 2020-05-05 12:25:00 UTC |
| C | 2020-05-06 18:05:00 UTC |
First part of the code is correct but second part doesn't return what i want:
SELECT *
FROM `xxx.yyy.df1` as df1 ,
`xxx.yyy.df2` as df2
WHERE ST_Contains (df1.PolygonWKT, df2.PointWKT)
AND (
df2.Date2 BETWEEN df1.Date1 AND TIMESTAMP_ADD(df1.Date1, INTERVAL 10 MINUTE)
desired df
| PointWKT | Date2 || PolygonWKT | Date1 |
----------------------------------------------------------------------------
| b | 2020-05-05 12:00:00 UTC | | B | 2020-05-05 12:05:00 UTC |
| b | 2020-05-05 12:00:10 UTC | | B | 2020-05-05 12:05:00 UTC |
| b | 2020-05-05 12:00:20 UTC | | B | 2020-05-05 12:05:00 UTC |
| b | 2020-05-05 12:17:00 UTC | | B | 2020-05-05 12:25:00 UTC |
| c | 2020-05-06 18:00:00 UTC | | C | 2020-05-06 18:05:00 UTC |
What would be a correct way to do this?
I would like to join tables based on geometry and also closest timestamp match between Date1 and Date2.
Below is for BigQuery Standard SQL
SELECT
ARRAY_AGG(STRUCT(df2.PointWKT, df2.Date2, df1.PolygonWKT, df1.Date1)
ORDER BY ABS(TIMESTAMP_DIFF(df2.Date2, df1.Date1, SECOND))
LIMIT 1)[OFFSET(0)].*
FROM `xxx.yyy.df1` AS df1 ,
`xxx.yyy.df2` AS df2
WHERE ST_CONTAINS(df1.PolygonWKT, df2.PointWKT)
GROUP BY TO_JSON_STRING(STRUCT(df2.PointWKT, df2.Date2))
If to apply to sample data similar to one in your example -
WITH `xxx.yyy.df1` AS (
SELECT ST_GEOGPOINT(1,2) PolygonWKT, TIMESTAMP '2020-05-03 9:00:00 UTC' Date1 UNION ALL
SELECT ST_GEOGPOINT(1,2), '2020-05-03 9:30:10 UTC' UNION ALL
SELECT ST_GEOGPOINT(1,3), '2020-05-05 12:05:00 UTC' UNION ALL
SELECT ST_GEOGPOINT(1,3), '2020-05-05 12:25:00 UTC' UNION ALL
SELECT ST_GEOGPOINT(1,4), '2020-05-06 18:05:00 UTC'
), `xxx.yyy.df2` AS (
SELECT ST_GEOGPOINT(1,3) PointWKT, TIMESTAMP '2020-05-05 12:00:00 UTC' Date2 UNION ALL
SELECT ST_GEOGPOINT(1,3), '2020-05-05 12:00:10 UTC' UNION ALL
SELECT ST_GEOGPOINT(1,3), '2020-05-05 12:00:20 UTC' UNION ALL
SELECT ST_GEOGPOINT(1,3), '2020-05-05 12:17:00 UTC' UNION ALL /* this value adjusted based on exapected result sample - as it looks as a typo */
SELECT ST_GEOGPOINT(1,4), '2020-05-06 18:00:00 UTC'
)
output is
Row PointWKT Date2 PolygonWKT Date1
1 POINT(1 3) 2020-05-05 12:00:00 UTC POINT(1 3) 2020-05-05 12:05:00 UTC
2 POINT(1 3) 2020-05-05 12:00:10 UTC POINT(1 3) 2020-05-05 12:05:00 UTC
3 POINT(1 3) 2020-05-05 12:00:20 UTC POINT(1 3) 2020-05-05 12:05:00 UTC
4 POINT(1 3) 2020-05-05 12:17:00 UTC POINT(1 3) 2020-05-05 12:25:00 UTC
5 POINT(1 4) 2020-05-06 18:00:00 UTC POINT(1 4) 2020-05-06 18:05:00 UTC
Based on your sample data, you are pulling the dates in the wrong order. Does this do what you want?
df2.Date1 BETWEEN df2.Date1 AND TIMESTAMP_ADD(df2.Date1, INTERVAL 10 MINUTE)
Related
How can I generate the following table in BigQuery:
+---------------------+
| mydate |
+---------------------+
| 2010-01-01 00:00:00 |
| 2010-01-01 01:00:00 |
| 2010-01-01 02:00:00 |
| 2010-01-01 03:00:00 |
| 2010-01-01 04:00:00 |
| 2010-01-01 05:00:00 |
+---------------------+
Use below
select ts
from unnest(generate_timestamp_array('2010-01-01 00:00:00', '2010-01-01 05:00:00', interval 1 hour)) ts
with output
Another option (based on #Daniel's comment and #Khilesh's answer)
select timestamp('2010-01-01 00:00:00') + make_interval(hour => hours_to_add)
from unnest(generate_array(0,5)) AS hours_to_add
obviously with same output as above
You can try this as well
SELECT
TIMESTAMP_ADD(TIMESTAMP("2010-01-01 00:00:00", INTERVAL hours_to_add HOURS) as mydate
from
(SELECT num1 as hours_to_add FROM UNNEST(GENERATE_ARRAY(0,2400)) AS num1)
Output :
+---------------------+
| mydate |
+---------------------+
| 2010-01-01 00:00:00 |
| 2010-01-01 01:00:00 |
| 2010-01-01 02:00:00 |
| 2010-01-01 03:00:00 |
| 2010-01-01 04:00:00 |
| 2010-01-01 05:00:00 |
+---------------------+
I have got the following table.
date2 Group number
2020-28-05 00:00:00 A 55
2020-28-05 00:00:00 B 1.09
2020-28-05 00:00:00 C 1.8
2020-29-05 00:00:00 A 68
2020-29-05 00:00:00 B 1.9
2020-29-05 00:00:00 C 1.19
2020-01-06 00:00:00 A 10
2020-01-06 00:00:00 B 15
2020-01-06 00:00:00 C 0.88
2020-02-06 00:00:00 A 22
2020-02-06 00:00:00 B 15
2020-02-06 00:00:00 C 13
2020-03-06 00:00:00 A 66
2020-03-06 00:00:00 B 88
2020-03-06 00:00:00 C 99
As you can see between dates 2020-30-05 and 2020-31-05 are missing in this table. So it is necessary to fill these dates with 2020-29-05 information grouped by GROUP. As a result the final output should be like that:
date2 Group number
2020-28-05 00:00:00 A 55
2020-28-05 00:00:00 B 1.09
2020-28-05 00:00:00 C 1.8
2020-29-05 00:00:00 A 68
2020-29-05 00:00:00 B 1.9
2020-29-05 00:00:00 C 1.19
2020-30-05 00:00:00 A 68
2020-30-05 00:00:00 B 1.9
2020-30-05 00:00:00 C 1.19
2020-31-05 00:00:00 A 68
2020-31-05 00:00:00 B 1.9
2020-31-05 00:00:00 C 1.19
2020-01-06 00:00:00 A 10
2020-01-06 00:00:00 B 15
2020-01-06 00:00:00 C 0.88
2020-02-06 00:00:00 A 22
2020-02-06 00:00:00 B 15
2020-02-06 00:00:00 C 13
2020-03-06 00:00:00 A 66
2020-03-06 00:00:00 B 88
2020-03-06 00:00:00 C 99
I tried to do in the following way:
create a temporary table (table B) with only dates for period 2020-28-05 till 2020-03-06 and then use left merge, thus making these new dates as null (in order to then insert a CASE when null, so fill in last_value). However, it does not work, because when merging I got nulls only for one date (but should be 3 times one date(because of groups). This is only part of the larger dataset, can you help how can I get the necessary output?
PS I use Vertica
It's Vertica. And Vertica has the TIMESERIES clause, which seems to exactly match with what you need:
Out of a time series - like you have one - with irregular intervals between the rows, or with longer gaps in an otherwise regular time series, it creates a regular time series, with the same interval between each row pair as you specify in the AS sub-clause of the TIMESERIES clause itself. TS_FIRST_VALUE() and TS_LAST_VALUE() are functions that rely on that clause and return the right value deduced from the input rows at the generated time stamp. This right value can be obtained 'const', that is from the row in the original row set closest to the generated time stamp, or 'linear', that is, interpolated from the original row just before and the original row just after the generated timestamp. For your needs, you would use the constant value. See here:
WITH
-- your input ....
input(tmstmp,grp,nbr) AS (
SELECT TIMESTAMP '2020-05-28 00:00:00','A',55
UNION ALL SELECT TIMESTAMP '2020-05-28 00:00:00','B',1.09
UNION ALL SELECT TIMESTAMP '2020-05-28 00:00:00','C',1.8
UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','A',68
UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','B',1.9
UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','C',1.19
UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','A',10
UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','B',15
UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','C',0.88
UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','A',22
UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','B',15
UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','C',13
UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','A',66
UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','B',88
UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','C',99
)
-- real query here ...
SELECT
ts AS tmstmp
, grp
, TS_FIRST_VALUE(nbr,'const') AS nbr
FROM input
TIMESERIES ts AS '1 DAY' OVER(PARTITION BY grp ORDER BY tmstmp)
ORDER BY 1,2
;
-- out tmstmp | grp | nbr
-- out ---------------------+-----+-------
-- out 2020-05-28 00:00:00 | A | 55.00
-- out 2020-05-28 00:00:00 | B | 1.09
-- out 2020-05-28 00:00:00 | C | 1.80
-- out 2020-05-29 00:00:00 | A | 68.00
-- out 2020-05-29 00:00:00 | B | 1.90
-- out 2020-05-29 00:00:00 | C | 1.19
-- out 2020-05-30 00:00:00 | A | 68.00
-- out 2020-05-30 00:00:00 | B | 1.90
-- out 2020-05-30 00:00:00 | C | 1.19
-- out 2020-05-31 00:00:00 | A | 68.00
-- out 2020-05-31 00:00:00 | B | 1.90
-- out 2020-05-31 00:00:00 | C | 1.19
-- out 2020-06-01 00:00:00 | A | 10.00
-- out 2020-06-01 00:00:00 | B | 15.00
-- out 2020-06-01 00:00:00 | C | 0.88
-- out 2020-06-02 00:00:00 | A | 22.00
-- out 2020-06-02 00:00:00 | B | 15.00
-- out 2020-06-02 00:00:00 | C | 13.00
-- out 2020-06-03 00:00:00 | A | 66.00
-- out 2020-06-03 00:00:00 | B | 88.00
In example: I have got the following table.
WITH
-- your input ....
input(t,grp,value) AS (
SELECT TIMESTAMP '2020-05-28 00:00:00','A',55
UNION ALL SELECT TIMESTAMP '2020-05-28 00:00:00','B',1.09
UNION ALL SELECT TIMESTAMP '2020-05-28 00:00:00','C',1.8
UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','A',68
UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','B',1.9
UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','C',1.19
UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','A',10
UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','B',15
UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','C',0.88
UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','A',22
UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','B',15
UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','C',13
UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','A',66
UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','B',88
UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','C',99
)
As you can see between dates 2020-30-05 and 2020-31-05 are missing in this table. So it is necessary to fill these dates with 2020-29-05 information grouped by GROUP. Additionally today date is larger than in the data (06-03 vs 06-08) (so in current month these observations are missing. As a result the final output should be like that :
date2 Group number
2020-28-05 00:00:00 A 55
2020-28-05 00:00:00 B 1.09
2020-28-05 00:00:00 C 1.8
2020-29-05 00:00:00 A 68
2020-29-05 00:00:00 B 1.9
2020-29-05 00:00:00 C 1.19
2020-30-05 00:00:00 A 68
2020-30-05 00:00:00 B 1.9
2020-30-05 00:00:00 C 1.19
2020-31-05 00:00:00 A 68
2020-31-05 00:00:00 B 1.9
2020-31-05 00:00:00 C 1.19
2020-01-06 00:00:00 A 10
2020-01-06 00:00:00 B 15
2020-01-06 00:00:00 C 0.88
2020-02-06 00:00:00 A 22
2020-02-06 00:00:00 B 15
2020-02-06 00:00:00 C 13
2020-03-06 00:00:00 A 66
2020-03-06 00:00:00 B 88
2020-03-06 00:00:00 C 99
And for periods 03-06 till 08-06 the same values
2020-08-06 00:00:00 A 66
2020-08-06 00:00:00 B 88
2020-08-06 00:00:00 C 99
The following code helps to find missing value in the dates, however those gaps are not filled up today dates. How to fix it?
SELECT ts AS t, grp, TS_FIRST_VALUE(value,'const') AS value
FROM input
TIMESERIES ts AS '1 DAY' OVER(PARTITION BY grp ORDER BY t)
ORDER BY 1,2
It's called INTERPOLATE and not EXTRAPOLATE, and that's the challenge.
You'll need to add the last row per group, but with today's date instead of the actual/original date, to the input table.
Note the padding and padded common table expressions I'm using below. Vertica has the analytic limit clause that I'm using here: LIMIT 1 OVER(PARTITION BY grp ORDER BY tmstmp DESC)..
WITH
input(tmstmp,grp,nbr) AS (
SELECT TIMESTAMP '2020-05-28 00:00:00','A',55
UNION ALL SELECT TIMESTAMP '2020-05-28 00:00:00','B',1.09
UNION ALL SELECT TIMESTAMP '2020-05-28 00:00:00','C',1.8
UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','A',68
UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','B',1.9
UNION ALL SELECT TIMESTAMP '2020-05-29 00:00:00','C',1.19
UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','A',10
UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','B',15
UNION ALL SELECT TIMESTAMP '2020-06-01 00:00:00','C',0.88
UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','A',22
UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','B',15
UNION ALL SELECT TIMESTAMP '2020-06-02 00:00:00','C',13
UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','A',66
UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','B',88
UNION ALL SELECT TIMESTAMP '2020-06-03 00:00:00','C',99
)
,
padding AS (
SELECT
CURRENT_DATE::timestamp
, grp
, nbr
FROM input
LIMIT 1 OVER(PARTITION BY grp ORDER BY tmstmp DESC)
)
,
padded AS (
SELECT * FROM input
UNION ALL
SELECT * FROM padding
)
SELECT
ts AS tmstmp
, grp
, TS_FIRST_VALUE(nbr,'const') AS nbr
FROM padded
TIMESERIES ts AS '1 DAY' OVER(PARTITION BY grp ORDER BY tmstmp)
ORDER BY 1,2
;
-- out tmstmp | grp | nbr
-- out ---------------------+-----+-------
-- out 2020-05-28 00:00:00 | A | 55.00
-- out 2020-05-28 00:00:00 | B | 1.09
-- out 2020-05-28 00:00:00 | C | 1.80
-- out 2020-05-29 00:00:00 | A | 68.00
-- out 2020-05-29 00:00:00 | B | 1.90
-- out 2020-05-29 00:00:00 | C | 1.19
-- out 2020-05-30 00:00:00 | A | 68.00
-- out 2020-05-30 00:00:00 | B | 1.90
-- out 2020-05-30 00:00:00 | C | 1.19
-- out 2020-05-31 00:00:00 | A | 68.00
-- out 2020-05-31 00:00:00 | B | 1.90
-- out 2020-05-31 00:00:00 | C | 1.19
-- out 2020-06-01 00:00:00 | A | 10.00
-- out 2020-06-01 00:00:00 | B | 15.00
-- out 2020-06-01 00:00:00 | C | 0.88
-- out 2020-06-02 00:00:00 | A | 22.00
-- out 2020-06-02 00:00:00 | B | 15.00
-- out 2020-06-02 00:00:00 | C | 13.00
-- out 2020-06-03 00:00:00 | A | 66.00
-- out 2020-06-03 00:00:00 | B | 88.00
-- out 2020-06-03 00:00:00 | C | 99.00
-- out 2020-06-04 00:00:00 | A | 66.00
-- out 2020-06-04 00:00:00 | B | 88.00
-- out 2020-06-04 00:00:00 | C | 99.00
-- out 2020-06-05 00:00:00 | A | 66.00
-- out 2020-06-05 00:00:00 | B | 88.00
-- out 2020-06-05 00:00:00 | C | 99.00
-- out 2020-06-06 00:00:00 | A | 66.00
-- out 2020-06-06 00:00:00 | B | 88.00
-- out 2020-06-06 00:00:00 | C | 99.00
-- out 2020-06-07 00:00:00 | A | 66.00
-- out 2020-06-07 00:00:00 | B | 88.00
-- out 2020-06-07 00:00:00 | C | 99.00
-- out 2020-06-08 00:00:00 | A | 66.00
-- out 2020-06-08 00:00:00 | B | 88.00
-- out 2020-06-08 00:00:00 | C | 99.00
-- out 2020-06-09 00:00:00 | A | 66.00
-- out 2020-06-09 00:00:00 | B | 88.00
-- out 2020-06-09 00:00:00 | C | 99.00
In Postgres below query is working using generate_series function
SELECT dates
FROM generate_series(CAST('2019-03-01' as TIMESTAMP), CAST('2019-04-01' as TIMESTAMP), interval '30 mins') AS dates
Below query is also working in Oracle but only for date interval
select to_date('2019-03-01','YYYY-MM-DD') + rownum -1 as dates
from all_objects
where rownum <= to_date('2019-03-06','YYYY-MM-DD')-to_date('2019-03-01','YYYY-MM-DD')+1
SELECT dates
FROM generate_series(CAST('2019-03-01' as TIMESTAMP), CAST('2019-04-01' as TIMESTAMP), interval '30 mins') AS dates
I want same result in Oracle for below query
SELECT dates
FROM generate_series(CAST('2019-03-01' as TIMESTAMP), CAST('2019-04-01' as TIMESTAMP), interval '30 mins') AS dates
Use a hierarchical query:
SELECT DATE '2019-03-01' + ( LEVEL - 1 ) * INTERVAL '30' MINUTE AS dates
FROM DUAL
CONNECT BY DATE '2019-03-01' + ( LEVEL - 1 ) * INTERVAL '30' MINUTE <= DATE '2019-04-01';
Output:
| DATES |
| :------------------ |
| 2019-03-01 00:00:00 |
| 2019-03-01 00:30:00 |
| 2019-03-01 01:00:00 |
| 2019-03-01 01:30:00 |
| 2019-03-01 02:00:00 |
| 2019-03-01 02:30:00 |
| 2019-03-01 03:00:00 |
| 2019-03-01 03:30:00 |
| 2019-03-01 04:00:00 |
| 2019-03-01 04:30:00 |
| 2019-03-01 05:00:00 |
| 2019-03-01 05:30:00 |
...
| 2019-03-31 19:30:00 |
| 2019-03-31 20:00:00 |
| 2019-03-31 20:30:00 |
| 2019-03-31 21:00:00 |
| 2019-03-31 21:30:00 |
| 2019-03-31 22:00:00 |
| 2019-03-31 22:30:00 |
| 2019-03-31 23:00:00 |
| 2019-03-31 23:30:00 |
| 2019-04-01 00:00:00 |
db<>fiddle here
Does anyone know how to create a single timestamp column from two timestamp columns in Google Bigquery?
I have a table with two timestamp columns and I want to bring these two columns into one single column. The table currently looks like:
id | user_id | created_at_a | created_at_b
------------------------------------------------------------------
1 | 1 | 2019-01-24 12:20:00 UTC | 2019-01-25 01:04:00 UTC
2 | 1 | 2019-01-24 12:20:00 UTC | 2019-01-25 01:03:00 UTC
3 | 1 | 2019-01-24 12:22:00 UTC | 2019-01-25 01:03:00 UTC
4 | 1 | 2019-01-24 12:22:00 UTC | 2019-01-25 01:04:00 UTC
5 | 2 | 2019-01-24 20:48:00 UTC | 2019-01-24 20:49:00 UTC
6 | 2 | 2019-01-24 11:21:00 UTC | 2019-01-24 20:49:00 UTC
So... I'm trying to merge these two timestamp columns into one column. My expected result is as follows:
id | user_id | created_at_a
----------------------------------------
1 | 1 | 2019-01-24 12:20:00 UTC
2 | 1 | 2019-01-25 01:04:00 UTC
4 | 1 | 2019-01-25 01:03:00 UTC
5 | 1 | 2019-01-24 12:22:00 UTC
6 | 2 | 2019-01-24 20:48:00 UTC
7 | 2 | 2019-01-24 20:49:00 UTC
8 | 2 | 2019-01-24 11:21:00 UTC
Could someone pleeaseeee help me.
Many thanks!
Below is for BigQuery Standard SQL
#standardSQL
SELECT DISTINCT user_id, created_at
FROM (
SELECT user_id,
ARRAY_CONCAT_AGG([created_at_a, created_at_b]) created_at_ab
FROM `project.dataset.table`
GROUP BY user_id
), UNNEST(created_at_ab) created_at
You can test, play with this using sample data from your question as below
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, 1 user_id, TIMESTAMP '2019-01-24 12:20:00 UTC' created_at_a, TIMESTAMP '2019-01-25 01:04:00 UTC' created_at_b UNION ALL
SELECT 2, 1, '2019-01-24 12:20:00 UTC', '2019-01-25 01:03:00 UTC' UNION ALL
SELECT 3, 1, '2019-01-24 12:22:00 UTC', '2019-01-25 01:03:00 UTC' UNION ALL
SELECT 4, 1, '2019-01-24 12:22:00 UTC', '2019-01-25 01:04:00 UTC' UNION ALL
SELECT 5, 2, '2019-01-24 20:48:00 UTC', '2019-01-24 20:49:00 UTC' UNION ALL
SELECT 6, 2, '2019-01-24 11:21:00 UTC', '2019-01-24 20:49:00 UTC'
)
SELECT DISTINCT user_id, created_at
FROM (
SELECT user_id,
ARRAY_CONCAT_AGG([created_at_a, created_at_b]) created_at_ab
FROM `project.dataset.table`
GROUP BY user_id
), UNNEST(created_at_ab) created_at
-- ORDER BY user_id, created_at
with result
Row user_id created_at
1 1 2019-01-24 12:20:00 UTC
2 1 2019-01-24 12:22:00 UTC
3 1 2019-01-25 01:03:00 UTC
4 1 2019-01-25 01:04:00 UTC
5 2 2019-01-24 11:21:00 UTC
6 2 2019-01-24 20:48:00 UTC
7 2 2019-01-24 20:49:00 UTC