How can I generate the following table in BigQuery:
+---------------------+
| mydate |
+---------------------+
| 2010-01-01 00:00:00 |
| 2010-01-01 01:00:00 |
| 2010-01-01 02:00:00 |
| 2010-01-01 03:00:00 |
| 2010-01-01 04:00:00 |
| 2010-01-01 05:00:00 |
+---------------------+
Use below
select ts
from unnest(generate_timestamp_array('2010-01-01 00:00:00', '2010-01-01 05:00:00', interval 1 hour)) ts
with output
Another option (based on #Daniel's comment and #Khilesh's answer)
select timestamp('2010-01-01 00:00:00') + make_interval(hour => hours_to_add)
from unnest(generate_array(0,5)) AS hours_to_add
obviously with same output as above
You can try this as well
SELECT
TIMESTAMP_ADD(TIMESTAMP("2010-01-01 00:00:00", INTERVAL hours_to_add HOURS) as mydate
from
(SELECT num1 as hours_to_add FROM UNNEST(GENERATE_ARRAY(0,2400)) AS num1)
Output :
+---------------------+
| mydate |
+---------------------+
| 2010-01-01 00:00:00 |
| 2010-01-01 01:00:00 |
| 2010-01-01 02:00:00 |
| 2010-01-01 03:00:00 |
| 2010-01-01 04:00:00 |
| 2010-01-01 05:00:00 |
+---------------------+
Related
I am trying to figure out the offset that should be applied to a meeting with start and end date time.
Timezone table below stores the utc offset in minutes and when the utc offset became active.
Timezone Table
TimezoneCode StartDate EndDate UtcOffSetInMinute
Antarctica/Casey 2020-04-05 02:00:00 2020-09-26 02:00:00 720
Antarctica/Casey 2020-09-27 05:00:00 2020-05-03 05:00:00 780
Meeting table which stores all the meetings
Meeting Table
|Id | StartDateTime | EndDateTime
+----+---------------------+----------------------
|1 | 2020-04-06 23:00:00 | 2020-09-26 05:00:00
|2 | 2020-10-21 10:00:00 | 2020-10-21 11:00:00
Using the above timezone table I am struggling to figure local time of meeting.
How can we join the timezone table with meeting table and get the utcoffset for meeting based on date range?
Expected output
|Id | StartDateTime | EndDateTime | OffsetInMins
+----+---------------------+----------------------
|1 | 2020-04-06 23:00:00 | 2020-09-26 05:00:00 | 720
|2 | 2020-09-27 23:00:00 | 2020-09-29 05:00:00 | 780
I have two tables; df1 contains Date1 (timestamp) and PolygonWKT (geometry), df2 contains Date2 (timestamp) and PointWKT (geometry). I joined df1 and df2 based on geomtery, so each PointWKT fell under the corresponding PolygonWKT. The problem is, that Date1 and Date2e columns are messed up and what i also need is matched Date1 and Date2.
I would like to join tables based on geometry and also closest timestamp match between Date1 and Date2.
df2
| PointWKT | Date2 |
--------------------------------------
| b | 2020-05-05 12:00:00 UTC |
| b | 2020-05-05 12:00:10 UTC |
| b | 2020-05-05 12:00:20 UTC |
| b | 2020-05-05 12:17:00 UTC |
| c | 2020-05-06 18:00:00 UTC |
df1
| PolygonWKT | Date1 |
--------------------------------------
| A | 2020-05-03 9:00:00 UTC |
| A | 2020-05-03 9:30:10 UTC |
| B | 2020-05-05 12:05:00 UTC |
| B | 2020-05-05 12:25:00 UTC |
| C | 2020-05-06 18:05:00 UTC |
First part of the code is correct but second part doesn't return what i want:
SELECT *
FROM `xxx.yyy.df1` as df1 ,
`xxx.yyy.df2` as df2
WHERE ST_Contains (df1.PolygonWKT, df2.PointWKT)
AND (
df2.Date2 BETWEEN df1.Date1 AND TIMESTAMP_ADD(df1.Date1, INTERVAL 10 MINUTE)
desired df
| PointWKT | Date2 || PolygonWKT | Date1 |
----------------------------------------------------------------------------
| b | 2020-05-05 12:00:00 UTC | | B | 2020-05-05 12:05:00 UTC |
| b | 2020-05-05 12:00:10 UTC | | B | 2020-05-05 12:05:00 UTC |
| b | 2020-05-05 12:00:20 UTC | | B | 2020-05-05 12:05:00 UTC |
| b | 2020-05-05 12:17:00 UTC | | B | 2020-05-05 12:25:00 UTC |
| c | 2020-05-06 18:00:00 UTC | | C | 2020-05-06 18:05:00 UTC |
What would be a correct way to do this?
I would like to join tables based on geometry and also closest timestamp match between Date1 and Date2.
Below is for BigQuery Standard SQL
SELECT
ARRAY_AGG(STRUCT(df2.PointWKT, df2.Date2, df1.PolygonWKT, df1.Date1)
ORDER BY ABS(TIMESTAMP_DIFF(df2.Date2, df1.Date1, SECOND))
LIMIT 1)[OFFSET(0)].*
FROM `xxx.yyy.df1` AS df1 ,
`xxx.yyy.df2` AS df2
WHERE ST_CONTAINS(df1.PolygonWKT, df2.PointWKT)
GROUP BY TO_JSON_STRING(STRUCT(df2.PointWKT, df2.Date2))
If to apply to sample data similar to one in your example -
WITH `xxx.yyy.df1` AS (
SELECT ST_GEOGPOINT(1,2) PolygonWKT, TIMESTAMP '2020-05-03 9:00:00 UTC' Date1 UNION ALL
SELECT ST_GEOGPOINT(1,2), '2020-05-03 9:30:10 UTC' UNION ALL
SELECT ST_GEOGPOINT(1,3), '2020-05-05 12:05:00 UTC' UNION ALL
SELECT ST_GEOGPOINT(1,3), '2020-05-05 12:25:00 UTC' UNION ALL
SELECT ST_GEOGPOINT(1,4), '2020-05-06 18:05:00 UTC'
), `xxx.yyy.df2` AS (
SELECT ST_GEOGPOINT(1,3) PointWKT, TIMESTAMP '2020-05-05 12:00:00 UTC' Date2 UNION ALL
SELECT ST_GEOGPOINT(1,3), '2020-05-05 12:00:10 UTC' UNION ALL
SELECT ST_GEOGPOINT(1,3), '2020-05-05 12:00:20 UTC' UNION ALL
SELECT ST_GEOGPOINT(1,3), '2020-05-05 12:17:00 UTC' UNION ALL /* this value adjusted based on exapected result sample - as it looks as a typo */
SELECT ST_GEOGPOINT(1,4), '2020-05-06 18:00:00 UTC'
)
output is
Row PointWKT Date2 PolygonWKT Date1
1 POINT(1 3) 2020-05-05 12:00:00 UTC POINT(1 3) 2020-05-05 12:05:00 UTC
2 POINT(1 3) 2020-05-05 12:00:10 UTC POINT(1 3) 2020-05-05 12:05:00 UTC
3 POINT(1 3) 2020-05-05 12:00:20 UTC POINT(1 3) 2020-05-05 12:05:00 UTC
4 POINT(1 3) 2020-05-05 12:17:00 UTC POINT(1 3) 2020-05-05 12:25:00 UTC
5 POINT(1 4) 2020-05-06 18:00:00 UTC POINT(1 4) 2020-05-06 18:05:00 UTC
Based on your sample data, you are pulling the dates in the wrong order. Does this do what you want?
df2.Date1 BETWEEN df2.Date1 AND TIMESTAMP_ADD(df2.Date1, INTERVAL 10 MINUTE)
I need to create new interval rows based on a start datetime column and an end datetime column.
My statement looks like this currently
select id,
startdatetime,
enddatetime
from calls
result looks like this
id startdatetime enddatetime
1 01/01/2020 00:00:00 01/01/2020 04:00:00
I would like a result like this
id startdatetime enddatetime Intervals
1 01/01/2020 00:00:00 01/01/2020 03:00:00 01/01/2020 00:00:00
1 01/01/2020 00:00:00 01/01/2020 03:00:00 01/01/2020 01:00:00
1 01/01/2020 00:00:00 01/01/2020 03:00:00 01/01/2020 02:00:00
1 01/01/2020 00:00:00 01/01/2020 03:00:00 01/01/2020 03:00:00
Thanking you in advance
p.s. I'm new to SQL
You can use a recursive sub-query factoring clause to loop and incrementally add an hour:
WITH times ( id, startdatetime, enddatetime, intervals ) AS (
SELECT id,
startdatetime,
enddatetime,
startdatetime
FROM calls c
UNION ALL
SELECT id,
startdatetime,
enddatetime,
intervals + INTERVAL '1' HOUR
FROM times
WHERE intervals + INTERVAL '1' HOUR <= enddatetime
)
SELECT *
FROM times;
outputs:
ID | STARTDATETIME | ENDDATETIME | INTERVALS
-: | :------------------ | :------------------ | :------------------
1 | 2020-01-01 00:00:00 | 2020-01-01 04:00:00 | 2020-01-01 00:00:00
1 | 2020-01-01 00:00:00 | 2020-01-01 04:00:00 | 2020-01-01 01:00:00
1 | 2020-01-01 00:00:00 | 2020-01-01 04:00:00 | 2020-01-01 02:00:00
1 | 2020-01-01 00:00:00 | 2020-01-01 04:00:00 | 2020-01-01 03:00:00
1 | 2020-01-01 00:00:00 | 2020-01-01 04:00:00 | 2020-01-01 04:00:00
db<>fiddle here
You can use the hierarchy query as following:
SQL> WITH CALLS (ID, STARTDATETIME, ENDDATETIME)
2 AS ( SELECT 1,
3 TO_DATE('01/01/2020 00:00:00', 'dd/mm/rrrr hh24:mi:ss'),
4 TO_DATE('01/01/2020 04:00:00', 'dd/mm/rrrr hh24:mi:ss')
5 FROM DUAL)
6 -- Your query starts from here
7 SELECT
8 ID,
9 STARTDATETIME,
10 ENDDATETIME,
11 STARTDATETIME + ( COLUMN_VALUE / 24 ) AS INTERVALS
12 FROM
13 CALLS C
14 CROSS JOIN TABLE ( CAST(MULTISET(
15 SELECT LEVEL - 1
16 FROM DUAL
17 CONNECT BY LEVEL <= TRUNC(24 *(ENDDATETIME - STARTDATETIME))
18 ) AS SYS.ODCINUMBERLIST) )
19 ORDER BY INTERVALS;
ID STARTDATETIME ENDDATETIME INTERVALS
---------- ------------------- ------------------- -------------------
1 01/01/2020 00:00:00 01/01/2020 04:00:00 01/01/2020 00:00:00
1 01/01/2020 00:00:00 01/01/2020 04:00:00 01/01/2020 01:00:00
1 01/01/2020 00:00:00 01/01/2020 04:00:00 01/01/2020 02:00:00
1 01/01/2020 00:00:00 01/01/2020 04:00:00 01/01/2020 03:00:00
SQL>
Cheers!!
In Postgres below query is working using generate_series function
SELECT dates
FROM generate_series(CAST('2019-03-01' as TIMESTAMP), CAST('2019-04-01' as TIMESTAMP), interval '30 mins') AS dates
Below query is also working in Oracle but only for date interval
select to_date('2019-03-01','YYYY-MM-DD') + rownum -1 as dates
from all_objects
where rownum <= to_date('2019-03-06','YYYY-MM-DD')-to_date('2019-03-01','YYYY-MM-DD')+1
SELECT dates
FROM generate_series(CAST('2019-03-01' as TIMESTAMP), CAST('2019-04-01' as TIMESTAMP), interval '30 mins') AS dates
I want same result in Oracle for below query
SELECT dates
FROM generate_series(CAST('2019-03-01' as TIMESTAMP), CAST('2019-04-01' as TIMESTAMP), interval '30 mins') AS dates
Use a hierarchical query:
SELECT DATE '2019-03-01' + ( LEVEL - 1 ) * INTERVAL '30' MINUTE AS dates
FROM DUAL
CONNECT BY DATE '2019-03-01' + ( LEVEL - 1 ) * INTERVAL '30' MINUTE <= DATE '2019-04-01';
Output:
| DATES |
| :------------------ |
| 2019-03-01 00:00:00 |
| 2019-03-01 00:30:00 |
| 2019-03-01 01:00:00 |
| 2019-03-01 01:30:00 |
| 2019-03-01 02:00:00 |
| 2019-03-01 02:30:00 |
| 2019-03-01 03:00:00 |
| 2019-03-01 03:30:00 |
| 2019-03-01 04:00:00 |
| 2019-03-01 04:30:00 |
| 2019-03-01 05:00:00 |
| 2019-03-01 05:30:00 |
...
| 2019-03-31 19:30:00 |
| 2019-03-31 20:00:00 |
| 2019-03-31 20:30:00 |
| 2019-03-31 21:00:00 |
| 2019-03-31 21:30:00 |
| 2019-03-31 22:00:00 |
| 2019-03-31 22:30:00 |
| 2019-03-31 23:00:00 |
| 2019-03-31 23:30:00 |
| 2019-04-01 00:00:00 |
db<>fiddle here
In my Postgres database, I have the following table:
SELECT start_at, end_at FROM schedules;
+---------------------+---------------------+
| start_at | end_at |
|---------------------+---------------------|
| 2016-09-05 16:30:00 | 2016-09-05 17:30:00 |
| 2016-09-05 17:30:00 | 2016-09-05 18:30:00 |
| 2017-08-13 03:00:00 | 2017-08-13 07:00:00 |
| 2017-08-13 03:00:00 | 2017-08-13 07:00:00 |
| 2017-08-13 18:42:26 | 2017-08-13 21:30:46 |
| 2017-08-10 00:00:00 | 2017-08-10 03:30:00 |
| 2017-08-09 18:00:00 | 2017-08-10 03:00:00 |
| 2017-08-06 23:00:00 | 2017-08-07 03:00:00 |
| 2017-08-07 01:00:00 | 2017-08-07 03:48:20 |
| 2017-08-07 01:00:00 | 2017-08-07 03:48:20 |
| 2017-08-07 18:05:00 | 2017-08-07 20:53:20 |
| 2017-08-07 14:00:00 | 2017-08-08 01:00:00 |
| 2017-08-07 18:00:00 | 2017-08-07 20:48:20 |
| 2017-08-08 08:00:00 | 2017-08-09 00:00:00 |
| 2017-08-09 21:30:00 | 2017-08-10 00:18:20 |
| 2017-08-13 03:53:26 | 2017-08-13 06:41:46 |
+---------------------+---------------------+
Assume I also have an ID column, what I want to do is update all the start and end times to be for today (now), what is the most efficient SQL to accomplish this? My table could have millions of rows.
the best I can think of is this:
update schedules
set start_at = current_date + start_at::time
, end_at = current_date + end_at::time
WHERE start_at::date <> current_date
or end_at::date <> current_date;
The arithmetic is fast compared to accessing the rows.
if not all rows need updating, the where clause will help efficiency. Updates are expensive.