How to deal with timestamps without a timezone? - google-bigquery

The NYC bike and taxi datasets list the time when events happened in local time. Timestamps like 2018-01-07 10:30:00 means it was 10am in NY at the time.
When I ingest these timestamps into BigQuery, BigQuery assumes they are GMT - appending the incorrect timezone information.
How can I fix this?

2 choices:
Use DATETIME instead of TIMESTAMP - DATETIME has the same information than TIMESTAMP, except no timezone information is added.
Since this is NY, you can append the US/Eastern timezone when ingesting - it will correctly identify summer daylight saving changes and so on
For example:
SELECT TIMESTAMP('2018-3-10 10:00:00', 'US/Eastern')
, TIMESTAMP('2018-5-10 10:00:00', 'US/Eastern')
2018-03-10 15:00:00 UTC
2018-05-10 14:00:00 UTC

Related

timestamp difference in bigquery with date

I have used Dataiku to transfer 40 GB of data from snowflake to big query. Somehow timestamp value changed my dates completely
Instead of 2021-06-30 00:00:00 UTC the copied timestamp value is 2021-06-29 22:00:00 UTC
I am looking for a bigquery solution to cast this into the correct timestamp as loading the data again is not possible.
Can someone help please? Thanks
found the solution
SELECT TIMESTAMP_ADD(Day_Sts, INTERVAL 120 MINUTE) AS Day_Sts,* except (Day_Sts)

parse_timestamp vs format_timestamp bigquery

Could someone help me understand why these two queries are returning different results in bigquery?
select FORMAT_TIMESTAMP('%F %H:%M:%E*S', "2018-10-01 00:00:00" , 'Europe/London')
returns 2018-10-01 01:00:00
select PARSE_TIMESTAMP('%F %H:%M:%E*S', "2018-10-0100:00:00", "Europe/London")
returns 2018-09-30 23:00:00 UTC
As 2018-10-01 is during british summer time (UTC +1), I would've expected both queries to return 2018-09-30 23:00:00 UTC
The first is given a timestamp which is in UTC. It then converts it to the corresponding time in Europe/London. The return value is a string representing the time in the local timezone.
The second takes a string representation and returns a UTC timestamp. The representation is assumed to be in Europe/London.
So, the two functions are going in different directions, one from UTC to the local time and the other from the local time to UTC.

pytz - convert a datetime in the future to UTC

I have a file that contains forecasted events for the next two weeks. There is a datetime column which has the date and each 30 minute interval, and a time zone column.
I am using pytz to convert the different time zones (around 30+ unique ones) to UTC before loading them into a database. However, for the forecast file I am receiving an error:
NonExistentTimeError: 2016-10-16 00:00:00
Is there a way to go about this?
date interval time_zone
10/26/2016 22:30 US/Central
10/26/2016 22:30 US/Eastern
10/26/2016 23:00 America/Bogota
10/26/2016 23:00 Asia/Calcutta
Current code:
for tz in df['time_zone'].unique():
df.loc[df['time_zone'] == tz, 'datetime_utc'] = df.loc[df['time_zone'] == tz, 'datetime'].dt.tz_localize(tz).dt.tz_convert('UTC')
df['datetime_utc'] = df['datetime_utc'].dt.tz_localize(None)
Due to changes in daylight saving happening on the 16th October, 2016-10-16 00:00:00 really is a local time that does not exist for Brazil (It should instead read 2016-10-16 01:00:00)

Extract date,month,year and month name from the unix timestamp with postgresql

I use postgres for the rails app and I have a unix timestamp in postgresql db. I have a requirement to select and group by the dd-mm-yyyy and by month name.
Consider I have the following unix timestamp
1425148200
and I would need to change this to datetime and I used to_timestamp which returned
2015-02-28 18:30:00 UTC
and I tried to convert the datetime to local timezone using
::timestamp without time zone AT TIME ZONE 'IST'
but that did not give time in required timezone and instead it returned
2015-02-28 16:30:00 UTC
and I tried to get the date part using ::date which returned
Sat, 28 Feb 2015
So please help me get the dd-mm-yyyy in specified timezone and month name(March) from the unix timestamp.
Thanks in Advance!
select to_char(to_timestamp('1425148200')::timestamptz at time zone 'UTC-5:30','DD-MM-YYYY & of course Month')
01-03-2015 & of course March
It is postgres mistake I guess
according to http://www.postgresql.org/docs/7.2/static/timezones.html

How to store a timestamp in Oracle that occurs within the hour skipped at the start of Daylight Savings

Our Oracle server is running in Australia/Sydney time.
We plan to store some new dates in UTC.
For example, one of the dates we may store is 5 Oct 2014 2:00AM.
However in Sydney we have Daylight Savings start at the same time, which means that times from 2:00AM to 2:59AM do not exist on that day.
For example, on 5 Oct 2014 the times that occur are:
01:58
01:59
03:00
03:01
The trouble is that if I try to store the time 2014-10-05 02:00 in the database, it's silently converted to 2014-10-05 03:00
We don't have the option to change the timezone on the server, so is there any way to store 2014-10-05 02:00 in our database?
Edit for comment from #mrjoltcola
Our server is running with timezone setting (GMT +10) Canberra, Melbourne, Sydney
If I run the command select DBTIMEZONE from dual; the output is the single value +00:00 (this was unexpected).
Our original column was a TIMESTAMP only column and I did not supply any timezone details with the insert.
For further exploration I created a test table as follows:
create table TEST
(
ID number,
TS timestamp,
TS_TZ timestamp with time zone
)
I then run the following insert statements (with and without the timezone):
insert into TEST
VALUES
(
1,
TIMESTAMP '2014-10-05 02:00:00 UTC',
TIMESTAMP '2014-10-05 02:00:00 UTC'
);
insert into TEST
VALUES
(
2,
TIMESTAMP '2014-10-05 02:00:00',
TIMESTAMP '2014-10-05 02:00:00'
);
Which produces the result:
+---+---------------------------------+--------------------------------------------------+
| 1 | 05/OCT/14 03:00:00.000000000 AM | 05/OCT/14 02:00:00.000000000 AM UTC |
+---+---------------------------------+--------------------------------------------------+
| 2 | 05/OCT/14 03:00:00.000000000 AM | 05/OCT/14 03:00:00.000000000 AM AUSTRALIA/SYDNEY |
+---+---------------------------------+--------------------------------------------------+
The trouble is that if I try to store the time 2014-10-05 02:00 in the database, it's silently converted to 2014-10-05 03:00
On my database, I can set the session timezone to 'Australia/Sydney' and still store 2014-10-05 02:00:00 in a plain TIMESTAMP column without having it converted. But if I change it to a TIMESTAMP WITH TIMEZONE and try to store the same value, I get ORA-01878
So I don't think your time was being converted to 03:00:00 based on Oracle trying to adjust the daylight savings, I think you are supposed to receive an ORA-01878 when you specify an incorrect timestamp. Instead, I think your client/session timezone was mismatching your database/server timezone by an hour, and you were seeing normal Oracle adjustment.
Oracle will convert timestamps when the client/session timezone mismatches the server/database timezone. Since there is no timezone info in a regular TIMESTAMP field, Oracle won't mind storing it.
So your "silently converted to 2014-10-05 03:00" should have been happening for other time values (did you try inserting 01:00 and see if it resulted in an adjustment and/or an ORA-01878 error?).
When I try here by setting my session timezone to Australia/Sydney and insert that time, I get:
SQL> alter session set time_zone = 'Australia/Sydney';
Session altered.
SQL> insert into test(ts) values(1, timestamp '2014-10-05 02:00:00');
insert into test values(1, timestamp '2014-10-05 02:00:00')
1 row created.
SQL> insert into test(ts_tz) values(2, timestamp '2014-10-05 02:00:00');
insert into test(ts_tz) values(2, timestamp '2014-10-05 02:00:00')
*
ERROR at line 1:
ORA-01878: specified field not found in datetime or interval
Now, as to your question, if you want to store UTC time, you should either explicitly specify timestamps with UTC, or set your database or session timezone to UTC. You don't have to have DBA privs to set the session time_zone, and it can differ from the db time_zone.
SQL> alter session set time_zone = '+00:00';
Session altered.
SQL> select dbtimezone, sessiontimezone from dual;
DBTIME SESSIONTIMEZONE
------ ---------------------------------------------------------------------------
+00:00 +00:00
SQL> insert into test values(1, timestamp '2014-10-05 02:00:00');
1 row created.
SQL> select * from test;
ID TS
---------- ---------------------------------------------------------------------------
1 05-OCT-14 02.00.00.000000 AM
However, the above TIMESTAMP field isn't necessarily UTC. It could change if the DB server timezone changed. To Oracle a plain TIMESTAMP is just a timestamp with no timezone information. If you want UTC time, but may be dealing with clients in different timezones, you can use TIMESTAMP WITH TIMEZONE or TIMESTAMP WITH LOCAL TIMEZONE.