I have used Dataiku to transfer 40 GB of data from snowflake to big query. Somehow timestamp value changed my dates completely
Instead of 2021-06-30 00:00:00 UTC the copied timestamp value is 2021-06-29 22:00:00 UTC
I am looking for a bigquery solution to cast this into the correct timestamp as loading the data again is not possible.
Can someone help please? Thanks
found the solution
SELECT TIMESTAMP_ADD(Day_Sts, INTERVAL 120 MINUTE) AS Day_Sts,* except (Day_Sts)
Related
I am trying to truncate UTC time in bigquery. I'm trying to remove the millisecond off my timestamp. I don't want to round down, just remove the millisecond.
To this: 2019-11-11 19:10:57 UTC```
I've tried truncate and date but can't seem to make it work.
Try below (BigQuery Standard SQL)
TIMESTAMP_TRUNC('2019-11-11 19:10:57.181 UTC', SECOND)
this will produce timestamp 2019-11-11 19:10:57 UTC
I am trying to calculate the time difference between two columns of a row which are of string data type. If the time difference between them is less than 2 hours then select the first column of that row else if the time difference is greater than 2 hours then select the second column of that row. It can be done by converting the columns to datetime format, but I want the result to be in string only. How can I do that? The data looks like this:
col1(string type)
2018-07-16 02:23:00
2018-07-26 12:26:00
2018-07-26 15:32:00
col2(string type)
2018-07-16 02:36:00
2018-07-26 14:29:00
2018-07-27 15:38:00
I think you don't need to convert the columns to datetime format, since the data in your case is already ordered (yyyy-MM-dd hh:mm:ss). You just need to take all the digits and take it into one string (yyyyMMddhhmmss) then you can apply your selection which is bigger or smaller than 2 hours (here 20000 since the hour is followed by mmss). By looking at your example (assuming col2 > col1), this query would work:
SELECT case when regexp_replace(col2,'[^0-9]', '')-regexp_replace(col1,'[^0-9]', '') < 20000 then col1 else col2 end as col3 from your_table;
Use unix_timestamp() to convert string timestamp to seconds.
The difference in hours will be:
hive> select (unix_timestamp('2018-07-16 02:23:00')- unix_timestamp('2018-07-16 02:36:00'))/60/60;
OK
-0.21666666666666667
Important update: this method will work correctly only if time zone is configured as UTC. Because for DST timezones for some marginal cases Hive converts time during timestamp operations. Consider this example for PDT time zone:
hive> select hour('2018-03-11 02:00:00');
OK
3
Note the hour is 3, not 2. This is because 2018-03-11 02:00:00 cannot exist in PDT time zone because exactly at 2018-03-11 02:00:00 time is adjusted and becomes 2018-03-11 03:00:00.
The same happens when converting to unix_timestamp. For PDT time zone unix_timestamp('2018-03-11 03:00:00') and unix_timestamp('2018-03-11 02:00:00') will return the same timestamp:
hive> select unix_timestamp('2018-03-11 03:00:00');
OK
1520762400
hive> select unix_timestamp('2018-03-11 02:00:00');
OK
1520762400
And few links for your reference:
https://community.hortonworks.com/questions/82511/change-default-timezone-for-hive.html
http://boristyukin.com/watch-out-for-timezones-with-sqoop-hive-impala-and-spark-2/
Also have a look at this jira please: Hive should carry out timestamp computations in UTC
I am running below query:
select a.event_date,
date_format(date_trunc('month', a.event_date), '%m/%d/%Y') as date
from monthly_test_table a
order by 1;
Output:
2017-09-15 | 09/01/2017
2017-10-01 | 09/30/2017
2017-11-01 | 11/01/2017
Can anyone tell me why for date "2017-10-01" it is showing me date as "09/30/2017" after using date_trunc.
Thanks in Advance...!
You are reverse formatting so it is incorrect.
Use the below Code
select a.event_date,
date_format(date_trunc('month', a.event_date), '%Y/%m/%d') as date
from monthly_test_table a
order by 1;
You can use date_add with a logic to subtract 1-day(yourdate) to replicate trunc.
For eg:
2017-10-01 - day('2017-10-01') is 1 and you add 1-1=0 days
2017-08-30 - day('2017-08-30') is 30 and you add 1-30=-29 days
I faced the same issue recently and resorted to using this logic.
date_add(from_unixtime(unix_timestamp(event_date,'yyyy-MM-dd'),'yyyy-MM-dd'),
1-day(from_unixtime(unix_timestamp(event_date,'yyyy-MM-dd'),'yyyy-MM-dd'))
)
PS: As far as i know, there is no date_trunc function in Hive documentation.
As per the source code below: UTC_CHRONOLOGY time is translated w.r.t. locale, also in Description it is mentioned that session timezone will be the precision, also refer to below URL.
#Description("truncate to the specified precision in the session timezone")
#ScalarFunction("date_trunc")
#LiteralParameters("x")
#SqlType(StandardTypes.DATE)
public static long truncateDate(ConnectorSession session, #SqlType("varchar(x)") Slice unit, #SqlType(StandardTypes.DATE) long date)
{
long millis = getDateField(UTC_CHRONOLOGY, unit).roundFloor(DAYS.toMillis(date));
return MILLISECONDS.toDays(millis);
}
See https://prestodb.io/docs/current/release/release-0.66.html:::
Time Zones:
This release has full support for time zone rules, which are needed to perform date/time calculations correctly. Typically, the session time zone is used for temporal calculations. This is the time zone of the client computer that submits the query, if available. Otherwise, it is the time zone of the server running the Presto coordinator.
Queries that operate with time zones that follow daylight saving can
produce unexpected results. For example, if we run the following query
to add 24 hours using in the America/Los Angeles time zone:
SELECT date_add('hour', 24, TIMESTAMP '2014-03-08 09:00:00');
Output: 2014-03-09 10:00:00.000
I want historic date convert UTC 0 to UTC local in SQL. Like;
2012-11-23
2013-01-08
2014-02-23
But we have 2 different time zone. We use UTC +2 after last sunday in March and use UTC +3 after last sunday October. I need solution immediately guys. Please help me...
Try this:
SELECT CONVERT_TZ('your date ','+your time zone','+time zone you want');
SELECT CONVERT_TZ('2004-01-01 12:00:00','+02:00' ,'+03:00'); // in your case, from +2 to +3
See this link
If you need a dynamic thing, you'll need the timezone of each history (maybe you can store it in a separeted column), so I think you can do something like this:
SELECT CONVERT_TZ('your_date_column','local_timezone' ,'time_zone_column');
I'm doing a SQL query in Oracle 10g where I'm comparing against a cutoff date. So my query has this in it:
THING < TO_DATE('02/14/13','MM/DD/YY')
Now the THING can have a time component in it. I want to know how the cutoff date will interact with it. Does the TO_DATE function have some default implied time component in it? Does the date it creates have a default time of midnight on the specified date, or noon or some other time? Essentially my concern is if I have a column in the table like this:
THING
-------
2/4/13 11:13AM
2/13/13 3:36PM
2/14/13 2:00PM
2/15/13 1:52AM
Will I get 2 rows or 3 rows back?
The implied time is 00:00:00, so in your example you will get two rows back.
You can verify this with:
select to_char(TO_DATE('02/14/13','MM/DD/YY'),'YYYY-MM-DD HH24:MI:SS')
from dual;
You'll get two rows back. The implied time is 0:00:00 (midnight). Your dates with a 24-hour clock look like this:
2/13/13 3:36PM --> 2013-02-13 15:36:00
TO_DATE('02/14/13','MM/DD/YY') --> 2013-02-13 00:00:00