I am querying a table in Hive with json payloads and am extracting the timestamp from these payloads. the problem is that timestamps are present in different timezone formats and I'm trying to extract them all in my timezone.
I am currently using the following:
select
from_unixtime(unix_timestamp(get_json_object (table.payload,
'$.timestamp'), "yyyy-MM-dd'T'HH:mm:ss.SSSXXX"))
FROM table
This is returning the correct values if the timestamp is in this format: 2018-08-16T08:54:05.543Z --> 2018-08-16 18:54:05 (changed format and converted into my timezone)
However the query above returns 'null' if the payload contains the timestamp in this format:
2018-09-13T01:35:08.460+0000
2018-09-13T11:35:09+10:00
How can I adjust my query to work for all types of timestamps all converting to proper timezone (+10 is my timezone!) and all in the same format?
Thanks in advance!
How about the following macro:
create temporary macro extract_ts(ts string)
from_unixtime(unix_timestamp(regexp_extract(ts, '(.*)\\+(.*)', 1), "yyyy-MM-dd'T'HH:mm:ss") + 3600*cast(regexp_extract(ts, '(.*)\\+(.*)\\:(.*)', 2) as int));
e.g.,
hive> select extract_ts('2018-09-13T11:35:09+10:00');
OK
2018-09-13 21:35:09
Without regexp use Z for +1000 of XXX for +10:00 :
select unix_timestamp('2016-07-30T10:29:33.000+03:00', "yyyy-MM-dd'T'HH:mm:ss.SSSXXX") as t1
select unix_timestamp('2016-07-30T10:29:33.000+0300', "yyyy-MM-dd'T'HH:mm:ss.SSSZ") as t2
Full docs about time formats:
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html
Related
I am using an SQL Script to parse a json into a table using dbt. One of the cols had this date value: '2022-02-09T20:28:59+0000'. What would be the correct way to define iso date's data type in Snowflake?
Currently, I just used the date type like this in my dbt sql script:
JSON_DATA:"situation_date"::date AS MY_DATE
but clearly, dateisn't the correct one because later when I test it using select * , I get this error:
SQL Error [100040] [22007]: Date '2022-02-09T20:28:59+0000' is not recognized
so I need to know which Snowflake date data type or datetime type suits the best with this one
Correct pulling the "date from JSON" so not so clear cut:
SELECT
'{"date":"2022-02-09T20:28:59+0000"}' as json_str
,parse_json(json_str) as json
,json:date as data_from_json
,TRY_TO_TIMESTAMP_NTZ(data_from_json, 'YYYY-MM-DDTHH:MI:SS+0000') as date_1
,TRY_TO_TIMESTAMP_NTZ(substr(data_from_json,1,19), 'YYYY-MM-DDTHH:MI:SS') as date_2
;
gives the error:
Function TRY_CAST cannot be used with arguments of types VARIANT and TIMESTAMP_NTZ(9)
Because the type of data_from_json as VARIANT and the TO_DATE/TO_TIMESTAMP function expect TEXT so we need to cast to that
SELECT
'{"date":"2022-02-09T20:28:59+0000"}' as json_str
,parse_json(json_str) as json
,json:date as data_from_json
,TRY_TO_TIMESTAMP_NTZ(data_from_json::text, 'YYYY-MM-DDTHH:MI:SS+0000') as date_1
,TRY_TO_TIMESTAMP_NTZ(substr(data_from_json::text,1,19), 'YYYY-MM-DDTHH:MI:SS') as date_2
;
If all your timezones are always +0000 you can just put that in the parse format (like example date_1), OR you can truncate that part off (like example date_2)
gives:
JSON_STR
JSON
DATA_FROM_JSON
DATE_1
DATE_2
{"date":"2022-02-09T20:28:59+0000"}
{ "date": "2022-02-09T20:28:59+0000" }
"2022-02-09T20:28:59+0000"
2022-02-09 20:28:59.000
2022-02-09 20:28:59.000
Using TRY_TO_TIMESTAMP:
SELECT TRY_TO_TIMESTAMP(JSON_DATA:"situation_date", 'format_here')
FROM tab;
so I need to know which Snowflake date data type or datetime type suits the best with this one
TIMESTAMP_INPUT_FORMAT
The specific input could be set up on ACCOUNT/USER/SESSION level.
AUTO Detection of Integer-stored Date, Time, and Timestamp Values
Avoid using AUTO format if there is any chance for ambiguous results. Instead, specify an explicit format string by:
Setting TIMESTAMP_INPUT_FORMAT and other session parameters for dates, timestamps, and times. See Session Parameters for Dates, Times, and Timestamps (in this topic).
I think ::TIMESTAMP should work for this. So JSON_DATA:"situation_date"::TIMESTAMP if you need to go just to date after, you could then to ::Date or to_Date()
After some testing, it seems to me you have 2 options.
Either you can get rid of the +0000 at the end:
left(column_date, len(column_date)-5)::timestamp
or use the function try_to_timestamp with format:
try_to_timestamp('2022-02-09T20:28:59+0000','YYYY-MM-DD"T"HH24:MI:SS+TZHTZM')
TZH and TZM both are TimeZone Offset Hours and Minutes
I have a timestamp from the source that has been loaded to BQ as a string. I'd like to write a query in BigQuery that will return timestamp in the following format 2020-01-06 11:09:14.000-0600. Here is the current format of the string field: 2020-01-06T11:09:14.000-0600, 2018-10-01T15:45:59.000-0500, etc.
I have tried the following:
SELECT parse_timestamp ("%Y-%m-%dT%H:%M:%S.%E3S", start_timestamp, "America/Chicago"), FROM bqtable
The goal is to perform arithmetic on the timestamp fields.
Any feedback is appreciated. Thank you.
I think the %S and %E3S% are conflicting, as they both are parsing the seconds part of the string.
Try this:
with data as (
select '2020-01-06T11:09:14.000-0600' as ts_string union all select '2018-10-01T15:45:59.000-0500'
)
select ts_string, parse_timestamp ("%Y-%m-%dT%H:%M:%E3S%z", ts_string, "America/Chicago") as ts
from data
I have hive table which contain daily records. I want to select record from week days. So i use bellow hive query to do it. I'm using QUBOLE API to do this.
SELECT hour(pickup_time),
COUNT(passengerid)
FROM home_pickup
WHERE CAST(date_format(pickup_time, 'u') as INT) NOT IN (6,7)
GROUP BY hour(pickup_time)
However when i run this code, It came with Bellow error.
SemanticException [Error 10011]: Line 4:12 Invalid function 'date_format'
Isn't Qbole support to date_format function? Are there any other way to select week days?
Use unix_timestamp(string date, string pattern) to convert given date format to seconds passed from 1970-01-01. Then use from_unixtime() to convert to given format:
Demo:
hive> select cast(from_unixtime(unix_timestamp('2017-08-21 10:55:00'),'u') as int);
OK
1
You can specify date pattern for unix_timestamp for non-standard format.
See docs here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions
My table in hive has a filed of date in the format of '2016/06/01'. but i find that it is not in harmory with the format of '2016-06-01'.
They can not compare for instance.
Both of them are string .
So I want to know how to make them in harmory and can compare them. Or on the other hand, how to change the '2016/06/01' to '2016-06-01' so that them can compare.
Many thanks.
To convert date string from one format to another you have to use two date function of hive
unix_timestamp(string date, string pattern) convert time string
with given pattern to unix time stamp (in seconds), return 0 if
fail.
from_unixtime(bigint unixtime[, string format]) converts the
number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a
string representing the timestamp of that moment in the current
system time zone.
Using above two function you can achieve your desired result.
The sample input and output can be seen from below image:
The final query is
select from_unixtime(unix_timestamp('2016/06/01','yyyy/MM/dd'),'yyyy-MM-dd') from table1;
where table1 is the table name present in my hive database.
I hope this help you!!!
Let's say you have a column 'birth_day' in your table which is in your format,
you should use the following query to convert birth_day into the required format.
date_Format(birth_day, 'yyyy-MM-dd')
You can use it in a query in the following way
select * from yourtable
where
date_Format(birth_day, 'yyyy-MM-dd') = '2019-04-16';
Use :
unix_timestamp(DATE_COLUMN, string pattern)
The above command would help convert the date to unix timestamp format which you may format as you want using the Simple Date Function.
Date Function
cast(to_date(from_unixtime(unix_timestamp(yourdate , 'MM-dd-yyyy'))) as date)
here is my solution (for string to real Date type):
select to_date(replace('2000/01/01', '/', '-')) as dt ;
ps:to_date() returns Date type, this feature needs Hive 2.1+; before 2.1, it returns String.
ps2: hive to_date() function or date_format() function , or even cast() function, cannot regonise the 'yyyy/MM/dd' or 'yyyymmdd' format, which I think is so sad, and make me a little crazy.
Here is my 1 line of data (for brevity):
73831 12/26/2014 1:00:00 AM 0.3220
The 2nd column is the time column which is in string format. I'm using this hive query:
select col2, UNIX_TIMESTAMP(col2,'MM/DD/YYYY hh:mm:ss aaa') from Table
Here is what I get: 1388296800
However, when I check with, http://www.epochconverter.com/ and also from_unixtime(1388296800), I get a different date.
Is there something wrong with my format / pattern string I enter into UNIX_TIMESTAMP in Hive?
Your date format symbols need to conform to those in the Java SimpleDateFormat documentation.
For your date it looks like you want MM/dd/yyyy HH:mm:ss aa.