Hive from_unixtime for milliseconds - hive

We have a timestamp epoch column (BIGINT) stored in Hive.
We want to get Date 'yyyy-MM-dd' for this epoch.
Problem is my epoch is in milliseconds e.g. 1409535303522.
So select timestamp, from_unixtime(timestamp,'yyyy-MM-dd') gives wrong results for date as it expects epoch in seconds.
So i tried dividing it by 1000. But then it gets converted to Double and we can not apply function to it. Even CAST is not working when I try to Convert this double to Bigint.

Solved it by following query:
select timestamp, from_unixtime(CAST(timestamp/1000 as BIGINT), 'yyyy-MM-dd') from Hadoop_V1_Main_text_archieved limit 10;

The type should be double to ensure precision is not lost:
select from_unixtime(cast(1601256179170 as double)/1000.0, "yyyy-MM-dd hh:mm:ss.SSS") as event_timestamp

timestamp_ms is unixtime in milliseconds
SELECT from_unixtime(floor(CAST(timestamp_ms AS BIGINT)/1000), 'yyyy-MM-dd HH:mm:ss.SSS') as created_timestamp FROM table_name;

In the original answer you'll get string, but if you'd like to get date you need to call extra cast with date:
select
timestamp,
cast(from_unixtime(CAST(timestamp/1000 as BIGINT), 'yyyy-MM-dd') as date) as date_col
from Hadoop_V1_Main_text_archieved
limit 10;
Docs for casting dates and timestamps. For converting string to date:
cast(string as date)
If the string is in the form 'YYYY-MM-DD', then a date value corresponding to that year/month/day is returned. If the string value does not match this formate, then NULL is returned.
Date type is available only from Hive > 0.12.0 as mentioned here:
DATE (Note: Only available starting with Hive 0.12.0)

Related

What is the alternative of TRUNC(DATE) in Hive?

I have Oracle SQL query where it has been used TRUNC(04-Aug-2017 15:35:32)
What will be parameter in Hive to replace TRUNC?
Assuming you have a date/time, you can use the to_date() function:
select to_date(col)
If you have a timestamp, say ts, you can use trunc():
trunc(ts, 'day')
This returns a timestamp, with the time portion stripped off - which is similar to what trunc() does in Oracle when given one argument only.
On the other hand, you can also convert the timestamp to a date:
to_date(ts)
This returns a date rather than a timestamp: that's a different datatype, that has no time component (Oracle does not have such a datatype: both date and timestamp store the date and time).
As per Oracle docs, The TRUNC (date) function returns date with the time portion of the day truncated to the unit specified by the format model fmt. The value returned is always of datatype DATE, even if you specify a different datetime datatype for date. If you omit fmt, then date is truncated to the nearest day.
Similar is the function of to_date function in Hive.
It returns the date part of a timestamp string (pre-Hive 2.1.0): to_date("1970-01-01 00:00:00") = "1970-01-01".
If what you want is the timestamp(midnight timestamp : 00:00:00) along with the truncated date, you need to use some conversions as shown below:
cast(from_unixtime(unix_timestamp(to_date(<YOU_DATE_COL>), 'yyyy-MM-dd')) as timestamp)

How to convert BIGINT to DATE in Redshift?

I am trying to figure out how to convert and format a BIGINT field (i.e. 20200301) to a DATE type field using Redshift SQL. I was successful in getting the snippet below to work but I believe that returns a string and I need a valid date returned in 'YYYY-MM-DD' format. I've tried several other version unsuccessfully. Thank you in advance.
'''to_char(to_date(date_column::text, 'yyyymmdd'), 'yyyy-mm-dd')'''
You just want the to_date() part:
select to_date(date_column::text, 'YYYYMMDD')
When it is a timestamp we need the below code to convert into correct value.
select trunc(TIMESTAMP 'epoch' + date_column / 1000 * INTERVAL '1 second')

Difference between unix_timestamp and casting to timestamp

I am having a situation for a hive table, to convert a two fields of numeric string (T1 and T2) to date timestamp format "YYYY-MM-DD hh:mm:ss.SSS" and to find difference of both.
I have tried two methods:
Method 1: Through CAST
Select CAST(regexp_replace(substring(t1, 1,17),'(\\d{4})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{3})','$1-$2-$3 $4:$5:$6.$7') as timestamp), CAST(regexp_replace(substring(t2, 1,17),'(\\d{4})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{3})','$1-$2-$3 $4:$5:$6.$7') as timestamp), CAST(regexp_replace(substring(t1, 1,17),'(\\d{4})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{3})','$1-$2-$3 $4:$5:$6.$7') as timestamp) - CAST(regexp_replace(substring(t2, 1,17),'(\\d{4})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{3})','$1-$2-$3 $4:$5:$6.$7') as timestamp) as time_diff
from tab1
And getting output as
Method 2: Through unix_timestamp
Select from_unixtime (unix_timestamp(substring(t1,1,17),'yyyyMMddhhmmssSSS'),'yyyy-MM-dd hh:mm:ss.SSS'), from_unixtime (unix_timestamp(substring(t2,1,17),'yyyyMMddhhmmssSSS'),'yyyy-MM-dd hh:mm:ss.SSS'), from_unixtime (unix_timestamp(substring(t1,1,17),'yyyyMMddhhmmssSSS'),'yyyy-MM-dd hh:mm:ss.SSS') - from_unixtime (unix_timestamp(substring(t2,1,17),'yyyyMMddhhmmssSSS'),'yyyy-MM-dd hh:mm:ss.SSS') as time_diff
from tab1;
And getting output as
I am not getting clear why there is difference in outputs.
unix_timestamp() gives you epoch time ie. time in seconds since unix epoch 1970-01-01 00:00:00
Whereas the the timestamp will provide date and time viz YYYY-MM-DD T HH:MI:SS
Hence an accurate way would be to convert the string timestamp to unix_timestamp(), subtract and then convert back using from_unixtime()
eg.
select from_unixtime(unix_timestamp('2020-04-12 01:30:02.000') - unix_timestamp('2020-04-12 01:29:43.000'))
Method 2 finally equates to something like this
select ('2020-04-12 01:30:02.000' - '2020-04-12 01:29:43.000') as time_diff;
You cannot subtract dates like this.. you have to use DateDiff.
In Hive DateDiff returns > 0 only if there is a diff in day else you get zero.

how to convert a timestamp to int in sql (vertica)

I have a timestamp as 2017-07-19 11:45:01and i want it to convert to int.
Query:
select cast(max(event_timestamp) as INT) from error_messages where error_level='ERROR' and user_name='git'
Error:
SQL Error [2366] [42846]: [Vertica][VJDBC](2366) ERROR: Cannot cast type timestamptz to int
[Vertica][VJDBC](2366) ERROR: Cannot cast type timestamptz to int
com.vertica.util.ServerException: [Vertica][VJDBC](2366) ERROR: Cannot cast type timestamptz to int
You have to use TIMESTAMPDIFF() this way:
SELECT TIMESTAMPDIFF(SECOND,'001-01-01 00:00:00', '2015-02-23 03:12:35');
timestampdiff
---------------
63560257955
to get the number of time units you want (SECONDs here above) since the timestamp you want...
If you want to get Unix Timestamp of that date as int than search fort that.
One option would be to calculate the range from your date to '1970-01-01' in seconds as int. This is the Unix Timestamp.
Use JULIAN_DAY function in Vertica to convert the time stamp to a integer value or number.
For more details refer Vertica documentation link: https://my.vertica.com/docs/6.1.x/HTML/index.htm#16070.htm
To extract number from date time with 1 second interval.
SELECT EXTRACT(EPOCH FROM TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40-08');

SQLite Current Timestamp with Milliseconds?

I am storing a timestamp field in a SQLite3 column as TIMESTAMP DATETIME DEFAULT CURRENT_TIMESTAMP and I was wondering if there was any way for it to include milliseconds in the timestamp as well?
Instead of CURRENT_TIMESTAMP, use (STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW')) so that your column definition become:
TIMESTAMP DATETIME DEFAULT(STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW'))
For example:
CREATE TABLE IF NOT EXISTS event
(when_ts DATETIME DEFAULT(STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW')));
To get number of milliseconds since epoch you can use julianday() with some additional calculations:
-- Julian time to Epoch MS
SELECT CAST((julianday('now') - 2440587.5)*86400000 AS INTEGER);
The following method doesn't require any multiplies or divides and should always produce the correct result, as multiple calls to get 'now' in a single query should always return the same result:
SELECT strftime('%s','now') || substr(strftime('%f','now'),4);
The generates the number of seconds and concatenates it to the milliseconds part from the current second+millisecond.
Here's a query that will generate a timestamp as a string with milliseconds:
select strftime("%Y-%m-%d %H:%M:%f", "now");
If you're really bent on using a numeric representation, you could use:
select julianday("now");
The accepted answer only gives you UTC. If you need a local time instead of UTC, use this:
strftime('%Y-%m-%d %H:%M:%f', 'now', 'localtime')