Get current unix_timestamp in Hive - sql

As the post How to select current date in Hive SQL, to get the current date in Hive, unix_timestamp can be used.
But I tried
select unix_timestamp();
and just,
unix_timestamp();
both give the error messages
FAILED: ParseException line 1:23 mismatched input '<EOF>' expecting FROM near ')' in from clause
FAILED: ParseException line 1:0 cannot recognize input near 'unix_timestamp' '(' ')'
respectively.
How can I use unix_timestamp properly in Hive?
UPDATED!
https://issues.apache.org/jira/browse/HIVE-178 has resolved this issue.
If you use 0.13 (released on 21 April 2014) or above, you can
-- unix_timestamp() is deprecated
select current_timestamp();
select 1+1;
without from <table>.

As Hive doesn't expose a dual table, you may want to create a single lined table, and use that table for that kind of querys.
You'll then be able to execute queries like
select unix_timestamp() from hive_dual;
A workaround is to use any existing table, with a LIMIT 1 or a TABLESAMPLE clause, but, depending on the size of your table, it will be less efficient.
# any_existing_table contains 10 lines
# hive_dual contains 1 line
select unix_timestamp() from any_existing_table LIMIT 1;
# Time taken: 17.492 seconds, Fetched: 1 row(s)
select unix_timestamp() from any_existing_table TABLESAMPLE(1 ROWS);
# Time taken: 15.273 seconds, Fetched: 1 row(s)
select unix_timestamp() from hive_dual ;
# Time taken: 16.144 seconds, Fetched: 1 row(s)
select unix_timestamp() from hive_dual LIMIT 1;
# Time taken: 14.086 seconds, Fetched: 1 row(s)
select unix_timestamp() from hive_dual TABLESAMPLE(1 ROWS);
# Time taken: 16.148 seconds, Fetched: 1 row(s)
Update
No need to pass any table name and limit statement. Hive does support select unix_timestamp() now.
More details :
Does Hive have something equivalent to DUAL?
BLOG POST : dual table in hive

To get the date out of timestamp use to_date function.
Try the below
select to_date(FROM_UNIXTIME(UNIX_TIMESTAMP())) as time from table_name;

Related

Hive Date functions not properly handling the dates

I have a daily job that handles the loads based on the date I derive using hive date functions. It was running fine until 2 days ago and the issue started from 12/30/2019. It is showing the year as 2020 when I use the date_format else it shows 2019. See below.
hive> select current_date;
OK
2019-12-31
Time taken: 0.182 seconds, Fetched: 1 row(s)
hive> select date_format(current_date,'dd-MMM-YYYY');
OK
31-Dec-2020
Time taken: 0.429 seconds, Fetched: 1 row(s)
hive> select cast(date_format(date_sub(CURRENT_DATE,1),'YYYYMMdd') AS string);
OK
20201230
Did anyone else face this issue.
Looks like you got into a classic mistake people do.
A common mistake is to use YYYY. yyyy specifies the calendar year
whereas YYYY specifies the year (of “Week of Year”), used in the ISO
year-week calendar. In most cases, yyyy and YYYY yield the same
number, however they may be different. Typically you should use the
calendar year.
Change your code as below (lower case yyyy) to get correct results:
hive> select date_format(current_date,'dd-MMM-yyyy');
OK
31-Dec-2019
select cast(date_format(date_sub(CURRENT_DATE,1),'yyyyMMdd') AS string);
OK
20191230
Make sure you change CURRENT_DATE to '2019-12-31' for testing purposes.

Apache Hive- Time Stamp query

I have two time stamp columns in a Hive DB storing timestamp in following format:
hive> select last_date from xyz limit 2;
OK
2019-08-21 15:11:23.553
2019-08-21 15:11:23.553
[Above has milliseconds stored in it by default]
hive> select last_modify_date from xyz limit 2;
OK
2018-04-18 23:32:58
2017-09-22 04:02:32
I need a common Hive select query which would convert both the above timestamps to 'YYYY-MM-DD HH:mm:ss.SSS' formats, preserving the millisecond value if exists, or appending '.000' if it doesnt exist.
What I have tried so far:
select
last_modify_date,
from_unixtime(unix_timestamp(last_modify_date), "yyyy-MM-dd HH:mm:ss.SSS") as ts
from xyz limit 3;
However, the above query displays '.000' for both the above said timestamp columns.
Please help
From the UDF that implements unix_timestamp, you can see that the returned value is in SENCONDS represented by a LongWritable. And anything less than one second is rounded off.
You can write your own UDF, or just use pure SQL to achieve that.
One of the easy way is to use the GenericUDFRpad rpad:
select rpad(your_date, 23, '.000') from your_table;
Some examples:
hive> select rpad('2018-04-18 23:32:58', 23, '.000');
OK
2018-04-18 23:32:58.000
hive> select rpad('2018-04-18 23:32:58.553', 23, '.000');
OK
2018-04-18 23:32:58.553

conversion from string to timestamp is not working

The data in the table as below.
The column jobdate data type is string.
jobdate
1536945012211.kc
1536945014231.kc
1536945312809.kc
I want to convert it to time stamp as the format 2018-12-205 06:15:10.505
I have tried the following queries but returning NULL.
select jobdate,from_unixtime(unix_timestamp(substr(jobdate,1,14),'YYYY-MM-DD HH:mm:ss.SSS')) from job_log;
select jobdate,from_unixtime(unix_timestamp(jobdate,'YYYY-MM-DD HH:mm:ss.SSS')) from job_log;
select jobdate,cast(date_format(jobdate,'YYYY-MM-DD HH:mm:ss.SSS') as timestamp) from job_log;
Please help me.
Thanks in advance
Original timestamps are too long, use 10 digits:
hive> select from_unixtime(cast(substr('1536945012211.kc',1,10) as int),'yyyy-MM-DD HH:mm:ss.SSS');
OK
2018-09-257 10:10:12.000
Time taken: 0.832 seconds, Fetched: 1 row(s)
hive> select from_unixtime(cast(substr('1536945012211.kc',1,10) as int),'yyyy-MM-dd HH:mm:ss.SSS');
OK
2018-09-14 10:10:12.000
Time taken: 0.061 seconds, Fetched: 1 row(s)
hive>

Teradata 15: [9134 : HY000] Teradata hour of day must be in range 1-12

I am trying to convert a varchar(19) timestamp field from a flat file into Teradata timestamp, but I got the following error.
select TOP 100
TO_TIMESTAMP (SOURCE_DTTM , 'YYYY-MM-DD HH:MI:SS') AS TS1
FROM "TEST"."CUSTOMER"
WHERE SOURCE_DTTM NOT LIKE '%0000-00-00%';
Executed as Single statement. Failed [9134 : HY000] Teradata hour of day must be in range 1-12
Elapsed time = 00:00:00.078
STATEMENT 1: Select Statement failed.
I am wondering if there is a way to specify the timestamp as a 24 hour format.
Really appreciate it.
I went to info.teradata.com and found the correct syntax:
select TOP 100
TO_TIMESTAMP (SOURCE_DTTM , 'YYYY-MM-DD HH24:MI:SS') AS TS1
FROM "TEST"."CUSTOMER"
WHERE SOURCE_DTTM NOT LIKE '%0000-00-00%';

Hive Unix Timestamp

I'm not getting expected result from UNIX_TIMESTAMP of hive
For example :
select FROM_UNIXTIME(UNIX_TIMESTAMP('2015/02/01', 'YYYY/MM/DD')) from table limit 1;
OUTPUT :
2014-12-28 00:00:00
Time taken: 0.287 seconds, Fetched: 1 row(s)
I expected it to return 2015-02-01 , but it resulted something else. I understand its probably because of epoch time ?
You have all caps in your date format. Try this using lowercase:
select from_unixtime(unix_timestamp('12/02/01','yyyy/MM/dd') from table;
Results in:
2015-02-01 00:00:00
Also you can strip off the time using:
to_date(from_unixtime(unix_timestamp('12/02/01','yyyy/MM/dd'))
Results in:
2015-02-01
You can use substr() function as well. works great everywhere, unless you need to compare it with other time variable/constant.
select FROM_UNIXTIME(UNIX_TIMESTAMP('2015/02/01'))
Use this Query to get the ans as 2015-02-01 00:00:00
Thanks.