Hive Unix Timestamp - apache

I'm not getting expected result from UNIX_TIMESTAMP of hive
For example :
select FROM_UNIXTIME(UNIX_TIMESTAMP('2015/02/01', 'YYYY/MM/DD')) from table limit 1;
OUTPUT :
2014-12-28 00:00:00
Time taken: 0.287 seconds, Fetched: 1 row(s)
I expected it to return 2015-02-01 , but it resulted something else. I understand its probably because of epoch time ?

You have all caps in your date format. Try this using lowercase:
select from_unixtime(unix_timestamp('12/02/01','yyyy/MM/dd') from table;
Results in:
2015-02-01 00:00:00
Also you can strip off the time using:
to_date(from_unixtime(unix_timestamp('12/02/01','yyyy/MM/dd'))
Results in:
2015-02-01

You can use substr() function as well. works great everywhere, unless you need to compare it with other time variable/constant.

select FROM_UNIXTIME(UNIX_TIMESTAMP('2015/02/01'))
Use this Query to get the ans as 2015-02-01 00:00:00
Thanks.

Related

How can I find time interval between 00:00:00 and 23:28:05 without date in PostgreSQL

So I have this table where I've got two times for each line, but no date and need to get the interval between those two, all is fine when it's:
11:00:00 - 09:38:54
Returns: 01:21:06
As there's no dates, times are stored in "time without time zone" format.
The problem arises when the time enters the next day and the hour becomes 00h, as there's no date the interval will something absurd like -22:58:21
Example:
00:00:00 - 22:59:01
Returns: -22:59:01
00:00:00 - 22:44:06
Returns: -22:44:06
Is there anyway to make SQL understand 00:00:00 as 24:00:00 for the sake of math without date?
The hours only range between 8 and 0, and nothing from the previous day goes further than 0h30, a simple case for "00h" solves it, but I can't make SQL understand 00h as 24h so far. Any ideas?
Like:
select '24:00:00'::time - '22:59:01'::time;
?column?
----------
01:00:59
UPDATE
If the 00:00:00 is coming from somewhere you can't modify in place then:
select time_fld from time_test ;
time_fld
----------
00:00:00
01:30:00
select coalesce(nullif(time_fld, '00:00:00'::time), '24:00:00') from time_test;
coalesce
----------
24:00:00
01:30:00

Apache Hive- Time Stamp query

I have two time stamp columns in a Hive DB storing timestamp in following format:
hive> select last_date from xyz limit 2;
OK
2019-08-21 15:11:23.553
2019-08-21 15:11:23.553
[Above has milliseconds stored in it by default]
hive> select last_modify_date from xyz limit 2;
OK
2018-04-18 23:32:58
2017-09-22 04:02:32
I need a common Hive select query which would convert both the above timestamps to 'YYYY-MM-DD HH:mm:ss.SSS' formats, preserving the millisecond value if exists, or appending '.000' if it doesnt exist.
What I have tried so far:
select
last_modify_date,
from_unixtime(unix_timestamp(last_modify_date), "yyyy-MM-dd HH:mm:ss.SSS") as ts
from xyz limit 3;
However, the above query displays '.000' for both the above said timestamp columns.
Please help
From the UDF that implements unix_timestamp, you can see that the returned value is in SENCONDS represented by a LongWritable. And anything less than one second is rounded off.
You can write your own UDF, or just use pure SQL to achieve that.
One of the easy way is to use the GenericUDFRpad rpad:
select rpad(your_date, 23, '.000') from your_table;
Some examples:
hive> select rpad('2018-04-18 23:32:58', 23, '.000');
OK
2018-04-18 23:32:58.000
hive> select rpad('2018-04-18 23:32:58.553', 23, '.000');
OK
2018-04-18 23:32:58.553

How to extract time in HH24:MM from varchar in Oracle

I have a column in the following varchar format. I would like to extract the time based on a condition e.g. < 7:00.
Table1
Column: timer(varchar)
23:45
05:00
07:00
22:00
Expected output
test
05:00
07:30
I tried the following:
Select *
FROM Table1
where timer < 7:00
However, the result is not as expected.
Oracle does not have a time date, so presumably the type is a string.
Use string comparisons:
where time < '07:00'
Note that the leading 0 is important!
If this is just time and you want a proper comparison then you can convert them to date and compare them.
Select *
FROM Table1
where to_date(timer,'hh24:mi') < to_date('07:00','hh24:mi');
please note that your expected output contains 07:30 but it is not less than 07:00 so it will not be part of the output if you compare it with less than 07:00.
Cheers!!

How to convert "2019-11-02T20:18:00Z" to timestamp in HQL?

I have datetime string "2019-11-02T20:18:00Z". How can I convert it into timestamp in Hive HQL?
try this:
select from_unixtime(unix_timestamp("2019-11-02T20:18:00Z", "yyyy-MM-dd'T'HH:mm:ss"))
If you want preserve milliseconds then remove Z, replace T with space and convert to timestamp:
select timestamp(regexp_replace("2019-11-02T20:18:00Z", '^(.+?)T(.+?)Z$','$1 $2'));
Result:
2019-11-02 20:18:00
Also it works with milliseconds:
select timestamp(regexp_replace("2019-11-02T20:18:00.123Z", '^(.+?)T(.+?)Z$','$1 $2'));
Result:
2019-11-02 20:18:00.123
Using from_unixtime(unix_timestamp()) solution does not work with milliseconds.
Demo:
select from_unixtime(unix_timestamp("2019-11-02T20:18:00.123Z", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"));
Result:
2019-11-02 20:18:00
Milliseconds are lost. And the reason is that function unix_timestamp returns seconds passed from the UNIX epoch (1970-01-01 00:00:00 UTC).

Calculate time difference between two columns of string type in hive without changing the data type string

I am trying to calculate the time difference between two columns of a row which are of string data type. If the time difference between them is less than 2 hours then select the first column of that row else if the time difference is greater than 2 hours then select the second column of that row. It can be done by converting the columns to datetime format, but I want the result to be in string only. How can I do that? The data looks like this:
col1(string type)
2018-07-16 02:23:00
2018-07-26 12:26:00
2018-07-26 15:32:00
col2(string type)
2018-07-16 02:36:00
2018-07-26 14:29:00
2018-07-27 15:38:00
I think you don't need to convert the columns to datetime format, since the data in your case is already ordered (yyyy-MM-dd hh:mm:ss). You just need to take all the digits and take it into one string (yyyyMMddhhmmss) then you can apply your selection which is bigger or smaller than 2 hours (here 20000 since the hour is followed by mmss). By looking at your example (assuming col2 > col1), this query would work:
SELECT case when regexp_replace(col2,'[^0-9]', '')-regexp_replace(col1,'[^0-9]', '') < 20000 then col1 else col2 end as col3 from your_table;
Use unix_timestamp() to convert string timestamp to seconds.
The difference in hours will be:
hive> select (unix_timestamp('2018-07-16 02:23:00')- unix_timestamp('2018-07-16 02:36:00'))/60/60;
OK
-0.21666666666666667
Important update: this method will work correctly only if time zone is configured as UTC. Because for DST timezones for some marginal cases Hive converts time during timestamp operations. Consider this example for PDT time zone:
hive> select hour('2018-03-11 02:00:00');
OK
3
Note the hour is 3, not 2. This is because 2018-03-11 02:00:00 cannot exist in PDT time zone because exactly at 2018-03-11 02:00:00 time is adjusted and becomes 2018-03-11 03:00:00.
The same happens when converting to unix_timestamp. For PDT time zone unix_timestamp('2018-03-11 03:00:00') and unix_timestamp('2018-03-11 02:00:00') will return the same timestamp:
hive> select unix_timestamp('2018-03-11 03:00:00');
OK
1520762400
hive> select unix_timestamp('2018-03-11 02:00:00');
OK
1520762400
And few links for your reference:
https://community.hortonworks.com/questions/82511/change-default-timezone-for-hive.html
http://boristyukin.com/watch-out-for-timezones-with-sqoop-hive-impala-and-spark-2/
Also have a look at this jira please: Hive should carry out timestamp computations in UTC