How to select records from week days? - hive

I have hive table which contain daily records. I want to select record from week days. So i use bellow hive query to do it. I'm using QUBOLE API to do this.
SELECT hour(pickup_time),
COUNT(passengerid)
FROM home_pickup
WHERE CAST(date_format(pickup_time, 'u') as INT) NOT IN (6,7)
GROUP BY hour(pickup_time)
However when i run this code, It came with Bellow error.
SemanticException [Error 10011]: Line 4:12 Invalid function 'date_format'
Isn't Qbole support to date_format function? Are there any other way to select week days?

Use unix_timestamp(string date, string pattern) to convert given date format to seconds passed from 1970-01-01. Then use from_unixtime() to convert to given format:
Demo:
hive> select cast(from_unixtime(unix_timestamp('2017-08-21 10:55:00'),'u') as int);
OK
1
You can specify date pattern for unix_timestamp for non-standard format.
See docs here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions

Related

Want last 12months data , tried with add_months , but its throwing error in Impala

I've a requirement where I have to push only last 12months data from hive to impala , so
used the following query, it was success in HIVE.
select * from table_1 where date_ >= add_months(CAST(current_date() as string), -12,'YYYY-MM-DD')
Now, after pushed to Impala when tried to access the table with select statement got the below error
ERROR:
AnalysisException: No matching function with signature: add_months(STRING, TINYINT, STRING).
Tried several other functions like unix_timestamp getting passed in HIVE, but facing error only in Impala.
Please help on this, I'm new to Impala and hive
Thanks in advance
In Impala add_months can have two parameters:
ADD_MONTHS(TIMESTAMP / DATE date, INT months)
or
ADD_MONTHS(TIMESTAMP / DATE date, BIGINT months)
Try this: add_months(current_date(), -12)

Problem in converting string format into date in Athena

Request your help as have been trying to solve this but not able to.
I have a column in athena which is string . I want to convert that column into timestamp in athena.
I have used the query:
select date_parse(timestamp,'%Y-%m-%dT%H:%i:%s.%fZ') from wqmparquetformat ;
But i am getting errors:
INVALID_FUNCTION_ARGUMENT: Invalid format: "1589832352" is malformed at "832352"
I have tried all the combination of Presto in timestamp format.
When i run the below query :
select to_iso8601(from_unixtime(1589832352));
I receive the below output:
2020-05-18T20:05:52.000Z
The date_parse() function expects (string, format) as parameters and returns timestamp. So you need to pass your string as shown below :
select date_parse(to_iso8601(from_unixtime(1589832352)),'%Y-%m-%dT%H:%i:%s.%fZ')
which gave me below output
2020-05-18 20:05:52.000
You need to pass the column name contains the value 1589832352 in your case
select date_parse(to_iso8601(from_unixtime(timestamp)),'%Y-%m-%dT%H:%i:%s.%fZ')
In your case you should cast timestamp as double for it to work as shown below:
select date_parse(to_iso8601(from_unixtime(cast(timestamp as double))),'%Y-%m-%dT%H:%i:%s.%fZ')
To test run below query which works fine.
select date_parse(to_iso8601(from_unixtime(cast('1589832352' as double))),'%Y-%m-%dT%H:%i:%s.%fZ')
For me, date_format works great in AWS Athena:
SELECT date_format(from_iso8601_timestamp(datetime), '%m-%d-%Y %H:%i') AS myDateTime FROM <table>;
OR
select date_format(from_iso8601_timestamp(timestamp),'%Y-%m-%dT%H:%i:%s.%fZ') from wqmparquetformat ;

How to extract day/month/year etc from varchar date field, using Presto?

I currently have tables with dates, set up as VARCHAR in the format of YYYY-MM-DD such as:
2017-01-01
The date column I'm working with is called 'event_dt'
I'm used to being able to use day(event_dt), month(event_dt), year(event_dt) etc. in Hive, but Presto just gives me error executing query with no other explanation when the queries fail.
So, for example, I've tried:
select
month(event_dt)
from
my_sql_table
where
event_dt = '2017-01-01'
I would expect the output to read:
01
but all I get is [Code: 0, SQL State: ] Error executing query
I've tried a few other methods listed in the Presto documentation but am having no luck at all. I realize this is probably very simple but any help would be much appreciated.
You can use the month() function after converting the varchar to a date with the date() function:
presto> select month(date('2017-01-01'));
_col0
-------
1
(1 row)
Thanks to #LukStorms in the comments to the original question, I've found two solutions:
Using month(cast(event_dt as date))
Using month(date(event_dt))

Highest date from hive table with string data type

I am a newbie to hive and need your help. My requirement is to get the highest date from the table and my date datatype is string. I tried with max(), but it's not working for string data type... please help me on this.
Use built-in date functions unix_timestamp(string date, string pattern).
The unix_timestamp covert a string date to unix_timestamp as int, which is comparable.
Assume your table name is t and the time column is tt.
select max(unix_timestamp(tt, 'yyyyMMdd')) from t
would find the max unix_timestamp for you, which is the latest date
You're asserting the MAX doesn't work on Strings in Hive, but in fact it does:
Select MAX(dt) FROM (Select explode(Array("20150103", "20150102")) as dt) a;
As long as your date string is in a format which can be sorted lexographically, MAX should work fine.
Since 0.12.0 version max(date) will just work.
If all the values in that column match the pattern 'yyyy-mm-dd' the above syntax should do the job

how to delete the records which is inserted 1 day ago

I dont have proper timestamp in table; is it possible to delete 1 day old logs even now?
I have a column name as SESSION_IN which is basically a VARCHAR datatype, and the value will be like
2013-10-15 02:10:27.883;1591537355
is there any way to trim the number after ; and is it possible to compare with "sysdate" identifier?
This SP should compare all the session IDs with current datetime and it should delete if it is older then 1 day.
You can igonre time part and convert date into required format somthing like this
SYSDATE - to_date('date_col','YYYY-DD-MM')
then you can perform operations.
Use the Substring function to extract the datetime portion from the record, then use convert to datetime to cast it to datetime, and then finally use datediff to check if it was inserted yesterday. Use all these caluses in a
DELETE FROM table
WHERE ___ query
For Oracle you could use something like this:
SELECT
TRUNC(to_timestamp(SUBSTR('2013-10-15 02:10:27.883;1591537355',1,
(
SELECT
instr('2013-10-15 02:10:27.883;1591537355', ';')-1
FROM
dual
)
), 'YYYY-MM-DD HH:MI:SS.FF'))
FROM
dual;
Which gives you just the date portion of your input string. Just subtract the amount of days you want to log at the end.
Hope following query helps you:
Select Convert(Datetime,Substring('2013-10-15 02:10:27.883;1591537355',1,23)), DateDiff(dd,Convert(Datetime,Substring('2013-10-15 02:10:27.883;1591537355',1,23)),Getdate())