Problem in converting string format into date in Athena - sql

Request your help as have been trying to solve this but not able to.
I have a column in athena which is string . I want to convert that column into timestamp in athena.
I have used the query:
select date_parse(timestamp,'%Y-%m-%dT%H:%i:%s.%fZ') from wqmparquetformat ;
But i am getting errors:
INVALID_FUNCTION_ARGUMENT: Invalid format: "1589832352" is malformed at "832352"
I have tried all the combination of Presto in timestamp format.
When i run the below query :
select to_iso8601(from_unixtime(1589832352));
I receive the below output:
2020-05-18T20:05:52.000Z

The date_parse() function expects (string, format) as parameters and returns timestamp. So you need to pass your string as shown below :
select date_parse(to_iso8601(from_unixtime(1589832352)),'%Y-%m-%dT%H:%i:%s.%fZ')
which gave me below output
2020-05-18 20:05:52.000
You need to pass the column name contains the value 1589832352 in your case
select date_parse(to_iso8601(from_unixtime(timestamp)),'%Y-%m-%dT%H:%i:%s.%fZ')
In your case you should cast timestamp as double for it to work as shown below:
select date_parse(to_iso8601(from_unixtime(cast(timestamp as double))),'%Y-%m-%dT%H:%i:%s.%fZ')
To test run below query which works fine.
select date_parse(to_iso8601(from_unixtime(cast('1589832352' as double))),'%Y-%m-%dT%H:%i:%s.%fZ')

For me, date_format works great in AWS Athena:
SELECT date_format(from_iso8601_timestamp(datetime), '%m-%d-%Y %H:%i') AS myDateTime FROM <table>;
OR
select date_format(from_iso8601_timestamp(timestamp),'%Y-%m-%dT%H:%i:%s.%fZ') from wqmparquetformat ;

Related

Compare prev_execution_date in Airflow to timestamp in BigQuery using SQL

I am trying to insert data from one BigQuery table to another using an Airflow DAG. I want to filter data such that the updateDate in my source table is greater than the previous execution date of my DAG run.
The updateDate in my source table looks like this: 2021-04-09T20:11:11Zand is of STRING data type whereasprev_execution_datelooks like this:2021-04-10T11:00:00+00:00which is why I am trying to convert myupdateDate` to TIMESTAMP first and then to ISO format as shown below.
SELECT *
FROM source_table
WHERE FORMAT_TIMESTAMP("%Y-%m-%dT%X%Ez", TIMESTAMP(UpdateDate)) > TIMESTAMP('{{ prev_execution_date }}')
But I am getting the error message: No matching signature for operator > for argument types: STRING, TIMESTAMP. Supported signature: ANY > ANY. Clearly the left hand side of my WHERE-clause above is of type STRING. How can I convert it to TIMESTAMP or to a correct format for that matter to be able to compare to prev_execution_date?
I have also tried with the following:
WHERE FORMAT_TIMESTAMP("%Y-%m-%dT%X%Ez", TIMESTAMP(UpdatedWhen)) > STRING('{{ prev_execution_date }}')
which results in the error message: Could not cast literal "2021-04-11T11:50:31.284349+00:00" to type DATE
I would appreciate some help regarding how to write my BigQuery SQL query to compare the String timestamp to previous execution date of Airflow DAG.
Probably you wanted to try parse_timestamp instead:
SELECT *
FROM source_table
WHERE PARSE_TIMESTAMP("%Y-%m-%dT%X%Ez", UpdateDate) > TIMESTAMP('{{ prev_execution_date }}')
although looks like it will work even without it
SELECT *
FROM source_table
WHERE TIMESTAMP(UpdateDate) > TIMESTAMP('{{ prev_execution_date }}')

Can you apply BigQuery's PARSE_TIMESTAMP function to entire field?

I have a field "mytimestamp" which is currently of data type STRING, with the syntax "DD/MM/YYYY hh:mm:ss", and I'm looking to convert it in to a field of type TIMESTAMP. The function PARSE_TIMESTAMP works for a specific argument, eg
SELECT PARSE_TIMESTAMP('%d/%m/%Y %H:%M:%S', '15/04/2020 15:13:52') AS mynewtimestamp
but attempting to apply this to the entire column as follows
SELECT PARSE_DATETIME('%d/%m/%Y %H:%M:%S', mytimestamp) AS mynewtimestamp
FROM `project.dataset.table`
yields instead the error "Failed to parse input string "mytimestamp""
You probably have bad data in the column. You can find the problems using:
select mytimestamp
from `project.dataset.table`
where SAFE.PARSE_DATETIME('%d/%m/%Y %H:%M:%S', mytimestamp) is null
yields instead the error "Failed to parse input string "mytimestamp""
Error message suggests that instead of passing mytimestamp as a column name - you are passing "mytimestamp" as a string - so check your query for this

NULL values in a string to date conversion

I have a table with the following data:
logs.ip logs.fecha logs.metodo
66.249.93.79 19/Nov/2018:03:46:33 GET
All data columns are string and I want to convert logs.fecha into date with the following format: YYYY-MM-dd HH:mm:ss
I try the following query:
SELECT TO_DATE(from_unixtime(UNIX_TIMESTAMP(fecha, 'yyyy-MM-dd'))) FROM logs
Results of the query are NULL in all rows.
How can I make the conversion string to date for all rows? I know I must use ALTER TABLE but I don't know how to do it.
Thanks
The reason you get null is because the format of the input string is different from the input passed to unix_timestamp. The second argument to unix_timestamp should specify the string format of the first argument. In from_unixtime you can specify the output format desired. If nothing is specified, a valid input to from_unixtime returns an output in yyyy-MM-dd format.
The error can be fixed as below.
from_unixtime(unix_timestamp(fecha,'dd/MMM/yyyy:HH:mm:ss'),'yyyy-MM-dd HH:mm:ss')
You just have to tell Oracle the date format you are reading with TO_DATE.
Try:
SELECT TO_DATE(fecha,'DD/MON/YYYY:HH:MI:SS') FROM logs

Convert date from string to date type in Hive

I want to change string which is in format '29-MAR-17' to date type in Hive. The column in question is named "open_time".
I have tried using:
SELECT TO_DATE(from_unixtime(UNIX_TIMESTAMP('open_time', 'dd/MM/yyyy')));
But it returns NULL. Subsequently, my objectif is to do something like this :
SELECT * FROM table_hive WHERE open_time BETWEEN '29-MAR-17' AND '28-MAR-17';
With strings, it will definitely not work.
Any help please ?
This should work
select to_date(from_unixtime(unix_timestamp('29-MAR-17','dd-MMM-yy')))
Returns 2017-03-29

How to select records from week days?

I have hive table which contain daily records. I want to select record from week days. So i use bellow hive query to do it. I'm using QUBOLE API to do this.
SELECT hour(pickup_time),
COUNT(passengerid)
FROM home_pickup
WHERE CAST(date_format(pickup_time, 'u') as INT) NOT IN (6,7)
GROUP BY hour(pickup_time)
However when i run this code, It came with Bellow error.
SemanticException [Error 10011]: Line 4:12 Invalid function 'date_format'
Isn't Qbole support to date_format function? Are there any other way to select week days?
Use unix_timestamp(string date, string pattern) to convert given date format to seconds passed from 1970-01-01. Then use from_unixtime() to convert to given format:
Demo:
hive> select cast(from_unixtime(unix_timestamp('2017-08-21 10:55:00'),'u') as int);
OK
1
You can specify date pattern for unix_timestamp for non-standard format.
See docs here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions