Compare prev_execution_date in Airflow to timestamp in BigQuery using SQL - google-bigquery

I am trying to insert data from one BigQuery table to another using an Airflow DAG. I want to filter data such that the updateDate in my source table is greater than the previous execution date of my DAG run.
The updateDate in my source table looks like this: 2021-04-09T20:11:11Zand is of STRING data type whereasprev_execution_datelooks like this:2021-04-10T11:00:00+00:00which is why I am trying to convert myupdateDate` to TIMESTAMP first and then to ISO format as shown below.
SELECT *
FROM source_table
WHERE FORMAT_TIMESTAMP("%Y-%m-%dT%X%Ez", TIMESTAMP(UpdateDate)) > TIMESTAMP('{{ prev_execution_date }}')
But I am getting the error message: No matching signature for operator > for argument types: STRING, TIMESTAMP. Supported signature: ANY > ANY. Clearly the left hand side of my WHERE-clause above is of type STRING. How can I convert it to TIMESTAMP or to a correct format for that matter to be able to compare to prev_execution_date?
I have also tried with the following:
WHERE FORMAT_TIMESTAMP("%Y-%m-%dT%X%Ez", TIMESTAMP(UpdatedWhen)) > STRING('{{ prev_execution_date }}')
which results in the error message: Could not cast literal "2021-04-11T11:50:31.284349+00:00" to type DATE
I would appreciate some help regarding how to write my BigQuery SQL query to compare the String timestamp to previous execution date of Airflow DAG.

Probably you wanted to try parse_timestamp instead:
SELECT *
FROM source_table
WHERE PARSE_TIMESTAMP("%Y-%m-%dT%X%Ez", UpdateDate) > TIMESTAMP('{{ prev_execution_date }}')
although looks like it will work even without it
SELECT *
FROM source_table
WHERE TIMESTAMP(UpdateDate) > TIMESTAMP('{{ prev_execution_date }}')

Related

Extract date from timestamp containing time zone in Big Query

I have data containing dates of the form
2020-12-14T18:58:10+01:00[Europe/Stockholm]
but I really only need the date 2020-12-14.
So, I tried:
DATE(Timestamp) as LastUpdateDate
which returned Error: Invalid time zone: +02:00[Europe/Stockholm]
So, thinking that the problem came from the time zone, I tried this instead:
TIMESTAMP(FORMAT_TIMESTAMP("%Y-%m-%d", PARSE_TIMESTAMP("%Y%m%d", Timestamp)))
which magically returned a new error, namely
Error: Failed to parse input string "2021-10-04T09:24:20+02:00[Europe/Stockholm]"
How do I solve this?
Just substring the date part from the string. Try one of these:
select left(Timestamp, 10)
select date(left(Timestamp, 10))
You should clean your data first.
select date("2020-12-14T18:58:10+01:00") as LastUpdateDate
This will work as expected.
Any chance of cleaning your data before using it in a query? Actually I think that +01:00[Europe/Stockholm] is not supported as format.

Problem in converting string format into date in Athena

Request your help as have been trying to solve this but not able to.
I have a column in athena which is string . I want to convert that column into timestamp in athena.
I have used the query:
select date_parse(timestamp,'%Y-%m-%dT%H:%i:%s.%fZ') from wqmparquetformat ;
But i am getting errors:
INVALID_FUNCTION_ARGUMENT: Invalid format: "1589832352" is malformed at "832352"
I have tried all the combination of Presto in timestamp format.
When i run the below query :
select to_iso8601(from_unixtime(1589832352));
I receive the below output:
2020-05-18T20:05:52.000Z
The date_parse() function expects (string, format) as parameters and returns timestamp. So you need to pass your string as shown below :
select date_parse(to_iso8601(from_unixtime(1589832352)),'%Y-%m-%dT%H:%i:%s.%fZ')
which gave me below output
2020-05-18 20:05:52.000
You need to pass the column name contains the value 1589832352 in your case
select date_parse(to_iso8601(from_unixtime(timestamp)),'%Y-%m-%dT%H:%i:%s.%fZ')
In your case you should cast timestamp as double for it to work as shown below:
select date_parse(to_iso8601(from_unixtime(cast(timestamp as double))),'%Y-%m-%dT%H:%i:%s.%fZ')
To test run below query which works fine.
select date_parse(to_iso8601(from_unixtime(cast('1589832352' as double))),'%Y-%m-%dT%H:%i:%s.%fZ')
For me, date_format works great in AWS Athena:
SELECT date_format(from_iso8601_timestamp(datetime), '%m-%d-%Y %H:%i') AS myDateTime FROM <table>;
OR
select date_format(from_iso8601_timestamp(timestamp),'%Y-%m-%dT%H:%i:%s.%fZ') from wqmparquetformat ;

How to extract day/month/year etc from varchar date field, using Presto?

I currently have tables with dates, set up as VARCHAR in the format of YYYY-MM-DD such as:
2017-01-01
The date column I'm working with is called 'event_dt'
I'm used to being able to use day(event_dt), month(event_dt), year(event_dt) etc. in Hive, but Presto just gives me error executing query with no other explanation when the queries fail.
So, for example, I've tried:
select
month(event_dt)
from
my_sql_table
where
event_dt = '2017-01-01'
I would expect the output to read:
01
but all I get is [Code: 0, SQL State: ] Error executing query
I've tried a few other methods listed in the Presto documentation but am having no luck at all. I realize this is probably very simple but any help would be much appreciated.
You can use the month() function after converting the varchar to a date with the date() function:
presto> select month(date('2017-01-01'));
_col0
-------
1
(1 row)
Thanks to #LukStorms in the comments to the original question, I've found two solutions:
Using month(cast(event_dt as date))
Using month(date(event_dt))

SQL Query "time" Reference

I've got the below query in MS SQL Server Management Studio:
SELECT t1.time, value, annotations
FROM PI.piarchive..picomp2 t1
WHERE tag = 'sinusoid'
AND t1.time >= 't'
AND annotated = 1
Unfortunately, when I try to run the query, the below error is returned:
"Conversion failed when converting date and/or time from character string."
That tells me that it's trying to use the SQL in-built time reference, but preventing me from referring to the "time" attribute in the system table "PI.piarchive..picomp2".
Are you able to advise what changes in syntax I need to make to change the behaviour during query execution so it can query the "time" attribute in the "PI.piarchive..picomp2" table?
EDITED
The "time" attribute is of DateTime type, but since this is a historian I am querying via OLEDB, the reference of 't' (what I am trying to compare with) is a valid value as 't' refers to today.
As it's said in the error message you are trying to compare value of data type datetime and character string. Of course, that's not aloud. How can you compare for example word 'ostrich' and current date? Which one is bigger or less?
You can compare t1.time with current date this way (SQL Server 2008+):
t1.time >= CAST(GETDATE() as date)

cast a varchar to date

i use a H2 database for a small project, all field are varchar
for a query, i must convert a string to date so i tried
SELECT * FROM USER WHERE cast(DATE_CONTRACT AS DATE) > '2005-02-21'
but there are an error
Code :
Error code 90009, SQL state 90009:
Cannot parse date constant
"2011-02-21-15.22.07", cause:
"java.lang.NumberFormatException: For
input string: ""21-15.22.07""";
any idea?
thanks
You have to use parsedatetime() in order to "cast" your character data to a date.
http://h2database.com/html/functions.html#parsedatetime
Something like this:
SELECT *
FROM USER
WHERE parsedatetime(DATE_CONTRACT, 'yyyy-MM-dd-HH.mm.ss') > DATE '2005-02-21'
Another good reason to never store dates, timestamps or numbers as character data
Of course you could use the built-in function PARSEDATETIME as follows:
CREATE TABLE USER(DATE_CONTRACT VARCHAR);
INSERT INTO USER VALUES('2011-02-21-15.22.07');
SELECT * FROM USER
WHERE PARSEDATETIME(DATE_CONTRACT, 'yyyy-MM-dd-HH.mm.ss') > DATE '2005-02-21';
But I guess it would make more sense to store the timestamp in a more standard format, so that no special parsing is required. Example:
CREATE TABLE USER(DATE_CONTRACT TIMESTAMP);
INSERT INTO USER VALUES('2011-02-21 15:22:07');
SELECT * FROM USER
WHERE DATE_CONTRACT > DATE '2005-02-21';
H2 supports the format supported by JDBC, plus ISO 8601.
Why convert to date? If DATE_CONTRACT uses a consistent CHAR(19), you can:
SELECT *
FROM USER
WHERE DATE_CONTRACT >= '2005-02-22-00.00.00'
;
Correct me if I'm wrong but if there is an index on DATE_CONTRACT, this will be faster than converting to date/datetime.