BigQueryValueCheckOperator and standard sql - google-bigquery

I would like to know if there is a way to use standard sql with the airflow BigQueryValueCheckOperator in apache airflow 1.9 The airflow BigQueryOperator normally has a flag like this
use_legacy_sql=False to disable legacy sql. I can't find a way to achieve this with the BigQueryValueCheckOperator.
Rewriting the query in legacy sql is not an option for now since I want to use the _PARTITIONTIME in my where clause.
Thank you.

Currently, you can't use StandardSQL with this operator.
However, for your use-case, you can still use _PARTITIONTIME with Legacy Sql as mentioned here in the docs: https://cloud.google.com/bigquery/docs/querying-partitioned-tables#querying_ingestion-time_partitioned_tables_using_time_zones
Sample Query:
#legacySQL
SELECT
field1
FROM
mydataset.partitioned_table
WHERE
_PARTITIONTIME BETWEEN TIMESTAMP("2016-05-01")
AND TIMESTAMP("2016-05-06")
AND DATE_ADD([MY_TIMESTAMP_FIELD], 8, 'HOUR') BETWEEN TIMESTAMP("2016-05-01 12:00:00")
AND TIMESTAMP("2016-05-05 14:00:00");

Related

starts_with in presto?

I am new to writing sql queries in presto and was looking for a function similar to 'starts_with'.
If a string starts with a given substring then the query needs to return that record.
In Postgresql, I am currently doing select * from tableA where name~'^Joh'. Whats the equivalent of this in Presto?
PostgreSQL and presto are RDBMS based on SQL. It is weird to see that you've learned a PostgreSQL proprietary add on (regular expressions) to the language before learning the standard SQL functions. In SQL you use LIKE for pattern matches:
select * from tableA where name like 'Joh%';
You can use Like in SQL. You can go through this link https://www.w3schools.com/sql/sql_like.asp. Using like you can search for a specified pattern.
In presto you can use regexp_like() which runs little faster than other like operators.For your case try below query which should provide you with expected functionality.
select regexp_like('John', '^John')

How to Use decorators in STANDARD SQL

I want to recover the truncated data from a table in bigquery an hour back,
I found one of the solution in legacy SQL like below:
SELECT COUNT(*) FROM [PROJECT_ID:DATASET.TABLE#-3600000]
How to achieve the same in standard sql.
Thanks
See the FOR SYSTEM TIME AS OF documentation. You would want something like this:
SELECT *
FROM `project`.dataset.table FOR SYSTEM TIME AS OF
TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)
Relative time decorator is not supported in standard sql yet. You can use absolute timestamp as the decorator in standard sql. Link from official Bigquery here.
EDIT
As per Elliott's response, its supported now in standard sql with different syntax.

REGEXP_CONTAINS not recognized

Happy new years, stackoverflow!
I am trying to use some regex functions in bigquery but some of them return error as if I have the name wrong.
SELECT REGEXP_CONTAINS(path, r'^abc$') FROM [tablename]
Query Failed
Error: 2.24 - 2.26: Unrecognized function regexp_contains
Where as if I do a similar regex function, the function text in the editor changes color and the query works.
SELECT REGEXP_EXTRACT(path, r'^abc$') FROM [tablename]
It should work since it's documented in this link.
Does anyone know how to fix this?
BigQuery Legacy SQL and Standard SQL support different set of regular expression functions
Legacy SQL Regular Expression Functions:
REGEXP_MATCH, REGEXP_EXTRACT and REGEXP_REPLACE
Standard SQL Regular Expression Functions:
REGEXP_CONTAINS, REGEXP_EXTRACT, REGEXP_EXTRACT_ALL and REGEXP_REPLACE
So, in your case just make sure you use proper BigQuery SQL dialect
#standardSQL
SELECT REGEXP_CONTAINS(path, r'^abc$') FROM [tablename]

Table ranges with BigQuery's standard SQL

How can you query a range of timestamped tables with the new syntax? Using TABLE_DATE_RANGE returns the error Unhandled node type in from clause: TVF.
The latest version of BigQuery supports an equivalent of table wildcards with Standard SQL. The documentation is available here: https://cloud.google.com/bigquery/docs/wildcard-tables.
Also please take a look at this post:
Is there an equivalent of table wildcard functions in BigQuery with standard SQL?

SQL Statement using LIKE

I want to know in a column named NUMTSO if there exists data with this format "WO#############", so what I'm doing is this:
select *
from fx1rah00
where numtso like 'WO[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
but I get nothing. What am I doing wrong?
This works fine for me in SQL Server. If you are not using SQL Server you will likely need some different syntax though as the pattern syntax is not standard SQL.
;with fx1rah00 As
(
select 'WO1234567890123' as numtso
)
select *
from fx1rah00
where numtso like
'WO[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
MySQL allows you to use regular expressions with the REGEXP keyword instead of LIKE. I suggest the following code:
SELECT *
FROM `fx1rah00`
WHERE `numtso` REGEXP 'WO[0-9]{13}'
What dbms is this? Some databases don't let use use regex in like clause just wildcards. If its oracle you could checkout REGEXP_LIKE or REGEXP for mysql.
I would do something like:
where NUMTSO like 'WO%'
and REGEXP_LIKE(NUMTSO, 'WO[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
by using the like and the regex check you can still range scan on an index if there was one.
The SQL standard does not support REGEXP in LIKE. They have a much more primitive pattern language. You'll need to add a function, or post-filter, or discover a DBMS-specific extension.