BigQueryOperator : No matching signature for operator = for argument types: DATE, INT64 - google-bigquery

When I run this query on BigQuery UI:
DELETE FROM `A.Books.items` where Date='2018-08-31'
The query works great.
However when I'm trying to do it on Airflow:
delete_sql = '''DELETE FROM `A.Books.items` where Date = {0}'''.format('2018-08-31') # // later this will be variable
delete_old= BigQueryOperator(
task_id='bigquery_delete',
bql=delete_sql,
destination_dataset_table=False,
bigquery_conn_id=CONNECTION_ID,
delegate_to=False,
use_legacy_sql = 'False',
udf_config=False,
dag=dag,
)
it returns:
Exception: BigQuery job failed. Final error was: {u'reason': u'invalidQuery', u'message': u'No matching signature for operator = for argument types: DATE, INT64. Supported signatures: ANY = ANY at [1:52]', u'location': u'query'
Date is a column with type DATE in items table.
How do I fix this error?

The following should work:
delete_sql = '''DELETE FROM A.Books.items where Date = '{0}' '''.format('2018-08-31')
You need quotes when you want to replace the strings.

Related

I am getting "CREATE MODEL OPTIONS() format parameter is missing" error in BigQuery while creating a model?

I am getting this error "CREATE MODEL OPTIONS() format parameter is missing" while trying to create an ARIMA model and it seems to be telling me that I need to define a certain format parameter, but I don't understand which one exactly it is asking me to add.
I am using the following script:
CREATE MODEL forecast
OPTIONS (model_type = 'ARIMA_PLUS',
time_series_timestamp_col='day',
time_series_data_col='cost',
auto_arima = TRUE,
data_frequency = 'AUTO_FREQUENCY',
decompose_time_series = TRUE) AS
SELECT
FORMAT_DATE('%Y-%m-%d', date) as day,
sum(net_cost) as cost
FROM ads_mif.logs_actual_footprint_cost_daily_raw
GROUP BY 1
'time_series_timestamp_col' must have a type of Timestamp, Date or DateTime, but instead has STRING type in query statement.
Remove the formatting of the date column.

SQL error when using format() function with pyodbc in Django

I want to execute a command using pyodbc in my Django app. When I do simple update with one column it works great:
cursor.execute("UPDATE dbo.Table SET attr = 1 WHERE id = {}".format(id))
However when I try to use a string as a column value it throws error:
cursor.execute("UPDATE dbo.Table SET attr = 1, user = '{}' WHERE id = {}".format(id, str(request.user.username)))
Here's error message:
('42S22', "[42S22] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid column name 'Admin'. (207) (SQLExecDirectW)")
Suprisingly this method works:
cursor.execute("UPDATE dbo.Table SET attr = 1, user = 'Admin' WHERE id = {}".format(id))
What seems to be the problem? Why is sql mistaking column value for its name?
As mentioned above, you have your arguments backwards, but if you're going to use cursor.execute(), the far more important thing to do is use positional parameters (%s). This will pass the SQL and values separately to the database backend, and protect you from SQL injection:
from django.db import connection
cursor = connection.cursor()
cursor.execute("""
UPDATE dbo.Table
SET attr = 1,
user = %s
WHERE id = %s
""", [
request.user.username,
id,
])
You've got your format arguments backwards. You're passing id to user, and username to the id WHERE clause.

Passing table name and list of values as argument to psycopg2 query

Context
I would like to pass a table name along with query parameters in a psycopg2 query in a python3 function.
If I understand correctly, I should not format the query string using python .format() method prior to the execution of the query, but let psycopg2 do that.
Issue
I can't succeed passing both the table name and the parameters as argument to my query string.
Code sample
Here is a code sample:
import psycopg2
from psycopg2 import sql
connection_string = "host={} port={} dbname={} user={} password={}".format(*PARAMS.values())
conn = psycopg2.connect(connection_string)
curs = conn.cursor()
table = 'my_customers'
cities = ["Paris", "London", "Madrid"]
data = (table, tuple(customers))
query = sql.SQL("SELECT * FROM {} WHERE city = ANY (%s);")
curs.execute(query, data)
rows = cursLocal.fetchall()
Error(s)
But I get the following error message:
TypeError: not all arguments converted during string formatting
I also tried to replace the data definition by:
data = (sql.Identifier(table), tuple(object_types))
But then this error pops:
ProgrammingError: can't adapt type 'Identifier'
If I put ANY {} instead of ANY (%s) in the query string, in both previous cases this error shows:
SyntaxError: syntax error at or near "{"
LINE 1: ...* FROM {} WHERE c...
^
Initially, I didn't used the sql module and I was trying to pass the data as the second argument to the curs.execute() method, but then the table name was single quoted in the command, which caused troubles. So I gave the sql module a try, hopping it's not a deprecated habit.
If possible, I would like to keep the curly braces {} for parameters substitution instead of %s, except if it's a bad idea.
Environment
Ubuntu 18.04 64 bit 5.0.0-37-generic x86_64 GNU/Linux
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
psycopg2.__version__
'2.8.4 (dt dec pq3 ext lo64)'
You want something like:
table = 'my_customers'
cities = ["Paris", "London", "Madrid"]
query = sql.SQL("SELECT * FROM {} WHERE city = ANY (%s)").format(sql.Identifier(table))
curs.execute(query, (cities,))
rows = cursLocal.fetchall()

How to use sql statement in django?

I want to get the latest date from my database.
Here is my sql statement.
select "RegDate"
from "Dev"
where "RegDate" = (
select max("RegDate") from "Dev")
It works in my database.
But how can I use it in django?
I tried these codes but it return error. These code are in views.py.
Version 1:
lastest_date = Dev.objects.filter(reg_date=max(reg_date))
Error:
'NoneType' object is not iterable
Version 2:
last_activation_date = Dev.objects.filter(regdate='reg_date').order_by('-regdate')[0]
Error:
"'reg_date' value has an invalid format. It must be in YYYY-MM-DD HH:MM[:ss[.uuuuuu]][TZ] format."
I've defined reg_date at beginning of the class.
What should I do for this?
You make things too complicated, you simply order by regdate, that's all:
last_activation_dev = Dev.objects.order_by('-regdate').first()
The .first() will return such Dev object if it is available, or None if there are no Dev objects.
If you only are interested in the regdate column itself, you can use .values_list(..):
last_activation_date = Dev.objects.order_by('-regdate').values_list('regdate', flat=True).first()
By using .filter() you actually were filtering the Dev table by Dev records such that the regdate column had as value 'reg_date', since 'reg_date' is not a valid datetime format, this thus produced an error.

Hive sql struct mismatch

I have a table with columns like this:
table<mytable>
field<myfield1> type(array<struct>)
item<struct>
cars(string)
isRed(boolean)
information(bigint)
When I perform the following query
select myfield1.isRed
from mytable
where myfield1.isRed = true
I get an error:
Argument type mismatch '': The 1st argument of EQUAL is expected to a primitive type, but list is found
When I query without the where the data looks something like this
[true,true,true]
[true,true,true,true,true,true]
[true]
[true, true]
Try this:
select myfield1[1]
from mytable
where myfield1[1] = true
You can find more info about how to access complex types here