pandas sqlite read_sql dynamic in clause - pandas

I am trying to use pandas read_sql function to query some data from sqlite DB. I need to use parameterized SQL which contains in clause (List) and some static parameters.
Below is my query
battingDataQuery = ('SELECT ID, MATCH_DATE, ROLE, DOWN_NUM, NAME, RUNS,'
'MATCH_ID, TEAM_NAME, VERSUS_TEAM_NAME, GROUND_NAME '
'FROM BATTING_DATA WHERE ID in ({1}) '
'AND DOWN_NUM < {0} AND MATCH_TYPE = {0}')
I have added the placeholders appropriately using format
battingDataQuery = battingDataQuery.format('?', ','.join('?' * len(playerIdList)))
My generated SQL is as following
'SELECT ID FROM BATTING_DATA WHERE ID in (?,?,?,?,?) AND DOWN_NUM < ? AND MATCH_TYPE = ?'
I am stuck at the last part where I am sending the parameters as following:
battingDataDF = pd.read_sql_query(battingDataQuery , conn, params=(playerIdList,battingDownNum,'\'T20\''))
I am getting following error when using this
Incorrect number of bindings supplied. The current statement uses 7,
and there are 3 supplied.
I have tried using following variations but still get the same error
battingDataDF = pd.read_sql_query(battingDataQuery , conn, params=[playerIdList,battingDownNum,'\'T20\'']) # same error
battingDataDF = pd.read_sql_query(battingDataQuery , conn, params=[playerIdList,battingDownNum,'\'T20\'']) # same error
battingDataDF = pd.read_sql_query(battingDataQuery , conn, params=[tuple(playerIdList),battingDownNum,'\'T20\'']) # same error

You should supply a list of 7 parameters for your 7 question marks:
battingDataDF = pd.read_sql_query(battingDataQuery , conn, params=playerIdList + [battingDownNum, "'T20'"])
(you supplied 3 parameters: a list of 5 numbers, a number and a string, hence the error)

Answer given my #stef worked but I was able to find another variation that worked. So wanted to post that for the sake of completion
battingDataDF = pd.read_sql_query(battingDataQuery , conn, params=(*playerIdList,battingDownNum,matchType))
*causes the list to be unpacked and thus resulting in supply of the correct number of arguments
Not sure which approach is better. If someone can post some light on this, it will be great.

Related

SqlAlchemy raw SQL queries strange errors [duplicate]

This question already has an answer here:
DB2 error Improper use of a string column, host variable, constant, or function
(1 answer)
Closed 8 months ago.
I´ve been facing some weird issues when studding SQL raw queries using SqlAlchemy.
sqlstr = 'SELECT "City" from CHICAGO_SCHOOLS;'
with engine.connect() as conn:
result = conn.execute(text(sqlstr))
print (result.all())
The query above returns hundreds of "Chicago" as results. So I just tried to get unique results:
sqlstr = 'SELECT DISTINCT "City" from CHICAGO_SCHOOLS;'
with engine.connect() as conn:
result = conn.execute(text(sqlstr))
print (result.all())
Now, all I got is a weird error :
Exception: SQLNumResultCols failed: [IBM][CLI Driver][DB2/LINUXX8664]
SQL0134N Improper use of a string column, host variable, constant, or
function "City". SQLSTATE=42907
At first I thought it was somehow related to the DISTINCT set quantifier. So I tried the same query with another column.
sqlstr = 'SELECT DISTINCT "School ID" from CHICAGO_SCHOOLS;'
with engine.connect() as conn:
result = conn.execute(text(sqlstr))
print (result.all())
And in this query I got all expected results.
I am not being able to truly understand what is wrong!
The issue was related to the column type. It was a CLOB type and that does not allow use of DISTINCT. Thanks to HoneyBadger

Python cx_oracle bind variable with a list of items

I have a query like this:
SELECT prodId, prod_name , prod_type FROM mytable WHERE prod_type in (:list_prod_names)
I want to get the information of a product, depending on the possible types are: "day", "week", "weekend", "month". Depending on the date it might be at least one of those option, or a combination of all of them.
This info (List type) is returned by the function prod_names(date_search)
I am using cx_oracle bindings with code like:
def get_prod_by_type(search_date :datetime):
query_path = r'./queries/prod_by_name.sql'
raw_query = open(query_path).read().strip().replace('\n', ' ').replace('\t', ' ').replace(' ', ' ')
print(sql_read_op)
# Depending on the date the product types may be different
prod_names(search_date) #This returns a list with possible names
qry_params = {"list_prod_names": prod_names} # See attempts bellow
try:
db = DB(username='username', password='pss', hostname="localhost")
df = db.get(raw_query,qry_params)
except Exception:
exception_error = traceback.format_exc()
exception_error = 'Exception on DB.get_short_cov_op2() : %s\n%s' % exception_error
print(exception_error)
return df
For this: qry_params = {"list_prod_names": prod_names} I have tried multiple different things such as:
prod_names = ''.join(prod_names)
prod_names = str(prod_names)
prod_names =." \'"+''.join(prod_names)+"\'"
The only thing I have managed to get it work is by doing:
new_query = raw_query.format(list_prod_names=prodnames_for_date(search_date)).replace('[', '').replace(']','')
df = db.query(new_query)
I am trying not to use .format() because is bad practie to do a .format to an sql to prevent attacks.
db.py contains among other functions:
def get(self, sql, params={}):
cur = self.con.cursor()
cur.prepare(sql)
try:
cur.execute(sql, **params)
df = pd.DataFrame(cur.fetchall(), columns=[c[0] for c in cur.description])
except Exception:
exception_error = traceback.format_exc()
exception_error = 'Exception on DB.get() : %s\n%s' % exception_error
print(exception_error)
self.con.rollback()
cur.close()
df.columns = df.columns.map(lambda x: x.upper())
return df
I would like to be able to do a type binding.
I am using:
python = 3.6
cx_oracle = 6.3.1
I have read the followig articles but I a still unable to find a solution:
Python cx_Oracle bind variables
Python cx_Oracle SQL with bind string variable
Search for name in cx_Oracle
Unfortunately you cannot bind an array directly unless you convert it to a SQL type and use a subquery -- which is fairly complex. So instead you need to do something like this:
inClauseParts = []
for i, inValue in enumerate(ARRAY_VALUE):
argName = "arg_" + str(i + 1)
inClauseParts.append(":" + argName)
clause = "%s in (%s)" % (columnName, ",".join(inClauseParts))
This works fine but be aware that if the number of elements in the array changes regularly that using this technique will create a separate statement that must be parsed for each number of elements. If you know that (in general) you won't have more than (for example) 10 elements in the array it would be better to append None to the incoming array so that the number of elements is always 10.
Hopefully that is clear enough!
I have finally manage to do it. It might not be pretty but it works.
I have modified my sql query to include an extra select which returns the value of my list of descriptors:
inner join (
SELECT regexp_substr(:my_list_of_items, '[^,]+', 1, LEVEL) as mylist
FROM dual
CONNECT BY LEVEL <= length(:my_list_of_items) - length(REPLACE(:my_list_of_items, ',', '')) + 1
) d
on d.mylist= a.corresponding_columns

Passing a parameter to a sql query using pyodbc failing

I have read dozens of similar posts and tried everything but I still get an error message when trying to pass a parameter to a simple query using pyodbc. Apologies if there is an answer to this elsewhere but I cannot find it
I have a very simple table:
select * from Test
yields
a
b
c
This works fine:
import pyodbc
import pandas
connection = pyodbc.connect('DSN=HyperCube SYSTEST',autocommit=True)
result = pandas.read_sql("""select * from Test where value = 'a'""",connection,params=None)
print(result)
result:
value
0 a
However if I try to do the where clause with a parameter it fails
result = pandas.read_sql("""select * from Test where value = ?""",connection,params='a')
yields
Error: ('01S02', '[01S02] Unknown column/parameter value (9001) (SQLPrepare)')
I also tried this
cursor = connection.cursor()
cursor.execute("""select * from Test where value = ?""",['a'])
pyodbcResults = cursor.fetchall()
and still received the same error
Does anyone know what is going on? Could it be an issue with the database I am querying?
PS. I looked at the following post and the syntax there in the first part of answer 9 where dates are passed by strings looks identical to what I am doing
pyodbc the sql contains 0 parameter markers but 1 parameters were supplied' 'hy000'
Thanks
pandas.read_sql(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None)[https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_sql.html]¶
params : list, tuple or dict, optional, default: None
example:
cursor.execute("select * from Test where value = %s",['a'])
or Named arguments example:
result = pandas.read_sql(('select * from Test where value = %(par)s'),
db,params={"par":'p'})
in pyodbc write parms directly after sql parameter:
cursor.execute(sql, *parameters)
for example:
onepar = 'a'
cursor.execute("select * from Test where value = ?", onepar)
cursor.execute("select a from tbl where b=? and c=?", x, y)

Enter Unspecified Number of Variables into Postgres Psycopg2 SQL query

I'm trying to retrieve some data from a postgresql database using psycogp2, and either exclude a variable number of rows or exclude none.
The code I have so far is:
def db_query(variables):
cursor.execute('SELECT * '
'FROM database.table '
'WHERE id NOT IN (%s)', (variables,))
This does partially work. E.g. If I call:
db_query('593')
It works. The same for any other single value. However, I cannot seem to get it to work when I enter more than one variable, eg:
db_query('593, 595')
I get the error:
psycopg2.DataError: invalid input syntax for integer: "593, 595"
I'm not sure how to enter the query correctly or amend the SQL query. Any help appreciated.
Thanks
Pass a tuple as it is adapted to a record:
query = """
select *
from database.table
where id not in %s
"""
var1 = 593
argument = (var1,)
print(cursor.mogrify(query, (argument,)).decode('utf8'))
#cursor.execute(query, (argument,))
Output:
select *
from database.table
where id not in (593)

Pyodbc and Access with query parameter that contains a period

I recently found a bug with some Access SQL queries that I can't seem to track down. I have a fairly straightforward SQL query that I use to retrieve data from an access database that's "managed" in an older application (ie the data is already in the database and I have no real control over what's in there).
import pyodbc
MDB = '******.MDB'
DRV = '{Microsoft Access Driver (*.mdb)}'
PWD = ''
con = pyodbc.connect('DRIVER={};DBQ={};PWD={}'.format(DRV, MDB, PWD))
sql = ('SELECT Estim.PartNo, Estim.Descrip, Estim.CustCode, Estim.User_Text1, Estim.Revision, ' +
'Estim.Comments, Routing.PartNo AS RPartNo, Routing.StepNo, Routing.WorkCntr, Routing.VendCode, ' +
'Routing.Descrip AS StepDescrip, Routing.SetupTime, Routing.CycleTime, ' +
'Routing.WorkOrVend, ' +
'Materials.PartNo as MatPartNo, Materials.SubPartNo, Materials.Qty, ' +
'Materials.Unit, Materials.TotalQty, Materials.ItemNo, Materials.Vendor ' +
'FROM (( Estim ' +
'INNER JOIN Routing ON Estim.PartNo = Routing.PartNo ) ' +
'INNER JOIN Materials ON Estim.PartNo = Materials.PartNo )')
if 'PartNo' in kwargs:
key = kwargs['PartNo']
sql = sql + 'WHERE Estim.PartNo=?'
cursor = con.cursor().execute(sql, key)
# use this for debuging only
num = 0
for row in cursor.fetchall():
num += 1
return num
This works fine for all PartNo except when PartNo contains a decimal point. Curiously, when PartNo contains a decimal point AND a hyphen, I get the appropriate record(s).
kwargs['PartNo'] = "100.100-2" # returns 1 record
kwargs['PartNo'] = "200.100" # returns 0 records
Both PartNos exist when viewed in the other application, so I know there should be records returned for both queries.
My first thought was to ensure kwargs['PartNo'] is a string key = str(kwargs['PartNo']) with no change.
I also tried to places quotes around the 'PartNo' value with no success. key = '\'' + kwargs['PartNo'] + '\''
Finally, I tried to escape the . with no success (I realize this would break most queries, but I'm just trying to track down the issue with a single period) key = str(kwargs['partNo']).replace('.', '"."')
I know using query parameters should handle all the escaping for me, but at this point, I'm just trying to figure out what's going on. Any thoughts on this?
So the issue isn't with the query parameters - everything works as it should. The problem is with the SQL statement. I incorrectly assumed - and never checked - that there was a record in the Materials table that matched PartNo.
INNER JOIN Materials ON Estim.PartNo = Materials.PartNo
will only return a record if PartNo is found in both tables, which in this particular case it is not.
Changing it to
LEFT OUTER JOIN Materials ON Estim.PartNo = Materials.PartNo
produces the expected results. See this for info on JOINS. https://msdn.microsoft.com/en-us/library/bb243855(v=office.12).aspx
As for print (repr(key)) - flask handles the kwarg type upstream properly
api.add_resource(PartAPI, '/api/v1.0/part/<string:PartNo>'
so when I ran this in the browser, I got the "full length" strings. When run in the cmd line using python -c ....... I was not handling the argument type properly as Gord pointed out, so it was truncating the trailing zeros. I didn't think the flask portion was relevant, so I never added that in the original question.