I'd like to use the IN clause with a prepared Oracle statement using cx_Oracle in Python.
E.g. query - select name from employee where id in ('101', '102', '103')
On python side, I have a list [101, 102, 103] which I converted to a string like this ('101', '102', '103') and used the following code in python -
import cx_Oracle
ids = [101, 102, 103]
ALL_IDS = "('{0}')".format("','".join(map(str, ids)))
conn = cx_Oracle.connect('username', 'pass', 'schema')
cursor = conn.cursor()
results = cursor.execute('select name from employee where id in :id_list', id_list=ALL_IDS)
names = [x[0] for x in cursor.description]
rows = results.fetchall()
This doesn't work. Am I doing something wrong?
This concept is not supported by Oracle -- and you are definitely not the first person to try this approach either! You must either:
create separate bind variables for each in value -- something that is fairly easy and straightforward to do in Python
create a subquery using the cast operator on Oracle types as is shown in this post: https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::p11_question_id:210612357425
use a stored procedure to accept the array and perform multiple queries directly within PL/SQL
or do something else entirely!
Just transform your list into a tuple and format the sql string with it
ids = [101, 102, 103]
param = tuple(ids)
results = cursor.execute("select name from employee where id IN {}".format(param))
Otra opción es dar formato a una cadena con la consulta.
import cx_Oracle
ids = [101, 102, 103]
ALL_IDS = "('{0}')".format("','".join(map(str, ids)))
conn = cx_Oracle.connect('username', 'pass', 'schema')
cursor = conn.cursor()
query = """
select name from employee where id in ('{}')
""".format("','".join(map(str, ids)))
results = cursor.execute(query)
names = [x[0] for x in cursor.description]
rows = results.fetchall()
Since you created the string, you're almost there. This should work:
results = cursor.execute('select name from employee where id in ' + ALL_IDS)
Related
I'm reading and executing sql queries from file and I need to inspect the result sets to count all the null values across all columns. Because the SQL is read from file, I don't know the column names and thus can't call the columns by name when trying to find the null values.
I think using CTE is the best way to do this, but how can I call the columns when I don't know what the column names are?
WITH query_results AS
(
<sql_read_from_file_here>
)
select count_if(<column_name> is not null) FROM query_results
If you are using Python to read the file of SQL statements, you can do something like this which uses pglast to parse the SQL query to get the columns for you:
import pglast
sql_read_from_file_here = "SELECT 1 foo, 1 bar"
ast = pglast.parse_sql(sql_read_from_file_here)
cols = ast[0]['RawStmt']['stmt']['SelectStmt']['targetList']
sum_stmt = "sum(iff({col} is null,1,0))"
sums = [sum_sql.format(col = col['ResTarget']['name']) for col in cols]
print(f"select {' + '.join(sums)} total_null_count from query_results")
# outputs: select sum(iff(foo is null,1,0)) + sum(iff(bar is null,1,0)) total_null_count from query_results
I am executing a sql query from a python script to retrieve the data from snowflake in windows 10 but the resulting query is missing column names and its getting replaced by 0,1,2,3 so on. While executing query in snowflake interface and downloading csv is giving the columns in the file. I am passing column names as Aliases in my query
Below is code
def _CONSUMPTION(con):
data2 = con.cursor().execute("""select sd.sales_force_lvl_1_code "Plan-To Code",sd.sales_force_lvl_1_desc "Plan-To Description",pd.matl_code "Product Code",pd.matl_desc "Product Description",pd.ean_upc_code "UPC",dd.fiscal_week_desc "Fiscal Week Description",f.unit_sales_qty "Sales Units",f.incr_units_qty "Incremental Units"
from DW.consumption_fact1 f, DW.market_dim md, DW.matl_dim pd, DW.fiscal_week_dim dd, (select sales_force_lvl_1_code,max(sales_force_lvl_1_desc) sales_force_lvl_1_desc from DW.mv_us_sales_force_dim group by sales_force_lvl_1_code) sd
where dd.fiscal_week_key = f.fiscal_week_key
and pd.matl_key = f.matl_key
and md.market_key = f.market_key
and sd.sales_force_lvl_1_code = md.curr_sales_force_lvl_1_code
and dd.fiscal_week_key between (select curr_fy_week_key-6 from DW.curr_date_lkp) and (select curr_fy_week_key-1 from DW.curr_date_lkp)
and f.company_key = 6006
and (f.unit_sales_qty <> 0 and f.sales_amt <> 0)
and md.curr_sales_force_lvl_1_code is not null
UNION
select '5000016240' "Plan-To Code", 'AWG TOTAL' "Plan-To Description",pd.matl_code "Product Code",pd.matl_desc "Product Description",pd.ean_upc_code "UPC",dd.fiscal_week_desc "Fiscal Week Description",f.unit_sales_qty "Sales Units",f.incr_units_qty "Incremental Units"
from DW.consumption_fact1 f, DW.market_dim md, DW.matl_dim pd, DW.fiscal_week_dim dd
where dd.fiscal_week_key = f.fiscal_week_key
and pd.matl_key = f.matl_key
and md.market_key = f.market_key
and dd.fiscal_week_key between (select curr_fy_week_key-6 from DW.curr_date_lkp) and (select curr_fy_week_key-1 from DW.curr_date_lkp)
and f.company_key = 6006
and (f.unit_sales_qty <> 0 and f.sales_amt <> 0)
and md.market_code = '20267'""").fetchall()
df = pd.DataFrame(data2)
df.head(5)
df.to_csv('CONSUMPTION.csv',index = False)
Looking [at the docs], seems the easiest way is to use the cursor method .fetch_pandas_all():
query = "SELECT 1 a, 2 b, 'a' c UNION ALL SELECT 7,4,'snow'"
cur = connection.cursor()
cur.execute(query).fetch_pandas_all()
Or if you want to dump the results into a CSV, just do so as in the question:
query = "SELECT 1 a, 2 b, 'a' c UNION ALL SELECT 7,4,'snow'"
cur = connection.cursor()
df = cur.execute(query).fetch_pandas_all()
df.to_csv('x.csv', index = False)
Visualized:
Looks like you haven’t defined the column methods in your code to define the data frame.
My recommendation will be to add column methods first df.columns
In addition refer snowflake page for details
https://docs.snowflake.com/en/user-guide/python-connector-pandas.html
Try this
import pandas as pd
def fetch_pandas_old(cur, sql):
cur.execute(sql)
rows = 0
while True:
dat = cur.fetchmany(50000)
if not dat:
break
df = pd.DataFrame(dat, columns=cur.description)
rows += df.shape[0]
print(rows)
A nice way to extract the column headings from the cursor description and save in a pandas df using the Snowflake connector (also works for psycopg2 btw) is as follows:
#Create the connection
def connect_snowflake(uname, pword, acct, role_name, whouse, dbase, schema_name):
conn = snowflake.connector.connect(
user=uname,
password=pword,
account=acct,
role = role_name,
warehouse = whouse,
database = dbase,
schema = schema_name
)
cur = conn.cursor()
return conn, cur
Then execute your query. The cur.description object returns a list of tuples, the first of each being the column name :)
conn, cur = connect_snowflake(username, password, account_name, role, warehouse, database, schema)
cur.execute('select * from my_schema.my_table')
result =cur.fetchall()
# Extract the column names
col_names = []
for elt in cur.description:
col_names.append(elt[0])
df = pd.DataFrame(result, columns=col_names)
cur.close()
conn.close()
I have a query like this:
SELECT prodId, prod_name , prod_type FROM mytable WHERE prod_type in (:list_prod_names)
I want to get the information of a product, depending on the possible types are: "day", "week", "weekend", "month". Depending on the date it might be at least one of those option, or a combination of all of them.
This info (List type) is returned by the function prod_names(date_search)
I am using cx_oracle bindings with code like:
def get_prod_by_type(search_date :datetime):
query_path = r'./queries/prod_by_name.sql'
raw_query = open(query_path).read().strip().replace('\n', ' ').replace('\t', ' ').replace(' ', ' ')
print(sql_read_op)
# Depending on the date the product types may be different
prod_names(search_date) #This returns a list with possible names
qry_params = {"list_prod_names": prod_names} # See attempts bellow
try:
db = DB(username='username', password='pss', hostname="localhost")
df = db.get(raw_query,qry_params)
except Exception:
exception_error = traceback.format_exc()
exception_error = 'Exception on DB.get_short_cov_op2() : %s\n%s' % exception_error
print(exception_error)
return df
For this: qry_params = {"list_prod_names": prod_names} I have tried multiple different things such as:
prod_names = ''.join(prod_names)
prod_names = str(prod_names)
prod_names =." \'"+''.join(prod_names)+"\'"
The only thing I have managed to get it work is by doing:
new_query = raw_query.format(list_prod_names=prodnames_for_date(search_date)).replace('[', '').replace(']','')
df = db.query(new_query)
I am trying not to use .format() because is bad practie to do a .format to an sql to prevent attacks.
db.py contains among other functions:
def get(self, sql, params={}):
cur = self.con.cursor()
cur.prepare(sql)
try:
cur.execute(sql, **params)
df = pd.DataFrame(cur.fetchall(), columns=[c[0] for c in cur.description])
except Exception:
exception_error = traceback.format_exc()
exception_error = 'Exception on DB.get() : %s\n%s' % exception_error
print(exception_error)
self.con.rollback()
cur.close()
df.columns = df.columns.map(lambda x: x.upper())
return df
I would like to be able to do a type binding.
I am using:
python = 3.6
cx_oracle = 6.3.1
I have read the followig articles but I a still unable to find a solution:
Python cx_Oracle bind variables
Python cx_Oracle SQL with bind string variable
Search for name in cx_Oracle
Unfortunately you cannot bind an array directly unless you convert it to a SQL type and use a subquery -- which is fairly complex. So instead you need to do something like this:
inClauseParts = []
for i, inValue in enumerate(ARRAY_VALUE):
argName = "arg_" + str(i + 1)
inClauseParts.append(":" + argName)
clause = "%s in (%s)" % (columnName, ",".join(inClauseParts))
This works fine but be aware that if the number of elements in the array changes regularly that using this technique will create a separate statement that must be parsed for each number of elements. If you know that (in general) you won't have more than (for example) 10 elements in the array it would be better to append None to the incoming array so that the number of elements is always 10.
Hopefully that is clear enough!
I have finally manage to do it. It might not be pretty but it works.
I have modified my sql query to include an extra select which returns the value of my list of descriptors:
inner join (
SELECT regexp_substr(:my_list_of_items, '[^,]+', 1, LEVEL) as mylist
FROM dual
CONNECT BY LEVEL <= length(:my_list_of_items) - length(REPLACE(:my_list_of_items, ',', '')) + 1
) d
on d.mylist= a.corresponding_columns
I have read dozens of similar posts and tried everything but I still get an error message when trying to pass a parameter to a simple query using pyodbc. Apologies if there is an answer to this elsewhere but I cannot find it
I have a very simple table:
select * from Test
yields
a
b
c
This works fine:
import pyodbc
import pandas
connection = pyodbc.connect('DSN=HyperCube SYSTEST',autocommit=True)
result = pandas.read_sql("""select * from Test where value = 'a'""",connection,params=None)
print(result)
result:
value
0 a
However if I try to do the where clause with a parameter it fails
result = pandas.read_sql("""select * from Test where value = ?""",connection,params='a')
yields
Error: ('01S02', '[01S02] Unknown column/parameter value (9001) (SQLPrepare)')
I also tried this
cursor = connection.cursor()
cursor.execute("""select * from Test where value = ?""",['a'])
pyodbcResults = cursor.fetchall()
and still received the same error
Does anyone know what is going on? Could it be an issue with the database I am querying?
PS. I looked at the following post and the syntax there in the first part of answer 9 where dates are passed by strings looks identical to what I am doing
pyodbc the sql contains 0 parameter markers but 1 parameters were supplied' 'hy000'
Thanks
pandas.read_sql(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None)[https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_sql.html]¶
params : list, tuple or dict, optional, default: None
example:
cursor.execute("select * from Test where value = %s",['a'])
or Named arguments example:
result = pandas.read_sql(('select * from Test where value = %(par)s'),
db,params={"par":'p'})
in pyodbc write parms directly after sql parameter:
cursor.execute(sql, *parameters)
for example:
onepar = 'a'
cursor.execute("select * from Test where value = ?", onepar)
cursor.execute("select a from tbl where b=? and c=?", x, y)
I use Jython2.5.0, mysql-connector-java-5.0.8-bin.jar, because my sever is Mysql5.0.38
There is a problem in my Jython application.
Two functions:
def act_query(query):
connection = conn //conn is global and has been assigned when called
cursor = connection.cursor()
num_affected_rows = cursor.execute(query)
cursor.close()
connection.commit()
return num_affected_rows
def get_row(query):
connection = conn
cursor = connection.cursor()
cursor.execute(query)
row = cursor.fetchone()
cursor.close()
return row
Then i do something like this:
query = """SELECT count(*) from test_db.test_table1 into #max"""
act_query(query)
get_query= """ select #max """
row = get_row(get_query)
print row
The output is something like: array('b', [49, 52, 54, 50, 56, 48, 52])
I try to find the reason and make a test like:
test_query = """select count(*) from test_db.test_table1"""
row = get_row(test_query)
print row
The output is the right answer
In fact this a script in CPython first with MySQLdb and I turn it into Jython with zxJDBC
It worked well in CPython but It can't work now.
In CPython the cursor can be defined as
cursor = connection.cursor(MySQLdb.cursors.DictCursor)
but in zxJDBC there seem to be no choice for creating a cursor.
Is this the reason?
Please show me the way. Thanks!
I got the answer from jython mailist.
A session variable get by JDBC is of varbinary type. zxJDBC make it to be an array.The right output can be get by converting the array into int with str(byte[]).