Passing query parameters in Pandas - sql-server-2005

Trying to write a simple pandas script which executes a query from SQL Server with WHERE clause. However, the query doesnt return any values. Possibly because the parameter is not passed? I thought we could pass the key-value pairs as below. Can you please point out what i am doing wrong here?
Posting just the query and relevant pieces. All the libraries have been imported as needed.
curr_sales_month = '2015-08-01'
sql_query = """SELECT sale_month,region,oem,nameplate,Model,Segment,Sales FROM [MONTHLY_SALES] WHERE Sale_Month = %(salesmonth)s"""
print ("Executed SQL Extract", sql_query)
df = pd.read_sql_query(sql_query,conn,params={"salesmonth":curr_sales_month})
The program returned with:
Closed Connection - Fetched 0 rows for Report
Process finished with exit code 0

Further to my comment. Here is an example that uses pyodbc to communicate to sql server and demonstrates passing a variable.
import pandas as pd
import pyodbc
pd.set_option('display.max_columns',50)
pd.set_option('display.width',5000)
conn_str = r"DRIVER={0};SERVER={1};DATABASE={2};UID={3};PWD={4}".format("SQL Server",'.','master','user','pwd')
cnxn = pyodbc.connect(conn_str)
sql_statement = "SELECT * FROM sys.databases WHERE database_id = ?"
df = pd.read_sql_query(sql = sql_statement, con = cnxn, params = [2])
cnxn.close()
print df.iloc[:,0:2].head()
which produces:
name database_id
0 tempdb 2
And if you wish to pass multiple parameters:
sql_statement = "SELECT * FROM sys.databases WHERE database_id > ? and database_id < ?"
df = pd.read_sql_query(sql = sql_statement, con = cnxn, params = [2,5])
cnxn.close()
print df.iloc[:,0:2].head()
which produces:
name database_id
0 model 3
1 msdb 4
my preferred way with dynamic inline sql statements
create_date = '2015-01-01'
name = 'mod'
sql_statement_template = r"""SELECT * FROM sys.databases WHERE database_id > {0} AND database_id < {1} AND create_date > '{2}' AND name LIKE '{3}%'"""
sql_statement = sql_statement_template.format('2','5',create_date,name)
print sql_statement
yields
SELECT * FROM sys.databases WHERE database_id > 2 AND database_id < 5 AND create_date > '2015-01-01' AND name LIKE 'mod%'
A further benefit if you do print this out, is you can copy and paste the sql commnand to management studio (or equivalent) and test your sql syntax easily.
and result should be:
name database_id
0 model 3
So this example demonstrates handling: date,string and int datatypes.
Including a LIKE with wildcard %

Related

How to retrieve multiple rows from a SQL Server stored procedure using Pyodbc in Jupyter Notebook

I have a stored procedure in SQL Server which takes 3 input parameters and can produce multiple rows as output. In fact, it is returning 20 rows in my current case. For instance, if I manually execute the stored procedure from SSMS, I get the following code and partial output, respectively:
Code:
DECLARE #return_value int
EXEC #return_value = [Coverage-Source].[ReadCoverageMapping]
#client = N'Capital BlueCross',
#lineOfBusiness = N'Commercial',
#distributionChannel = N'Retail'
SELECT 'Return Value' = #return_value
GO
Output:
ID Attribute Value CoverageName
-----------------------------------------
1 Copay Yes Retail Base
2 Copay No Retail
. . . .
. . . .
Return Value
20
Now, while I try to read the stored procedure from jupyter notebook using pyodbc, I get an error
Procedure or function ReadCoverageMapping has too many arguments specified
I want output something like this:
ID Attribute Value CoverageName
------------------------------------------
1 Copay Yes Retail Base
2 Copay No Retail
. . . .
. . . .
I tried this code:
client = 'Capital BlueCross'
lineOfBusiness = 'Commercial'
distributionChannel = 'Retail'
cnxn = pyodbc.connect(r'Driver={SQL
Server};Server=MyServer;Database=COV_SRCE_TEST;Trusted_Connection=yes;')
sql = """\
DECLARE #out nvarchar(max);
EXEC [cov_srce_test].[coverage-source].ReadCoverageMapping #client = ?, #lineOfBusiness = ?,
#distributionChannel = ?, #param_out = #out OUTPUT;
SELECT #out AS the_output;
"""
values = (client, lineOfBusiness, distributionChannel)
cnxn.execute(sql, values)
rows = cnxn.fetchall()
while rows:
print(rows)
if cnxn.nextset():
rows = cnxn.fetchall()
else:
rows = None
Is there any way to achieve this? I tried using multiple ways, but couldn't find a solution.
Those batches are different. Looks like it should be:
sql = """\
DECLARE #return_value int
EXEC #return_value = [Coverage-Source].[ReadCoverageMapping]
#client = ?,
#lineOfBusiness = ?,
#distributionChannel = ?
SELECT 'Return Value' = #return_value
"""
which will output two resultsets, one from the stored procedure, and a second one for the Return value which you can probably omit.

dynamic sql query formation using pyhon

I am new to the python and want to form SQL query dynamically in python.so tried below sample code:
empId = 12
query = ''' select name, ''' +
if empId > 10:
'''basic_salary'''
else:
''' bonus'''
+ ''' from employee '''
print(query)
but , getting syntax error. does anyone knows how to form dynamic query in python.
You need to indicate that the assignment to query continues on the next line, which you can do with a \ at the end of the line. Also, you need to write the if statement as an inline if expression as you can't have an if statement in the middle of an assignment statement:
empId = 12
query = ''' select name, ''' + \
('''basic_salary''' if empId > 10 else ''' bonus''') + \
''' from employee '''
print(query)
Output:
select name, basic_salary from employee
If you have multiple conditions, you can just add to query in the conditions. For example:
empId = 6
query = 'select name, '
if empId > 10:
query += 'basic_salary'
elif empId > 5:
query += 'benefits'
else:
query += 'bonus'
query += ' from employee'
print(query)
Output
select name, benefits from employee
#dynamic sql query formation using python
#This is for PostgresSQL you can use this for other queries as well:
def updateQuery(self,tableName,setFields,setValues,whereFields,whereValues):
print("Generating update query started")
querySetfields = None
queryWhereFields = None
# Loop for set fields
for i in range(len(setFields)):
if querySetfields is None:
querySetfields=setFields[i]+"='"+setValues[i]+"'"
else:
querySetfields=querySetfields+","+setFields[i]+"='"+setValues[i]+"'"
# Loop for whereFields
for i in range(len(whereFields)):
if queryWhereFields is None:
queryWhereFields=whereFields[i]+"='"+whereValues[i]+"'"
else:
queryWhereFields=queryWhereFields+","+whereFields[i]+"='"+whereValues[i]+"'"
#Form the complete update query
query="UPDATE "+tableName+" SET "+querySetfields+" WHERE "+queryWhereFields
print("Generating update query completed")
return query
print(updateQuery(None,"EMPLOYEE_DETAILS",["EMPI_ID","EMP_LANID","EMP_NAME","EMP_EMAIL"],["A","B","C"],["EMPI_ID","EMP_LANID"],["X","Y","Z"]))

Querying multiple postgres tables in python

I'm trying to query multiple sql tables and store them as pandas dataframe.
cur = conn.cursor()
cur.execute("select relname from pg_class where relkind='r' and relname !~ '^(pg_|sql_)';")
tables_df = cur.fetchall()
##table_name_list = tables_df.table_name
select_template = ' SELECT * FROM {table_name}'
frames_dict = {}
for tname in tables_df :
query = select_template.format(table_name = tname)
frames_dict [tname] = pd.read_sql ( query , conn)
But I'm getting error like :
DatabaseError: Execution failed on sql ' SELECT * FROM ('customer',)': syntax
error at or near "'yesbank'"
`enter code here`LINE 1: SELECT * FROM ('customer',)
Customer is name of table in my databse that i get from line
tables_df = cur.fetchall()
Per your error, looks like you have a typo in the word format:
AttributeError: 'str' object has no attribute 'formate'
Try
query = select_template.format(table_name = tname)

Is it possible to invoke BigQuery procedures in python client?

Scripting/procedures for BigQuery just came out in beta - is it possible to invoke procedures using the BigQuery python client?
I tried:
query = """CALL `myproject.dataset.procedure`()...."""
job = client.query(query, location="US",)
print(job.results())
print(job.ddl_operation_performed)
print(job._properties) but that didn't give me the result set from the procedure. Is it possible to get the results?
Thank you!
Edited - stored procedure I am calling
CREATE OR REPLACE PROCEDURE `Project.Dataset.Table`(IN country STRING, IN accessDate DATE, IN accessId, OUT saleExists INT64)
BEGIN
IF EXISTS (SELECT 1 FROM dataset.table where purchaseCountry = country and purchaseDate=accessDate and customerId = accessId)
THEN
SET saleExists = (SELECT 1);
ELSE
INSERT Dataset.MissingSalesTable (purchaseCountry, purchaseDate, customerId) VALUES (country, accessDate, accessId);
SET saleExists = (SELECT 0);
END IF;
END;
If you follow the CALL command with a SELECT statement, you can get the return value of the function as a result set. For example, I created the following stored procedure:
BEGIN
-- Build an array of the top 100 names from the year 2017.
DECLARE
top_names ARRAY<STRING>;
SET
top_names = (
SELECT
ARRAY_AGG(name
ORDER BY
number DESC
LIMIT
100)
FROM
`bigquery-public-data.usa_names.usa_1910_current`
WHERE
year = 2017 );
-- Which names appear as words in Shakespeare's plays?
SET
top_shakespeare_names = (
SELECT
ARRAY_AGG(name)
FROM
UNNEST(top_names) AS name
WHERE
name IN (
SELECT
word
FROM
`bigquery-public-data.samples.shakespeare` ));
END
Running the following query will return the procedure's return as the top-level results set.
DECLARE top_shakespeare_names ARRAY<STRING> DEFAULT NULL;
CALL `my-project.test_dataset.top_names`(top_shakespeare_names);
SELECT top_shakespeare_names;
In Python:
from google.cloud import bigquery
client = bigquery.Client()
query_string = """
DECLARE top_shakespeare_names ARRAY<STRING> DEFAULT NULL;
CALL `swast-scratch.test_dataset.top_names`(top_shakespeare_names);
SELECT top_shakespeare_names;
"""
query_job = client.query(query_string)
rows = list(query_job.result())
print(rows)
Related: If you have SELECT statements within a stored procedure, you can walk the job to fetch the results, even if the SELECT statement isn't the last statement in the procedure.
# TODO(developer): Import the client library.
# from google.cloud import bigquery
# TODO(developer): Construct a BigQuery client object.
# client = bigquery.Client()
# Run a SQL script.
sql_script = """
-- Declare a variable to hold names as an array.
DECLARE top_names ARRAY<STRING>;
-- Build an array of the top 100 names from the year 2017.
SET top_names = (
SELECT ARRAY_AGG(name ORDER BY number DESC LIMIT 100)
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE year = 2000
);
-- Which names appear as words in Shakespeare's plays?
SELECT
name AS shakespeare_name
FROM UNNEST(top_names) AS name
WHERE name IN (
SELECT word
FROM `bigquery-public-data.samples.shakespeare`
);
"""
parent_job = client.query(sql_script)
# Wait for the whole script to finish.
rows_iterable = parent_job.result()
print("Script created {} child jobs.".format(parent_job.num_child_jobs))
# Fetch result rows for the final sub-job in the script.
rows = list(rows_iterable)
print("{} of the top 100 names from year 2000 also appear in Shakespeare's works.".format(len(rows)))
# Fetch jobs created by the SQL script.
child_jobs_iterable = client.list_jobs(parent_job=parent_job)
for child_job in child_jobs_iterable:
child_rows = list(child_job.result())
print("Child job with ID {} produced {} rows.".format(child_job.job_id, len(child_rows)))
It works if you have SELECT inside your procedure, given the procedure being:
create or replace procedure dataset.proc_output() BEGIN
SELECT t FROM UNNEST(['1','2','3']) t;
END;
Code:
from google.cloud import bigquery
client = bigquery.Client()
query = """CALL dataset.proc_output()"""
job = client.query(query, location="US")
for result in job.result():
print result
will output:
Row((u'1',), {u't': 0})
Row((u'2',), {u't': 0})
Row((u'3',), {u't': 0})
However, if there are multiple SELECT inside a procedure, only the last result set can be fetched this way.
Update
See below example:
CREATE OR REPLACE PROCEDURE zyun.exists(IN country STRING, IN accessDate DATE, OUT saleExists INT64)
BEGIN
SET saleExists = (WITH data AS (SELECT "US" purchaseCountry, DATE "2019-1-1" purchaseDate)
SELECT Count(*) FROM data where purchaseCountry = country and purchaseDate=accessDate);
IF saleExists = 0 THEN
INSERT Dataset.MissingSalesTable (purchaseCountry, purchaseDate, customerId) VALUES (country, accessDate, accessId);
END IF;
END;
BEGIN
DECLARE saleExists INT64;
CALL zyun.exists("US", DATE "2019-2-1", saleExists);
SELECT saleExists;
END
BTW, your example is much better served with a single MERGE statement instead of a script.

Logic to prepare SQL statements dynamically

So i have a requirement where I need to read through records of all records of a file and insert them into another file if they meet a set of rules which are described in another table as shown below..
A record after it has been read from the first file has to meet all the sequences of at least one Rule to make it eligible to be written into the Second table.
For example once a record is read from CAR file, the rules below have to be checked till all sequences of atleast one rule set is satisfied. For this I was planning to Create a dynamic SQL program something of this sort. But this does not work as Prepared SQL does not support host variables.
If any body can suggest or provide any guidance on how to create SQL statemtns dynamically and check if records satisfy the required rules for them to be entered into the second file, it would be great
So basically what I am looking for is once I select a field from a table, how do I store it somehere to do further validation and checking.
Update
:
Based on the intelligent advice from Danny117, I have come up with the below code:
H Option(*NoDebugIO:*SrcStmt)
D RULEDS E DS EXTNAME(RULESTABLE)
D MAXRUL S 1 0
D MAXSEQ S 1 0
D STMT S 512
D WHERESTMT S 512 INZ('')
D FullSqlStmt S 512 INZ('')
D RULINDEX S 1 0 INZ(1)
D SEQINDEX S 1 0 INZ(1)
D APOS C CONST('''')
/Free
Exec SQL SELECT MAX(RULENO)INTO :MAXRUL FROM RULESTABLE;
Exec SQL DECLARE RULCRS CURSOR FOR SELECT * FROM RULESTABLE;
Exec SQL OPEN RULCRS;
Exec SQL FETCH RULCRS INTO :RULEDS;
DoW (Sqlcod = 0 AND RULINDEX <= MAXRUL);
Exec SQL SELECT MAX(SEQNO) INTO :MAXSEQ FROM RULESTABLE
WHERE RULENO=:RULINDEX ;
DoW (SEQINDEX <= MAXSEQ);
If (Position <> '');
Field = 'SUBSTR('+%Trim(Field)+','+%Trim(Position)+','
+'1'+')';
EndIf;
WhereStmt = %Trim(WhereStmt) + ' ' + %Trim(field)+ ' ' +
%Trim(condition) + ' ' + APOS + %Trim(Value) + APOS;
If (SeqIndex < MaxSeq);
WhereStmt = %Trim(WhereStmt) + ' AND ';
EndIf;
Exec SQL FETCH NEXT FROM RULCRS INTO :RULEDS;
SeqIndex = SeqIndex + 1;
EndDo;
FullSqlStmt = %Trim('INSERT INTO ITMRVAT SELECT * +
FROM ITMRVA WHERE '+ %Trim(WhereStmt));
Exec SQL Prepare InsertStmt from :FullSqlStmt;
Exec SQL EXECUTE InsertStmt;
RulIndex = RulIndex + 1;
EndDo;
This produces SQL statement as shown below which is what I want. Now let me go ahead and look at the other parts of the code.
> EVAL FullSqlStmt
FULLSQLSTMT =
....5...10...15...20...25...30...35...40...45...50...55...60
1 'INSERT INTO ITMRVAT SELECT * FROM ITMRVA WHERE STID = 'PLD' '
61 'AND ENGNO LIKE '%415015%' AND SUBSTR(ENGNO,1,1) = 'R' AND SU'
121 'BSTR(ENGNO,5,1) = 'Y' '
181 ' '
241 ' '
301 ' '
361 ' '
421 ' '
481 ' '
But the issue is now as I mentioned in my comment to Danny, how to handle if a new rule involving second table is specified..
Embedded SQL does allow for 'dynamic statements' in ILE languages. You are able to have a query within a character field and then pass it into the Embedded SQL.
Dcl-S lQuery Varchar(100);
lQuery = 'SELECT * FROM CUST';
EXEC SQL
PREPARE SCust FROM :lQuery;
EXEC SQL
DECLARE SearchCust CURSOR FOR SCust;
//Continue working with cursor..
You may want to just prepare, execute and return a result set:
lQuery = 'SELECT * FROM CUST WHERE ID = ' + %Char(CustID);
EXEC SQL
PREPARE SCust FROM :lQuery;
DECLARE c1 CURSOR FOR SCust;
OPEN c1;
FETCH c1 INTO :CustDS;
CLOSE c1;
Optional extra: You may also want to use field markers (?) in your query.
//'SELECT * FROM CUST WHERE CUSTID = ?';
EXEC SQL OPEN SearchCust USING :CustID;
//'INSERT INTO CUST VALUES(?,?)';
EXEC SQL EXECUTE CUST USING :CustID;
You have to translate the rules into a join statement or a where clause. The join statement is more complex so go that route.
If you were smart (and you are) consider saving the rules as a SQL clause that you can join or use in a where clause. Its infinitely flexible this way a more modern design.
rule 1 / car.year = 1990 and car.engno like '%43243%' and substring(car.vin,12,1) = 'X'
eval statement =
insert into sometable
Select car.* from car
join sysibm.sysdummy1
on car.year = 1990
and car.engno lile '%43243%'
...etc on to rule 2 starting with "OR"
or car.year = PLD
and car.engno like '%1234%'
...etc other rules starting with "OR"
exec immediate statement