Django raw query fails with ILIKE? [duplicate] - sql

Following this SO question, I'm trying to "truncate" all tables related to a certain django application using the following raw sql commands in python:
cursor.execute("set foreign_key_checks = 0")
cursor.execute("select concat('truncate table ',table_schema,'.',table_name,';') as sql_stmt from information_schema.tables where table_schema = 'my_db' and table_type = 'base table' AND table_name LIKE 'some_prefix%'")
for sql in [sql[0] for sql in cursor.fetchall()]:
cursor.execute(sql)
cursor.execute("set foreign_key_checks = 1")
Alas I receive the following error:
C:\dev\my_project>my_script.py
Traceback (most recent call last):
File "C:\dev\my_project\my_script.py", line 295, in <module>
cursor.execute(r"select concat('truncate table ',table_schema,'.',table_name,';') as sql_stmt from information_schema.tables where table_schema = 'my_db' and table_type = 'base table' AND table_name LIKE 'some_prefix%'")
File "C:\Python26\lib\site-packages\django\db\backends\util.py", line 18, in execute
sql = self.db.ops.last_executed_query(self.cursor, sql, params)
File "C:\Python26\lib\site-packages\django\db\backends\__init__.py", line 216, in last_executed_query
return smart_unicode(sql) % u_params
TypeError: not enough arguments for format string
Is the % in the LIKE making trouble? How can I workaround it?

Have you tried %%? That quotes a % in Python string-formatting.

Related

UndefinedTable: Connection doesn't exist

I always used psycopg2 to connect my Postgres database to my script, but, this is the first time that a get this error.
import psycopg2
import pandas as pd
conn = psycopg2.connect(
host='dabname',
database='db',
user='myuser',
password='mypassword')
cursor = conn.cursor()
cursor.execute("select relname from pg_class where relkind='r' and relname !~ '^(pg_|sql_)';")
print(cursor.fetchall())
and this returns:
[('tb_orcamento_mes',), ('tb_notificacao',), ('tb_grupo_premissa',), ('tb_premissa',), ('tb_etapas',), ('tb_orcamento_anual',)]
But, when I try to fetch all records from 'tb_premissa', I get an error:
cursor = conn.cursor()
cursor.execute("SELECT * FROM tb_premissa")
quantitativos_realizado = cursor.fetchall()
results in this error:
UndefinedTable: ERRO: relação "tb_premissa" LINE 1: SELECT *
FROM tb_premissa
Does anyone have an idea?

How to read a large table (>100 columns (variables) and 100,000 observations) from SQL Server into R using ODBC package

I'm getting an error to read large table into R from SQL Server.
Here is my connection code:
library(odbc)
library(DBI)
con <- dbConnect(odbc::odbc(),
.connection_string = 'driver={SQL Server};server=DW01;database=SFAF_DW;trusted_connection=true')
Here is a schema of my table which has 149 variables:
data1 <- dbGetQuery(con, "SELECT * FROM [eCW].[Visits]")
I got an error from this code probably because of large table.
I would like to reduce the large table (number of observations) applying "VisitDateTime" variable.
data2 <- dbGetQuery(con, "SELECT cast(VisitDateTime as DATETIME) as VisitDateTime FROM [eCW].[Visits] WHERE VisitDateTime>='2019-07-01 00:00:00' AND VisitDateTime<='2020-06-30 12:00:00'")
This code selected only "VisitDateTime" variable but I would like to get all (149 variables) from the table.
Hoping to get some efficient codes. Greatly appreciate your help on this. Thank you.
According to your schema, you have many variable length types, varchar, of 255 character lengths. As multiple answers on the similar error post suggests, you cannot rely on arbitrary order with SELECT * but must explicitly reference each column and place variable lengths toward the end of SELECT clause. In fact, generally in application code running SQL, avoid SELECT * FROM. See Why is SELECT * considered harmful?
Fortunately, from your schema output using INFORMATION_SCHEMA.COLUMNS you can dynamically develop such a larger named list for SELECT. First, adjust and run your schema query as an R data frame with a calculated column to order smallest to largest types and their precision/lengths.
schema_sql <- "SELECT sub.TABLE_NAME, sub.COLUMN_NAME, sub.DATA_TYPE, sub.SELECT_TYPE_ORDER
, sub.CHARACTER_MAXIMUM_LENGTH, sub.CHARACTER_OCTET_LENGTH
, sub.NUMERIC_PRECISION, sub.NUMERIC_PRECISION_RADIX, sub.NUMERIC_SCALE
FROM
(SELECT TABLE_NAME, COLUMN_NAME, DATA_TYPE
, CHARACTER_MAXIMUM_LENGTH, CHARACTER_OCTET_LENGTH
, NUMERIC_PRECISION, NUMERIC_PRECISION_RADIX, NUMERIC_SCALE
, CASE DATA_TYPE
WHEN 'tinyint' THEN 1
WHEN 'smallint' THEN 2
WHEN 'int' THEN 3
WHEN 'bigint' THEN 4
WHEN 'date' THEN 5
WHEN 'datetime' THEN 6
WHEN 'datetime2' THEN 7
WHEN 'decimal' THEN 8
WHEN 'varchar' THEN 9
WHEN 'nvarchar' THEN 10
END AS SELECT_TYPE_ORDER
FROM INFORMATION_SCHEMA.COLUMNS
WHERE SCHEMA_NAME = 'eCW'
AND TABLE_NAME = 'Visits'
) sub
ORDER BY sub.SELECT_TYPE_ORDER
, sub.NUMERIC_PRECISION
, sub.NUMERIC_PRECISION_RADIX
, sub.NUMERIC_SCALE
, sub.CHARACTER_MAXIMUM_LENGTH
, sub.CHARACTER_OCTET_LENGTH"
visits_schema_df <- dbGetQuery(con, schema_sql)
# BUILD COLUMN LIST FOR SELECT CLAUSE
select_columns <- paste0("[", paste(visits_schema_df$COLUMN_NAME, collapse="], ["), "]")
# RUN QUERY WITH EXPLICIT COLUMNS
data <- dbGetQuery(con, paste("SELECT", select_columns, "FROM [eCW].[Visits]"))
Above may need adjustment if same error arises. Be proactive and test on your end by isolating the problem columns, column types, etc. A few suggestions include filtering out DATA_TYPE, COLUMN_NAME or moving around ORDER columns in schema query.
...
FROM INFORMATION_SCHEMA.COLUMNS
WHERE SCHEMA_NAME = 'eCW'
AND TABLE_NAME = 'Visits'
AND DATA_TYPE IN ('tinyint', 'smallint', 'int') -- TEST WITH ONLY INTEGER TYPES
...
FROM INFORMATION_SCHEMA.COLUMNS
WHERE SCHEMA_NAME = 'eCW'
AND TABLE_NAME = 'Visits'
AND NOT DATA_TYPE IN ('varchar', 'nvarchar') -- TEST WITHOUT VARIABLE STRING TYPES
...
FROM INFORMATION_SCHEMA.COLUMNS
WHERE SCHEMA_NAME = 'eCW'
AND TABLE_NAME = 'Visits'
AND NOT DATA_TYPE IN ('decimal', 'datetime2') -- TEST WITHOUT HIGH PRECISION TYPES
...
FROM INFORMATION_SCHEMA.COLUMNS
WHERE SCHEMA_NAME = 'eCW'
AND TABLE_NAME = 'Visits'
AND NOT COLUMN_NAME IN ('LastHIVTestResult') -- TEST WITHOUT LARGE VARCHARs
...
ORDER BY sub.SELECT_TYPE_ORDER -- ADJUST ORDERING
, sub.NUMERIC_SCALE
, sub.NUMERIC_PRECISION
, sub.NUMERIC_PRECISION_RADIX
, sub.CHARACTER_OCTET_LENGTH
, sub.CHARACTER_MAXIMUM_LENGTH
Still another solution is to stitch the R data frame together by their types (adjusting schema query) using the chain merge on the primary key (assumed to be DW_Id):
final_data <- Reduce(function(x, y) merge(x, y, by="DW_Id"),
list(data_int_columns, # SEPARATE QUERY RESULT WITH DW_Id AND INTs IN SELECT
data_num_columns, # SEPARATE QUERY RESULT WITH DW_Id AND DECIMALs IN SELECT
data_dt_columns, # SEPARATE QUERY RESULT WITH DW_Id AND DATE/TIMEs IN SELECT
data_char_columns) # SEPARATE QUERY RESULT WITH DW_Id AND VARCHARs IN SELECT
)

Querying multiple postgres tables in python

I'm trying to query multiple sql tables and store them as pandas dataframe.
cur = conn.cursor()
cur.execute("select relname from pg_class where relkind='r' and relname !~ '^(pg_|sql_)';")
tables_df = cur.fetchall()
##table_name_list = tables_df.table_name
select_template = ' SELECT * FROM {table_name}'
frames_dict = {}
for tname in tables_df :
query = select_template.format(table_name = tname)
frames_dict [tname] = pd.read_sql ( query , conn)
But I'm getting error like :
DatabaseError: Execution failed on sql ' SELECT * FROM ('customer',)': syntax
error at or near "'yesbank'"
`enter code here`LINE 1: SELECT * FROM ('customer',)
Customer is name of table in my databse that i get from line
tables_df = cur.fetchall()
Per your error, looks like you have a typo in the word format:
AttributeError: 'str' object has no attribute 'formate'
Try
query = select_template.format(table_name = tname)

Python - Execute multiple SQL query from file

I am trying to execute SQL query from a file in Python 2.7.13 and getting the following error while displaying the resultset.
The SQL statements in the file are simple like count(*) from table but if this logic works I need to replace it with complex queries.
Error
Info : (7,)
Traceback (most recent call last):
File "SQLserver_loop.py", line 19, in <module>
fields = c.fetchall()
File "pymssql.pyx", line 542, in pymssql.Cursor.fetchall (pymssql.c:9352)
pymssql.OperationalError: Statement not executed or executed statement has no re
sultset
Python Script:
import pymssql
conn = pymssql.connect(
host=r'name',
user=r'user',
password='credential',
database='Test')
c = conn.cursor()
fd = open('ZooDatabase.sql', 'r') # Open and read the file as a single buffer
sqlFile = fd.read()
fd.close()
sqlCommands = sqlFile.split(';') # all SQL commands (split on ';')
for command in sqlCommands: # Execute every command from the input file
c.execute(command)
fields = c.fetchall()
for row in fields:
print "Info : %s " % str(row)
c.close()
conn.close()
Error Message
**SQL File - ZooDatabase.sql**
select count(*) from emp2;
select count(*) from emp1;
**Error Log with SQL print statement output:**
C:\Python27\pycode>python SQLserver_loop.py
SELECT count(*) FROM emp2
Info : (7,)
SELECT count(*) FROM emp1
Info : (7,)
Traceback (most recent call last):
File "SQLserver_loop.py", line 20, in <module>
fields = c.fetchall()
File "pymssql.pyx", line 542, in pymssql.Cursor.fetchall (pymssql.c:9352)
pymssql.OperationalError: Statement not executed or executed statement has no re
sultset
fields = c.fetchall() was causing the error I commented it and works fine now.
for command in sqlCommands:
#print command
c.execute(command)
#fields = c.fetchall()
for row in c:
print (row)

DatabaseError: Execution failed on sql: SELECT

I am connecting to Teradata with the following code which I execute on the python console:
conn = pyodbc.connect('DRIVER={Teradata}; DBCNAME=TdwB; UID=####;PWD=###;')
query = file(full_path).read()
opportunities = pd.read_sql(query, conn)
conn.close()
In query I read a very simple sql query from a file and everything works fine.
Then, I try running a much more complex query, expected to return about 350000 rows (0.2 GB). I am sure the query works because it has been executed perfectly on the SQL Assistant, Teradata query tool.
The script fails with DatabaseError: Execution failed on sql: SELECT after something like 5 minutes (I expect the query to run for about 10-20 minutes).
I am not sure how to tackle this because the error message is rather cryptic.
Is it a timeout?
Data formatting issue?
Anonymized query
Originally over 300 lines but it's just a select. Here are the main operations on data:
SELECT
TRUNC (CAST (a.created_at AS DATE ), 'M') AS first_day_month
,d.country_name AS country
,d.contract_id AS contract_id
,MAX (TRIM(CAST(REGEXP_REPLACE(contracts.BillingStreet, '\||\n|\r|\t', '_',1,0,'m') AS CHARACTER(2048))) || ', ' || TRIM(contracts.BillingPostalCode) || ', ' || TRIM(contracts.BillingCity)) AS FullAdress
,MIN (CAST (bills.created_at AS DATE )) AS first_transaction
,SUM (gross_revenue )
FROM db_1.billings AS bills
LEFT JOIN db_2.contracts AS contracts ON bills.deal_id = contracts.deal_id
WHERE bills.economic_area = 'NA'
AND CAST (bills.created_at AS DATE ) >= TRUNC (ADD_MONTHS (CURRENT_DATE -1, - 26) , 'MM')
AND bills.country_id = 110
GROUP BY 1,2,3
INTERESTINGLY:
conn = pyodbc.connect('DRIVER={Teradata}; DBCNAME=####; UID=####;PWD=####;', timeout=0)
cursor = conn.cursor()
query = file(full_path).read()
cursor.execute(query)
row = cursor.fetchone()
if row:
print row
cursor.close()
conn.close()
results in
Traceback (most recent call last):
File "C:\Users\mferrini\AppData\Local\Continuum\Anaconda\lib\site-packages\IPython\core\interactiveshell.py", line 2883, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-19-8f156c658077>", line 6, in <module>
cursor.execute(query)
Error: ('HY000', '[HY000] [Teradata][ODBC Teradata Driver][Teradata Database] Wrong Format (-9134) (SQLExecDirectW)')