I have followed other posts related to inserting a dictionary into a Postgresql. The syntax is as follows:
sql_insert = 'INSERT INTO table_results (id, name, state, response, duration_time) VALUES (%(id)s, %(name)s, %(response)s, %(duration_time)s);'
The dictionary has the following data:
data= {'id': 29, 'name': "ABC'S KINDERGARDEN", 'response': 'OK', 'duration_time': 60}
Then, I can execute the insert statement:
db.session.execute(sql_insert,data_dict )
However, the previous throws the next error:
(psycopg2.errors.SyntaxError) syntax error at or near "%"
LINE 1: ...1, result_response, result_duration_secs) VALUES (%(id)s, %(...
^
[SQL: INSERT INTO table_results (id, name, state, response, duration_time) VALUES (%%(id)s, %%(name)s, %%(state)s, %%(response)s, %%(duration_time)s);]
(Background on this error at: http://sqlalche.me/e/13/f405)
I have tried several ways with this as the most general accepted format.
The following snippet includes the import and code:
import psycopg2
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()
sql_insert = "INSERT INTO table_results (" + ", ".join(data_dict.keys()) + ") VALUES (" + ", ".join(["%("+k+")s" for k in data_dict]) + ");"
db.session.execute(sql_insert,data_dict )
db.session.commit()
Thanks
Related
I have a very frustrating issue. At the bottom of this post is a function I created to (1) create a table in snowflake and (2) store a dataframe to that table.
The creation of the table is work fine. The issue is happening specifically with writepandas the code snippet:
write_pandas(
conn=conn,
df=df,
table_name=table_name,
database=database,
schema=schema
)
I keep getting an error that the table I created "doesn't exist" because the naming convention is off .. for instance in the database the table is created as "DATABASE"."SCHEMA"."TABLE" but the error message says 'DATABASE.SCHEMA."TABLE"' does not exist
I know this is a simple issue but Im stuck for the moment. Any help would be appreciated.
from datetime import datetime, timedelta, date
from airflow import DAG
from airflow.providers.snowflake.operators.snowflake import SnowflakeOperator
from sqlalchemy import create_engine
import requests
from pandas.io.json import json_normalize
import numpy as np
from sqlalchemy.types import Integer, Text, String, DateTime
from IPython.display import display, HTML
from flatten_json import flatten
from snowflake.connector import connect
from snowflake.connector.pandas_tools import write_pandas
from airflow.operators.python_operator import PythonOperator
import os
from airflow.providers.snowflake.hooks.snowflake import SnowflakeHook
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
def create_store_snowflake(df,table):
#quick transforms
df = df.rename(columns=str.upper)
df.columns = df.columns.str.replace('[-,/]','')
#Define the table name, schema, and database you want to write to
#Note: the table, schema, and database need to already exist in Snowflake
#Define the table name, schema, and database you want to write to
table_name = table
schema = 'schema'
database = 'database'
#Connect to Snowflake using the required user
conn = connect(
user="user",
password="password",
account="account",
role="role",
database = "database",
schema = 'schema'
)
#reroute raw data to dataframe variable
dataframe = df
#Create the SQL statement to create or replace the table
create_tbl_statement = "CREATE OR REPLACE TABLE " + database + "." + schema + "." + table_name + " (\n"
# Loop through each column finding the datatype and adding it to the statement
for column in dataframe.columns:
if (
dataframe[column].dtype.name == "int"
or dataframe[column].dtype.name == "int64"
):
create_tbl_statement = create_tbl_statement + column + " int"
elif dataframe[column].dtype.name == "object":
create_tbl_statement = create_tbl_statement + column + " varchar(16777216)"
elif dataframe[column].dtype.name == "datetime64[ns]":
create_tbl_statement = create_tbl_statement + column + " datetime"
elif dataframe[column].dtype.name == "float64":
create_tbl_statement = create_tbl_statement + column + " float8"
elif dataframe[column].dtype.name == "bool":
create_tbl_statement = create_tbl_statement + column + " boolean"
else:
create_tbl_statement = create_tbl_statement + column + " varchar(16777216)"
# If column is not last column, add comma, else end sql-query
if dataframe[column].name != dataframe.columns[-1]:
create_tbl_statement = create_tbl_statement + ",\n"
else:
create_tbl_statement = create_tbl_statement + ")"
#Execute the SQL statement to create the table
conn.cursor().execute(create_tbl_statement)
print(f"{table_name} created!")
#write df to created table
write_pandas(
conn=conn,
df=df,
table_name=table_name,
database=database,
schema=schema
)
print(df.shape[0],f"rows written to {table_name} in Snowflake")
just had to make sure the tablename was CAPITALIZED as everything stored to Snowflake is apparently capitalized ::face-palm:: instead of create_store_snowflake(df,'mynewtable') it has to be create_store_snowflake(df,'MYNEWTABLE')
When the table identifier is wrapped with " during creation the followin rules applies:
create_tbl_statement= "CREATE OR REPLACE TABLE " + database + "." + schema + "." + table_name
Double-quoted Identifiers:
Delimited identifiers (i.e. identifiers enclosed in double quotes) are case-sensitive and can start with and contain any valid characters
Important
If an object is created using a double-quoted identifier, when referenced in a query or any other SQL statement, the identifier must be specified exactly as created, including the double quotes. Failure to include the quotes might result in an Object does not exist error (or similar type of error).
PostGres SQL will not accept data which is in violation of primary key. To ignore the duplicate data, I have this code:
import pandas as pd
import psycopg2
import os
import matplotlib
from sqlalchemy import create_engine
from tqdm import tqdm_notebook
from pandas_datareader import data as web
import datetime
from dateutil.relativedelta import relativedelta
db_database = os.environ.get('123')
engine = create_engine('postgresql://postgres:{}#localhost:5433/stockdata'.format(123))
def import_data(Symbol):
df = web.DataReader(Symbol, 'yahoo',start=datetime.datetime.now()-relativedelta(days=3), end= datetime.datetime.now())
insert_init = """INSERT INTO stockprices
(Symbol, Date, Volume, Open, Close, High, Low)
VALUES
"""
vals = ",".join(["""('{}','{}','{}','{}','{}','{}','{}')""".format(
Symbol,
Date,
row.High,
row.Low,
row.Open,
row.Close,
row.Volume,
) for Date, row in df.iterrows()])
insert_end ="""ON CONFLICT (Symbol, Date) DO UPDATE
SET
Volume = EXCLUDED.Volume,
Open = EXCLUDED.Open,
Close = EXCLUDED.Close,
Low = EXCLUDED.Low,
High = EXCLUDED.High
"""
query = insert_init + vals + insert_end
engine.execute(query)
import_data('aapl')
I am getting this error:
ProgrammingError: (psycopg2.errors.UndefinedColumn) column "symbol" of relation "stockprices" does not exist
LINE 2: (Symbol,Date, Volume, Open, Close, H...
^
[SQL: INSERT INTO stockprices
Could you please advise as to what does this error mean? I got rid of all the double quotes as advised in the comment.
I had used this code to create the table:
def create_price_table(symbol):
print(symbol)
df = web.DataReader(symbol, 'yahoo', start=datetime.datetime.now()-relativedelta(days=7), end= datetime.datetime.now())
df['Symbol'] = symbol
df.to_sql(name = "stockprices", con = engine, if_exists='append', index = True)
return 'daily prices table created'
create_price_table('amzn')
Also as was mentioned in the comment. I used this to check the table name:
SELECT table_name
FROM information_schema.tables
WHERE table_schema='public'
AND table_type='BASE TABLE';
Edit 1:
I changed the code as suggested in the comment, now the column name is in small case. Below is the code:
import pandas as pd
import psycopg2
import os
import matplotlib
from sqlalchemy import create_engine
from tqdm import tqdm_notebook
from pandas_datareader import data as web
import datetime
from dateutil.relativedelta import relativedelta
db_database = os.environ.get('123')
engine = create_engine('postgresql://postgres:{}#localhost:5433/stockdata'.format(123))
def create_price_table(symbol):
print(symbol)
df = web.DataReader(symbol, 'yahoo', start=datetime.datetime.now()-relativedelta(days=7), end= datetime.datetime.now())
df['symbol'] = symbol
df = df.rename(columns= {'Open':'open'})
df = df.rename(columns= {'Close':'close'})
df = df.rename(columns= {'High':'high'})
df = df.rename(columns= {'Low':'low'})
df = df.rename(columns= {'Volume':'volume'})
df = df.rename(columns= {'Adj Close':'adj_close'})
df.index.name ='date'
df.to_sql(name = "stockprices", con = engine, if_exists='append', index = True)
return 'daily prices table created'
# create_price_table('amzn')
def import_data(Symbol):
df = web.DataReader(Symbol, 'yahoo', start=datetime.datetime.now()-relativedelta(days=3), end= datetime.datetime.now())
insert_init = """INSERT INTO stockprices
(symbol, date, volume, open, close, high, low)
VALUES
"""
vals = ",".join(["""('{}','{}','{}','{}','{}','{}','{}')""".format(
Symbol,
Date,
row.High,
row.Low,
row.Open,
row.Close,
row.Volume,
) for Date, row in df.iterrows()])
insert_end ="""ON CONFLICT (Symbol, Date) DO UPDATE
SET
Volume = EXCLUDED.Volume,
Open = EXCLUDED.Open,
Close = EXCLUDED.Close,
Low = EXCLUDED.Low,
High = EXCLUDED.High
"""
query = insert_init + vals + insert_end
engine.execute(query)
import_data('aapl')
This code however is producing a new error:
DataError: (psycopg2.errors.InvalidTextRepresentation) invalid input syntax for type bigint: "166.14999389648438"
LINE 4: ('aapl','2022-02-23 00:00:00','166.14999...
^
Per my comment you have two issues:
You are trying to INSERT a float value(166.14999389648438) into an integer field. First thing to figure out is why the mismatch? Do really want the database field to be an integer? Second thing is that trying to force a float into an integer will work if the value is being entered as a float/numeric:
select 166.14999389648438::bigint; 166
Though as you see it gets truncated.
It will not work if entered as a string:
ERROR: invalid input syntax for type bigint: "166.14999389648438"
Which is what you are doing. This leads to the second issue below.
You are not using proper Parameter passing as shown in the link. Where among other things is the warning:
Warning
Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
For the purposes of this question the important part is that using parameter passing will result in proper type adaptation.
I am trying to create a sqlite3 db table using a constructed f string in python 3, however I am receiving the below error:
sqlite3.OperationalError: near "(": syntax error
I had hoped that I wouldn't need to ask here for a syntax error but I have been searching on stackoverflow as well as generally online to identify the issue with no success.
I have compared the code to other samples and equally do not see any difference to the construction, except for that it doesn't appear to be common to use f strings.
I have read the pros/cons of passing parameters and would prefer this f string unless it is the root cause.
I expect the issue might be obvious, however any pointers would be greatly appreciated.
Below is the full code:
import sqlite3
import pandas as pd
db_path = [PATH TO DATABASE]
db_table_name = [TABLE NAME]
header_source = [PATH TO .XLSX]
def ReadHeaders():
df = pd.read_excel(header_source)
col_list = list(df.columns.values)
prep_col_list = [item.replace(" ", "_") for item in col_list]
col_string = " TEXT, _".join(prep_col_list)
final_col_string = col_string.replace("Primary_ID TEXT", "Primary_ID PRIMARY KEY")
return final_col_string
def CreateSQLdb():
cols = ReadHeaders()
conn = sqlite3.connect(db_path)
c = conn.cursor()
c.execute(f""" CREATE TABLE IF NOT EXISTS {db_table_name} ({cols}) """)
conn.commit()
conn.close()
A sample of the string that is created for the table headers is:
_link TEXT, _Primary_ID PRIMARY KEY, _Status_Description TEXT, _Price_List_Status TEXT, _Brand TEXT, _36_Character_Description TEXT
Solved
After breaking everything down, the root cause was the constructed string. I was able to identify it when trying to export the constructed string to a .txt file and received a unicode error.
Code before:
return final_col_string
Code after:
return final_col_string.encode(encoding="utf-8")
I also added a simple check of the table info for confirmation
def ShowTable(c):
c.execute(f"PRAGMA table_info({db_table_name})")
print (c.fetchall())
Complete code encase anyone else comes across this issue:
import sqlite3
import pandas as pd
db_path = [PATH TO DATABASE]
db_table_name = [TABLE NAME]
header_source = [PATH TO .XLSX]
def ReadHeaders():
df = pd.read_excel(header_source)
col_list = list(df.columns.values)
prep_col_list = [item.replace(" ", "_") for item in col_list]
col_string = " TEXT, _".join(prep_col_list)
final_col_string = col_string.replace("Primary_ID TEXT", "Primary_ID PRIMARY KEY")
return final_col_string.encode(encoding="utf-8")
def CreateSQLdb():
cols = ReadHeaders()
conn = sqlite3.connect(db_path)
c = conn.cursor()
c.execute(f""" CREATE TABLE IF NOT EXISTS {db_table_name} ({cols}) """)
conn.commit()
conn.close()
def ShowTable(c):
c.execute(f"PRAGMA table_info({db_table_name})")
print (c.fetchall())
if __name__ == "__main__":
CreateSQLdb()
Using Flask and SQLAlchemy is it possible to create a query where a column can be cast from a number to a string so that .like() can be used as a filter?
The sample code below illustrates what I'm after, however Test 3 is a broken statement (ie: No attempt at casting so the query fails. Error is below)
Test 1 - demonstrates a standard select
Test 2 - demonstrates a select using like on a string
Can 'test 3' be modified to permit a like on a number?
In PostgreSQL the SQL query would be:
SELECT * FROM mytable WHERE number::varchar like '%2%'
Any assistance gratefully appreciated.
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy import Table, Column, Integer, String
app = Flask(__name__)
app.debug = True
app.config.from_pyfile('config.py')
db = SQLAlchemy( app )
class MyTable(db.Model):
'''My Sample Table'''
__tablename__ = 'mytable'
number = db.Column( db.Integer, primary_key = True )
text = db.Column( db.String )
def __repr__(self):
return( 'MyTable( ' + str( self.number ) + ', ' + self.text + ')' )
test_1 = (db.session.query(MyTable)
.all())
print "Test 1 = " + str( test_1 )
test_2 = (db.session.query(MyTable)
.filter( MyTable.text.like( '%orl%' ) )
.all())
print "Test 2 = " + str( test_2 )
test_3 = (db.session.query(MyTable)
.filter( MyTable.number.like( '%2%' ) )
.all())
And the sample data:
=> select * from mytable;
number | text
--------+-------
100 | Hello
20 | World
And the error:
Traceback (most recent call last):
File "sample.py", line 33, in <module>
.filter( MyTable.number.like( '%2%' ) )
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2320, in all
return list(self)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2438, in __iter__
return self._execute_and_instances(context)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2453, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 729, in execute
return meth(self, multiparams, params)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 322, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 826, in _execute_clauseelement
compiled_sql, distilled_params
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 958, in _execute_context
context)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1159, in _handle_dbapi_exception
exc_info
File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 951, in _execute_context
context)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 436, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (ProgrammingError) operator does not exist: integer ~~ unknown
LINE 3: WHERE mytable.number LIKE '%2%'
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
'SELECT mytable.number AS mytable_number, mytable.text AS mytable_text \nFROM mytable \nWHERE mytable.number LIKE %(number_1)s' {'number_1': '%2%'}
Solved. The Query method filter can take an expression, so the solution is:
from sqlalchemy import cast, String
result = (db.session.query(MyTable)
.filter( cast( MyTable.number, String ).like( '%2%' ) )
.all())
With the result:
Test 3 = [MyTable( 20, World)]
Found the information in the SQLAlchemy Query API documentation.
I am new to informix database. I want to know how to insert blob and clob type of column in informix. I need sample query for those two type of column. If someone help, I will appreciate...
This is my Jython code that uses JDBC and prepared statements to test both INSERT and SELECT of blob from Informix database:
#!/usr/bin/env jython
# -*- coding: utf8 -*-
import time
from java.sql import DriverManager
from java.lang import Class
from java.io import FileInputStream
from java.io import ByteArrayOutputStream
Class.forName("com.informix.jdbc.IfxDriver")
"""
create table _blob_test_so (
id serial,
txt varchar(30),
column_blob blob
);
"""
def test_blob_insert(db):
print('inserting gif picture into blob table...')
blob = FileInputStream('snoopy_comics20121023.gif')
insert_stmt = db.prepareStatement("INSERT INTO _blob_test_so (txt, column_blob) VALUES (?, ?)")
insert_stmt.setString(1, 'test %s' % (time.strftime('%Y-%m-%d %H:%M:%S')))
insert_stmt.setBinaryStream(2, blob)
rec_cnt = insert_stmt.executeUpdate()
blob.close()
insert_stmt.close()
print('records changed: %d' % (rec_cnt))
def test_blob_select(db):
print('selecting data from blob table...')
pstm = db.prepareStatement("SELECT id, txt, column_blob FROM _blob_test_so")
rs = pstm.executeQuery()
while (rs.next()):
id = rs.getInt(1)
txt = rs.getString(2)
image_stream = rs.getBinaryStream(3)
fout = ByteArrayOutputStream()
while 1:
b = image_stream.read()
if b < 0:
break
fout.write(b)
arr = fout.toByteArray()
fname_out = 'test_%s.gif' % (id)
print('%d:%s: fname: %s %d [b]' % (id, txt, fname_out, len(arr)))
f = open(fname_out, 'wb')
f.write(arr)
f.close()
rs.close()
db = DriverManager.getConnection('jdbc:informix-sqli://test-informix:9088/test:informixserver=ol_testifx;DB_LOCALE=pl_PL.CP1250;CLIENT_LOCALE=pl_PL.CP1250;charSet=CP1250', 'user', 'passwd')
test_blob_insert(db)
test_blob_select(db)