I want to save a data frame in a Database table. What I did :
Connect to azure Sql server DB
import pyodbc
# Create
server = 'XXXXXXXXXXXXXXXXXXXX'
database = 'XXXXXXXXXXXXXXXXXXX'
username = 'XXXXXXXXXXXXXXXX'
password = 'XXXXXXXXXXXX'
cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
cursor = cnxn.cursor()
Create Table
#create_table = """
CREATE TABLE forecast_data (
CompanyEndDate text,
Retailer text,
Store_Name text,
Category text,
Description text,
QtySold int);
cursor.execute(create_table)
cnxn.commit()
Use pandas to_sql
data.to_sql('forecast_data', con=cnxn)
I get this error:
ProgrammingError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/io/sql.py in execute(self, *args, **kwargs)
1680 try:
-> 1681 cur.execute(*args, **kwargs)
1682 return cur
ProgrammingError: ('42S02', "[42S02] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW)")
The above exception was the direct cause of the following exception:
DatabaseError Traceback (most recent call last)
7 frames
/usr/local/lib/python3.7/dist-packages/pandas/io/sql.py in execute(self, *args, **kwargs)
1691
1692 ex = DatabaseError(f"Execution failed on sql '{args[0]}': {exc}")
-> 1693 raise ex from exc
1694
1695 #staticmethod
DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': ('42S02', "[42S02] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW)")
Any one have an idea what is going on ?
When import sqlalchemy, you can use to_sql.
Related Post:
pandas to sql server
import sqlalchemy
...
engine = sqlalchemy.create_engine(
"mssql+pyodbc://user:pwd#server/database",
echo=False)
data.to_sql('forecast_data', con=cnxn, if_exists='replace')
When import pyodbc, you can use to_sql.
Your code should like below. You also can read my answer in below post.
Logon failed (pyodbc.InterfaceError) ('28000', "[28000] [Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for user 'xxxx'
Related
I was previously using SQLAlchemy 1.4.44 with pandas 1.5.1 and the following code that executes a SQL Stored procedure worked:
sql_connection: str = "driver={ODBC Driver 17 for SQL Server};server=localhost\SQLEXPRESS;database=Finance;trusted_connection=yes"
sql_engine: sqlalchemy.engine = sqlalchemy.create_engine(
url=sqlalchemy.engine.URL.create(drivername="mssql+pyodbc", query={"odbc_connect": sql_connection})
)
with sql_engine.connect() as connection:
query: str = "EXEC dbo.get_latest_tickers #etf=?"
return pandas.read_sql_query(sql=query, con=connection, params=[etf])
I switched to SQLAlchemy 2.0.3 and pandas 1.5.3 and revised the code by wrapping the call to stored procedure with sqlalchemy.text as this version of SQLAlchemy requires it. The revised code is as follows:
sql_connection: str = "driver={ODBC Driver 17 for SQL Server};server=localhost\SQLEXPRESS;database=Finance;trusted_connection=yes"
sql_engine: sqlalchemy.engine = sqlalchemy.create_engine(
url=sqlalchemy.engine.URL.create(drivername="mssql+pyodbc", query={"odbc_connect": sql_connection})
)
with sql_engine.connect() as connection:
query: str = sqlalchemy.text("EXEC dbo.get_latest_tickers #etf=?")
return pandas.read_sql_query(sql=query, con=connection, params=[etf])
The code throws the following exception:
(ArgumentError)List argument must consist only of tuples or dictionaries
I have tried revising params argument as follows but the revision also fails:
return pandas.read_sql_query(sql=query, con=connection, params={"#etf": etf})
The exception thrown is as follows:
{DBAPIError}(pyodbc.Error)('07002', '[07002] [Microsoft][ODBC Driver 17 for SQL Server)COUNT field incorrect or syntax error (0) (SQLExecDirectW)')
How do I pass parameters to execute the stored procedure?
I learned that the function sqlalchemy.text provides backend-neutral support for bind parameters. Such parameters must be in the named column format. See https://www.tutorialspoint.com/sqlalchemy/sqlalchemy_core_using_textual_sql.htm.
The following revised code works:
with sql_engine.connect() as connection:
query: str = sqlalchemy.text("EXEC dbo.get_latest_tickers #etf=:etf")
return pandas.read_sql_query(sql=query, con=connection, params={"etf": etf})
I have the following data frame
CALL_DISPOSITION CITY END INCIDENT_NUMBER
0 ADV-Advised Waterloo Fri, 23 Mar 2018 01:13:27 GMT 6478983
1 AST-Assist Waterloo Sat, 18 Mar 2017 12:41:47 GMT 724030
2 AST-Assist Waterloo Sat, 18 Mar 2017 12:41:47 GMT 999000
I am trying to push this to an IBM DB2 Database.
I have the following code:
# IBM DB2 imports
import ibm_db
# instantiate db2 connection
connection_id = ibm_db.connect(
conn_string,
'',
'',
conn_option,
ibm_db.QUOTED_LITERAL_REPLACEMENT_OFF)
# create list of tuples from df
records = [tuple(row) for _, row in df.iterrows()]
# Define sql statement structure to replace data into WATERLOO_911_CALLS table
column_names = df.columns
df_sql = "VALUES({}{})".format("?," * (len(column_names) - 1), "?")
sql_command = "REPLACE INTO WATERLOO_911_CALLS {} ".format(df_sql)
# Prepare SQL statement
try:
sql_command = ibm_db.prepare(connection, sql_command)
except Exception as e:
print(e)
# Execute query
try:
ibm_db.execute_many(sql_command, tuple(temp_records))
except Exception as e:
print('Data pushing error {}'.format(e))
However, I keep getting the following error:
Exception: [IBM][CLI Driver][DB2/LINUXX8664] SQL0104N An unexpected token "REPLACE INTO WATERLOO_911_CALLS" was found following "BEGIN-OF-STATEMENT". Expected tokens may include: "<space>". SQLSTATE=42601 SQLCODE=-104
I don't understand why that is the case. I followed the steps outlined in this repo but I can't seem to get this to work. What am I doing wrong? Please let me know there are any clarifications I can make.
It hints about missing spaces, maybe it needs one between the fields in the VALUE() string.
Like df_sql = "VALUES({}{})".format("?, " * (len(column_names) - 1), "?")
instead of df_sql = "VALUES({}{})".format("?," * (len(column_names) - 1), "?")
Just a hunch.
I find that printing sql_command before executing it could also help troubleshooting.
I've created a 'artist' table in my database with the columns 'artistid' and 'artisttitle'. I also uploaded a csv that have the same names for headers. I'm using the below code to upload the csv data into the sql table but receive the following error:
---------------------------------------------------------------------------
UndefinedColumn Traceback (most recent call last)
<ipython-input-97-80bd8826bb17> in <module>
10 with connection, connection.cursor() as cursor:
11 for row in album.itertuples(index=False, name=None):
---> 12 cursor.execute(INSERT_SQL,row)
13
14 mediatype = mediatype.where(pd.notnull(mediatype), None)
UndefinedColumn: column "albumid" of relation "album" does not exist
LINE 1: INSERT INTO zp2gz.album (albumid, albumtitle) VALUES (1,'Fo...
^
EDIT---------------------------------
I meant to say albumid and albumtitle! My apologies
Seems like a typo -- you need to use albmid instead of albumid -- maybe fix your models.py and re-migrate.
I am trying to insert my dataframe into a newly created table in Teradata. My connection and creating the table using SQLAchmey works, but I am unable to insert the data. I keep getting the same error that the schemy columns do not exist.
Here is my code:
username = '..'
password= '..'
server ='...'
database ='..'
driver = 'Aster ODBC Driver'
engine_stmt = ("mssql+pyodbc://%s:%s#%s/%s?driver=%s" % (username, password, server, database, driver ))
engine = sqlalchemy.create_engine(engine_stmt)
conn = engine.raw_connection()
#create tble function
def create_sql_tbl_schema(conn):
#tbl_cols_sql = gen_tbl_cols_sql(df)
sql = "CREATE TABLE so_sandbox.mn_testCreation3 (A INTEGER NULL,B INTEGER NULL,C INTEGER NULL,D INTEGER NULL) DISTRIBUTE BY HASH (A) STORAGE ROW COMPRESS LOW;"
cur = conn2.cursor()
cur.execute('rollback')
cur.execute(sql)
cur.close()
conn.commit()
create_mysql_tbl_schema(conn) #this works and the table is created
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('abcd'))
df.to_sql('mn_testCreation3', con=engine,
schema='so_sandbox', index=False, if_exists='append') #this is giving me problems
Error message returned is:
sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', '[42000] [AsterData][nCluster] (34) ERROR: relation "INFORMATION_SCHEMA"."COLUMNS" does not exist. (34) (SQLPrepare)') [SQL: 'SELECT [INFORMATION_SCHEMA].[COLUMNS].[TABLE_SCHEMA], [INFORMATION_SCHEMA].[COLUMNS].[TABLE_NAME], [INFORMATION_SCHEMA].[COLUMNS].[COLUMN_NAME], [INFORMATION_SCHEMA].[COLUMNS].[IS_NULLABLE], [INFORMATION_SCHEMA].[COLUMNS].[DATA_TYPE], [INFORMATION_SCHEMA].[COLUMNS].[ORDINAL_POSITION], [INFORMATION_SCHEMA].[COLUMNS].[CHARACTER_MAXIMUM_LENGTH], [INFORMATION_SCHEMA].[COLUMNS].[NUMERIC_PRECISION], [INFORMATION_SCHEMA].[COLUMNS].[NUMERIC_SCALE], [INFORMATION_SCHEMA].[COLUMNS].[COLUMN_DEFAULT], [INFORMATION_SCHEMA].[COLUMNS].[COLLATION_NAME] \nFROM [INFORMATION_SCHEMA].[COLUMNS] \nWHERE [INFORMATION_SCHEMA].[COLUMNS].[TABLE_NAME] = ? AND [INFORMATION_SCHEMA].[COLUMNS].[TABLE_SCHEMA] = ?'] [parameters: ('mn_testCreation3', 'so_sandbox')] (Background on this error at: http://sqlalche.me/e/f405)
I have a number of functions written on our Microsoft SQL servers.
I can easily access and query all data normally, but I cannot execute functions on the server using RODBC.
How can I execute sql-functions using R? Are there other packages that can do this?
Or do I need to switch strategies completely?
Example:
require(RODBC)
db <- odbcConnect("db")
df <- sqlQuery(channel = db, query = "USE [Prognosis]
GO
SELECT * FROM [dbo].[Functionname] ("information_variable")
GO" )
Error message:
"42000 102 [Microsoft][ODBC SQL Server Driver][SQL Server]Incorrect syntax near 'GO'."
[2] "[RODBC] ERROR: Could not SQLExecDirect 'USE... "
This turned out to work:
df <- sqlQuery(channel = db,
query = "SELECT * FROM [dbo].[Functionname] ("information_variable")" )
So I dropped USE [The_SQL_TABLE] and GO