SqlSave Error: Unable to append to table - sql

Code:
sqlSave(SQL,data.frame(df),tablename='Data',append = TRUE,rownames = FALSE)
The table in which I am trying to insert the data has a primary key which is auto-increment. My table has a total of 5 columns including the primary key. In my data frame, I have 4 columns because I don't want to insert the PK myself. However, when I run the command, I get the following error:
Error in colnames<-(*tmp*, value = c("BId", "name", "Set", :
length of 'dimnames' [2] not equal to array extent
Also, when I insert the Primary key in the dataframe by myself, it still doesn't work.
Error in sqlSave(SQL, data.frame(df), tablename = "Data", :
unable to append to table ‘Data’

have a try safer = FALSE
the defination of sqlSave
if (!append) {
if (safer)
stop("table ", sQuote(tablename), " already exists")
......
}
......
if (safer)
stop("unable to append to table ", sQuote(tablename))

You can use use verbose argument to get the actual database error.
sqlsave(con, df, verbose = T)

Related

PostgreSQL - Query nested json in text column

My situation is the following
-> The table A has a column named informations whose type is text
-> Inside the informations column is stored a JSON string (but is still a string), like this:
{
"key": "value",
"meta": {
"inner_key": "inner_value"
}
}
I'm trying to query this table by seraching its informations.meta.inner_key column with the given query:
SELECT * FROM A WHERE (informations::json#>>'{meta, inner_key}' = 'inner_value')
But I'm getting the following error:
ERROR: invalid input syntax for type json
DETAIL: The input string ended unexpectedly.
CONTEXT: JSON data, line 1:
SQL state: 22P02
I've built the query following the given link: DevHints - PostgreSQL
Does anyone know how to properly build the query ?
EDIT 1:
I solved with this workaround, but I think there are better solutions to the problem
WITH temporary_table as (SELECT A.informations::json#>>'{meta, inner_key}' as inner_key FROM A)
SELECT * FROM temporary_table WHERE inner_key = 'inner_value'

Writing dataframe to Postgres database psycopg2

I am trying to write a pandas DataFrame to a Postgres database.
Code is as below:
dbConnection = psycopg2.connect(user = "user1", password = "user1", host = "localhost", port = "5432", database = "postgres")
dbConnection.set_isolation_level(0)
dbCursor = dbConnection.cursor()
dbCursor.execute("DROP DATABASE IF EXISTS FiguresUSA")
dbCursor.execute("CREATE DATABASE FiguresUSA")
dbCursor.execute("DROP TABLE IF EXISTS FiguresUSAByState")
dbCursor.execute("CREATE TABLE FiguresUSAByState(Index integer PRIMARY KEY, Province_State VARCHAR(50), NumberByState integer)");
for i in data_pandas.index:
query = """
INSERT into FiguresUSAByState(column1, column2, column3) values('%s',%s,%i);
""" % (data_pandas['Index'], data_pandas['Province_State'], data_pandas['NumberByState'])
dbCursor.execute(query)
When I run this, I get an error which just says : "Index". I know its somewhere in my for loop is the problem, is that % notation correct? I am new to Postgres and don't see how that could be correct syntax. I know I can use to_sql but I am trying to use different techniques.
Print out of data_pandas is as below:
One slight possible anomaly is that there an "index" in the IDE version. Could this be the problem?
If you use pd.DataFrame.to_sql, you can supply the index_label parameter to use that as a column.
data_pandas.to_sql('FiguresUSAByState', con=dbConnection, index_label='Index')
If you would prefer to stick with the custom SQL and for loop you have, you will need to reset_index first.
for row in data_pandas.reset_index().to_dict('rows'):
query = """
INSERT into FiguresUSAByState(index, Province_State, NumberByState) values(%i, '%s', %i);
""" % (row['index'], row['Province_State'], row['NumberByState'])
Note that the default name for the new column is index, uncapitalized, rather than Index.
In the insert statement:
query = """
        INSERT into FiguresUSAByState (column1, column2, column3) values ​​('%s',%s,%i);
        """% (data_pandas ['Index'], data_pandas ['Province_State'], data_pandas ['NumberByState'])
You have a '%s', I think that is the problem. So remove the quotes

Querying DynamoDB table using Global Secondary Indexes

I was trying to query a DynamoDB table using a Lambda function.
My table's partition key is id. I am trying to query it on another key named dipl_idpp. I unstrood that that is not possible.
I found here a solution: I need to create a Global Secondary Index poitning on the column that I want to query on (in my case dipl_idpp).
I did that on Dynamo. But when I execute my function I still have the same problem:
An error occurred (ValidationException) when calling the Query operation: Query condition missed key schema element: id', 'occurred at index 0')"
This is the code I use:
def query_dipl_dynamo(key_table,valeur_query,name_table):
dynamoDBResource = boto3.resource('dynamodb')
table = dynamoDBResource.Table(name_table)
response = table.query(
KeyConditionExpression=Key(key_table).eq(valeur_query))
df_fr = pd.DataFrame([response['Items']])
if len(df_fr.columns) > 0 :
print("hellooo1")
df = pd.DataFrame([response['Items'][0]])
return valeur_query, df["dipl_libelle"].iloc[0]
//
//
df9_tmp["dipl_idpp"] = df8_tmp.apply(lambda x : query_dipl_dynamo("dipl_idpp",x["num_auto"], "ddb-dev-PS_LibreAcces_Dipl_AutExerc")[0], axis=1)
Should I change something else beside creating the index? Too little documentation is available.
Thank you!
I just found the solution. When we use Indexes we must provide
an argument namde IndexName who takes the name of the index in Dynamo.
I had to change my code to:
def query_dipl_dynamo(key_table,valeur_query,name_table):
dynamoDBResource = boto3.resource('dynamodb')
table = dynamoDBResource.Table(name_table)
response = table.query(
IndexName:"NameOfTheIndexInDynamoDB",
KeyConditionExpression=Key(key_table).eq(valeur_query))
df_fr = pd.DataFrame([response['Items']])
if len(df_fr.columns) > 0 :
df = pd.DataFrame([response['Items'][0]])
return valeur_query, df["dipl_libelle"].iloc[0]
//
//
df9_tmp["dipl_idpp"] = df8_tmp.apply(lambda x : query_dipl_dynamo("dipl_idpp",x["num_auto"], "ddb-dev-PS_LibreAcces_Dipl_AutExerc")[0], axis=1)

how to avoid updation of id column

I don't want to update the id column because it is auto incremented .
when I run this code then i face
java.sql.SQLException: No value specified for parameter 1.
so my question is how to avoid / bypass the update.
String sql="";
sql = "insert into registration(first_name,last_name,gender,email_id,dob,"
+ "father_name,mother_name,contact,mobile,address,city,country,graduation,"
+ "graduation_marks,graduation_year,inter,inter_marks,inter_year,high_school,"
+ "high_marks,high_year,role,salary,resume,photo,pre_comp) value(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?);";
// int i=0;
PreparedStatement p = con.prepareStatement(sql);
p.setString(2,registration.first_name);
p.setString(3,registration.last_name);
p.setString(4,registration.gender);
p.setString(5,registration.email_id);
As specified in the error message: "No value specified for parameter 1"
The problem here is that "first_name" is the first column in insert query for which it will expect a value.
But your PreparedStatement sets the parameter values starting from index 2. It should be done starting with index 1
That is, p.setString(1, registration.first_name)

update an SQL table via R sqlSave

I have a data frame in R having 3 columns, using sqlSave I can easily create a table in an SQL database:
channel <- odbcConnect("JWPMICOMP")
sqlSave(channel, dbdata, tablename = "ManagerNav", rownames = FALSE, append = TRUE, varTypes = c(DateNav = "datetime"))
odbcClose(channel)
This data frame contains information about Managers (Name, Nav and Date) which are updatede every day with new values for the current date and maybe old values could be updated too in case of errors.
How can I accomplish this task in R?
I treid to use sqlUpdate but it returns me the following error:
> sqlUpdate(channel, dbdata, tablename = "ManagerNav")
Error in sqlUpdate(channel, dbdata, tablename = "ManagerNav") :
cannot update ‘ManagerNav’ without unique column
When you create a table "the white shark-way" (see documentation), it does not get a primary index, but is just plain columns, and often of the wrong type. Usually, I use your approach to get the columns names right, but after that you should go into your database and assign a primary index, correct column widths and types.
After that, sqlUpdate() might work; I say might, because I have given up using sqlUpdate(), there are too many caveats, and use sqlQuery(..., paste("Update....))) for the real work.
What I would do for this is the following
Solution 1
sqlUpdate(channel, dbdata,tablename="ManagerNav", index=c("ManagerNav"))
Solution 2
Lcolumns <- list(dbdata[0,])
sqlUpdate(channel, dbdata,tablename="ManagerNav", index=c(Lcolumns))
Index is used to specify what columns R is going to update.
Hope this helps!
If none of the other solutions work and your data is not that big, I'd suggest using sqlQuery() and loop through your dataframe.
one_row_of_your_df <- function(i) {
sql_query <-
paste0("INSERT INTO your_table_name (column_name1, column_name2, column_name3) VALUES",
"(",
"'",your_dataframe[i,1],",",
"'",your_dataframe[i,2],"'",",",
"'",your_dataframe[i,3],"'",",",
")"
)
return(sql_query)
}
This function is Exasol specific, it is pretty similar to MySQL, but not identical, so small changes could be necessary.
Then use a simple for loop like this one:
for(i in 1:nrow(your_dataframe))
{
sqlQuery(your_connection, one_row_of_your_df(i))
}