Using insert into with R - sql

I am tring to using insert into sql syntax in R to insert row in data frame but is showing the following error:
(( error in sentax ))
Below is sample of my code:
Vector <- c("alex" ,"IT")
Tst <- data.frame( name.charcher(), major.charachter())
sqldf( c(" insert into Tst values (" , Vector[1] , "," ,Vector[2] , ")" , "select * from main.Tst "))
I hope my question is clear

A few edits to help address the syntax error:
use a lower case s in the function name (sqldf() instead of Sqldf())
add a comma between "," and Vector[2]
add quotes around select * from main.Tst
Also, to note:
the 1d data structure for the heterogeneous content types in your Vector <- c("alex", 32) should be a list (rather than an atomic vector where all contents are of the same type).
depending on what database driver you're using, sqldf() may return an error if you try to insert values into an empty R data frame as you have in your code. Creating the empty data frame within the sqldf() call is one approach to avoid this (used below in absence of knowing your database info).
For example, you could use the following to resolve the error message you're getting:
library(sqldf)
new <- list(name='alex', age=as.integer(32))
Tst <- sqldf(c("create table T1 (name char, age int)",
paste0("insert into T1 (name, age) values ('", new$name[1],"',", new$age[1],")",sep=''),
"select * from T1"))
Tst
# > Tst
# name age
# 1 alex 32

Related

ProgrammingError: ('Expected 0 parameters, supplied 391', 'HY000') with 391 columns using dynamic approach

I have a dataframe that contains 391 columns and a number of rows. I am trying to push this to a database via pyodbc and using the following command:
cursor = conn.cursor()
cursor.fast_executemany = True
cursor.executemany(
f"INSERT INTO db.tble({', '.join(df.columns.tolist())}) VALUES ({('?,' * len(df.columns))[:-1]})",
list(df.itertuples(index=False, name=None))
)
cursor.commit()
I would have thought this method would be dynamic for a dataframe of any size yet I get the following error:
ProgrammingError: ('Expected 0 parameters, supplied 391', 'HY000')
I am struggling to understand this as the syntax looks correct, ? has been used instead of %s like other answers. Can someone please help.
Thanks
I once wrote a piece of code, where I wanted to create the insert statement dynamically based on number of columns in the data frame:
here is how the insert query would be passed to the database:
INSERT INTO dbo.Table (column1,columns2,column3) VALUES (?,?,?)
and again, the number of columns and values '?' would be required to be created dynamically at runtime based upon the number of columns the data frame had
I wrote the below piece to just write a string (of ?,?,?) and concatenate it with the insert query,
here
df is the dataframe,
symbol_counter would hold the number of columns in the dataframe,
sym_string would be the final string i.e. (?,?,?,?...n) based on the number of columns
symbol = ['?']
sym_string = ''
symbol_counter = int(df.shape[1])-1
word = 0
for word in range(symbol_counter):
# sym_string += str(symbol)
symbol.insert(word, "?")
word+=1
sym_string = (','.join(symbol))
#and then use this variable and concatenate it with the rest of the query as shown below
query = Variable_holding_first_partofthequery + " VALUES (" +sym_string+")"
I know, it's the big way, but that's how I got it to work. Good Luck!

For Loop to iterate SQL query in R

I would like to iterate this SQL query over the 17 rows in my df. My df and code are below. I think I may need single quotes around dat$ptt_id, because I get a syntax error at the "IN" function. Any ideas how to correctly write this?
df looks like:
ptt_id
1 181787
2 181788
3 184073
4 184098
5 197601
6 197602
7 197603
8 197604
9 197605
10 197606
11 197607
12 197608
13 197609
14 200853
15 200854
16 200851
17 200852
#Load data----
dat <- read.csv("ptts.csv")
dat2<-list(dat)
#Send to database----
for(i in 1:nrow(dat)){
q <- paste("SELECT orgnl_pit, t_name, cap_date, species, sex, mass, cap_lat, cap_lon, sat_appld_id
FROM main.capev JOIN uid.turtles USING (orgnl_pit)
WHERE sat_appld_id IN", dat$ptt_id[i],";")
#Get query----
tags <- dbGetQuery(conn, q)
}
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: syntax error at or near "181787"
LINE 3: WHERE sat_appld_id IN 181787 ;
^
Thanks for any assistance.
Two options:
Parameter binding.
qmarks <- paste0("(", paste(rep("?", nrow(df)), collapse = ","), ")")
qmarks
# [1] "(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"
qry <- paste(
"SELECT orgnl_pit, t_name, cap_date, species, sex, mass, cap_lat, cap_lon, sat_appld_id
FROM main.capev JOIN uid.turtles USING (orgnl_pit)
WHERE sat_appld_id IN", qmarks)
tags <- dbGetQuery(conn, qry, params = df[,1])
Temporary table. This might be more useful when you have a large number of ids to use. (This also works without a temp table if you get the ids from the database anyway, and can use that query in this sub-query.)
dbWriteTable(conn, df, "mytemp")
qry <- "SELECT orgnl_pit, t_name, cap_date, species, sex, mass, cap_lat, cap_lon, sat_appld_id
FROM main.capev JOIN uid.turtles USING (orgnl_pit)
WHERE sat_appld_id IN (select id from mytemp)"
tags <- dbGetQuery(conn, qry)
dbExecute(conn, "drop table mytemp")
(Name your temp table carefully. DBMSes usually have a nomenclature to ensure the table is automatically cleaned/dropped when you disconnect, often something like "#mytemp". Check with your DBMS or DBA.)
The IN operator requires a list. You can think of it as multiple OR conditions.
E.g. instead of WHERE sat_appld_id IN 181787 it should be WHERE sat_appld_id IN (181787)
And to that point, instead of a loop you could create a list from your dat$ptt_id column for just one sql query such as WHERE sat_appld_id IN (181787, 181788, 184073, ...) and do any additional wraggling within your R code instead of making multiple database queries.

SQL "where" clause failing with R JDBC HANA connection

I've had nothing but trouble connecting to my company's HANA db through R but finally had a breakthrough, however now my sql statement is failing in subsetting data using a "where" statement.
The following returns a data frame of 10 observations across 9 variables
# Fetch all results
rs <- dbSendQuery(jdbcConnection, 'SELECT TOP 10
VISITTYPE,
ACCOUNT,
PLANNEDSTART,
PLANNEDEND,
EXECUTIONSTART,
EXECUTIONEND,
STATUS,
SOURCE,
ACCOUNT_NAME
FROM "_SYS_BIC"."cona-reporting.field-sales/Q_CA_R_SpringVisit"')
a <- dbFetch(rs)
However when I throw a where into it, I receive an error.
rs <- dbSendQuery(jdbcConnection, 'SELECT TOP 10
VISITTYPE,
ACCOUNT,
PLANNEDSTART,
PLANNEDEND,
EXECUTIONSTART,
EXECUTIONEND,
STATUS,
SOURCE,
ACCOUNT_NAME
FROM "_SYS_BIC"."cona-reporting.field-sales/Q_CA_R_SpringVisit" WHERE VISITTYPE = ZR')
Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", :
Unable to retrieve JDBC result set for SELECT TOP 10
VISITTYPE,
ACCOUNT,
PLANNEDSTART,
PLANNEDEND,
EXECUTIONSTART,
EXECUTIONEND,
STATUS,
SOURCE,
ACCOUNT_NAME
FROM "_SYS_BIC"."cona-reporting.field-sales/Q_CA_R_SpringVisit" WHERE VISITTYPE = ZR (SAP DBTech JDBC: [260] (at 222): invalid column name: ZR: line 11 col 101 (at pos 222))
What does this mean? ZR is not a column, it is a value within the column. Tried placing ZR in quotes to no other effect.
My double and single quote syntax is based on this other question I've asked.
Issues connecting R to HANA db with many special characters
Never got it working with RODBC so tried JODBC.
Likely it is handling of quotes within an embedded quote-enclosed string, further complicated by the double quote symbols used in SQL for identifiers. However, consider parameterization (an industry best practice whenever running SQL in application layer such as R) to avoid the need of quote punctuation or concatenation. Like most JDBC APIs, RJDBC supports parameterization. Also note, dbGetQuery summarily equates to dbSendQuery + dbFetch:
sql <- 'SELECT TOP 10 VISITTYPE,
ACCOUNT,
PLANNEDSTART,
PLANNEDEND,
EXECUTIONSTART,
EXECUTIONEND,
STATUS,
SOURCE,
ACCOUNT_NAME
FROM "_SYS_BIC"."cona-reporting.field-sales/Q_CA_R_SpringVisit"
WHERE VISITTYPE = ?'
param <- 'ZR'
df <- dbGetQuery(jdbcConnection, sql, param)
To complete the previous answer (which is of course preferable as it uses bind variables) here is described the *root cause** of the problem:
The use of a single quote in a single quoted string must be of course escaped
Contrary to the Oracle escaping using doubling the quote R uses backslash.
i.e. the proper usage is as follows:
> df <- dbGetQuery(jdbcConnection,
+ 'select * from "DUAL" where "DUMMY" = \'X\'')
> df
DUMMY
1 X
alternative way using double quoted string
> df <- dbGetQuery(jdbcConnection,
+ "select * from \"DUAL\" where \"DUMMY\" = 'X'")
> df
DUMMY
1 X

Putting output from sql query into another query using R environment

I am wondering what approach should have been selected to perform action from title. I am using ODBC connection and what I get from first sql query are like 40-50 rows in one column. What I want is to put this output as a values in to search for.
How should i treat this? Like a array or separated variables? I still do not know R well so just need to know where to search for.
Regards
------more explanation below----
I have list of 40-50 numbers of 10 digits each, organized in a column.
I am trying to do this:
list <- c(my_input)
sql_in <- paste0(list, collapse="")
and characters are organized like this after this operations:
'c(1234567890, , 1234567890, 1234567890)'
and almost all looks fine and fit into my query besides additional c character at the beginning and missing apostrophes.I try to use gsub function but did not work in way I want.
You may likely do this in one SQL call using a subquery. Notice in the call below that the result of
SELECT n_gear
FROM Gear
WHERE n_gear IN (3,4)
Is passed to the WHERE clause of the primary query. This is perfectly valid and will allow your query to execute entirely in SQL without having to do any intermediate steps in R.
(I use sqldf for simplicity of illustration, but this should work through just about any ODBC connection)
library(sqldf)
Gear <- data.frame(n_gear = 1:5)
sqldf(
"SELECT mpg, qsec, gear, wt
FROM mtcars
WHERE gear IN (SELECT n_gear
FROM Gear
WHERE n_gear IN (3,4))"
)
Try something like this:
list<-c("try","this") #The output from your first query
sql_in<-paste0(list, collapse="','")
The Output
paste("select * from table where table.var in ",paste("('",sql_in,"')",sep=''))
[1] "select * from table where table.var in ('try','this')"
If yuo have space as first or last element of the string you can use this code:
`list<-c(" first element is a space","try","this","last element is a space ")` #The output from your first query
Find space at first or last character
first_space<-substr(list, start = 1, stop = 1)==" "
last_space<-substr(list, start = nchar(list), stop = nchar(list))==" "
Remove spaces
list[first_space]<-substr(list[first_space], start = 2, stop = nchar(list[first_space]))
list[last_space]<-substr(list[last_space], start = 1, stop = nchar(list[last_space])-1)
sql_in<-paste0(list, collapse="','")
Your output
paste0("select * from table where table.var in ",paste("('",sql_in,"')",sep=''))
"select * from table where table.var in ('first element is a space','try','this','last element is a space')"
I think You are expecting some thing like shown below code,
data <- dbGetQuery(con, "select column from yourfirsttable")
list <- paste(data$column, collapse="','")
result <- dbGetQuery(con, statement = sprintf("select * from yourresulttable where inv in ('%s')",list))
It's not entirely clear exactly what you're wanting to achieve here. For example, one use case just means you can do it all with a join. But I have cases where I don't know the values for the test without doing some computation. Then I do a separate query having created a query string thus:
> id <- 1:5
> paste0("SELECT * FROM table WHERE ID IN (", paste0(id, collapse = ","), ")")
[1] "SELECT * FROM table WHERE ID IN (1,2,3,4,5)"

How to use parameters with RPostgreSQL (to insert data)

I'm trying to insert data into a pre-existing PostgreSQL table using RPostgreSQL and I can't figure out the syntax for SQL parameters (prepared statements).
E.g. suppose I want to do the following
insert into mytable (a,b,c) values ($1,$2,$3)
How do I specify the parameters? dbSendQuery doesn't seem to understand if you just put the parameters in the ....
I've found dbWriteTable can be used to dump an entire table, but won't let you specify the columns (so no good for defaults etc.). And anyway, I'll need to know this for other queries once I get the data in there (so I suppose this isn't really insert specific)!
Sure I'm just missing something obvious...
I was looking for the same thing, for the same reasons, which is security.
Apparently dplyr package has the capacity that you are interested in. It's barely documented, but it's there. Scroll down to "Postgresql" in this vignette: http://cran.r-project.org/web/packages/dplyr/vignettes/databases.html
To summarize, dplyr offers functions sql() and escape(), which can be combined to produce a parametrized query. SQL() function from DBI package seems to work in exactly same way.
> sql(paste0('SELECT * FROM blaah WHERE id = ', escape('random "\'stuff')))
<SQL> SELECT * FROM blaah WHERE id = 'random "''stuff'
It returns an object of classes "sql" and "character", so you can either pass it on to tbl() or possibly dbSendQuery() as well.
The escape() function correctly handles vectors as well, which I find most useful:
> sql(paste0('SELECT * FROM blaah WHERE id in ', escape(1:5)))
<SQL> SELECT * FROM blaah WHERE id in (1, 2, 3, 4, 5)
Same naturally works with variables as well:
> tmp <- c("asd", 2, date())
> sql(paste0('SELECT * FROM blaah WHERE id in ', escape(tmp)))
<SQL> SELECT * FROM blaah WHERE id in ('asd', '2', 'Tue Nov 18 15:19:08 2014')
I feel much safer now putting together queries.
As of the latest RPostgreSQL it should work:
db_connection <- dbConnect(dbDriver("PostgreSQL"), dbname = database_name,
host = "localhost", port = database_port, password=database_user_password,
user = database_user)
qry = "insert into mytable (a,b,c) values ($1,$2,$3)"
dbSendQuery(db_connection, qry, c(1, "some string", "some string with | ' "))
Here's a version using the DBI and RPostgres packages, and inserting multiple rows at once, since all these years later it's still very difficult to figure out from the documentation.
x <- data.frame(
a = c(1:10),
b = letters[1:10],
c = letters[11:20]
)
# insert your own connection info
con <- DBI::dbConnect(
RPostgres::Postgres(),
dbname = '',
host = '',
port = 5432,
user = '',
password = ''
)
RPostgres::dbSendQuery(
con,
"INSERT INTO mytable (a,b,c) VALUES ($1,$2,$3);",
list(
x$a,
x$b,
x$c
)
)
The help for dbBind() in the DBI package is the only place that explains how to format parameters:
The placeholder format is currently not specified by DBI; in the
future, a uniform placeholder syntax may be supported. Consult the
backend documentation for the supported formats.... Known examples are:
? (positional matching in order of appearance) in RMySQL and RSQLite
$1 (positional matching by index) in RPostgres and RSQLite
:name and $name (named matching) in RSQLite
? is also the placeholder for R package RJDBC.