Error in using variables in SQL statements [duplicate] - sql

This question already has answers here:
How to use a variable name in a SQL statement?
(5 answers)
Closed 9 years ago.
Im extracting some data from a table using sql select statment in R,
query <- "select * from MyTable where TimeCol='6/29/2012 21:05' ";
result <- fn$sqldf(query);
The above code gives correct results, but when the time value is saved in variable, it doesn't works
mytime <- "6/29/2012 21:05";
query <- "select * from MyTable where TimeCol = $mytime"; # OR
query <- "select * from MyTable where TimeCol = $[mytime]"; # OR
query <- "select * from MyTable where TimeCol = '$[mytime]' ";
result <- fn$sqldf(query);
None of the above three lines is working
View(result) it gives the error: invalid 'x' argument

$[] and $() are not valid syntax and the quotes that were around the time string in the first instance of query in the post are missing in the subsequent instances so a correct version would be:
library(sqldf)
mytime <- "6/29/2012 21:05"
MyTable <- data.frame(TimeCol = mytime)
query <- "select * from MyTable where TimeCol = '$mytime' "
fn$sqldf(query)

Although the answer I linked to in a comment uses a different function to query the data.frame, the principle is the same: paste the variable to the rest of your select string, ensuring quotes are included when necessary (using shQuote), then pass that character string to your sql querying function of choice.
query <- paste0("select * from MyTable where TimeCol = ", shQuote(mytime))
result <- fn$sqldf(query)
The semicolons at the ends of your lines probably aren't necessary.
As Joran mentions in a comment, sprintf could also be used (perhaps with a gain in readability in case there are many variable components in your query string):
sprintf("select * from MyTable where %s = '%s'", "TimeCol", mytime)
# [1] "select * from MyTable where TimeCol = '6/29/2012 21:05'"

Related

Run SQL script from R with variables defined in R

I have an SQL script which I need to run using R Studio. However, my SQL script has one variable that is defined in my R environment. I am using dbGetQuery; however, I do not know (and I didn't find a solution) how to pass these variables.
library(readr)
library(DBI)
library(odbc)
library(RODBC)
#create conection (fake one here)
con <- odbcConnect(...)
dt = Sys.Date()
df = dbGetQuery(.con, statement = read_file('Query.sql'))
The file 'Query.sql' makes reference to dt. How do I make the file recognize my variable dt?
There are several options, but my preferred is "bound parameters".
If, for instance, your 'Query.sql' looks something like
select ...
from MyTable
where CreatedDateTime > ?
The ? is a place-holder for a binding.
Then you can do
con <- dbConnect(...) # from DBI
df = dbGetQuery(con, statement = read_file('Query.sql'), params = list(dt))
With more parameters, add more ?s and more objects to the list, as in
qry <- "select ... where a > ? and b < ?"
newdat <- dbGetQuery(con, qry, params = list(var1, var2))
If you need a SQL IN clause, it gets a little dicey, since it doesn't bind things precisely like we want.
candidate_values <- c(2020, 1997, 1996, 1901)
qry <- paste("select ... where a > ? and b in (", paste(rep("?", length(candidate_values)), collapse=","), ")")
qry
# [1] "select ... where a > ? and b in ( ?,?,?,? )"
df <- dbGetQuery(con, qry, params = c(list(avar), as.list(candidate_values)))

new column each row results from a query to mysql database in R

I've got a simple dataframe with three columns. One of the columns contains a database name. I first need to check if data exists, and if not, insert it. Otherwise do nothing.
Sample data frame:
clientid;region;database
135;Europe;europedb
2567;Asia;asiadb
23;America;americadb
So I created a function to apply to dataframe this way:
library(RMySQL)
check_if_exist <- function(df){
con <- dbConnect(MySQL(),
user="myuser", password="mypass",
dbname=df$database, host="myhost")
query <- paste0("select count(*) from table where client_id='", df$clientid,"' and region='", df$region ,"'")
rs <- dbSendQuery(con, query)
rs
}
Function call:
df$new_column <- lapply(df, check_if_exist)
But this doesn't work.
This is a working example of what you are asking, if I understood correctly. But I don't have your database, so we just print the query for verification, and fetch a random number as the result.
Note that by doing lapply(df,...), you are looping over the columns of the database, and not the rows as you want.
df = read.table(text="clientid;region;database
135;Europe;europedb
2567;Asia;asiadb
23;America;americadb",header=T,sep=";")
check_if_exist <- function(df){
query = paste0("select count(*) from table where client_id='", df$clientid,"' and region='", df$region ,"'")
print(query)
rs <- runif(1,0,1)
return(rs)
}
df$new_column <- sapply(split(df,seq(1,nrow(df))),check_if_exist)
Hope this helps.

R RDB Query of date using RODBC Connection

I am trying to query an old Oracle RDB database using something like Sys.Date()-1 as that works in R, but I have not been able to find the right syntax.
The following works but I want to run it on a regular schedule so a fixed time frame will not work :
`Output <- SQLQuery ("SELECT * FROM tablename WHERE productionDate BETWEEN 20-FEB-2017 00:00:00 AND 21-FEB-2017 23:59:00")`
I would like to have something like:
Output <- SQLQuery ("SELECT * FROM tablename WHERE productionDate >= Today()-1")
I have also tried assigning macro variables outside of the query and then call the. Inside the query with no success. The entries go back several years so to take everything takes a couple minutes to run the query and then I have to subset the data after the fact. I would hope there was a better way for R to query a database even an old one based on dates.
Thank you for any help.
something like this?
startDt <- as.Date("2016-02-12")
endDate <- as.Date("2016-03-12")
sqlstr <- paste0("SELECT * FROM tablename WHERE productionDate BETWEEN '",
paste(format(c(startDt, endDate), "%d %B %Y"), collapse="' and '"),"'")
sqlstr

Dynamic SQL Query in R (WHERE)

I am trying out some dynamic SQL queries using R and the postgres package to connect to my DB.
Unfortunately I get an empty data frame if I execute the following statement:
x <- "Mean"
query1 <- dbGetQuery(con, statement = paste(
"SELECT *",
"FROM name",
"WHERE statistic = '",x,"'"))
I believe that there is a syntax error somewhere in the last line. I already changed the commas and quotation marks in every possible way, but nothing seems to work.
Does anyone have an idea how I can construct this SQL Query with a dynamic WHERE Statement using a R variable?
You should use paste0 instead of paste which is producing wrong results or paste(..., collapse='') which is slightly less efficient (see ?paste0 or docs here).
Also you should consider preparing your SQL statement in separated variable. In such way you can always easily check what SQL is being produced.
I would use this (and I am using this all the time):
x <- "Mean"
sql <- paste0("select * from name where statistic='", x, "'")
# print(sql)
query1 <- dbGetQuery(con, sql)
In case I have SQL inside a function I always add debug parameter so I can see what SQL is used:
function get_statistic(x=NA, debug=FALSE) {
sql <- paste0("select * from name where statistic='", x, "'")
if(debug) print(sql)
query1 <- dbGetQuery(con, sql)
query1
}
Then I can simply use get_statistic('Mean', debug=TRUE) and I will see immediately if generated SQL is really what I expected.
The Problem The problem may be that you have spaces around Mean:
x <- "Mean"
s <- paste(
"SELECT *",
"FROM name",
"WHERE statistic = '",x,"'")
giving:
> s
[1] "SELECT * FROM name WHERE statistic = ' Mean '"
Corrected Version Instead try:
s <- sprintf("select * from name where statistic = '%s'", x)
giving:
> s
[1] "select * from name where statistic = 'Mean'"
gsubfn You could also try this:
library(gsubfn)
fn$dbGetQuery(con, "SELECT *
FROM name
WHERE statistic = '$x'")
Try this:
require(stringi)
stri_paste("SELECT * ",
"FROM name ",
"WHERE statistic = '",x,"'",collapse="")
## [1] "SELECT * FROM name WHERE statistic = 'Mean'"
or use concatenate operator %+%
"SELECT * FROM name WHERE statistic ='" %+% x %+% "'"
## [1] "SELECT * FROM name WHERE statistic ='mean'"
A newer way to do this is with the glue package, part of the tidyverse. It is described as "An implementation of interpreted string literals, inspired by Python's Literal String Interpolation."
Using glue, you would do:
library(glue)
library(DBI)
x <- "Mean"
query1 <- glue_sql("
SELECT *
FROM name
WHERE statistic = ({x})
", .con = con)
dbGetQuery(con, query1)
It's a great package due to its flexibility. For example, let's say you wanted to import mean, median and mode statistics. Then you would add an asterisk to the call like so:
x <- c("Mean", "Median", "Mode")
query2 <- glue_sql("
SELECT *
FROM name
WHERE statistic = ({x*})
", .con = con)
dbGetQuery(con, query2)

Specifying an SQL where statement based on a string in R (using rodbc)

This is my first attempt at using R to access data from within MS Access using ODBC.
The following query works:
id <- levels(assetid)[assetid[,1]][12]
qry <- "SELECT DriverName FROM Data WHERE ID = 'idofinterest'"
sqlQuery(con, qry)
However, I would like to know if there is a way to use the variable "id" in the "qry" statement (without using paste)? I have seen some statements on the web with $ and % signs - however I haven't had any success in using them.
Thanks.
Why don't you want to use paste? Anyway, sprintf is an alternative means of string munging.
qry <- sprintf("SELECT DriverName FROM Data WHERE ID = '%s'", id)
sqlQuery(con, qry)
Try fn$ from the gsubfn package :
> library(gsubfn)
> id <- "abc"
> fn$identity("SELECT DriverName FROM Data WHERE ID = '$id'")
[1] "SELECT DriverName FROM Data WHERE ID = 'abc'"