Replace " to ' in R for SQL query - sql

I'm working on an RShiny Application and the input going to RMD from App is an Array of 'Choices' depending on what the user selects. The output of Array is:
[1] "One" "Two" "Three"
I want to have the choices plugged to the SQL Query in RMD in the form:
('One', 'Two', 'Three')

I highly recommend you do not use paste0 due to the possibility of SQL injection.
Instead, you should use dbplyr::escape, which escapes string inputs properly (with respect to backend implied by the supplied connection). For example:
dbplyr::escape(x = c("one", "two", "three"),
parens = T,
collapse = ",",
con = con_oracle)
produces the string:
# <SQL> ('one','two','three')

Assuming your input is:
x <- c("One", "Two", "Three")
we can can build a WHERE IN clause using paste with collapse:
sql_where <- paste0("WHERE some_col IN ('", paste(x, collapse="', '"), "')")
sql_where
[1] "WHERE some_col IN ('One', 'Two', 'Three')"

Related

Create dynamic SQL query depending on user input in R Shiny App

I have an Shiny App where User can filter a SQL Database of Movies. So far, you can only filter by different countries.
con <- dbConnect(RSQLite::SQLite(), 'Movies.db')
movies_data <- dbReadTable(con, 'Movies')
ui <- fluidPage(
fluidRow(
selectInput(
inputId = "country",
label = "Country:",
choices = movies_data$journal,
multi=T
),
br(),
fluidRow(width="100%",
dataTableOutput("table")
)
)
)
server <- function(input, output, session) {
output$table <- renderDataTable({
dbGetQuery(
conn = con,
statement = 'SELECT * FROM movies WHERE country IN ( ? )',
params = list(input$country))
})
}
shinyApp(ui = ui, server = server)
Now i want to give the user more Filters, for example Actor or Genre. All Filters are Multiselect and optional. How can i create the Statement dynamic? Would i use some switch statement for every possible combination (i.e. no Filter on Country but only Action Movies)? This seems ab it bit exhausting to me.
First off, you say the filter is optional but I see no way to disable it in your code. I'm assuming that deselecting all options is your way of disabling the filter, or at least that it's intended to work that way. If all options are selected for any filter, then the current approach should work fine, and will just show all films.
You can probably just construct the overall query piece by piece, and then paste it all together at the end.
Base query: 'SELECT * FROM movies'
Country filter: 'country in ' input country
Actor filter: 'actor in' input actor
Genre filter: 'genre in' input genre
Then you put it all together with paste.
To summarize: Base query. Then, if any of the filters are active, add a WHERE. Join all filters together, separating by AND. Pass the final query in as a direct string.
You can even put the filters into a list for easier parsing.
# Here, filterList is a list containing input$country, input$actor, input$genre
# and filterNames contains the corresponding names in the database
# e.g. filterList <- list("c1", list("a1", "a2"), "g1")
# filterNames <- filterNames <- list("c", "a", "g")
baseQuery <- "SELECT * FROM movies"
# If any of the filters have greater than 0 value, this knows to do the filters
filterCheck <- any(sapply(filterList, length)>0)
# NOTE: If you have a different selection available for None
# just modify the sapply function accordingly
if(filterCheck)
{
baseQuery <- paste(baseQuery, "WHERE")
# This collapses multiselects for a filter into a single string with a comma separator
filterList <- sapply(filterList, paste, collapse = ", ")
# Now you construct the filters
filterList <- sapply(1:length(filterList), function(x)
paste0(filterNames[x], " IN (", filterList[x], ")"))
# Paste the filters together
filterList <- paste(filterList, collapse = " and ")
baseQuery <- paste(baseQuery, filterList)
}
# Final output using the sample input above:
# "SELECT * FROM movies WHERE c IN (c1) and a IN (a1, a2) and g IN (g1)"
Now use baseQuery as the direct query statement

Maintain SQL operator precedence when constructing Q objects in Django

I am trying to construct a complex query in Django by adding Q objects based on a list of user inputs:
from django.db.models import Q
q = Q()
expressions = [
{'operator': 'or', 'field': 'f1', 'value': 1},
{'operator': 'or', 'field': 'f2', 'value': 2},
{'operator': 'and', 'field': 'f3', 'value': 3},
{'operator': 'or', 'field': 'f4', 'value': 4},
]
for item in expressions:
if item['operator'] == 'and':
q.add(Q(**{item['field']:item['value']}), Q.AND )
elif item['operator'] == 'or':
q.add(Q(**{item['field']:item['value']}), Q.OR )
Based on this I am expecting to get a query with the following where condition:
f1 = 1 or f2 = 2 and f3 = 3 or f4 = 4
which, based on the default operator precedence will be executed as
f1 = 1 or (f2 = 2 and f3 = 3) or f4 = 4
however, I am getting the following query:
((f1 = 1 or f2 = 2) and f3 = 3) or f4 = 4
It looks like the Q() object forces the conditions to be evaluated in the order they were added.
Is there a way that I can keep the default SQL precedence? Basically I want to tell the ORM not to add parenthesis in my conditions.
Seems that you are not the only one with a similar problem. (edited due to #hynekcer 's comment)
A workaround would be to "parse" the incoming parameters into a list of Q() objects and create your query from that list:
from operator import or_
from django.db.models import Q
query_list = []
for item in expressions:
if item['operator'] == 'and' and query_list:
# query_list must have at least one item for this to work
query_list[-1] = query_list[-1] & Q(**{item['field']:item['value']})
elif item['operator'] == 'or':
query_list.append(Q(**{item['field']:item['value']}))
else:
# If you find yourself here, something went wrong...
Now the query_list contains the individual queries as Q() or the Q() AND Q() relationships between them.
The list can be reduce()d with the or_ operator to create the remaining OR relationships and used in a filter(), get() etc. query:
MyModel.objects.filter(reduce(or_, query_list))
PS: Although Kevin's answer is clever, using eval() is considered a bad practice and should be avoided.
Since SQL precedence is the same as Python precedence when it comes to AND, OR, and NOT, you should be able to achieve what you want by letting Python parse the expression.
One quick-and-dirty way to do it would be to construct the expression as a string and let Python eval() it.
from functools import reduce
ops = ["&" if item["operator"] == "and" else "|" for item in expressions]
qs = [Q(**{item["field"]: item["value"]}) for item in expressions]
q_string = reduce(
lambda acc, index: acc + " {op} qs[{index}]".format(op=ops[index], index=index),
range(len(expressions)),
"Q()"
) # equals "Q() | qs[0] | qs[1] & qs[2] | qs[3]"
q_expression = eval(q_string)
Python will parse this expression according to its own operator precedence, and the resulting SQL clause will match your expectations:
f1 = 1 or (f2 = 2 and f3 = 3) or f4 = 4
Of course, using eval() with user-supplied strings would be a major security risk, so here I'm constructing the Q objects separately (in the same way you did) and just referring to them in the eval string. So I don't think there are any additional security implications of using eval() here.

for loop or use apply functions over a RODBC sql query [duplicate]

This question already has answers here:
Dynamic "string" in R
(4 answers)
Add a dynamic value into RMySQL getQuery [duplicate]
(2 answers)
RSQLite query with user specified variable in the WHERE field [duplicate]
(2 answers)
Closed 5 years ago.
Is there any way to pass a variable defined within R to the sqlQuery function within the RODBC package?
Specifically, I need to pass such a variable to either a scalar/table-valued function, a stored procedure, and/or perhaps the WHERE clause of a SELECT statement.
For example, let:
x <- 1 ## user-defined
Then,
example <- sqlQuery(myDB,"SELECT * FROM dbo.my_table_fn (x)")
Or...
example2 <- sqlQuery(myDB,"SELECT * FROM dbo.some_random_table AS foo WHERE foo.ID = x")
Or...
example3 <- sqlQuery(myDB,"EXEC dbo.my_stored_proc (x)")
Obviously, none of these work, but I'm thinking that there's something that enables this sort of functionality.
Build the string you intend to pass. So instead of
example <- sqlQuery(myDB,"SELECT * FROM dbo.my_table_fn (x)")
do
example <- sqlQuery(myDB, paste("SELECT * FROM dbo.my_table_fn (",
x, ")", sep=""))
which will fill in the value of x.
If you use sprintf, you can very easily build the query string using variable substitution. For extra ease-of-use, if you pre-parse that query string (I'm using stringr) you can write it over multiple lines in your code.
e.g.
q1 <- sprintf("
SELECT basketid, count(%s)
FROM %s
GROUP BY basketid
"
,item_barcode
,dbo.sales
)
q1 <- str_replace_all(str_replace_all(q1,"\n",""),"\\s+"," ")
df <- sqlQuery(shopping_database, q1)
Side-note and hat-tip to another R chap
Recently I found I wanted to make the variable substitution even simpler by using something like Python's string.format() function, which lets you reuse and reorder variables within the string
e.g.
$: w = "He{0}{0}{1} W{1}r{0}d".format("l","o")
$: print(w)
"Hello World"
However, this function doesn't appear to exist in R, so I asked around on Twitter, and a very helpful chap #kevin_ushey replied with his own custom function to be used in R. Check it out!
With more variables do this:
aaa <- "
SELECT ColOne, ColTwo
FROM TheTable
WHERE HpId = AAAA and
VariableId = BBBB and
convert (date,date ) < 'CCCC'
"
--------------------------
aaa <- gsub ("AAAA", toString(111),aaa)
aaa <- gsub ("BBBB", toString(2222),aaa)
aaa <- gsub ("CCCC", toString (2016-01-01) ,aaa)
try with this
x <- 1
example2 <- fn$sqlQuery(myDB,"SELECT * FROM dbo.some_random_table AS foo WHERE foo.ID = '$x'")

Providing lookup list from R vector as SQL table for RODBC lookup

I have a list of IDs in an R vector.
IDlist <- c(23, 232, 434, 35445)
I would like to write an RODBC sqlQuery with a clause stating something like
WHERE idname IN IDlist
Do I have to read the whole table and then merge it to the idList vector within R? Or how can I provide these values to the RODBC statement, so recover only the records I'm interested in?
Note: As the list is quite long, pasting individual values into the SQL statement, as in the answer below, won't do it.
You could always construct the statement using paste
IDlist <- c(23, 232, 434, 35445)
paste("WHERE idname IN (", paste(IDlist, collapse = ", "), ")")
#[1] "WHERE idname IN ( 23, 232, 434, 35445 )"
Clearly you would need to add more to this to construct your exact statement
I put together a solution to a similar problem by combining the tips here and here and running in batches. Approximate code follows (retyped from an isolated machine):
#assuming you have a list of IDs you want to match in vIDs and an RODBC connection in mycon
#queries that don't change
q_create_tmp <- "create table #tmptbl (ID int)"
q_get_records <- "select * from mastertbl as X join #tmptbl as Y on (X.ID = Y.ID)"
q_del_tmp <- "drop table #tmptbl"
#initialize counters and storage
start_row <- 1
batch_size <- 1000
allresults <- data.frame()
while(start_row <= length(vIDs) {
end_row <- min(length(vIDs), start_row+batch_size-1)
q_fill_tmp <- sprintf("insert into #tmptbl (ID) values %s", paste(sprintf("(%d)", vIDs[start_row:end_row]), collapse=","))
q_all <- list(q_create_tmp, q_fill_tmp, q_get_records, q_del_tmp)
sqlOutput <- lapply(q_all, function(x) sqlQuery(mycon, x))
allresults <- rbind(allresults, sqlOutput[[3]])
start_row <- end_row + 1
}

Pass R variable to RODBC's sqlQuery? [duplicate]

This question already has answers here:
Dynamic "string" in R
(4 answers)
Add a dynamic value into RMySQL getQuery [duplicate]
(2 answers)
RSQLite query with user specified variable in the WHERE field [duplicate]
(2 answers)
Closed 5 years ago.
Is there any way to pass a variable defined within R to the sqlQuery function within the RODBC package?
Specifically, I need to pass such a variable to either a scalar/table-valued function, a stored procedure, and/or perhaps the WHERE clause of a SELECT statement.
For example, let:
x <- 1 ## user-defined
Then,
example <- sqlQuery(myDB,"SELECT * FROM dbo.my_table_fn (x)")
Or...
example2 <- sqlQuery(myDB,"SELECT * FROM dbo.some_random_table AS foo WHERE foo.ID = x")
Or...
example3 <- sqlQuery(myDB,"EXEC dbo.my_stored_proc (x)")
Obviously, none of these work, but I'm thinking that there's something that enables this sort of functionality.
Build the string you intend to pass. So instead of
example <- sqlQuery(myDB,"SELECT * FROM dbo.my_table_fn (x)")
do
example <- sqlQuery(myDB, paste("SELECT * FROM dbo.my_table_fn (",
x, ")", sep=""))
which will fill in the value of x.
If you use sprintf, you can very easily build the query string using variable substitution. For extra ease-of-use, if you pre-parse that query string (I'm using stringr) you can write it over multiple lines in your code.
e.g.
q1 <- sprintf("
SELECT basketid, count(%s)
FROM %s
GROUP BY basketid
"
,item_barcode
,dbo.sales
)
q1 <- str_replace_all(str_replace_all(q1,"\n",""),"\\s+"," ")
df <- sqlQuery(shopping_database, q1)
Side-note and hat-tip to another R chap
Recently I found I wanted to make the variable substitution even simpler by using something like Python's string.format() function, which lets you reuse and reorder variables within the string
e.g.
$: w = "He{0}{0}{1} W{1}r{0}d".format("l","o")
$: print(w)
"Hello World"
However, this function doesn't appear to exist in R, so I asked around on Twitter, and a very helpful chap #kevin_ushey replied with his own custom function to be used in R. Check it out!
With more variables do this:
aaa <- "
SELECT ColOne, ColTwo
FROM TheTable
WHERE HpId = AAAA and
VariableId = BBBB and
convert (date,date ) < 'CCCC'
"
--------------------------
aaa <- gsub ("AAAA", toString(111),aaa)
aaa <- gsub ("BBBB", toString(2222),aaa)
aaa <- gsub ("CCCC", toString (2016-01-01) ,aaa)
try with this
x <- 1
example2 <- fn$sqlQuery(myDB,"SELECT * FROM dbo.some_random_table AS foo WHERE foo.ID = '$x'")