How to let user define SQL parameters (in terminal) in Rmarkdown? - sql

I am working to make R more user friendly. I am using library(RODBC) function to have direct access to the SQL database from RStudio. In markdown I use code something like this:
sql <- "
Select [A]
,[year]
,[model]
....
from
(SELECT ....
Where [year] between 2010 and 2020 and model ='1A' and [A] between 11 and 22... "
I need to let the user specify parameters from the last line (for instance 2010...2020 , '1A' and 11 ...22) in the terminal by using readline() function.

It is not the best option, but you can simply ask for some inputs before,
range_start <- readline("Range start: ")
range_end <- readline("Range start: ")
literal_and <- "and"
sql_stmt <- "select year, model from where year between"
and then concatenate them in one query
final_sql <- paste(sql_stmt, range_start, literal_and, range_end)

Related

For Loop to iterate SQL query in R

I would like to iterate this SQL query over the 17 rows in my df. My df and code are below. I think I may need single quotes around dat$ptt_id, because I get a syntax error at the "IN" function. Any ideas how to correctly write this?
df looks like:
ptt_id
1 181787
2 181788
3 184073
4 184098
5 197601
6 197602
7 197603
8 197604
9 197605
10 197606
11 197607
12 197608
13 197609
14 200853
15 200854
16 200851
17 200852
#Load data----
dat <- read.csv("ptts.csv")
dat2<-list(dat)
#Send to database----
for(i in 1:nrow(dat)){
q <- paste("SELECT orgnl_pit, t_name, cap_date, species, sex, mass, cap_lat, cap_lon, sat_appld_id
FROM main.capev JOIN uid.turtles USING (orgnl_pit)
WHERE sat_appld_id IN", dat$ptt_id[i],";")
#Get query----
tags <- dbGetQuery(conn, q)
}
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: syntax error at or near "181787"
LINE 3: WHERE sat_appld_id IN 181787 ;
^
Thanks for any assistance.
Two options:
Parameter binding.
qmarks <- paste0("(", paste(rep("?", nrow(df)), collapse = ","), ")")
qmarks
# [1] "(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"
qry <- paste(
"SELECT orgnl_pit, t_name, cap_date, species, sex, mass, cap_lat, cap_lon, sat_appld_id
FROM main.capev JOIN uid.turtles USING (orgnl_pit)
WHERE sat_appld_id IN", qmarks)
tags <- dbGetQuery(conn, qry, params = df[,1])
Temporary table. This might be more useful when you have a large number of ids to use. (This also works without a temp table if you get the ids from the database anyway, and can use that query in this sub-query.)
dbWriteTable(conn, df, "mytemp")
qry <- "SELECT orgnl_pit, t_name, cap_date, species, sex, mass, cap_lat, cap_lon, sat_appld_id
FROM main.capev JOIN uid.turtles USING (orgnl_pit)
WHERE sat_appld_id IN (select id from mytemp)"
tags <- dbGetQuery(conn, qry)
dbExecute(conn, "drop table mytemp")
(Name your temp table carefully. DBMSes usually have a nomenclature to ensure the table is automatically cleaned/dropped when you disconnect, often something like "#mytemp". Check with your DBMS or DBA.)
The IN operator requires a list. You can think of it as multiple OR conditions.
E.g. instead of WHERE sat_appld_id IN 181787 it should be WHERE sat_appld_id IN (181787)
And to that point, instead of a loop you could create a list from your dat$ptt_id column for just one sql query such as WHERE sat_appld_id IN (181787, 181788, 184073, ...) and do any additional wraggling within your R code instead of making multiple database queries.

Putting output from sql query into another query using R environment

I am wondering what approach should have been selected to perform action from title. I am using ODBC connection and what I get from first sql query are like 40-50 rows in one column. What I want is to put this output as a values in to search for.
How should i treat this? Like a array or separated variables? I still do not know R well so just need to know where to search for.
Regards
------more explanation below----
I have list of 40-50 numbers of 10 digits each, organized in a column.
I am trying to do this:
list <- c(my_input)
sql_in <- paste0(list, collapse="")
and characters are organized like this after this operations:
'c(1234567890, , 1234567890, 1234567890)'
and almost all looks fine and fit into my query besides additional c character at the beginning and missing apostrophes.I try to use gsub function but did not work in way I want.
You may likely do this in one SQL call using a subquery. Notice in the call below that the result of
SELECT n_gear
FROM Gear
WHERE n_gear IN (3,4)
Is passed to the WHERE clause of the primary query. This is perfectly valid and will allow your query to execute entirely in SQL without having to do any intermediate steps in R.
(I use sqldf for simplicity of illustration, but this should work through just about any ODBC connection)
library(sqldf)
Gear <- data.frame(n_gear = 1:5)
sqldf(
"SELECT mpg, qsec, gear, wt
FROM mtcars
WHERE gear IN (SELECT n_gear
FROM Gear
WHERE n_gear IN (3,4))"
)
Try something like this:
list<-c("try","this") #The output from your first query
sql_in<-paste0(list, collapse="','")
The Output
paste("select * from table where table.var in ",paste("('",sql_in,"')",sep=''))
[1] "select * from table where table.var in ('try','this')"
If yuo have space as first or last element of the string you can use this code:
`list<-c(" first element is a space","try","this","last element is a space ")` #The output from your first query
Find space at first or last character
first_space<-substr(list, start = 1, stop = 1)==" "
last_space<-substr(list, start = nchar(list), stop = nchar(list))==" "
Remove spaces
list[first_space]<-substr(list[first_space], start = 2, stop = nchar(list[first_space]))
list[last_space]<-substr(list[last_space], start = 1, stop = nchar(list[last_space])-1)
sql_in<-paste0(list, collapse="','")
Your output
paste0("select * from table where table.var in ",paste("('",sql_in,"')",sep=''))
"select * from table where table.var in ('first element is a space','try','this','last element is a space')"
I think You are expecting some thing like shown below code,
data <- dbGetQuery(con, "select column from yourfirsttable")
list <- paste(data$column, collapse="','")
result <- dbGetQuery(con, statement = sprintf("select * from yourresulttable where inv in ('%s')",list))
It's not entirely clear exactly what you're wanting to achieve here. For example, one use case just means you can do it all with a join. But I have cases where I don't know the values for the test without doing some computation. Then I do a separate query having created a query string thus:
> id <- 1:5
> paste0("SELECT * FROM table WHERE ID IN (", paste0(id, collapse = ","), ")")
[1] "SELECT * FROM table WHERE ID IN (1,2,3,4,5)"

Shiny Reactive SQL: WHERE clause

I am new to SQL and i am trying to figure out how i can "select all" in WHERE clause.
Let me make it bit more clearer why i would like to achieve that:
I do have reactive SQL in my Shiny App. The user can filter at least three different variables in my SQL:
In this case user can choose Kunde (Customer), Abmessung (Diameter) and Date. I do have hard time to figure out when the user wants to filter only one/two variables and not three (the combinations of possibilities inside is huge, and writing SQL Query in if statement for each is quite a thing). So for example the user would like to have Kunde (Customer) filtered but all the Abmessung (Diameters) kept.
Here is a sample SQL Query which i have used:
select * from x.xy where kdname IN (",paste0("'",paste(input$kunde,
collapse="', '"),"'"),") and abmessung IN (",paste0("'",paste(input$abm,
collapse="', '"),"'"),") and dati_create between
to_date('",format(input$date[1], '%d.%m.%Y'),"','dd.mm.yyyy') and
to_date('",format(input$date[2], '%d.%m.%Y'),"','dd.mm.yyyy') + (86399/86400)
Is there a possibility in SQL to use some kind of "*" in WHERE clause?
Why not simply do it using if to each condition and than paste collapse it
( i think there isn t way to do what your want without "if\esle" or "case" or "nvl" or "decode" so you need to hardcode it)
like:
input=list("kunde"="a","abm"="b",date=c("2016-01-01","2016-02-01"))
#input=list("kunde"="","abm"="",date=c("","")) # for test
sql_main="select * from x.xy "
sql_cond=list()
sql_cond[1]= if(input$kunde==""){NULL}else{paste0("kdname IN ('",paste(input$kunde,collapse="','"),"')")}
sql_cond[2]= if(input$abm==""){NULL}else{paste0("abmessung IN ('",paste(input$abm,collapse="', '"),"')")}
sql_cond[3]= if(input$date[1]==""|input$date[2]==""){NULL}else{paste0("dati_create between to_date('",as.character(as.Date(input$date[1]),'%d.%m.%Y'),"','dd.mm.yyyy') and
to_date('",as.character(as.Date(input$date[2]),'%d.%m.%Y'),"','dd.mm.yyyy') + (86399/86400)")}
sql_cond=sql_cond[!sapply(sql_cond,is.null)]# needed to del NULL in list
sql_cond_all=paste(sql_cond,collapse =" and ")
sql=if(sql_cond_all!=""){paste(sql_main,"where",sql_cond_all)}else{sql_main}
sql
If you use RMySQL you can just paste together a chr string as query.
So you could do something like this:
condition <- character()
if(input$kdname != "") condition <- append(condition, paste0("kdname IN '", paste(input$kdname, collapse = "', '"), "'"))
if(input$abm != "") condition <- append(condition, paste0("abmessung IN '", paste(input$abm, collapse = "', '"), "'"))
query <- paste("SELECT * FROM x.xy WHERE", paste(condition, collapse = " AND "))
I am not familiar with shiny. However, I think you could achieve your goal by using oracle's decode function.
The where clause will look something like this (pseudo-code)
where
decode(input_variable,null,1,input_variable)=decode(input_variable,null,1,table_column)
and dati_create between
decode(input_date1,null,'1/1/1900',input_date1)
and decode(input_date2,null,'1/1/3000',input_date2)
and table_column2 in (
SELECT decode(inputvarable2,null, (select table_column2 from dual)
,TRIM(REGEXP_SUBSTR(temp, '[^,]+', 1, level)) )
FROM (SELECT inputvarable2 temp FROM DUAL)
CONNECT BY level <= REGEXP_COUNT(temp, '[^,]+')
)
Here I assume that when input variable is not given - it has null value. Also, as user input sample was not provided, I assume that user input for in clause is comma separated string. (e.g. 'a,b,c')
And thus, if input_variable is null then where becomes 1=1 (always true), otherwise it is input_variable=table_column. With dates it is a little bit more tricky - so I give very early date (1/1/1900), or very far away one (1/1/3000). The logic behind in clause, is to convert user input into collection from comma separated string and then use the same trick with decode.
Though this might not be the most efficient way to do this.
Also I see that you are concatenating user input directly into your sql statement. This is highly risky - as your code would be prone to sql injection attacks.
I find sub function very useful when constructing queries, try this:
myquery <- 'select * from x.xy where kdname IN "NAME" and abmessung IN "ABM" and dati_create between DATE1 and DATE2 + (86399/86400)'
date1 <- input$date[1];date1 <- sub("-",".",date1);date1 <- sub("-",".",date1)
date2 <- input$date[2];date2 <- sub("-",".",date2);date2 <- sub("-",".",date2)
myquery <- sub("DATE1",date1,myquery)
myquery <- sub("DATE2",date2,myquery)
myquery <- sub("NAME",input$kunde,myquery)
myquery <- sub("ABM",input$abm,myquery)
myquery <- noquote(myquery)
myquery

Passing characters as argument into db function in R

I am wondering how to pass 2 characters (x,y) in my below R user-defined function, and hope someone could assist on this:
sql.r<-function(x,y){
# Load RODBC package
library(RODBC)
# Create a connection to the database called "con"
con <- odbcConnect("odbccalc", uid=xxx, pwd=xxx, believeNRows=FALSE)
# Check that connection is working (Optional)
odbcGetInfo(con)
# Find out what tables are available (Optional)
Tables <- sqlTables(con, schema="tblData")
# Query the database and put the results into the data frame "dataframe"
dataframe <- sqlQuery(con, "
SELECT lbl,Date, dot
FROM
tblData t
WHERE t.lbl="'',x,"''
AND t.Date <"'',y,"''
ORDER BY t.Date desc")
The syntax problem might lie in the management of quotes.
Working syntax, in case it helps:
sqlQuery(con, "
SELECT lbl,Date, dot
FROM
tblData t
WHERE t.lbl='fruit'
AND t.Date < '2015-06-01'
ORDER BY t.Date desc")
Best,
As others say, you can use paste or paste0 to construct the query. However, the sprintf function can also do the trick. I think this is slightly more easy to read as you avoid mixed single and double quotation marks.
I.e. do something like the following in your function:
query <- sprintf("SELECT lbl, Date, dot
FROM
tblData t
WHERE t.lbl= '%s'
AND t.Date < '%s'
ORDER BY t.Date desc", x, y)
sqlQuery(con, query)
You have to build the query through paste or paste0. Try this:
dataframe <- sqlQuery(con, paste0("
SELECT lbl,Date, dot
FROM
tblData t
WHERE t.lbl='",x,"'
AND t.Date <'",y,"'
ORDER BY t.Date desc"))
The point is that sqlQuery takes two args: a connection and a string. The string is the sql command you want to execute. If the command depends on some inputs, you have to build the string accordingly. paste and sprintf let you do this. You are putting the values of the x and y variables in the string representing the command. This isn't sql specific, but just standard string manipulation.

RODBC Multiple Inputs from Shiny

I have a Shiny app that has a checkbox group input. The user can select multiple inputs. I also have an ODBC connection linked to a database. The process would be that when a user selects items from the check box group, that user input would be part of a string in the sql query to filter the data.
UI.R (partial to show example)
checkboxGroupInput('Type', 'Type', c(
"AX"="AX",
"AY"="AY",
"AZ"="AZ",
"BGB"="BGB",
"BT"="BT",
"BX"="BX",
"BXT"="BXT",
"C"="C",
"CNT"="CNT")),
The column in the table where the "Type" information is in is called COMPONENT, so my sql query using RODBC is
data <- odbcConnect("database", uid="username", pwd="password")
query <- (SELECT ID, NAME, TYPE FROM COMPONENT WHERE TYPE LIKE Input$Type)
df <- odbcQuery(data, query)
The query line would not work, but I have no idea how to take multiple inputs and place them properly in the query. Also, there is an added level of complexity that I am not sure how to handle. The data in the database is alpha numeric, so instead of AX, it might be listed as AX14 or AX 71. Also, because there are some one letter types, using a wildcard seems a little difficult.
To answer your initial question regarding "multiple inputs in the query", I use concatenation to achieve this.
Using paste0(), I write something as follows:
type = "AX14"
myQuery <- paste0("Select variable1, variable2 from my_table where type like ",type)
myQuery
[1] "Select variable1, variable2 from my_table where type like AX14"
You can add little things like single quotes or wildcard operators as follows:
myQuery <- paste0("Select variable1, variable2 from my_table where type like '%",type,"%'")
myQuery
[1] "Select variable1, variable2 from my_table where type like '%AX14%'"
Then proceed with actually running the query:
df <- odbcQuery(data, myQuery)