I am using RODBC to connect to a database. I would love for a user to be able to define wildcard strings to lookup in the SQL as part of a function. I cannot use CONTAINS as the database is not full-text indexed.
The SQL I want to create is
"SELECT *
FROM mydataTable
WHERE (ItemNM LIKE '%CT%' OR ItemNM LIKE '%MRI%' OR ItemNM LIKE '%US%')"
The user should be able to define as many wildcards as they like, all from the ItemNM field and all separated by OR.
myLookup<-function(userdefined){
paste0("SELECT *
FROM mydataTable
WHERE ( LIKE '",userdefined,"')")
}
If I vectorise the userdefined (ie userdefined<-c("US","MRI")) then I end up with separate SQL strings which is no good. How can I get the output as above but for any length of user defined string where they are just defining the wildcard?
You could use :
myLookup <- function(userdefined) {
paste0('SELECT * FROM mydataTable WHERE (',
paste0('ITENM LIKE %', userdefined, '%', collapse = " OR "), ')')
}
userdefined<-c("US","MRI")
myLookup(userdefined)
#[1] "SELECT * FROM mydataTable WHERE (ITENM LIKE %US% OR ITENM LIKE %MRI%)"
We can use glue
library(glue)
mylookup <- function(userdefined){
as.character(glue('SELECT * FROM mydataTable WHERE (',
glue_collapse(glue("ItemNM LIKE '%{userdefined}%'"), sep=" OR "), ')'))
}
mylookup(userdefined)
#[1] "SELECT * FROM mydataTable WHERE (ItemNM LIKE '%US%' OR ItemNM LIKE '%MRI%')"
Related
I am trying to pull data based on multiple keywords from the same column.
Currently I have a SQL statement that works like this.
SELECT *
FROM Customers
WHERE CustomerName LIKE 'a%'
OR CustomerName LIKE '_r%'
OR CustomerName LIKE 'si%';
That works fine. What I am trying to achieve is to pass the keywords c("a", "_r", "si") as a vector. Like this:
keywords <- c("a", "_r", "si")
SELECT *
FROM Customers
WHERE CustomerName LIKE '%' + keywords + '%';
That did not work. How do I submit a variable with a bunch of keywords into the like statement?
Use sprintf and paste/collapse= . Within a sprintf format %s is replaced with the next argument and %% means %.
keywords <- c("a", "_r", "si")
sql <- keywords |>
sprintf(fmt = "CustomerName LIKE '%%%s%%'") |>
paste(collapse = " OR \n") |>
sprintf(fmt = "SELECT *
FROM Customers
WHERE %s")
cat(sql, "\n")
giving:
SELECT *
FROM Customers
WHERE CustomerName LIKE '%a%' OR
CustomerName LIKE '%_r%' OR
CustomerName LIKE '%si%'
Just another option using string_split() and a JOIN
Example
DECLARE #Find VARCHAR(max) = ('a%,_r%,si%')
Select Distinct A.*
From Customers A
Join string_split(#Find,',') B
on CustomerName like B.value
I want to get all rows in my database where a condition with regular expressions is met. The variable should start with "J12", "J13", "J14" or "J15".
This was my attempt:
Data <- dbGetQuery(db,
"SELECT * FROM 'XXX.XXXX.XXX'
WHERE TYPE = 'xyz' AND [xyz_DIAG] LIKE '^J1[2-5]' ")
Then a data.frame with 0 rows is returned.
When I send the query
Data <- dbGetQuery(db,
"SELECT * FROM 'XXX.XXXX.XXX'
WHERE TYPE = 'xyz'")
I get a quite large data.frame and then I call
Data %>% setDT %>% .[str_detect(xyz_DIAG, "^J1[2-5]")] and I get the expected result because in fact there are many rows that fulfill that regexp. Have I done something wrong?
For the time being, REGEXP operator has not been added to RSQLITE, see this pull request.
You thus need to "unwrap" the regex and use ORed LIKE:
Data <- dbGetQuery(db,
"SELECT * FROM 'XXX.XXXX.XXX'
WHERE TYPE = 'xyz' AND ([xyz_DIAG] LIKE 'J12%' OR [xyz_DIAG] LIKE 'J13%' OR [xyz_DIAG] LIKE 'J14%' OR [xyz_DIAG] LIKE 'J15%') ")
I am getting array from front end to perform filters according that inside the SQL query.
I want to apply a LIKE filter on the array. How to add an array inside LIKE function?
I am using Angular with Html as front end and Node as back end.
Array being passed in from the front end:
[ "Sports", "Life", "Relationship", ...]
SQL query is :
SELECT *
FROM Skills
WHERE Description LIKE ('%Sports%')
SELECT *
FROM Skills
WHERE Description LIKE ('%Life%')
SELECT *
FROM Skills
WHERE Description LIKE ('%Relationship%')
But I am getting an array from the front end - how to create a query for this?
In SQL Server 2017 you can use OPENJSON to consume the JSON string as-is:
SELECT *
FROM skills
WHERE EXISTS (
SELECT 1
FROM OPENJSON('["Sports", "Life", "Relationship"]', '$') AS j
WHERE skills.description LIKE '%' + j.value + '%'
)
Demo on db<>fiddle
As an example, for SQL Server 2016+ and STRING_SPLIT():
DECLARE #Str NVARCHAR(100) = N'mast;mode'
SELECT name FROM sys.databases sd
INNER JOIN STRING_SPLIT(#Str, N';') val ON sd.name LIKE N'%' + val.value + N'%'
-- returns:
name
------
master
model
Worth to mention that input data to be strictly controlled, since such way can lead to SQL Injection attack
As the alternative and more safer and simpler approach: SQL can be generated on an app side this way:
Select * from Skills
WHERE (
Description Like '%Sports%'
OR Description Like '%Life%'
OR Description Like '%Life%'
)
A simple map()-call on the words array will allow you to generate the corresponding queries, which you can then execute (with or without joining them first into a single string).
Demo:
var words = ["Sports", "Life", "Relationship"];
var template = "Select * From Skills Where Description Like ('%{0}%')";
var queries = words.map(word => template.replace('{0}', word));
var combinedQuery = queries.join("\r\n");
console.log(queries);
console.log(combinedQuery);
I am wondering what approach should have been selected to perform action from title. I am using ODBC connection and what I get from first sql query are like 40-50 rows in one column. What I want is to put this output as a values in to search for.
How should i treat this? Like a array or separated variables? I still do not know R well so just need to know where to search for.
Regards
------more explanation below----
I have list of 40-50 numbers of 10 digits each, organized in a column.
I am trying to do this:
list <- c(my_input)
sql_in <- paste0(list, collapse="")
and characters are organized like this after this operations:
'c(1234567890, , 1234567890, 1234567890)'
and almost all looks fine and fit into my query besides additional c character at the beginning and missing apostrophes.I try to use gsub function but did not work in way I want.
You may likely do this in one SQL call using a subquery. Notice in the call below that the result of
SELECT n_gear
FROM Gear
WHERE n_gear IN (3,4)
Is passed to the WHERE clause of the primary query. This is perfectly valid and will allow your query to execute entirely in SQL without having to do any intermediate steps in R.
(I use sqldf for simplicity of illustration, but this should work through just about any ODBC connection)
library(sqldf)
Gear <- data.frame(n_gear = 1:5)
sqldf(
"SELECT mpg, qsec, gear, wt
FROM mtcars
WHERE gear IN (SELECT n_gear
FROM Gear
WHERE n_gear IN (3,4))"
)
Try something like this:
list<-c("try","this") #The output from your first query
sql_in<-paste0(list, collapse="','")
The Output
paste("select * from table where table.var in ",paste("('",sql_in,"')",sep=''))
[1] "select * from table where table.var in ('try','this')"
If yuo have space as first or last element of the string you can use this code:
`list<-c(" first element is a space","try","this","last element is a space ")` #The output from your first query
Find space at first or last character
first_space<-substr(list, start = 1, stop = 1)==" "
last_space<-substr(list, start = nchar(list), stop = nchar(list))==" "
Remove spaces
list[first_space]<-substr(list[first_space], start = 2, stop = nchar(list[first_space]))
list[last_space]<-substr(list[last_space], start = 1, stop = nchar(list[last_space])-1)
sql_in<-paste0(list, collapse="','")
Your output
paste0("select * from table where table.var in ",paste("('",sql_in,"')",sep=''))
"select * from table where table.var in ('first element is a space','try','this','last element is a space')"
I think You are expecting some thing like shown below code,
data <- dbGetQuery(con, "select column from yourfirsttable")
list <- paste(data$column, collapse="','")
result <- dbGetQuery(con, statement = sprintf("select * from yourresulttable where inv in ('%s')",list))
It's not entirely clear exactly what you're wanting to achieve here. For example, one use case just means you can do it all with a join. But I have cases where I don't know the values for the test without doing some computation. Then I do a separate query having created a query string thus:
> id <- 1:5
> paste0("SELECT * FROM table WHERE ID IN (", paste0(id, collapse = ","), ")")
[1] "SELECT * FROM table WHERE ID IN (1,2,3,4,5)"
I am new to SQL and i am trying to figure out how i can "select all" in WHERE clause.
Let me make it bit more clearer why i would like to achieve that:
I do have reactive SQL in my Shiny App. The user can filter at least three different variables in my SQL:
In this case user can choose Kunde (Customer), Abmessung (Diameter) and Date. I do have hard time to figure out when the user wants to filter only one/two variables and not three (the combinations of possibilities inside is huge, and writing SQL Query in if statement for each is quite a thing). So for example the user would like to have Kunde (Customer) filtered but all the Abmessung (Diameters) kept.
Here is a sample SQL Query which i have used:
select * from x.xy where kdname IN (",paste0("'",paste(input$kunde,
collapse="', '"),"'"),") and abmessung IN (",paste0("'",paste(input$abm,
collapse="', '"),"'"),") and dati_create between
to_date('",format(input$date[1], '%d.%m.%Y'),"','dd.mm.yyyy') and
to_date('",format(input$date[2], '%d.%m.%Y'),"','dd.mm.yyyy') + (86399/86400)
Is there a possibility in SQL to use some kind of "*" in WHERE clause?
Why not simply do it using if to each condition and than paste collapse it
( i think there isn t way to do what your want without "if\esle" or "case" or "nvl" or "decode" so you need to hardcode it)
like:
input=list("kunde"="a","abm"="b",date=c("2016-01-01","2016-02-01"))
#input=list("kunde"="","abm"="",date=c("","")) # for test
sql_main="select * from x.xy "
sql_cond=list()
sql_cond[1]= if(input$kunde==""){NULL}else{paste0("kdname IN ('",paste(input$kunde,collapse="','"),"')")}
sql_cond[2]= if(input$abm==""){NULL}else{paste0("abmessung IN ('",paste(input$abm,collapse="', '"),"')")}
sql_cond[3]= if(input$date[1]==""|input$date[2]==""){NULL}else{paste0("dati_create between to_date('",as.character(as.Date(input$date[1]),'%d.%m.%Y'),"','dd.mm.yyyy') and
to_date('",as.character(as.Date(input$date[2]),'%d.%m.%Y'),"','dd.mm.yyyy') + (86399/86400)")}
sql_cond=sql_cond[!sapply(sql_cond,is.null)]# needed to del NULL in list
sql_cond_all=paste(sql_cond,collapse =" and ")
sql=if(sql_cond_all!=""){paste(sql_main,"where",sql_cond_all)}else{sql_main}
sql
If you use RMySQL you can just paste together a chr string as query.
So you could do something like this:
condition <- character()
if(input$kdname != "") condition <- append(condition, paste0("kdname IN '", paste(input$kdname, collapse = "', '"), "'"))
if(input$abm != "") condition <- append(condition, paste0("abmessung IN '", paste(input$abm, collapse = "', '"), "'"))
query <- paste("SELECT * FROM x.xy WHERE", paste(condition, collapse = " AND "))
I am not familiar with shiny. However, I think you could achieve your goal by using oracle's decode function.
The where clause will look something like this (pseudo-code)
where
decode(input_variable,null,1,input_variable)=decode(input_variable,null,1,table_column)
and dati_create between
decode(input_date1,null,'1/1/1900',input_date1)
and decode(input_date2,null,'1/1/3000',input_date2)
and table_column2 in (
SELECT decode(inputvarable2,null, (select table_column2 from dual)
,TRIM(REGEXP_SUBSTR(temp, '[^,]+', 1, level)) )
FROM (SELECT inputvarable2 temp FROM DUAL)
CONNECT BY level <= REGEXP_COUNT(temp, '[^,]+')
)
Here I assume that when input variable is not given - it has null value. Also, as user input sample was not provided, I assume that user input for in clause is comma separated string. (e.g. 'a,b,c')
And thus, if input_variable is null then where becomes 1=1 (always true), otherwise it is input_variable=table_column. With dates it is a little bit more tricky - so I give very early date (1/1/1900), or very far away one (1/1/3000). The logic behind in clause, is to convert user input into collection from comma separated string and then use the same trick with decode.
Though this might not be the most efficient way to do this.
Also I see that you are concatenating user input directly into your sql statement. This is highly risky - as your code would be prone to sql injection attacks.
I find sub function very useful when constructing queries, try this:
myquery <- 'select * from x.xy where kdname IN "NAME" and abmessung IN "ABM" and dati_create between DATE1 and DATE2 + (86399/86400)'
date1 <- input$date[1];date1 <- sub("-",".",date1);date1 <- sub("-",".",date1)
date2 <- input$date[2];date2 <- sub("-",".",date2);date2 <- sub("-",".",date2)
myquery <- sub("DATE1",date1,myquery)
myquery <- sub("DATE2",date2,myquery)
myquery <- sub("NAME",input$kunde,myquery)
myquery <- sub("ABM",input$abm,myquery)
myquery <- noquote(myquery)
myquery