How to choose all or a subset of a vector in SQL query when linking R to HANA - sql

I am using R to link HANA, so that I can use SQL query in R for data retrieve. My data includes a lot of store names. sometimes, my query is based on all the stores, which can be done easily without setting up any limits in WHERE. Sometimes, I just focus on a couple of stores, like store 1,2,3. I can use the answer from Providing lookup list from R vector as SQL table for RODBC lookup to do this.
For example:
IDlist <- c(23, 232, 434, 35445)
paste("WHERE idname IN (", paste(IDlist, collapse = ", "), ")")
But how can I combine these two situations, i.e., all names or name subset, together in a WHERE? I am looking forward for something like:
IDlist <- all
IDlist <- c(23, 232, 434, 35445)
paste("WHERE idname IN (", paste(IDlist, collapse = ", "), ")")
So, when IDlist is all, then the query will be for all the store names. When IDlist have some specific numbers, then the query will just focus on those stores.
This is just my idea. I am not sure if there is a better way to do it. Anyway, combining all names and some names together so that I can use them in one WHERE, therefore, I do not need to change my code.
Here, my WHERE is :
myOffice <- c(416,247,602,428)
WHERE "a"."/BIC/ZARTICLE"<>\'GIFTCARDPU\' AND "a"."/BIC/ZRETURN"=\'X\'
AND "a"."CALDAY" BETWEEN',StartDate,'AND',EndDate,'
AND "a"."/BIC/ZSALE_OFF" IN (',paste(myOffice, collapse = ", "),')

Maybe something like this:
office_clause = ""
if (IDlist != all) {
office_clause = paste(
'AND "a"."/BIC/ZSALE_OFF" IN (',
paste(IDlist, collapse = ', '),
')'
)
}
Then you can construct your query, just pasting office_clause at the end of the where. If it IDlist is all, then you will paste on a blank string, otherwise you will paste on the ID clause. (Note that I assume all is a variable because that's how you used it in the question.)

Related

How to pass data.frame into SQL "IN" condition using R?

I am reading list of values from CSV file in R, and trying to pass the values into IN condition of SQL(dbGetQuery). Can some one help me out with this?
library(rJava)
library(RJDBC)
library(dbplyr)
library(tibble)
library(DBI)
library(RODBC)
library(data.table)
jdbcDriver <- JDBC("oracle.jdbc.OracleDriver",classPath="C://Users/********/Oracle_JDBC/ojdbc6.jar")
jdbcConnection <- dbConnect(jdbcDriver, "jdbc:oracle:thin:Rahul#//Host/DB", "User_name", "Password")
## Setting working directory for the data
setwd("C:\\Users\\**********\\Desktop")
## reading csv file into data frame
pii<-read.csv("sample.csv")
pii
PII_ID
S0094-5765(17)31236-5
S0094-5765(17)31420-0
S0094-5765(17)31508-4
S0094-5765(17)31522-9
S0094-5765(17)30772-5
S0094-5765(17)30773-7
PII_ID1<-dbplyr::build_sql(pii$PII_ID)
PII_ID1
<SQL> ('S0094-5765(17)31236-5', 'S0094-5765(17)31420-0', 'S0094-5765(17)31508-4', 'S0094-5765(17)31522-9', 'S0094-5765(17)30772-5', 'S0094-5765(17)30773-7')
Data<-dbGetQuery(jdbcConnection, "SELECT ARTICLE_ID FROM JRBI_OWNER.JRBI_ARTICLE_DIM WHERE PII_ID in ?",(PII_ID1))
Expected:
ARTICLE_ID
12345
23456
12356
14567
13456
Actual result:
[1] ARTICLE_ID
<0 rows> (or 0-length row.names)
The SQL code you pass in the second argument to dbGetQuery is just a text string, hence you can construct this using paste or equivalents.
You are after something like the following:
in_clause <- paste0("('", paste0(pii$PII_ID, collapse = "', '"), "')")
sql_text <- paste0("SELECT ARTICLE_ID
FROM JRBI_OWNER.JRBI_ARTICLE_DIM
WHERE PII_ID IN ", in_clause)
data <- dbGetQuery(jdbcConnection, sql_text)
However, the exact form of the first paste0 depends on the format of PII_ID (I have assumed it is text) and how this format is represented in sql (I have assumed single quotes).
Be sure to check sql_text is valid SQL before passing it to dbGetQuery.
IMPORTANT: This approach is only suitable when pii contains a small number of values (I recommend fewer than 10). If pii contains a large number of values your query will be very large and will run very slowly. If you have many values in pii then a better approach would be a join or semi-join as per #nicola's comment.

Putting output from sql query into another query using R environment

I am wondering what approach should have been selected to perform action from title. I am using ODBC connection and what I get from first sql query are like 40-50 rows in one column. What I want is to put this output as a values in to search for.
How should i treat this? Like a array or separated variables? I still do not know R well so just need to know where to search for.
Regards
------more explanation below----
I have list of 40-50 numbers of 10 digits each, organized in a column.
I am trying to do this:
list <- c(my_input)
sql_in <- paste0(list, collapse="")
and characters are organized like this after this operations:
'c(1234567890, , 1234567890, 1234567890)'
and almost all looks fine and fit into my query besides additional c character at the beginning and missing apostrophes.I try to use gsub function but did not work in way I want.
You may likely do this in one SQL call using a subquery. Notice in the call below that the result of
SELECT n_gear
FROM Gear
WHERE n_gear IN (3,4)
Is passed to the WHERE clause of the primary query. This is perfectly valid and will allow your query to execute entirely in SQL without having to do any intermediate steps in R.
(I use sqldf for simplicity of illustration, but this should work through just about any ODBC connection)
library(sqldf)
Gear <- data.frame(n_gear = 1:5)
sqldf(
"SELECT mpg, qsec, gear, wt
FROM mtcars
WHERE gear IN (SELECT n_gear
FROM Gear
WHERE n_gear IN (3,4))"
)
Try something like this:
list<-c("try","this") #The output from your first query
sql_in<-paste0(list, collapse="','")
The Output
paste("select * from table where table.var in ",paste("('",sql_in,"')",sep=''))
[1] "select * from table where table.var in ('try','this')"
If yuo have space as first or last element of the string you can use this code:
`list<-c(" first element is a space","try","this","last element is a space ")` #The output from your first query
Find space at first or last character
first_space<-substr(list, start = 1, stop = 1)==" "
last_space<-substr(list, start = nchar(list), stop = nchar(list))==" "
Remove spaces
list[first_space]<-substr(list[first_space], start = 2, stop = nchar(list[first_space]))
list[last_space]<-substr(list[last_space], start = 1, stop = nchar(list[last_space])-1)
sql_in<-paste0(list, collapse="','")
Your output
paste0("select * from table where table.var in ",paste("('",sql_in,"')",sep=''))
"select * from table where table.var in ('first element is a space','try','this','last element is a space')"
I think You are expecting some thing like shown below code,
data <- dbGetQuery(con, "select column from yourfirsttable")
list <- paste(data$column, collapse="','")
result <- dbGetQuery(con, statement = sprintf("select * from yourresulttable where inv in ('%s')",list))
It's not entirely clear exactly what you're wanting to achieve here. For example, one use case just means you can do it all with a join. But I have cases where I don't know the values for the test without doing some computation. Then I do a separate query having created a query string thus:
> id <- 1:5
> paste0("SELECT * FROM table WHERE ID IN (", paste0(id, collapse = ","), ")")
[1] "SELECT * FROM table WHERE ID IN (1,2,3,4,5)"

Shiny Reactive SQL: WHERE clause

I am new to SQL and i am trying to figure out how i can "select all" in WHERE clause.
Let me make it bit more clearer why i would like to achieve that:
I do have reactive SQL in my Shiny App. The user can filter at least three different variables in my SQL:
In this case user can choose Kunde (Customer), Abmessung (Diameter) and Date. I do have hard time to figure out when the user wants to filter only one/two variables and not three (the combinations of possibilities inside is huge, and writing SQL Query in if statement for each is quite a thing). So for example the user would like to have Kunde (Customer) filtered but all the Abmessung (Diameters) kept.
Here is a sample SQL Query which i have used:
select * from x.xy where kdname IN (",paste0("'",paste(input$kunde,
collapse="', '"),"'"),") and abmessung IN (",paste0("'",paste(input$abm,
collapse="', '"),"'"),") and dati_create between
to_date('",format(input$date[1], '%d.%m.%Y'),"','dd.mm.yyyy') and
to_date('",format(input$date[2], '%d.%m.%Y'),"','dd.mm.yyyy') + (86399/86400)
Is there a possibility in SQL to use some kind of "*" in WHERE clause?
Why not simply do it using if to each condition and than paste collapse it
( i think there isn t way to do what your want without "if\esle" or "case" or "nvl" or "decode" so you need to hardcode it)
like:
input=list("kunde"="a","abm"="b",date=c("2016-01-01","2016-02-01"))
#input=list("kunde"="","abm"="",date=c("","")) # for test
sql_main="select * from x.xy "
sql_cond=list()
sql_cond[1]= if(input$kunde==""){NULL}else{paste0("kdname IN ('",paste(input$kunde,collapse="','"),"')")}
sql_cond[2]= if(input$abm==""){NULL}else{paste0("abmessung IN ('",paste(input$abm,collapse="', '"),"')")}
sql_cond[3]= if(input$date[1]==""|input$date[2]==""){NULL}else{paste0("dati_create between to_date('",as.character(as.Date(input$date[1]),'%d.%m.%Y'),"','dd.mm.yyyy') and
to_date('",as.character(as.Date(input$date[2]),'%d.%m.%Y'),"','dd.mm.yyyy') + (86399/86400)")}
sql_cond=sql_cond[!sapply(sql_cond,is.null)]# needed to del NULL in list
sql_cond_all=paste(sql_cond,collapse =" and ")
sql=if(sql_cond_all!=""){paste(sql_main,"where",sql_cond_all)}else{sql_main}
sql
If you use RMySQL you can just paste together a chr string as query.
So you could do something like this:
condition <- character()
if(input$kdname != "") condition <- append(condition, paste0("kdname IN '", paste(input$kdname, collapse = "', '"), "'"))
if(input$abm != "") condition <- append(condition, paste0("abmessung IN '", paste(input$abm, collapse = "', '"), "'"))
query <- paste("SELECT * FROM x.xy WHERE", paste(condition, collapse = " AND "))
I am not familiar with shiny. However, I think you could achieve your goal by using oracle's decode function.
The where clause will look something like this (pseudo-code)
where
decode(input_variable,null,1,input_variable)=decode(input_variable,null,1,table_column)
and dati_create between
decode(input_date1,null,'1/1/1900',input_date1)
and decode(input_date2,null,'1/1/3000',input_date2)
and table_column2 in (
SELECT decode(inputvarable2,null, (select table_column2 from dual)
,TRIM(REGEXP_SUBSTR(temp, '[^,]+', 1, level)) )
FROM (SELECT inputvarable2 temp FROM DUAL)
CONNECT BY level <= REGEXP_COUNT(temp, '[^,]+')
)
Here I assume that when input variable is not given - it has null value. Also, as user input sample was not provided, I assume that user input for in clause is comma separated string. (e.g. 'a,b,c')
And thus, if input_variable is null then where becomes 1=1 (always true), otherwise it is input_variable=table_column. With dates it is a little bit more tricky - so I give very early date (1/1/1900), or very far away one (1/1/3000). The logic behind in clause, is to convert user input into collection from comma separated string and then use the same trick with decode.
Though this might not be the most efficient way to do this.
Also I see that you are concatenating user input directly into your sql statement. This is highly risky - as your code would be prone to sql injection attacks.
I find sub function very useful when constructing queries, try this:
myquery <- 'select * from x.xy where kdname IN "NAME" and abmessung IN "ABM" and dati_create between DATE1 and DATE2 + (86399/86400)'
date1 <- input$date[1];date1 <- sub("-",".",date1);date1 <- sub("-",".",date1)
date2 <- input$date[2];date2 <- sub("-",".",date2);date2 <- sub("-",".",date2)
myquery <- sub("DATE1",date1,myquery)
myquery <- sub("DATE2",date2,myquery)
myquery <- sub("NAME",input$kunde,myquery)
myquery <- sub("ABM",input$abm,myquery)
myquery <- noquote(myquery)
myquery

Limit number of characters imported from SQL in R

I am using the sqlquery function in R to connect the DB with R. I am using the following lines
for (i in 1:length(Counter)){
if (Counter[i] %in% str_sub(dir(),1,29) == FALSE){
DT <- data.table(sqlQuery(con, paste0("select a.* from edp_data.sme_loan a
where a.edcode IN (", print(paste0("\'",EDCode,"\'"), quote=FALSE),
") and a.poolcutoffdate in (",print(paste0("\'",str_sub(PoolCutoffDate,1,4),"-",str_sub(PoolCutoffDate,5,6),"-",
str_sub(PoolCutoffDate,7,8),"\'"), quote=FALSE),")")))}}
Thus I am importing subsets of the DB by EDCode and PoolCutoffDate. This works perfectly, however there is one variable in edp_data.sme in one particular EDCode which produces an undesired result.
If I take the unique of this "as.3" variable for a particular EDCode I get:
unique(DT$as3)
[1] 30003000000000019876240886000 30003000000000028672000424000
In reality there shoud be more unique IDs in this DB. The problem is that the string of as3 is much longer than the one which is imported.
nchar(unique(DT$as3))
[1] 29 29
How can I import more characters from this string? I do not want to specify each variable in select a.* ideally, but only make sure that it imports the full string of as3.
Any help is appreciated!

update an SQL table via R sqlSave

I have a data frame in R having 3 columns, using sqlSave I can easily create a table in an SQL database:
channel <- odbcConnect("JWPMICOMP")
sqlSave(channel, dbdata, tablename = "ManagerNav", rownames = FALSE, append = TRUE, varTypes = c(DateNav = "datetime"))
odbcClose(channel)
This data frame contains information about Managers (Name, Nav and Date) which are updatede every day with new values for the current date and maybe old values could be updated too in case of errors.
How can I accomplish this task in R?
I treid to use sqlUpdate but it returns me the following error:
> sqlUpdate(channel, dbdata, tablename = "ManagerNav")
Error in sqlUpdate(channel, dbdata, tablename = "ManagerNav") :
cannot update ‘ManagerNav’ without unique column
When you create a table "the white shark-way" (see documentation), it does not get a primary index, but is just plain columns, and often of the wrong type. Usually, I use your approach to get the columns names right, but after that you should go into your database and assign a primary index, correct column widths and types.
After that, sqlUpdate() might work; I say might, because I have given up using sqlUpdate(), there are too many caveats, and use sqlQuery(..., paste("Update....))) for the real work.
What I would do for this is the following
Solution 1
sqlUpdate(channel, dbdata,tablename="ManagerNav", index=c("ManagerNav"))
Solution 2
Lcolumns <- list(dbdata[0,])
sqlUpdate(channel, dbdata,tablename="ManagerNav", index=c(Lcolumns))
Index is used to specify what columns R is going to update.
Hope this helps!
If none of the other solutions work and your data is not that big, I'd suggest using sqlQuery() and loop through your dataframe.
one_row_of_your_df <- function(i) {
sql_query <-
paste0("INSERT INTO your_table_name (column_name1, column_name2, column_name3) VALUES",
"(",
"'",your_dataframe[i,1],",",
"'",your_dataframe[i,2],"'",",",
"'",your_dataframe[i,3],"'",",",
")"
)
return(sql_query)
}
This function is Exasol specific, it is pretty similar to MySQL, but not identical, so small changes could be necessary.
Then use a simple for loop like this one:
for(i in 1:nrow(your_dataframe))
{
sqlQuery(your_connection, one_row_of_your_df(i))
}