How to efficiently paste many variables into a sql query (Rshiny) - sql

I'm building a shiny app where the user could update a table in a database by editing a selected row in a DT:table.
The problem is that process can be time-consuming when the dt:table has many columns (let's say 25 for instance). So I was wondering if there was a nice and efficient way to link my "vals" variables in the query below with the dataframe columns ?
The code below is working but since my DT:table has more than 60 columns I really cannot stick to this solution... :(
selected_row <- donnees[input$dt_rows_selected,]
query <- glue_sql('UPDATE myschema.mytable SET field1= ({vals*}), field2= ({vals2*}), field3 = ({vals3*}), field4= ({vals4*}), field5= ({vals5*}) WHERE id IN ({ID_field*});',
vals = selected_row$column1, vals2 = selected_row$column2, vals3= selected_row$column3, vals4= selected_row$column4, vals5= selected_row$column5, ID_field= selected_row$ID, .con = pool)
DBI::dbExecute(pool2, query)

The purpose of this answer is two-fold:
Demonstrate the (a?) proper postgres-style upsert action. I present a pg_upsert function, and in that function I've included (prefixed with #'#) what the query looks like when finished. The query is formed dynamically, so does not need a priori knowledge of the fields other than the user-provided idfields= argument.
Demonstrate how to react to DT-edits using this function. This is one way and there are definitely other ways to formulate how to deal with the reactive DT. If you have a different style for keeping track of changes in the DT, then feel free to take pg_upsert and run with it!
Notes:
it does not update the database with each cell edit, the changes are "batched" until the user clicks the Upsert! button; it is feasible to change to "upsert on each cell", but that would be a relatively trivial query, no need for upserts
since you're using postgres, the target table must have one or more unique indices (see No unique or exclusion constraint matching the ON CONFLICT); I'll create the sample data and the index on said table; if you don't understand what this means and your data doesn't have a clear "id" field(s), then do what I did: add an id column (both locally and in the db) that sequences along your real rows (this won't work if your data is preexisting and has no id fields)
the id field(s) must not be editable, so the editable= part of DT disables changing that column; I included a query (found in https://stackoverflow.com/a/2213199/3358272) that will tell you these fields programmatically; if this returns nothing, then go back to the previous bullet and fix it
the pg_upsert function takes a few steps to ensure things are clean (i.e., checks for duplicate ids), but does not check for incorrect new-values (DT does some of this for you, by class I believe), I'll assume you are verifying what you need before sending for an upsert;
the return value from pg_upsert is logical, indicating that the upsert action updated as many rows as we expected; this might be overly aggressive, though I cannot think of an example when it would correctly return other than nrow(value); caveat emptor
I include an optional "dbout" table in the shiny layout solely to show the current state of the database data, updated every time pg_upsert is called (indirectly); if no changes have been made, it will still query to show the current state, and is therefore the best way to show the starting condition for your testing; again, it is optional. When you remove it (and you should) and nothing else uses the do_update() reactive, then change
do_update <- eventReactive(input$upbtn, ...)
output$dbout <- renderTable({ do_update(); ... })
to
observeEvent(input$upbtn, ...)
# output$dbout <- renderTable({ do_update(); ... })
(Otherwise, a reactive(.) block that is never used downstream will never fire, so your updates would not happen.)
This app queries the database for all values (into curdata), this is likely already being done in your case. This app also finds (programmatically) the required indices. If you know ahead of time what these are, feel free to drop the query that feeds idfields and just assign it directly (case-sensitive).
When the app exits, the user-edited data is not stored in the local R console/environment, all changes are stored in the database. It's my assumption that this will be formalized into a shiny-server, RStudio Connect, or similar production server, in which case "console" has little meaning. If you really need the user-changed data to be available on the local R console while you are developing your app, then in addition to using mydata reactive values, after mydata$data is reassigned you can overwrite curdata <<- mydata$data (note the double < in <<-). I discourage this practice in production but it might be useful while in development.
Here is a setup for sample data. It doesn't matter if you have 6 (as here) or 60 columns, the premise remains. (After this, origdata is not used, it was a throw-away to prep for this answer.)
# pgcon <- DBI::dbConnect(...)
set.seed(42)
origdata <- iris[sample(nrow(iris), 6),]
origdata$id <- seq_len(nrow(origdata))
# setup for this answer
DBI::dbExecute(pgcon, "drop table if exists mydata")
DBI::dbWriteTable(pgcon, "mydata", origdata)
# postgres upserts require 'unique' index on 'id'
DBI::dbExecute(pgcon, "create unique index mydata_id_idx on mydata (id)")
Here is the UPSERT function itself, broken out to facilitate testing, console evaluation, and similar.
#' #param value 'data.frame', values to be updated, does not need to
#' include all columns in the database
#' #param name 'character', the table name to receive the updated
#' values
#' #param idfields 'character', one or more id fields that are present
#' in both the 'value' and the database table, these cannot change
#' #param con database connection object, from [DBI::dbConnect()]
#' #param verbose 'logical', be verbose about operation, default true
#' #return logical, whether 'nrow(value)' rows were affected; if an
#' error occurred, it is messaged to the console and a `FALSE` is
#' returned
pg_upsert <- function(value, name, idfields, con = NULL, verbose = TRUE) {
if (verbose) message(Sys.time(), " upsert ", name, " with ", nrow(value), " rows")
if (any(duplicated(value[idfields]))) {
message("'value' contains duplicates in the idfields, upsert will not work")
return(FALSE)
}
tmptable <- paste(c("uptemp_", name, "_", sample(1e6, size = 1)), collapse = "")
on.exit({
DBI::dbExecute(con, paste("drop table if exists", tmptable))
}, add = TRUE)
DBI::dbWriteTable(con, tmptable, value)
cn <- colnames(value)
quotednms <- DBI::dbQuoteIdentifier(con, cn)
notid <- DBI::dbQuoteIdentifier(con, setdiff(cn, idfields))
qry <- sprintf(
"INSERT INTO %s ( %s )
SELECT %s FROM %s
ON CONFLICT ( %s ) DO
UPDATE SET %s",
name, paste(quotednms, collapse = " , "),
paste(quotednms, collapse = " , "), tmptable,
paste(DBI::dbQuoteIdentifier(con, idfields), collapse = " , "),
paste(paste(notid, paste0("EXCLUDED.", notid), sep = "="), collapse = " , "))
#'# INSERT INTO mydata ( "Sepal.Length" , "Petal.Length" )
#'# SELECT "Sepal.Length" , "Petal.Length" , "id" FROM mydata
#'# ON CONFLICT ( "id" ) DO
#'# UPDATE SET "Sepal.Length"=EXCLUDED."Sepal.Length" , "Petal.Length"=EXCLUDED."Petal.Length"
# dbExecute returns the number of rows affected, this ensures we
# return a logical "yes, all rows were updated" or "no, something
# went wrong"
res <- tryCatch(DBI::dbExecute(con, qry), error = function(e) e)
if (inherits(res, "error")) {
msg <- paste("error upserting data:", conditionMessage(res))
message(Sys.time(), " ", msg)
ret <- FALSE
attr(ret, "error") <- conditionMessage(res)
} else {
ret <- (res == nrow(value))
if (!ret) {
msg <- paste("expecting", nrow(value), "rows updated, returned", res, "rows updated")
message(Sys.time(), " ", msg)
attr(ret, "error") <- msg
}
}
ret
}
Here's the shiny app. When you source this, you can immediately press Upsert! to get the current state of the database table (again, only an option, not required for production), no updated values are needed to requery.
library(shiny)
library(DT)
pgcon <- DBI::dbConnect(...) # fix this incomplete expression
curdata <- DBI::dbGetQuery(pgcon, "select * from mydata order by id")
# if you don't know the idfield(s) offhand, then use this:
idfields <- DBI::dbGetQuery(pgcon, "
select
t.relname as table_name,
i.relname as index_name,
a.attname as column_name
from
pg_class t,
pg_class i,
pg_index ix,
pg_attribute a
where
t.oid = ix.indrelid
and i.oid = ix.indexrelid
and a.attrelid = t.oid
and a.attnum = ANY(ix.indkey)
and t.relkind = 'r'
and t.relname = 'mydata'
order by
t.relname,
i.relname;")
idfieldnums <- which(colnames(curdata) %in% idfields$column_name)
shinyApp(
ui = fluidPage(
DTOutput("tbl"),
actionButton("upbtn", "UPSERT!"),
tableOutput("dbout")
),
server = function(input, output) {
mydata <- reactiveValues(data = curdata, changes = NULL)
output$tbl = renderDT(
mydata$data, options = list(lengthChange = FALSE),
editable = list(target = "cell", disable = list(columns = idfields)))
observeEvent(input$tbl_cell_edit, {
mydata$data <- editData(mydata$data, input$tbl_cell_edit)
mydata$changes <- rbind(
if (!is.null(mydata$changes)) mydata$changes,
input$tbl_cell_edit
)
# keep the most recent change to the same cell
dupes <- rev(duplicated(mydata$changes[rev(seq(nrow(mydata$changes))),c("row","col")]))
mydata$changes <- mydata$changes[!dupes,]
message(Sys.time(), " pending changes: ", nrow(mydata$changes))
})
do_update <- eventReactive(input$upbtn, {
if (isTRUE(nrow(mydata$changes) > 0)) {
# always include the 'id' field(s)
# idcol <- which(colnames(mydata$data) == "id")
updateddata <- mydata$data[ mydata$changes$row, c(mydata$changes$col, idfieldnums) ]
res <- pg_upsert(updateddata, "mydata", idfields = "id", con = pgcon)
# clear the stored changes only if the upsert was successful
if (res) mydata$changes <- mydata$changes[0,]
}
input$upbtn
})
output$dbout <- renderTable({
do_update() # react when changes are attempted, the button is pressed
message(Sys.time(), " query 'mydata'")
DBI::dbGetQuery(pgcon, "select * from mydata order by id")
})
}
)
In action:
(Left) When we start, we see the original DT and no database output.
(Middle) Press the Upsert! button just to query the db and show the optional table.
(Right) Make updates, then press Upsert!, and the database is updated (and the lower table re-queried).

Related

How can I reduce SQL Query complexity and length in RShiny App?

I am building an app using RShiny which queries a SQLite database.
I have finished writing the code for the app, but I am having trouble reducing the complexity of the query / using glue_sql function. As it is right now there are many OR statements which I believe might be slowing the query down.
I used sqlInterpolate to conduct the query to prevent any SQL injections, as this is what is cited on the RShiny website as what should be done.
This is my code as of right now:
UI:
ui <- fluidPage(theme = shinytheme("cerulean"),
titlePanel("Application"),
sidebarLayout(
sidebarPanel(
checkboxGroupInput(inputId = "Assay", label = "Assay Type:", choices = c("MBP1","EV","Oxidation","WGAS","WLS","Genomics"),
selected = c("MBP1","EV","Oxidation","WGAS","WLS","Genomics")), #Assay
checkboxGroupInput(inputId = "Vendor", label = "Vendor:", choices = c("AKE","BSA"),
selected = c("AKE","BSA")), #Vendor
actionButton("read", "Read From Database"),
checkboxGroupInput(inputId = "Study",
label = "Study Type",
choices = c("MP", "CSP"),
selected = c("MP", "CSP")),
dateRangeInput(inputId = "Date", label = "Sample Date Range:", format ="yyyy-mm-dd"), #Date
actionButton("read", "Read From Database")),
mainPanel(h1("Sample Count:"),
dataTableOutput("Samples_Sorted_by_Study"),
dataTableOutput("Samples_Sorted_by_Assay_and_Vendor")) #table
) #closing navbarPage
) #closing fluidPage UI
Server:
server <- function(input, output){
#Storing values in myData variable
myData <- reactiveValues()
observeEvent(
input$read,
{
myData$assay <- input$Assay
myData$vendor <- input$Vendor
myData$date_val <- input$Date
myData$studytype <- input$Study
#Opening database connection
connectiontodb <- dbConnect(RSQLite::SQLite(), "example.sqlite")
#Shortening Query??
#myData$study_int <- c(myData$studytype[1], myData$studytype[2])
#myData$study_int <- glue_sql("{myData$study_int*}")
#conducting first query of samples grouped by study
#Sample query of example.sqlite database
myData$interpolatequery1 <- sqlInterpolate(connectiontodb,
"Select study, count(genomic_id), count(specimen_id)
FROM exampletable WHERE study = ?studytype1 OR study = ?studytype2
GROUP BY study",
studytype1 = myData$studytype[1], studytype2 = myData$studytype[2])
myData$exampledataSQLQuery1 <- dbGetQuery(connectiontodb, myData$interpolatequery1)
#Closing connection to database
dbDisconnect(connectiontodb)
#Correcting Column Names
colnames(myData$exampledataSQLQuery1) <- c("Study", "Genomic Id", "Specimen Id")
}
)
output$Samples_Sorted_by_Study <- renderDataTable(myData$exampleSQLQuery1)
}
shinyApp(ui = ui, server = server)
How can I condense the query so that there isn't a need for so many OR statements, but also protect against SQL injection?

Checking if GroovyRowResult field is empty string

I am using sql.firstRow to check if a row exists in the postgres database based on some criteria.
def cur = sql.firstRow(r, '''
SELECT "some_thing"
FROM "my_table"
WHERE "customer_name" = :customer_name
AND "sad_date" = :sad_date
AND "forgiver" = :forgiver
''')
I find that this works:
if (cur){
log.debug("Found Some thing " + cur["some_thing"])
log.debug("Cur: " + cur.keySet())
}
however this lets in any rows that don't have some_field inside it.
ISSUE
To avoid this, when we try and check for the existance of a non empty value for some_field on the result row like this:
if (cur && "${cur.some_thing}" ){
log.debug("Found Some thing " + cur["some_thing"])
}
ERROR
I get an error suggesting that:
No signature of `String.positive` for argument types for the given type.
I have read this question and changed from cur.some_thing and cur['some_thing'] to "${cur.some_thing}" but the error does not go away
I have also tried this post and tried to use cur.getProperty("some_thing") and it still throws the same error.

Subsetting or filtering of data as per dynamic user input

I like to filter data as per dynamic user input. Below mentioned sample code is executable, however, currently, it does not have the logic to filter the data. Therefore, the table does not change irrespective of selection of any variables. The code has mainly three components: 1. First few lines of code creates a data frame of which information can be used in creating controls widgets on the fly; 2. ui code; 3. server currently has logic to create control widgets on the fly, and I need to figure out how to use information from dynamically created control widgets and filter data (which is used for different purposes). I'm unable to figure out how to track the number of variable and filtering as per their range. Greatly appreciate suggestions.
## Create a data frame of which information is used in creating (dynamical) control widgets
varnames <- names(iris[,1:4]) # names
varinit <- apply(iris[,1:4],2,median) # initival value used in slider
varmin <- apply(iris[,1:4],2,min) # min.
varmax <- apply(iris[,1:4],2,max) # max.
## dataframe
vardf <- data.frame(varnames,varmin,varmax,varinit)
ui <- fluidPage(
checkboxGroupInput("ConditioningVariables", "Conditioning variables (choose one or more):",
varnames,inline = TRUE),
uiOutput("ControlWidgetsofConditioningVariables"),
tableOutput("data")
)
server <- function(input, output, session) {
output$ControlWidgetsofConditioningVariables <- renderUI({
if (is.null(input$ConditioningVariables)){
return()
} else {
selvarnames = sort(input$ConditioningVariables)
selpos = sapply(selvarnames,function(x) which(varnames==x))
# create a taglist of dynamic widgets
ListofDynamicWidgets <- lapply(selpos, function(x){sliderInput(as.character(vardf[x,1]),
as.character(vardf[x,1]),
vardf[x,2],vardf[x,3],
vardf[x,4],.1)})
do.call(tagList, ListofDynamicWidgets)
}
})
## filter data as per selected variables and their range
## this is where I'm kind of struck, I think I need to track number of variables (is list good idea?)
## and filter as per selected range of a specific variable
newdata <- reactive({
subset(iris)
})
output$data <- renderTable({ newdata() })
}
shinyApp(ui, server)
Here's one approach (note that I slightly modified the code to generate the widgets). The main idea is to generate a named list of the range vectors (conds), convert the ranges into a list of character strings that represent the filtering conditions (subs), collapse them into one long string of conditions, and use that as the subsetting argument for filter_().
library(dplyr)
server <- function(input, output, session) {
allControls <- lapply(setNames(varnames, varnames), function(x) {
sliderInput(x, x, varmin[x], varmax[x], c(varmin[x], varinit[x]),
round = -2)
})
output$ControlWidgetsofConditioningVariables <- renderUI({
if (is.null(input$ConditioningVariables)){
return()
} else {
allControls[input$ConditioningVariables]
}
})
## filter data as per selected variables and their range
newdata <- reactive({
if(!is.null(input$ConditioningVariables)) {
cond_names <- input$ConditioningVariables
conds <- lapply(setNames(cond_names, cond_names), function(x) input[[x]])
subs <- mapply(function(name, range){
if (!is.null(range))
sprintf("%1$s >= %2$f & %1$s <= %3$f", name, range[1], range[2])
}, names(conds), conds)
subs <- subs[!sapply(subs, is.null)]
if (length(subs) > 0)
filter_(iris, paste0(subs, collapse = " & "))
} else {
iris
}
})
output$data <- renderTable({ newdata() })
}
Output:

SQL Query Error when selecting separate columns

I do not understand the error which appears in my SQL Query. If i choose in SQL
Select * -> it is working fine and i do get the table,
however if i select any of the column/s it is giving me an Error:
Error in $<-.data.frame(*tmp*, "PROBE", value =
structure(integer(0), .Label = character(0), class = "factor")) :
replacement has 0 rows, data has 1427
Here is my SQL code:
if(input$filter == 1){
sqlOutput <- reactive({
sqlInput <- paste("select * from DWH.PROBE where DWH.PROBE.Nr =",paste0("'",d(),"'"), "And DWH.PROBE.AB BETWEEN",input$abmfrom, "AND",input$abmto,"ORDER BY Datum asc")
print(sqlInput)
dbGetQuery(con$cc, sqlInput)
})
}else{
sqlOutput <- reactive({
sqlInput <- paste("select * from DWH.PROBE where DWH.PROBE.S BETWEEN",d2(), "AND",input$sgehalt2, "And DWH.PROBE.AB BETWEEN",input$abmfrom2, "AND",input$abmto2,"ORDER BY Datum asc")
dbGetQuery(con$cc, sqlInput)
})}
And if i just add to those SQL Queries
select DWH.PROBE.S, DWH.PROBE.AB.. from DWH.PROBE
Then it comes above mentioned Error.
Additionally i need to say if i will use this SQL Query in a simple code:
rs <- dbSendQuery(con, paste("select DWH.PROBE.AB, DWH.PROBE.S from DWH.PROBE where DWH.PROBE.Nr = '50' And DWH.PROBE.AB BETWEEN 40 AND 50 ORDER BY Datum asc"))
data <- fetch(rs)
It is giving me the results...
Any ideas?
[EDIT *as my question is not a duplicate]
The question posted here: http://stackoverflow.com/questions/32048072/how-to-pass-input-variable-to-sql-statement-in-r-shiny actually has nothing to do with my topic. As we can see the error in this post:
Error in .getReactiveEnvironment()$currentContext() : Operation not
allowed without an active reactive context. (You tried to do something
that can only be done from inside a reactive expression or observer.)
I do not have a problems with passing input variable to sql statement and additionally if you can see in my SQL: The Query is in reactive context!:
sqlOutput <- reactive({...
The solution for above question was:
to put SQL Query in reactive context which is not a thing in my case
[EDIT 2] -> bits related to sqlOutput()
Here is a bit of code related to sqlOutput() which i am using in my Shiny App (at the moment this is the only bit because i am stuck with SQL Query)
output$tabelle <- DT::renderDataTable({
data <- sqlOutput()
data$PROBE <- as.factor(as.character(data$PROBE))
data
}, rownames=TRUE, filter="top", class = 'cell-border stripe',
options = list(pageLength = 100, lengthMenu=c(100,200,500), columnDefs = list(list(width = '200px', targets = "_all"),list(bSortable = FALSE, targets = "_all"))))
Thanks
Error doesn't relate to SQL statements, however, try changing your code to below:
sqlOutput <- reactive({
if(input$filter == 1){
sqlInput <- paste("select * from DWH.PROBE where DWH.PROBE.Nr =",paste0("'",d(),"'"), "And DWH.PROBE.AB BETWEEN",input$abmfrom, "AND",input$abmto,"ORDER BY Datum asc")
} else {
sqlInput <- paste("select * from DWH.PROBE where DWH.PROBE.S BETWEEN",d2(), "AND",input$sgehalt2, "And DWH.PROBE.AB BETWEEN",input$abmfrom2, "AND",input$abmto2,"ORDER BY Datum asc")
}
dbGetQuery(con$cc, sqlInput)
})

How do I loop thought each DB field to see if range is correct

I have this response in soapUI:
<pointsCriteria>
<calculatorLabel>Have you registered for inContact, signed up for marketing news from FNB/RMB Private Bank, updated your contact details and chosen to receive your statements</calculatorLabel>
<description>Be registered for inContact, allow us to communicate with you (i.e. update your marketing consent to 'Yes'), receive your statements via email and keep your contact information up to date</description>
<grades>
<points>0</points>
<value>No</value>
</grades>
<grades>
<points>1000</points>
<value>Yes</value>
</grades>
<label>Marketing consent given and Online Contact details updated in last 12 months</label>
<name>c21_mrktng_cnsnt_cntct_cmb_point</name>
</pointsCriteria>
There are many many many pointsCriteria and I use the below xquery to give me the DB value and Range of what that field is meant to be:
<return>
{
for $x in //pointsCriteria
return <DBRange>
<db>{data($x/name/text())}</db>
<points>{data($x//points/text())}</points>
</DBRange>
}
</return>
And i get the below response
<return><DBRange><db>c21_mrktng_cnsnt_cntct_cmb_point</db><points>0 1000</points></DBRange>
That last bit sits in a property transfer. I need SQL to bring back all rows where that DB field is not in that points range (field can only be 0 or 1000 in this case), my problem is I dont know how to loop through each DBRange/DBrange in this manner? please help
I'm not sure that I really understand your question, however I think that you want to make queries in your DB using specific table with a column name defined in your <db> field of your xml, and using as values the values defined in <points> field of the same xml.
So you can try using a groovy TestStep, first parse your Xml and get back your column name, and your points. To iterate over points if the values are separated with a blank space you can make a split(" ") to get a list and then use each() to iterate over the points on this list. Then using groovy.sql.Sql you can perform the queries in your DB.
Only one more thing, you need to put the JDBC drivers for your vendor DB in $SOAPUI_HOME/bin/ext and then restart SOAPUI in order that it can load the necessary driver classes.
So the follow code approach can achieve your goal:
import groovy.sql.Sql
import groovy.util.XmlSlurper
// soapui groovy testStep requires that first register your
// db vendor drivers, as example I use oracle drivers...
com.eviware.soapui.support.GroovyUtils.registerJdbcDriver( "oracle.jdbc.driver.OracleDriver")
// connection properties db (example for oracle data base)
def db = [
url : 'jdbc:oracle:thin:#db_host:d_bport/db_name',
username : 'yourUser',
password : '********',
driver : 'oracle.jdbc.driver.OracleDriver'
]
// create the db instance
def sql = Sql.newInstance("${db.url}", "${db.username}", "${db.password}","${db.driver}")
def result = '''<return>
<DBRange>
<db>c21_mrktng_cnsnt_cntct_cmb_point</db>
<points>0 1000</points>
</DBRange>
</return>'''
def resXml = new XmlSlurper().parseText(result)
// get the field
def field = resXml.DBRange.db.text()
// get the points
def points = resXml.DBRange.points.text()
// points are separated by blank space,
// so split to get an array with the points
def pointList = points.split(" ")
// for each point make your query
pointList.each {
def sqlResult = sql.rows "select * from your_table where ${field} = ?",[it]
log.info sqlResult
}
sql.close();
Hope this helps,
Thanks again for your help #albciff, I had to add this into a multidimensional array (I renamed field to column and result is a large return from the Xquery above)
def resXml = new XmlSlurper().parseText(result)
//get the columns and points ranges
def Column = resXml.DBRange.db*.text()
def Points = resXml.DBRange.points*.text()
//sorting it all out into a multidimensional array (index per index)
count = 0
bigList = Column.collect
{
[it, Points[count++]]
}
//iterating through the array
bigList.each
{//creating two smaller lists and making it readable for sql part later
def column = it[0]
def points = it[1]
//further splitting the points to test each
pointList = points.split(" ")
pointList.each
{//test each points range per column
def sqlResult = sql.rows "select * from my_table where ${column} <> ",[it]
log.info sqlResult
}
}
sql.close();
return;