How to modify ggboxplot (ggpubr) to suppress whiskers, but retain access to other boxplot customisations? - ggplot2

Originally I asked this question about suppressing the whiskers on a boxplot made by ggboxplot. (The expected way of setting a geom_boxplot option was not available.) A nice solution appeared which suited the original question. However, the broader question to address is how to suppress whiskers on the boxplot but still retain access to the nice additions in ggpubr, such as being able to automatically compute statistical test results and place these on a boxplot.
I tinkered with the solution from #Julian_Hn to get something like what I want.
There are two issues that someone more knowledgeable might be able to help with, now that I've asked the broader question:
Are there ways to make the solution more efficient?
How can I add in the ability to change the range of x-values? (I tried various methods using ggpar and coord_cartesian, with no effect. I might be lacking knowledge of how to use commands like ggplot_build effectively.)
Here's an example where I suppress whiskers and use stat_kruskal_test to label the boxplot:
ggboxplot_whisker_opt <- function(...)
{
opts <- list(...) # Modification of original question solution to include the original labelled ggboxplot with whiskers and stat info added
# Check if user specified a whiskers arg and set options accordingly
if("whisker" %in% names(opts))
{
whisk <- opts$whisker
opts$whisker <- NULL
} else {
whisk <- TRUE
}
# Additional arguments that might need generalising so that other statistical tests can be used in other applications
if ("kruskal" %in% names(opts))
{ kruskal<-opts$kruskal
opts$kruskal <- NULL
opt.group <- opts$kruskal.options[[1]]
opt.label<- opts$kruskal.options[[2]]
opt.y <- opts$kruskal.options[[3]]
opt.x <- opts$kruskal.options[[4]]
opts$kruskal.options <- NULL
}
pl <- do.call(ggboxplot,opts) # create plot by calling ggboxplot with all user options
if (kruskal){ pl <- pl + stat_kruskal_test(group.by=opt.group,label=opt.label, label.y.npc=opt.y,label.x.npc=opt.x) }
if(!whisk)
{ pl_list <- ggplot_build(pl) # get listed version of ggboxplot object to modify
pl_list$data[[1]]$ymin <- NA # remove the ymin/max that specify the whiskers
pl_list$data[[1]]$ymax <- NA
pl <- ggplot_gtable(pl_list) # convert back to ggplot object
}
# return
pl
}
Here's the application:
set.seed(123)
x <-rnorm(100)
labels <- round(runif(100,1,2))
df <- data.frame(labels=labels, value=x)
# Define the options for the stat_kruskal_test label
KO <- list("group"="labels","label"="as_detailed_italic", "label.y.npc"=0.5,"label.x.npc"=0.5,ylim=c(-1.2, 1.2))
# call the function
output.plot <- ggboxplot_whisker_opt(df, "labels","value", col="labels", legend="none", whisker=FALSE,add=c("mean"), orientation="horizontal" kruskal=TRUE,kruskal.options=KO)
# Plot the result
plot(output.plot)

the issue with modifying was that the returned object was not a ggplot object anymore (wrong comment on my side) but a plot object. I have thought about it and instead of modifying the ggbuilt object, it's also possible to directly pass the coef=0 through to the geom_boxplot layer inside the object returned by ggboxplot:
ggboxplot_whisker_opt <- function(...)
{
opts <- list(...)
# check if user specified a whiskers argument and set options accordingly
if("whisker" %in% names(opts))
{
whisk <- opts$whisker
opts$whisker <- NULL
} else {
whisk <- TRUE
}
pl <- do.call(ggpubr::ggboxplot, opts) # create plot by calling ggboxplot with all user options
if(!whisk)
{
pl$layers[[1]]$stat_params$coef <- 0 # modify coef param of geom_boxplot layer
}
# plot the ggplot and return so other ggplot parts can be added via `+`
pl
}
This function now returns an object compatible with ggpar or adding other ggplot modifiers via +
library(ggplot2)
library(ggpubr)
set.seed(123)
x <- rnorm(100)
labels <- round(runif(100,1,2))
df <- data.frame(labels=labels,
value=x)
testplot <- ggboxplot_whisker_opt(df,"labels","value",whisker=FALSE)
ggpar(testplot,xlim=c(0.5,1.5),
ylim=c(-0.5,0.5))
testplot +
geom_line(data=data.frame(x=c(1,2),y=c(0,0)),aes(x=x,y=y),color="red",lwd=2)

Related

Is there a method for converting a winmids object to a mids object?

Suppose I create 10 multiply-imputed datasets and use the (wonderful) MatchThem package in R to create weights for my exposure variable. The MatchThem package takes a mids object and converts it to an object of the class winmids.
My desired output is a mids object - but with weights. I hope to pass this mids object to BRMS as follows:
library(brms)
m0 <- brm_multiple(Y|weights(weights) ~ A, data = mids_data)
Open to suggestions.
EDIT: Noah's solution below will unfortunately not work.
The package's first author, Farhad Pishgar, sent me the following elegant solution. It will create a mids object from a winmidsobject. Thank you Farhad!
library(mice)
library(MatchThem)
#"weighted.dataset" is our .wimids object
#Extracting the original dataset with missing value
maindataset <- complete(weighted.datasets, action = 0)
#Some spit-and-polish
maindataset <- data.frame(.imp = 0, .id = seq_len(nrow(maindataset)), maindataset)
#Extracting imputed-weighted datasets in the long format
alldataset <- complete(weighted.datasets, action = "long")
#Binding them together
alldataset <- rbind(maindataset, alldataset)
#Converting to .mids
newmids <- as.mids(alldataset)
Additionally, for BRMS, I worked out this solution which instead creates a list of dataframes. It will work in fewer steps.
library("mice")
library("dplyr")
library("MatchThem")
library("brms") # for bayesian estimation.
# Note, I realise that my approach here is not fully Bayesian, but that is a good thing! I need to ensure balance in the exposure.
# impute missing data
data("nhanes2")
imp <- mice(nhanes2, printFlag = FALSE, seed = 0, m = 10)
# MathThem. This is just a fast method
w_imp <- weightthem(hyp ~ chl + age, data = imp,
approach = "within",
estimand = "ATE",
method = "ps")
# get individual data frames with weights
out <- complete(w_imp, action ="long", include = FALSE, mild = TRUE)
# assemble individual data frames into a list
m <- 10
listdat<- list()
for (i in 1:m) {
listdat[[i]] <- as.data.frame(out[[i]])
}
# pass the list to brms, and it runs as it should!
fit_1 <- brm_multiple(bmi|weights(weights) ~ age + hyp + chl,
data = listdat,
backend = "cmdstanr",
family = "gaussian",
set_prior('normal(0, 1)',
class = 'b'))
brm_multiple() can take in a list of data frames for its data argument. You can produce this from the wimids object using complete(). The output of complete() with action = "all" is a mild object, which is a list of data frames, but this is not recognized by brm_multiple() as such. So, you can just convert it to a list. This should look like the following:
df_list <- complete(mids_data, "all")
class(df_list) <- "list"
m0 <- brm_multiple(Y|weights(weights) ~ A, data = df_list)
Using complete() automatically adds a weights column to the resulting imputed data frames.

R Shiny: Build an interactive SQL query and copy data into global environment

I try to build a shiny app that enables users to query data. Users are supposed to provide a list of values of interest that are used to filter data stored in a database. Unfortunately, dplyr's translation does not appear to be the most efficient/performant solution such that I have to build a string and pass it to the database via sql(). The string manipulations are also a crude fail safe to handle entry errors/varieties. The following code illustrates this string building and data query process:
library(tibble)
library(dplyr)
library(dbplyr)
library(shiny)
library(RSQLite)
library(DBI)
library(stringr)
# Create string variables to experiment
mtcars1 <- tibble::rownames_to_column(mtcars, var = "model") %>% # Create column based on row names
dplyr::mutate(carmaker = stringr::word(model, 1)) # Create column with first word of column with row names
# Establish example database
con <- DBI::dbConnect(RSQLite::SQLite(), path = ":memory:")
copy_to(con, mtcars1, "mtcars1", temporary = FALSE)
# Query example
# Input
string_input <- "Mazda, Merc"
# Prepare input string to be used in SQL
string_filter <- string_input %>%
base::gsub("[,]+", " ", .) %>% # remove commas
stringr::str_squish(.) %>% # remove multiple blanks
base::gsub(" ", ",", .) %>% # substitute blanks for commas
base::gsub("(\\w+)", "'\\1'", .) %>% # enclose words with single quotation marks
base::paste0("carmaker in (", ., ")") # create sql where statement
# Data query
data <- tbl(con, "mtcars1") %>%
filter(sql(string_filter)) %>%
show_query() %>%
collect()
I'd like to implement this code in a shiny app:
# Shiny user interface
ui <- fluidPage(
textInput(inputId = "string_input", label = "Input", value = "", placeholder = "Enter list of car models without commas"),
actionButton(inputId = "go", label = "Go"),
textOutput(outputId = "string_output")
)
# Shiny server function
server <- function(input, output){
observeEvent(input$go, {
output$string_output <- reactive({input$string_input %>%
base::gsub("[,]+", " ", .) %>% # remove commas
stringr::str_squish(.) %>% # remove multiple blanks
base::gsub(" ", ",", .) %>% # substitute blanks for commas
base::gsub("(\\w+)", "'\\1'", .) %>% # enclose words with single quotation marks
base::paste0("carmaker in (", ., ")") # create sql where statement
})
})
}
# Launch shiny app
shinyApp(ui, server)
The app takes to list of values supplied by the user as input, transforms it and shows the transformed list as output.
Here is what I want to do:
I would like store the transformed string_input in an extra local object for further use in the app, that is, I want to pass the string_input to the data query similar to the non-shiny example above.
I would like to copy the data query result to R's global environment such that I can use it, even after closing the app.
Regarding the my second point: I read that one can use <- and <<-, but I could not make it work in a reactive context.
I figured out the solution:
library(tibble)
library(dplyr)
library(dbplyr)
library(shiny)
library(RSQLite)
library(DBI)
library(stringr)
# Create string variables to experiment
mtcars1 <- tibble::rownames_to_column(mtcars, var = "model") %>% # Create column based on row names
dplyr::mutate(carmaker = stringr::word(model, 1)) # Create column with first word of column with row names
# Establish example database
con <- DBI::dbConnect(RSQLite::SQLite(), path = ":memory:")
copy_to(con, mtcars1, "mtcars1", temporary = FALSE)
# Shiny user interface
ui <- fluidPage(
textInput(inputId = "string_input",
label = "Input",
value = "",
placeholder = "Enter a list of car makers (e.g. Mazda, Merc)"),
textOutput(outputId = "string_output"),
actionButton(inputId = "go", label = "Go"),
tableOutput(outputId = "data_output")
)
# Custom function to save reactive object to global environment
saveData <- function(x) {
export <<- x
}
# Shiny server function
server <- function(input, output){
list <- reactive({
input$string_input %>%
base::gsub("[,]+", " ", .) %>% # remove commas
stringr::str_squish(.) %>% # remove multiple blanks
base::gsub(" ", ",", .) %>% # substitute blanks for commas
base::gsub("(\\w+)", "'\\1'", .) %>% # enclose words with single quotation marks
base::paste0("carmaker in (", ., ")") # create sql where statement
})
output$string_output <- reactive({list()})
data <- eventReactive(input$go, {
tbl(con, "mtcars1") %>%
dplyr::filter(sql(!!list())) %>%
dplyr::collect()
})
output$data_output <- renderTable(data())
observeEvent(input$go, {
saveData(data())
})
}
# Launch shiny app
shinyApp(ui, server)
The trick was to define the function saveData, pass the reactive data object to it and assign it to export via <<-.
Honestly, I do not understand all of the fundamentals so any suggestions for improvement are welcome. However, it works.
For your first question:
# Shiny server function
server <- function(input, output){
string_output <- eventReactive(input$go, {
input$string_input %>%
base::gsub("[,]+", " ", .) %>% # remove commas
stringr::str_squish(.) %>% # remove multiple blanks
base::gsub(" ", ",", .) %>% # substitute blanks for commas
base::gsub("(\\w+)", "'\\1'", .) %>% # enclose words with single quotation marks
base::paste0("carmaker in (", ., ")") # create sql where statement
})
output$string_output <-renderText(string_output())
}
string_output() reactive function is now available for output as well as for data query.
Note that you could also use input$string_input instead of input$goas trigger to update the output while you type the criteria.
You can then use input$go to query the data:
data <- eventReactive(input$go, { dbGetQuery(yourConnection,YourQuery(string_output())})
output$data <- renderTable(data())
Not sure you can directly write from Shiny to R's environment, but you can for sure save data() as a file on the server.

Can we bind three plots using rbind_gtable in shiny dashboard to download all the plots together?

Trouble combining multiple shiny plots and downloading through a single click.
Code below is from the answer here: Solution
I tried this solution. It works fine for two plots but as soon as I add another plot it only returns the graph information but not the graph themselves.
Combining two plots works fine.
I also tried other solutions but both of the solutions when implemented returns a text file instead of pdf or pdf file that is corrupted.
Solution1
Solution2
Any suggestion would be really appreciated. Thank you!
Code
library(ggplot2)
ui <- shinyUI(fluidPage(
titlePanel("Test app"),
fluidRow(
column(4,
wellPanel(
downloadButton('download',label="Download plot as png")
)
),
column(8,
plotOutput("plot")
)
)
))
server <- function(input,output) {
plotting<- reactive({
data1=data.frame(x=rnorm(50),y=rnorm(50))
data2=data.frame(x=rexp(50),y=rexp(50))
data3=data.frame(x=rexp(50),y=rexp(50))
plot1=ggplot(data1,aes(x,y))+geom_point()
plot2=ggplot(data2,aes(x,y))+geom_point()
plot3=ggplot(data3,aes(x,y))+geom_point()
gb1=ggplot_build(plot1)
gb2=ggplot_build(plot2)
gb3=ggplot_build(plot3)
gA <- ggplot_gtable(gb1)
gB <- ggplot_gtable(gb2)
gC <- ggplot_gtable(gb3)
both <- gtable:::rbind_gtable(gA, gB, "last")
all <- gtable:::rbind_gtable(both, gC, "last")
return(all)
})
output$plot <- renderPlot({
grid.newpage()
grid.draw(plotting())
})
output$download <- downloadHandler(
filename <- "shinytestplot.png",
# Changes:
content <- function(file){ ## file = NULL --> file
png(file) # filename --> file
grid.newpage()
grid.draw(plotting())
dev.off()
}
)
}
shinyApp(server=server,ui=ui)
Your example works just fine for me. However, I'd recommend against rbinding plots. Use the patchwork package instead.
library(ggplot2)
library(shiny)
library(patchwork)
ui <- shinyUI(fluidPage(
titlePanel("Test app"),
fluidRow(
column(4,
wellPanel(
downloadButton('download',label="Download plot as png")
)
),
column(8,
plotOutput("plot")
)
)
))
server <- function(input,output) {
plotting<- reactive({
data1=data.frame(x=rnorm(50),y=rnorm(50))
data2=data.frame(x=rexp(50),y=rexp(50))
data3=data.frame(x=rexp(50),y=rexp(50))
plot1=ggplot(data1,aes(x,y))+geom_point()
plot2=ggplot(data2,aes(x,y))+geom_point()
plot3=ggplot(data3,aes(x,y))+geom_point()
# stack the plots on top of one another with patchwork
plot1 / plot2 / plot3
})
output$plot <- renderPlot({
print(plotting())
})
output$download <- downloadHandler(
filename <- "shinytestplot.png",
# Changes:
content <- function(file){
ggsave(file, plotting())
}
)
}
shinyApp(server=server,ui=ui)

Issue when trying to plot geom_tile using ggplotly

I would like to plot a ggplot2 image using ggplotly
What I am trying to do is to initially plot rectangles of grey fill without any aesthetic mapping, and then in a second step to plot tiles and change colors based on aesthetics. My code is working when I use ggplot but crashes when I try to use ggplotly to transform my graph into interactive
Here is a sample code
library(ggplot2)
library(data.table)
library(plotly)
library(dplyr)
x = rep(c("1", "2", "3"), 3)
y = rep(c("K", "B","A"), each=3)
z = sample(c(NA,"A","L"), 9,replace = TRUE)
df <- data.table(x,y,z)
p<-ggplot(df)+
geom_tile(aes(x=x,y=y),width=0.9,height=0.9,fill="grey")
p<-p+geom_tile(data=filter(df,z=="A"),aes(x=x,y=y,fill=z),width=0.9,height=0.9)
p
But when I type this
ggplotly(p)
I get the following error
Error in [.data.frame(g, , c("fill_plotlyDomain", "fill")) :
undefined columns selected
The versions I use are
> packageVersion("plotly")
1 ‘4.7.1
packageVersion("ggplot2")
1 ‘2.2.1.9000’
##########Edited example for Arthur
p<-ggplot(df)+
geom_tile(aes(x=x,y=y,fill="G"),width=0.9,height=0.9)
p<- p+geom_tile(data=filter(df,z=="A"),aes(x=x,y=y,fill=z),width=0.9,height=0.9)
p<-p+ scale_fill_manual(
guide = guide_legend(title = "test",
override.aes = list(
fill =c("red","white") )
),
values = c("red","grey"),
labels=c("A",""))
p
This works
but ggplotly(p) adds the grey bar labeled G in the legend
The output of the ggplotly function is a list with the plotly class. It gets printed as Plotly graph but you can still work with it as a list. Moreover, the documentation indicates that modifying the list makes it possible to clear all or part of the legend. One only has to understand how the data is structured.
p<-ggplot(df)+
geom_tile(aes(x=x,y=y,fill=z),width=0.9,height=0.9)+
scale_fill_manual(values = c(L='grey', A='red'), na.value='grey')
p2 <- ggplotly(p)
str(p2)
The global legend is here in p2$x$layout$showlegend and setting this to false displays no legend at all.
The group-specific legend appears at each of the 9 p2$x$data elements each time in an other showlegend attribute. Only 3 of them are set to TRUE, corresponding to the 3 keys in the legend. The following loop thus clears all the undesired labels:
for(i in seq_along(p2$x$data)){
if(p2$x$data[[i]]$legendgroup!='A'){
p2$x$data[[i]]$showlegend <- FALSE
}
}
Voilà!
This works here:
ggplot(df)+
geom_tile(aes(x=x,y=y,fill=z),width=0.9,height=0.9)+
scale_fill_manual(values = c(L='grey', A='red'), na.value='grey')
ggplotly(p)
I guess your problem comes from the use of 2 different data sources, df and filter(df,z=="A"), with columns with the same name.
[Note this is not an Answer Yet]
(Putting for reference, as it is beyond the limits for comments.)
The problem is rather complicated.
I just finished debugging the code of plotly. It seems like it's occurring here.
I have opened an issue in GitHub
Here is the minimal code for the reproduction of the problem.
library(ggplot2)
set.seed(1503)
df <- data.frame(x = rep(1:3, 3),
y = rep(1:3, 3),
z = sample(c("A","B"), 9,replace = TRUE),
stringsAsFactors = F)
p1 <- ggplot(df)+
geom_tile(aes(x=x,y=y, fill="grey"), color = "black")
p2 <- ggplot(df)+
geom_tile(aes(x=x,y=y),fill="grey", color = "black")
class(plotly::ggplotly(p1))
#> [1] "plotly" "htmlwidget"
class(plotly::ggplotly(p2))
#> Error in `[.data.frame`(g, , c("fill_plotlyDomain", "fill")): undefined columns selected

ggplot plotly API mess width stack bar graph

I am using plotly library to get me HTML interactive graph, which i already generating from ggplot2, but with stacked graph, plotly doesnt work properly.
Here is my ggplot code :
if(file.exists(filename)) {
data = read.table(filename,sep=",",header=T)
} else {
g <- paste0("=== [E] Error : Couldn't Found File : ",filename)
print (g)
}
ReadChData <- data[data$Channel %in% c("R"),]
#head(ReadChData,10)
# calculate midpoints of bars (simplified using comment by #DWin)
Data <- ddply(ReadChData, .(qos_level),
transform, pos = cumsum(AvgBandwidth) - (0.5 *AvgBandwidth)
)
# library(dplyr) ## If using dplyr...
# Data <- group_by(Data,Year) %>%
# mutate(pos = cumsum(Frequency) - (0.5 * Frequency))
# plot bars and add text
g <- ggplot(Data, aes(x = qos_level, y = AvgBandwidth)) +
scale_x_continuous(breaks = x_axis_break) +
geom_bar(aes(fill = MasterID), stat="identity", width=0.2) +
scale_colour_gradientn(colours = rainbow(7)) +
geom_text(aes(label = AvgBandwidth, y = pos), size = 3) +
theme_set(theme_bw()) +
ylab("Bandwidth (GB/s)") +
xlab("QoS Level") +
ggtitle("Qos Compting Stream")
png(paste0(opt$out,"/",GraphName,".png"),width=6*ppi, height=6*ppi, res=ppi)
print (g)
library(plotly)
p <- ggplotly(g)
#libdir arugumet will be use to point to commin lib
htmlwidgets::saveWidget(as.widget(p), selfcontained=FALSE, paste0(opt$out,"/qos_competing_stream.html"))
and here is HTML output form plotly lib
http://pasteboard.co/2fHQfJwFu.jpg
Please help.
This is perhaps quite a bit late to answer. But for someone who might have the issue in future...
The geom_bar's width parameter is not recognized by ggplotly function.
Work Around :
A work around (not very good one) by using parameters colour="white", size = 1. This basically adds a white line around the bars, making an effect like white space.
You could try the following:
stat_summary(aes(fill = MasterID), geom="bar", colour="white", size = 1, fun.y = "sum", position = "stack")
Better solution :
Use bargap parameter from layout function. The code should be:
ggplotly(type='bar', ...) %>% layout(bargap = 3, autosize=T)
P.S. the code in question code is not executable, throws an error due to missing filename.