I'm having making an animated line chart, problems with the X axis - ggplot2

I'm trying to animate a plot I have where the X axis is non-numeric. The plot itself looks great, but I get a few error messages trying to animate it using the transition_reveal function.
I've got a data set called df100m that tracks the times/speeds of 10 meter splits of the 100 meter dash for various Olympic runners. It looks like this.
splits
runners
times(s)
speed(mph)
10-20
Bolt_08
1.070
21.93
20-30
Bolt_08
0.910
24.58
84 more rows of different splits and runners omitted for space.
Plotting the average speed for this data set using stat_smooth looks great. I removed the reaction time (RT), the final time (TOTAL), and the starting 10m (Start-10), so that it only shows the numeric splits. Here is the code for the plot I have so far:
df100m %>%
filter(!grepl("RT", splits)) %>%
filter(!grepl("TOTAL", splits)) %>%
filter(!grepl("Start-10", splits)) %>%
ggplot(mapping = aes(x = splits, y = speed, col = runner, group = runner)) +
stat_smooth(method = loess, se = F, fullrange = F) +
theme(axis.text.x = element_text(angle = 90)) +
theme(aspect.ratio = 3/7) +
theme_solarized_2(light=F)
However when I add +transition_reveal(~splits) I get the following error message:
Error in seq.default(range[1], range[2], length.out = nframes) :
'from' must be a finite number
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
Playing around with it, I sometimes also get the "invalid 'times' argument" error.
I know there are a few problems with the X axis (splits), it's a character rather than numeric, but also has a dash (-). I've seen a few posts attempting to fix this error, but I am unable to fix it as I am a beginner. Could someone point me to the right direction?

Using some minimal made-up data, this is one possible approach creating the smoothed lines before the plotting, then basing the transition_reveal on the splits mutated to integers (as splits_int).
library(tidyverse)
library(gganimate)
library(broom)
tribble(~splits, ~speed, ~runner,
"10-20", 20.0, "A",
"20-30", 21.0, "A",
"30-40", 22.0, "A",
"10-20", 19.0, "B",
"20-30", 20.0, "B",
"30-40", 21.0, "B"
) %>%
mutate(splits_int = factor(splits) %>% as.integer()) %>%
nest(data = -runner) %>%
mutate(
lm_model = map(data, ~loess(speed ~ splits_int, data = .x)),
augmented = map(lm_model, augment) %>% map(select, .fitted)
) %>%
unnest(c(augmented, data)) %>%
ggplot(aes(splits, .fitted, col = runner, group = runner)) +
geom_line() +
transition_reveal(splits_int)
Created on 2022-12-10 with reprex v2.0.2

Related

Wilcox.test confidence intervals by group (to plot in ggplot)

Is there way to carry out a wilcoxon.test by group, with calculate confidence intervals, and then plot these results in ggplot?
My "data":
zero <- sample(0:0, 50, replace = TRUE)
small <- sample(1:5, 20, replace = TRUE)
medium <- sample(5:25, 15, replace = TRUE)
high <- sample(150:300, 5, replace = TRUE)
f <- function(x){
return(data.frame(ID=deparse(substitute(x)), value=x))
}
all <- bind_rows(f(zero), f(small), f(medium), f(high))
all <- as.data.frame(all[,-1])
names(all)[1] <- "value"
all$group <- c("a", "b", "c")
My attempt:
x <- ddply(all, .(group), function(x) {wilcox.test(all$value, conf.int=TRUE, conf.level=0.95)})
Error in list_to_dataframe(res, attr(.data, "split_labels"), .id, id_as_factor) :
Results must be all atomic, or all data frames
In addition: There were 12 warnings (use warnings() to see them)
I'd then like to plot the psuedo-medians with their respective confidence intervals, but I'm also not sure how to save the results for ggplot to work from.

ggplot2: add title changes point colors <-> scale_color_manual removes ggtitle

I am facing a silly point color in a dot plot with ggplot 2. I have a whole table of data of which i take relevant rows to make a dot plot. With scale_color_manual my points get colored according to the named palette and factor genotype specified in aes() and when i simply want to add a title specifying the cell line used, the points get colored back to automatic yellow and purple. Adding the title first and setting scale_color_manual as the last layer changes the points colors and removes the title.
What is wrong in there? I don't get it and it is a bit frustrating
thanks for your help!
Here's reproducible code to get my whole df and the subset for the plots:
# df of data to plot
exp <- c(rep(284, times = 6), rep(285, times = 12))
geno <- c(rep(rep(c("WT", "KO"), each =3), times = 6))
line <- c(rep(5, times = 6),rep(8, times= 12), rep(5, times =12), rep(8, times = 6))
ttt <- c(rep(c(0, 10, 60), times = 10), rep(c("ZAc60", "Cu60", "Cu200"), times = 2))
rep <- c(rep(1, times = 12), rep(2, times = 6), rep(c(1,2), times = 6), rep(1, times = 6))
rel_expr <- c(0.20688185, 0.21576131, 0.94046028, 0.30327675, 0.22865200,
0.92941881, 0.13787508, 0.13325281, 0.22114990, 0.95591724,
1.03239718, 0.83339248, 0.15332420, 0.17558160, 0.22475604,
1.02356351, 0.77882000, 0.69214403, 0.16874097, 0.15548158,
0.45207943, 0.28123760, 0.23500083, 0.51588856, 0.1399634,
0.14610184, 1.06716713, 0.16517801, 0.34736164, 0.64773650,
0.18334429, 0.05924757, 0.01803593, 0.86685230, 0.39554685,
0.25764805)
df_all <- data.frame(exp, geno, line, ttt, rep, rel_expr)
names(df_all) <- c("EXP", "Geno", "Line", "TTT", "Rep", "Rel_Expr")
str(df_all)
# make Geno an ordered factor
df_all$Geno <- ordered(df_all$Geno, levels = c("WT", "KO"))
# select set of whole dataset for current plot
df_ions <- df_all[df_all$Line == 8 & !df_all$TTT %in% c(10, 60),]
# add a treatment as factor columns fTTT
df_ions$fTTT <- ordered(df_ions$TTT, levels = c("0", "ZAc60", "Cu60", "Cu200"))
str(df_ions)
# plot rel_exp vs factor treatment, color points by geno
# with named color palette
library(ggplot2)
col_palette <- c("#000000", "#1356BC")
names(col_palette) <- c("WT", "KO")
plt <- ggplot(df_ions, aes(x = fTTT, y = Rel_Expr, color = Geno)) +
geom_jitter(width = 0.1)
plt # intermediate_plt_1.png
plt + scale_color_manual(values = col_palette) # intermediate_plt_2.png
plt + ggtitle("mRPTEC8") # final_plot.png
images:

Barplot of percentages by groups in ggplot2

So, I've done my searches but cannot find the solution to this problem i have with a bar plot in ggplot.
I'm trying to make the bars be in percentage of the total number of cases in each group in grouping variable 2.
Right now i have it visualising the number of counts,
Dataframe = ASAP
Grouping variable 1 - cc_groups (seen in top of the graph)
(counts number of cases within a range (steps of 20) in a score from 0-100.)
grouping variable 2 - asap
( binary variable with either intervention or control, number of controls and interventions are not the same)
Initial code
``` r
ggplot(ASAP, aes(x = asap, fill = asap)) + geom_bar(position = "dodge") +
facet_grid(. ~ cc_groups) + scale_fill_manual(values = c("red",
"darkgray"))
#> Error in ggplot(ASAP, aes(x = asap, fill = asap)): could not find function "ggplot"
```
Created on 2020-05-19 by the reprex package (v0.3.0)
this gives me the following graph which is a visualisation of the counts in each subgroup.
enter image description here
I have manually calculated the different percentages that actually needs to be visualised:
table_groups <- matrix(c(66/120,128/258,34/120,67/258,10/120,30/258,2/120,4/258,0,1/258,8/120,28/258),ncol = 2, byrow = T)
colnames(table_groups) <- c("ASAP","Control")
rownames(table_groups) <- c("0-10","20-39","40-59","60-79","80-99","100")
ASAP Control
0-10 0.55000 0.496124
20-39 0.28333 0.259690
40-59 0.08333 0.116279
60-79 0.01667 0.015504
80-99 0.00000 0.003876
100 0.06667 0.108527
When i use the solution provided by Stefan below (which was an excellent answer but didn't do the actual trick. i get the following output
``` r
ASAP %>% count(cc_groups, asap) %>% group_by(cc_groups) %>% mutate(pct = n/sum(n)) %>%
ggplot(aes(x = asap, y = pct, fill = asap)) + geom_col(position = "dodge") +
facet_grid(~cc_groups) + scale_fill_manual(values = c("red",
"darkgray"))
#> Error in ASAP %>% count(cc_groups, asap) %>% group_by(cc_groups) %>% mutate(pct = n/sum(n)) %>% : could not find function "%>%"
```
<sup>Created on 2020-05-19 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0)</sup>
enter image description here
whereas (when i go analogue) id like it to show the percentages as above like this.
enter image description here
Im SO sorry about that drawing.. :) and reprex kept feeding me errors, im sure im using it incorrectly.
The easiest way to achieve this is via aggregating the data before plotting, i.e. manually computing counts and percentages:
library(ggplot2)
library(dplyr)
ASAP %>%
count(cc_groups, asap) %>%
group_by(asap) %>%
mutate(pct = n / sum(n)) %>%
ggplot(aes(x = asap, y = pct, fill=asap)) +
geom_col(position="dodge")+
facet_grid(~cc_groups)+
scale_fill_manual(values = c("red","darkgray"))
Using ggplot2::mpg as example data:
library(ggplot2)
library(dplyr)
# example data
mpg2 <- mpg %>%
filter(cyl %in% c(4, 6)) %>%
mutate(cyl = factor(cyl))
# Manually compute counts and percentages
mpg3 <- mpg2 %>%
count(class, cyl) %>%
group_by(class) %>%
mutate(pct = n / sum(n))
# Plot
ggplot(mpg3, aes(x = cyl, y = pct, fill = cyl)) +
geom_col(position = "dodge") +
facet_grid(~ class) +
scale_fill_manual(values = c("red","darkgray"))
Created on 2020-05-18 by the reprex package (v0.3.0)

Adding percentage labels to a barplot with y-axis 'count' in R

I'd like to add percentage labels per gear to the bars but keep the count y-scale.
E.g. 10% of all 'gear 3' are '4 cyl'
library(ggplot)
ds <- mtcars
ds$gear <- as.factor(ds$gear)
p1 <- ggplot(ds, aes(gear, fill=gear)) +
geom_bar() +
facet_grid(cols = vars(cyl), margins=T)
p1
Ideally only in ggplot, wihtout adding dplyr or tidy. I found some of these solutions but then I get other issues with my original data.
EDIT: Suggestions that this is a duplicate from:
enter link description here
I saw this also earlier, but wasn't able to integrate that code into what I want:
# i just copy paste some of the code bits and try to reconstruct what I had earlier
ggplot(ds, aes(gear, fill=gear)) +
facet_grid(cols = vars(cyl), margins=T) +
# ..prop.. meaning %, but i want to keep the y-axis as count
geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
# not sure why, but I only get 100%
geom_text(aes( label = scales::percent(..prop..),
y= ..prop.. ), stat= "count", vjust = -.5)
The issue is that ggplot doesn't know that each facet is one group. This very useful tutorial helps with a nice solution. Just add aes(group = 1)
P.S. At the beginning, I was often quite reluctant and feared myself to manipulate my data and pre-calculate data frames for plotting. But there is no need to fret! It is actually often much easier (and safer!) to first shape / aggregate your data into the right form and then plot/ analyse the new data.
library(tidyverse)
library(scales)
ds <- mtcars
ds$gear <- as.factor(ds$gear)
First solution:
ggplot(ds, aes(gear, fill = gear)) +
geom_bar() +
facet_grid(cols = vars(cyl), margins = T) +
geom_text(aes(label = scales::percent(..prop..), group = 1), stat= "count")
edit to reply to comment
Showing percentages across facets is quite confusing to the reader of the figure and I would probably recommend against such a visualization. You won't get around data manipulation here. The challenge is here to include your "facet margin". I create two summary data frames and bind them together.
ds_count <-
ds %>%
count(cyl, gear) %>%
group_by(gear) %>%
mutate(perc = n/sum(n)) %>%
ungroup %>%
mutate(cyl = as.character(cyl))
ds_all <-
ds %>%
count(cyl, gear) %>%
group_by(gear) %>%
summarise(n = sum(n)) %>%
mutate(cyl = 'all', perc = 1)
ds_new <- bind_rows(ds_count, ds_all)
ggplot(ds_new, aes(gear, fill = gear)) +
geom_col(aes(gear, n, fill = gear)) +
facet_grid(cols = vars(cyl)) +
geom_text(aes(label = scales::percent(perc)), stat= "count")
IMO, a better way would be to simply swap x and facetting variables. Then you can use ggplots summarising function as above.
ggplot(ds, aes(as.character(cyl), fill = gear)) +
geom_bar() +
facet_grid(cols = vars(gear), margins = T) +
geom_text(aes(label = scales::percent(..prop..), group = 1), stat= "count")
Created on 2020-02-07 by the reprex package (v0.3.0)

ggplot plotly API mess width stack bar graph

I am using plotly library to get me HTML interactive graph, which i already generating from ggplot2, but with stacked graph, plotly doesnt work properly.
Here is my ggplot code :
if(file.exists(filename)) {
data = read.table(filename,sep=",",header=T)
} else {
g <- paste0("=== [E] Error : Couldn't Found File : ",filename)
print (g)
}
ReadChData <- data[data$Channel %in% c("R"),]
#head(ReadChData,10)
# calculate midpoints of bars (simplified using comment by #DWin)
Data <- ddply(ReadChData, .(qos_level),
transform, pos = cumsum(AvgBandwidth) - (0.5 *AvgBandwidth)
)
# library(dplyr) ## If using dplyr...
# Data <- group_by(Data,Year) %>%
# mutate(pos = cumsum(Frequency) - (0.5 * Frequency))
# plot bars and add text
g <- ggplot(Data, aes(x = qos_level, y = AvgBandwidth)) +
scale_x_continuous(breaks = x_axis_break) +
geom_bar(aes(fill = MasterID), stat="identity", width=0.2) +
scale_colour_gradientn(colours = rainbow(7)) +
geom_text(aes(label = AvgBandwidth, y = pos), size = 3) +
theme_set(theme_bw()) +
ylab("Bandwidth (GB/s)") +
xlab("QoS Level") +
ggtitle("Qos Compting Stream")
png(paste0(opt$out,"/",GraphName,".png"),width=6*ppi, height=6*ppi, res=ppi)
print (g)
library(plotly)
p <- ggplotly(g)
#libdir arugumet will be use to point to commin lib
htmlwidgets::saveWidget(as.widget(p), selfcontained=FALSE, paste0(opt$out,"/qos_competing_stream.html"))
and here is HTML output form plotly lib
http://pasteboard.co/2fHQfJwFu.jpg
Please help.
This is perhaps quite a bit late to answer. But for someone who might have the issue in future...
The geom_bar's width parameter is not recognized by ggplotly function.
Work Around :
A work around (not very good one) by using parameters colour="white", size = 1. This basically adds a white line around the bars, making an effect like white space.
You could try the following:
stat_summary(aes(fill = MasterID), geom="bar", colour="white", size = 1, fun.y = "sum", position = "stack")
Better solution :
Use bargap parameter from layout function. The code should be:
ggplotly(type='bar', ...) %>% layout(bargap = 3, autosize=T)
P.S. the code in question code is not executable, throws an error due to missing filename.