ggplot2 - add manual legend to multiple layers - ggplot2

I have a ggplot in which I am using color for my geom_points as a function of one of my columns(my treatment) and then I am using the scale_color_manual to choose the colors.
I automatically get my legend right
The problem is I need to graph some horizontal lines that have to do with the experimental set up, which I am doing with geom_vline, but then I don't know how to manually add a separate legend that doesn't mess with the one I already have and that states what those lines are.
I have the following code
ggplot(dcons.summary, aes(x = meters, y = ymean, color = treatment, shape = treatment)) +
geom_point(size = 4) +
geom_errorbar(aes(ymin = ymin, ymax = ymax)) +
scale_color_manual(values=c("navy","seagreen3"))+
theme_classic() +
geom_vline(xintercept = c(0.23,3.23, 6.23,9.23), color= "bisque3", size=0.4) +
scale_x_continuous(limits = c(-5, 25)) +
labs(title= "Sediment erosion", subtitle= "-5 -> 25 meters; standard deviation; consistent measurements BESE & Control", x= "distance (meters)", y="erosion (cm)", color="Treatment", shape="Treatment")
So I would just need an extra legend beneath the "treatment" one that says "BESE PLOTS LOCATION" and that is related to the gray lines
I have been searching for a solution, I've tried using "scale_linetype_manual" and also "guides", but I'm not getting there

As you provided no reproducible example, I used data from the mtcars dataset.
In addition I modified this similar answer a little bit. As you already specified the color and in addition the fill factor is not working here, you can use the linetype as a second parameter within aes wich can be shown in the legend:
xid <- data.frame(xintercept = c(15,20,30), lty=factor(1))
mtcars %>%
ggplot(aes(mpg ,cyl, col=factor(gear))) +
geom_point() +
geom_vline(data=xid, aes(xintercept=xintercept, lty=lty) , col = "red", size=0.4) +
scale_linetype_manual(values = 1, name="",label="BESE PLOTS LOCATION")
Or without the second data.frame:
ggplot() +
geom_point(data = mtcars,aes(mpg ,cyl, col=factor(gear))) +
geom_vline(aes(xintercept=c(15,20,30), lty=factor(1) ), col = "red", size=0.4)+
scale_linetype_manual(values = 1, name="",label="BESE PLOTS LOCATION")

Related

Colouring line of best fit by shape whilst accounting for a separate coloured variable in ggplot?

I have a technical issue with my attempts to plot group differences whilst accounting for 3 variables. This all works fine until I attempt to plot the line of best fit for each group; which results in a plot that makes it difficult to distinguish between groups (as seen below)
ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width, color = Petal.Length, shape = Species)) + geom_point() +
scale_color_viridis_c() +
geom_smooth(method = "lm", se = FALSE, show.legend = TRUE)
I would like to provide a manual discrete colour for each best fit line, so that readers can distinguish between groups easier (for example; something like having a red line for setosa, a white line for versicolor and black line for virginica). Below are the examples of what I have tried so far with their associated error messages.
ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width, color = Petal.Length, shape = Species)) + geom_point() +
scale_color_viridis_c() +
geom_smooth(method = "lm", se = FALSE, show.legend = TRUE, aes(color = Species))
"Error: Discrete value supplied to continuous scale"
ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width, color = Petal.Length, shape = Species)) + geom_point() +
scale_color_viridis_c() +
geom_smooth(method = "lm", se = FALSE, show.legend = TRUE , aes(color = Species)) +
scale_color_discrete()
"Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
geom_smooth() using formula 'y ~ x'
Error: Continuous value supplied to discrete scale"
Any recommendations on how to manually assign a colour to each line (whilst leaving the scatter plot colours unchanged) would be very appreciated.
Many thanks in advance,
Rhys

How can I make the line appear besides the dots?

I tried to plot my data but I can only get the points, if I put "linetype" with geom:line it does not appear. Besides, I have other columns in my data set, called SD, SD.1 and SD.2, which are standard deviation values I calculated previously that appear at the bottom. I would like to remove them from the plot and put them like error bars in the lines.
library(tidyr)
long_data <- tidyr::pivot_longer(
data=OD,
cols=-Days,
names_to="Strain",
values_to="OD")
ggplot(long_data, aes(x=Days, y=OD, color=Strain)) +
geom_line() + geom_point(shape=16, size=1.5) +
scale_color_manual(values=c("Wildtype"="darkorange2", "Winter"="cadetblue3", "Flagella_less"="olivedrab3"))+
labs(title="Growth curve",x="Days",y="OD750",color="Legend")+
theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5,color="black",size=8),
axis.text.y=element_text(angle=0,hjust=1,vjust=0.5,color="black",size=8),
plot.title=element_text(hjust=0.5, size=13,face = "bold",margin = margin(t=0, r=10,b=10,l=10)),
axis.title.y =element_text(size=10, margin=margin(t=0,r=10,b=0,l=0)),
axis.title.x =element_text(size=10, margin=margin(t=10,r=10,b=0,l=0)),
axis.line = element_line(size = 0.5, linetype = "solid",colour = "black"))

Adding numeric label to geom_hline in ggplot2

I have produced the graph pictured using the following code -
ggboxplot(xray50g, x = "SupplyingSite", y = "PercentPopAff",
fill = "SupplyingSite", legend = "none") +
geom_point() +
rotate_x_text(angle = 45) +
# ADD HORIZONTAL LINE AT BASE MEAN
geom_hline(yintercept = mean(xray50g$PercentPopAff), linetype = 2)
What I would like to do is label the horizontal geom_hline with it's numeric value so that it appears on the y axis.
I have provided an example of what I would like to achieve in the second image.
Could somebody please help with the code to achieve this for my plot?
Thanks!
There's a really great answer that should help you out posted here. As long as you are okay with formatting the "extra tick" to match the existing axis, the easiest solution is to just create your axis breaks manually and specify within scale_y_continuous. See below where I use an example to label a vertical dotted line on the x-axis using this method.
df <- data.frame(x=rnorm(1000, mean = 0.5))
ggplot(df, aes(x)) +
geom_histogram(binwidth = 0.1) +
geom_vline(xintercept = 0.5, linetype=2) +
scale_x_continuous(breaks=c(seq(from=-4,to=4,by=2), 0.5))
Again, for other methods, including those where you want the extra tick mark formatted differently than the rest of the axis, check the top answer here.

Why is there a space between the bars and the axis in ggplot2 bar graphs, and how do I get rid of it?

I've been building a bar graph in R, and I noticed a problem. whenever the graph is made, it has a very small gap between the bars and the axis that causes a line of the background image to appear (Link). How can I get rid of this?
Code:
album_cover <- image_read("https://i.scdn.co/image/ab67616d0000b273922a12ba0b5a66f034dc9959")
ggplot(data=album_df, aes(x=rev(factor(track_names, track_names)), y=-1 * track_length)) +
ggtitle("Songs vs length")+
annotation_custom(rasterGrob(album_cover,
width = unit(1,"npc"),
height = unit(1,"npc")),
-Inf, Inf, -Inf, Inf)+
#geom_image(image = "https://i.scdn.co/image/ab67616d0000b273922a12ba0b5a66f034dc9959", size = Inf) +
geom_bar(stat="identity", position = "identity", color = 'NA', alpha = 0.9, width = 1, fill = 'white') +
scale_y_continuous(expand = c(0, 0), limits = c(-1 * max_track, 0)) +
scale_x_discrete(expand = c(0, 0)) +
theme(axis.title.x=element_blank(),
axis.title.y=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank()
) +
coord_flip()
Interesting issue. I've tried many things, including modification of many of the theme elements. It works with theme_void(), but then the issue resurfaces as you add back in the plot elements (namely the song titles on the axis, for some reason).
What finally did work is just squishing your image to be ever so slightly less than 1. In this case, just changing from 1 to 0.999 fixes the issue and you no longer have the strip of the image hanging out on the right. For this, I made up my own data, but I'm using the same image:
df <- data.frame(
track_names=paste0('Song',1:8),
track_length=c(3.5,7.5,5,3,7,10,6,7.4)
)
album_cover <- image_read2("https://i.scdn.co/image/ab67616d0000b273922a12ba0b5a66f034dc9959")
ggplot(data=df, aes(x=track_names, y=-1*track_length)) +
annotation_custom(rasterGrob(album_cover,
width=unit(0.999,'npc'), height=unit(1,'npc')),
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_col(alpha=0.9, width=1, fill='white', color=NA) +
scale_y_continuous(expand=c(0,0)) +
scale_x_discrete(expand=c(0,0)) +
ggtitle('Songs vs Length') +
coord_flip()
Note, the same code above gives the following image below when width=unit(1, 'npc'),... in the rasterGrob() function (note the line at the right side of the image):

Connect observations (dots and lines) without using ggpaired

I created a bar chart using geom_bar with "Group" on the x-axis (Female, Male), and "Values" on the y-axis. Group is further subdivided into "Session" such that there is "Session 1" and "Session 2" for both Male and Female (i.e. four bars in total).
Since all participants participated in Session 1 and 2, I overlayed a dotplot (geom_dot) over each of the four bars, to represent the individual data.
I am now trying to connect the observations for all participants ("PID"), between session 1 and 2. In other words, there should be lines connecting several sets of two-points on the "Male" portion of the x-axis (i.e. per participant), and "Female portion".
I tried this with "geom_line" (below) but to no avail (instead, it created a single vertical line in the middle of "Male" and another in the middle of "Female"). I'm not too sure how to fix this.
See code below:
ggplot(data_foo, aes(x=factor(Group),y=Values, colour = factor(Session), fill = factor(Session))) +
geom_bar(stat = "summary", fun.y = "mean", position = "dodge") +
geom_dotplot(binaxis = "y", stackdir = "center", dotsize = 1.0, position = "dodge", fill = "black") +
geom_line(aes(group = PID), colour="dark grey") +
labs(title='My Data',x='Group',y='Values') +
theme_light()
Sample data (.txt)
data_foo <- readr::read_csv("PID,Group,Session,Values
P1,F,1,14
P2,F,1,13
P3,F,1,16
P4,M,1,18
P5,F,1,20
P6,M,1,27
P7,M,1,19
P8,M,1,11
P9,F,1,28
P10,F,1,20
P11,F,1,24
P12,M,1,10
P1,F,2,26
P2,F,2,21
P3,F,2,19
P4,M,2,13
P5,F,2,26
P6,M,2,15
P7,M,2,23
P8,M,2,23
P9,F,2,30
P10,F,2,21
P11,F,2,11
P12,M,2,19")
The trouble you have is that you want to dodge by several groups. Your geom_line does not know how to split the Group variable by session. Here are two ways to address this problem. Method 1 is probably the most "ggploty way", and a neat way of adding another grouping without making the visualisation too overcrowded. for method 2 you need to change your x variable
1) Use facet
2) Use interaction to split session for each Group. Define levels for the right bar order
I have also used geom_point instead, because geom_dot is more a specific type of histogram.
I would generally recommend to use boxplots for such plots of values like that, because bars are more appropriate for specific measures such as counts.
Method 1: Facets
library(ggplot2)
ggplot(data_foo, aes(x = Session, y = Values, fill = as.character(Session))) +
geom_bar(stat = "summary", fun.y = "mean", position = "dodge") +
geom_line(aes(group = PID)) +
geom_point(aes(group = PID), shape = 21, color = 'black') +
facet_wrap(~Group)
Created on 2020-01-20 by the reprex package (v0.3.0)
Method 2: create an interaction term in your x variable. note that you need to order the factor levels manually.
data_foo <- data_foo %>% mutate(new_x = factor(interaction(Group,Session), levels = c('F.1','F.2','M.1','M.2')))
ggplot(data_foo, aes(x = new_x, y = Values, fill = as.character(Session))) +
geom_bar(stat = "summary", fun.y = "mean", position = "dodge") +
geom_line(aes(group = PID)) +
geom_point(aes(group = PID), shape = 21, color = 'black')
Created on 2020-01-20 by the reprex package (v0.3.0)
But everything gets visually not very compelling.
I suggest doing a few visualization tips to have a more informative chart. For example, I feel like having a differentiation of colors for PID will help us track the changes of each participant for different levels of other variables. Something like:
library(ggplot2)
ggplot(data_foo, aes(x = factor(Session), y = Values, fill = factor(Session))) +
geom_bar(stat = "summary", fun.y = "mean", position = "dodge") +
geom_line(aes(group = factor(PID), colour=factor(PID)), size=2, alpha=0.7) +
geom_point(aes(group = factor(PID), colour=factor(PID)), shape = 21, size=2,show.legend = F) +
theme_bw() +
labs(x='Session',fill='Session',colour='PID')+
theme(legend.position="right") +
facet_wrap(~Group)+
scale_colour_discrete(breaks=paste0('P',1:12))
And we have the following plot:
Hope it helps.