Adding prime symbol (') to ggplot2 axis label using expression() - ggplot2

I have the following code snippet that should (once complete) plot u_prime and v_prime on the log2(x + 1) scale using expression:
p <- ggplot(df_uv_prime, aes(x = u_prime, y = v_prime)) + geom_point(colour = "blue") +
labs(x = expression(log[2](u^{...} + 1)), y = expression(log[2](v^{...} + 1))) +
xlim(0, 1) +
ylim(0, 1)
However adding ' in place of ... doesn't work, as R expects a closing '.
Is there a base R solution without having to resort to any packages?

Based on your description of the desired outcome, one option is to enclose the single quote or backtick (not sure which one you're using) in double quotes and add 'connectors' (*) or 'spaces' (~) to either side, e.g. backtick with space on x axis, single quote with connector on y axis:
library(ggplot2)
ggplot(mtcars, aes(x = hp, y = disp)) +
geom_point(colour = "blue") +
labs(x = expression(log[2](u*"`"~ + 1)),
y = expression(log[2](v*"'"* + 1)))
Created on 2022-11-28 with reprex v2.0.2

Related

Creating graph on ggplot2 with facte_wrap using multiple seperate variables

For my thesis I am using R-studio. I want to make a graph on ggplot2 with x= age(H2_lft) and y = IMT value (Mean_IMT_alg). I want to plot a graph with multiple variables(cardiovascular risk factors) to see the relationship between a certain variable/cardiovascular risk factor (e.g. smoking(H2_roken)/gender(H1_geslacht)/ethnicity(H1_EtnTotaal) and the IMT value on a certain age.
First, I plotted multiple lines (each line represented a variable) in a graph. But I think this is a little too messy. I actually want to have multiple 'pannels/graphs' with x= age and y = IMT value. And in every graph I want to have a different variable.
I hope my explanation is clear enough and someone can help me :)
My first code (multiple lines in same plot) is:
t <- ggplot(data = Dataset, aes(x = H2_lft, y = MeanIMT_alg)) +
geom_smooth(se = FALSE, aes(group = H1_EtnTotaal, colour = H1_EtnTotaal)) +
geom_smooth(se = FALSE, aes(group = H2_Roken, colour = H2_Roken)) +
geom_smooth(se = FALSE, aes(group = H1_geslacht, colour = H1_geslacht)) +
stat_smooth(method = lm, se=FALSE) +
theme_classic()
t + labs(x = "Age (years)", y = "Mean IMT (mm)", title ="IMT", caption = "Figure 2: mean IMT", color = "cardiovascular risk factors", fil = "cardiovascular risk factors")
To accomplish multiple panels i used 'facet_wrap'. The problem however is that when using 'groups' in facet_Wrap, R makes groups that proceed on each other. But i want the groups to be unrelated of eachother. For example: I want one graph with a line for Marroccan ethnicity, one line with current smoking and one line with Male participants. I do not want a graph with: morroccan women that currently smoke or: Dutch men that never smoked. So, I want the graph with all the lines but split into several graphs.
The code that I used to accomplish this is:
t <- ggplot(data = Dataset, aes(x = H2_lft, y = MeanIMT_alg)) +
geom_smooth(se = FALSE, aes(group = H1_EtnTotaal, colour = H1_EtnTotaal)) +
geom_smooth(se = FALSE, aes(group = H2_Roken, colour = H2_Roken)) +
geom_smooth(se = FALSE, aes(group = H1_geslacht, colour = H1_geslacht)) +
stat_smooth(method = lm, se=FALSE)+
facet_wrap(~H1_EtnTotaal + ~H2_Roken + ~H1_geslacht, scales = "free_y") +
theme_classic()
t + labs(x = "Age (years)", y = "Mean IMT (mm)", title ="IMT", caption = "Figure 2: mean IMT", color = "cardiovascular risk factors", fil = "cardiovascular risk factors")
I think it might be generally easier to reshape the data to a long format for plotting with ggplot2. If you want seperate legends for each of the categories, you can use the {ggnewscale} package to do so. Is this (approximately) what you're looking for?
library(ggnewscale)
library(ggplot2)
# Dummy data
Dataset <- data.frame(
H2_lft = runif(100, 18, 90),
MeanIMT_alg = rnorm(100),
H1_EtnTotaal = sample(LETTERS[1:5], 100, replace = TRUE),
H2_Roken = sample(LETTERS[6:8], 100, replace = TRUE),
H1_geslacht = sample(c("M", "F"), 100, replace = TRUE)
)
# Reshape data to long format
new <- tidyr::pivot_longer(Dataset, c(H1_EtnTotaal, H2_Roken, H1_geslacht))
ggplot(new, aes(H2_lft, MeanIMT_alg, group = value)) +
geom_smooth(
data = ~ subset(.x, name == "H1_EtnTotaal"),
aes(colour = value),
se = FALSE
) +
scale_colour_discrete(name = "EtnTotaal") +
new_scale_colour() +
geom_smooth(
data = ~ subset(.x, name == "H1_geslacht"),
aes(colour = value),
se = FALSE
) +
scale_colour_discrete(name = "geslacht") +
new_scale_colour() +
geom_smooth(
data = ~ subset(.x, name == "H2_Roken"),
aes(colour = value),
se = FALSE
) +
scale_colour_discrete(name = "Roken") +
geom_smooth(
method = lm, se = FALSE,
aes(group = NULL)
) +
facet_wrap(~ name)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
#> `geom_smooth()` using formula 'y ~ x'
Created on 2022-11-04 by the reprex package (v2.0.0)

ggplot2 Expand the plot limits giving error

I have a df:
test<- data.frame (Metrics = c("PCT_PF_READS (%)" , "PCT_Q30_R1 (%)" , "PCT_Q30_R2 (%)"),
LowerLimit = c(80,80,80),
Percent = c(93.1,95.1,92.4)
)
> test
Metrics LowerLimit Percent
1 PCT_PF_READS (%) 80 93.1
2 PCT_Q30_R1 (%) 80 95.1
3 PCT_Q30_R2 (%) 80 92.4
I am trying to plot in ggplot2 but I want to specify the yaxis.
If I do:
ggplot(data=test3, aes(x= Metrics,y=Percent,)) +
geom_bar(stat="identity" )
If I try to set the yaxis to start at 75, I get a blank plot:
ggplot(data=test3, aes(x= Metrics,y=Percent,)) +
geom_bar(stat="identity" ) + scale_y_continuous(limits = c(75,100))
with the message
Warning message:
Removed 3 rows containing missing values (geom_bar)
But the values are in range????
Does this answer your question?
library(tidyverse)
test<- data.frame (Metrics = c("PCT_PF_READS (%)" , "PCT_Q30_R1 (%)" , "PCT_Q30_R2 (%)"),
LowerLimit = c(80,80,80),
Percent = c(93.1,95.1,92.4)
)
# Starting plot:
ggplot(data = test, aes(x = Metrics, y = Percent)) +
geom_bar(stat = "identity")
# If you cut off any of the bar using "limit" the bar is removed,
# E.g. this removes the middle bar (Percent = 95.1)
ggplot(data = test, aes(x = Metrics, y = Percent)) +
geom_bar(stat = "identity") +
scale_y_continuous(limits = c(0,95))
#> Warning: Removed 1 rows containing missing values (position_stack).
# A better solution is to use "coord_cartesian()"
ggplot(data = test, aes(x = Metrics, y = Percent)) +
geom_bar(stat = "identity") +
coord_cartesian(ylim = c(75, 100))
# Although it's generally advised to keep the whole axis,
# as 'chopping off' the bottom can be misleading
# Another alternative is to write the percentages on the plot:
ggplot(data = test, aes(x = Metrics, y = Percent)) +
geom_bar(stat = "identity") +
geom_text(aes(label = paste0(Percent, "%")),
nudge_y = 2)
Created on 2022-10-19 by the reprex package (v2.0.1)

Start ggplot continuous axis with a squiggly line break? [duplicate]

I have a dataframe (dat) with two columns 1) Month and 2) Value. I would like to highlight that the x-axis is not continuous in my boxplot by interrupting the x-axis with two angled lines on the x-axis that are empty between the angled lines.
Example Data and Boxplot
library(ggplot2)
set.seed(321)
dat <- data.frame(matrix(ncol = 2, nrow = 18))
x <- c("Month", "Value")
colnames(dat) <- x
dat$Month <- rep(c(1,2,3,10,11,12),3)
dat$Value <- rnorm(18,20,2)
ggplot(data = dat, aes(x = factor(Month), y = Value)) +
geom_boxplot() +
labs(x = "Month") +
theme_bw() +
theme(panel.grid = element_blank(),
text = element_text(size = 16),
axis.text.x = element_text(size = 14, color = "black"),
axis.text.y = element_text(size = 14, color = "black"))
The ideal figure would look something like below. How can I make this discontinuous axis in ggplot?
You could make use of the extended axis guides in the ggh4x package. Alas, you won't easily be able to create the "separators" without a hack similar to the one suggested by user Zhiqiang Wang
guide_axis_truncated accepts vectors to define lower and upper trunks. This also works for units, by the way, then you have to pass the vector inside the unit function (e.g., trunc_lower = unit(c(0,.45), "npc") !
library(ggplot2)
library(ggh4x)
set.seed(321)
dat <- data.frame(matrix(ncol = 2, nrow = 18))
x <- c("Month", "Value")
colnames(dat) <- x
dat$Month <- rep(c(1,2,3,10,11,12),3)
dat$Value <- rnorm(18,20,2)
# this is to make it slightly more programmatic
x1end <- 3.45
x2start <- 3.55
p <-
ggplot(data = dat, aes(x = factor(Month), y = Value)) +
geom_boxplot() +
labs(x = "Month") +
theme_classic() +
theme(axis.line = element_line(colour = "black"))
p +
guides(x = guide_axis_truncated(
trunc_lower = c(-Inf, x2start),
trunc_upper = c(x1end, Inf)
))
Created on 2021-11-01 by the reprex package (v2.0.1)
The below is taking user Zhiqiang Wang's hack a step further. You will see I am using simple trigonometry to calculate the segment coordinates. in order to make the angle actually look as it is defined in the function, you would need to set coord_equal.
# a simple function to help make the segments
add_separators <- function(x, y = 0, angle = 45, length = .1){
add_y <- length * sin(angle * pi/180)
add_x <- length * cos(angle * pi/180)
## making the list for your segments
myseg <- list(x = x - add_x, xend = x + add_x,
y = rep(y - add_y, length(x)), yend = rep(y + add_y, length(x)))
## this function returns an annotate layer with your segment coordinates
annotate("segment",
x = myseg$x, xend = myseg$xend,
y = myseg$y, yend = myseg$yend)
}
# you will need to set limits for correct positioning of your separators
# I chose 0.05 because this is the expand factor by default
y_sep <- min(dat$Value) -0.05*(min(dat$Value))
p +
guides(x = guide_axis_truncated(
trunc_lower = c(-Inf, x2start),
trunc_upper = c(x1end, Inf)
)) +
add_separators(x = c(x1end, x2start), y = y_sep, angle = 70) +
# you need to set expand to 0
scale_y_continuous(expand = c(0,0)) +
## to make the angle look like specified, you would need to use coord_equal()
coord_cartesian(clip = "off", ylim = c(y_sep, NA))
I think it is possible to get what you want. It may take some work.
Here is your graph:
library(ggplot2)
set.seed(321)
dat <- data.frame(matrix(ncol = 2, nrow = 18))
x <- c("Month", "Value")
colnames(dat) <- x
dat$Month <- rep(c(1,2,3,10,11,12),3)
dat$Value <- rnorm(18,20,2)
p <- ggplot(data = dat, aes(x = factor(Month), y = Value)) +
geom_boxplot() +
labs(x = "Month") +
theme_bw() +
theme(panel.grid = element_blank(),
text = element_text(size = 16),
axis.text.x = element_text(size = 14, color = "black"),
axis.text.y = element_text(size = 14, color = "black"))
Here is my effort:
p + annotate("segment", x = c(3.3, 3.5), xend = c(3.6, 3.8), y = c(14, 14), yend = c(15, 15))+
coord_cartesian(clip = "off", ylim = c(15, 25))
Get something like this:
If you want to go further, it may take several tries to get it right:
p + annotate("segment", x = c(3.3, 3.5), xend = c(3.6, 3.8), y = c(14, 14), yend = c(15, 15))+
annotate("segment", x = c(0, 3.65), xend = c(3.45, 7), y = c(14.55, 14.55), yend = c(14.55, 14.55)) +
coord_cartesian(clip = "off", ylim = c(15, 25)) +
theme_classic()+
theme(axis.line.x = element_blank())
Just replace axis with two new lines. This is a rough idea, it may take some time to make it perfect.
You could use facet_wrap. If you assign the first 3 months to one group, and the other months to another, then you can produce two plots that are side by side and use a single y axis.
It's not exactly what you want, but it will show the data effectively, and highlights the fact that the x axis is not continuous.
dat$group[dat$Month %in% c("1", "2", "3")] <- 1
dat$group[dat$Month %in% c("10", "11", "12")] <- 2
ggplot(data = dat, aes(x = factor(Month), y = Value)) +
geom_boxplot() +
labs(x = "Month") +
theme_bw() +
theme(panel.grid = element_blank(),
text = element_text(size = 16),
axis.text.x = element_text(size = 14, color = "black"),
axis.text.y = element_text(size = 14, color = "black")) +
facet_wrap(~group, scales = "free_x")
* Differences in the plot are likely due to using different versions of R where the set.seed gives different result

Add boxes with descriptive annotations to y-axis in ggplot2

I"M trying to add another label or description to my Y axis. I attached a picture for reference for what I'm trying to accomplish. I can't find anything that describes how to add additional elements to an axis. It the "Good" and "Bad" boxes beside the Y axis that I"m trying to incorporate into my ggplot. Thanks!
enter image description here
One approach to achieve this is by using patchwork. You can set up the annotations of the y-axis as a second ggplot and glue it to your main plot using patchwork. Try this:
library(ggplot2)
library(patchwork)
library(dplyr)
p1 <- tibble(x = 1:10, y = 1:10) %>%
ggplot(aes(x, y)) +
geom_point() +
scale_y_reverse(breaks = seq(1, 10)) +
labs(y = NULL)
p2 <- tibble(ymin = c(0, 4), ymax = c(4, 10), fill = c("bad", "good")) %>%
ggplot() +
geom_rect(aes(xmin = 0, xmax = 1, ymin = ymin, ymax = ymax, fill = fill)) +
geom_text(aes(x = .5, y = (ymin + ymax) / 2, label = fill), angle = 90) +
scale_y_reverse(breaks = seq(1, 10), expand = expansion(mult = c(0, 0))) +
scale_x_continuous(breaks = c(0), expand = expansion(mult = c(0, 0))) +
guides(fill = FALSE) +
theme_void()
p2 + p1 + plot_layout(widths = c(1, 9))
Created on 2020-05-28 by the reprex package (v0.3.0)

Explain np.polyfit and np.polyval for a scatter plot

I have to make a scatter plot and liner fit to my data. prediction_08.Dem_Adv and prediction_08.Dem_Win are two column of datas. I know that np.polyfit returns coefficients. But what is np.polyval doing here? I saw the documentation, but the explanation is confusing. can some one explain to me clearly
plt.plot(prediction_08.Dem_Adv, prediction_08.Dem_Win, 'o')
plt.xlabel("2008 Gallup Democrat Advantage")
plt.ylabel("2008 Election Democrat Win")
fit = np.polyfit(prediction_08.Dem_Adv, prediction_08.Dem_Win, 1)
x = np.linspace(-40, 80, 10)
y = np.polyval(fit, x)
plt.plot(x, y)
print fit
np.polyval is applying the polynomial function which you got using polyfit. If you get y = mx+ c relationship. The np.polyval function will multiply your x values with fit[0] and add fit[1]
Polyval according to Docs:
N = len(p)
y = p[0]*x**(N-1) + p[1]*x**(N-2) + ... + p[N-2]*x + p[N-1]
If the relationship is y = ax**2 + bx + c,
fit = np.polyfit(x,y,2)
a = fit[0]
b = fit[1]
c = fit[2]
If you do not want to use the polyval function:
y = a*(x**2) + b*(x) + c
This will create the same output as polyval.