How to outline a histogram with a color and add a bell curve on ggplot2 - ggplot2

I have been trying to add a bell curve to my histogram an outline it with a color so that it is more pleasing. enter image description here
I have added what my histogram looks like to give someone an idea on what I am working with, also here is my code thus far, thank you in advance.
ggplot(data = mammal.data.22.select2)+
geom_histogram(aes(x=Time, fill=Species))+
scale_fill_manual(values=c("paleturquoise4", "turquoise2"))+
facet_wrap(~Species, nrow=1)+
ylab("Observations")+
xlab("Time of Day")+
theme(strip.text.x = element_blank())

Let's build a histogram with a build-in dataset that seems similar-ish to your data structure.
library(ggplot2)
binwidth <- 0.25
p <- ggplot(iris, aes(Petal.Length)) +
geom_histogram(
aes(fill = Species),
binwidth = binwidth,
alpha = 0.5
) +
facet_wrap(~ Species)
You can use stat_bin() + geom_step() to give an outline to the histogram, without colouring the edge of every rectangle in the histogram. The only downside is that the first and last bins don't touch the x-axis.
p + stat_bin(
geom = "step", direction = "mid",
aes(colour = Species), binwidth = binwidth
)
To overlay a density function with a histogram, you could calculate the relevant parameters yourself and use stat_function() with fun = dnorm repeatedly. Alternatively, you can use ggh4x::stat_theodensity() to achieve a similar thing. Note that whether you use stat_function() or stat_theodensity(), you should scale the density back to the counts that your histogram uses (or scale histogram to density). In the example below, we do that by using after_stat(count * binwidth).
p + ggh4x::stat_theodensity(
aes(colour = Species,
y = after_stat(count * binwidth))
)
Created on 2022-04-15 by the reprex package (v2.0.1)
(disclaimer: I'm the author of ggh4x)

Related

Multiple Relative frequency histogram in R, ggplot

I'm trying to plot multiple histograms of relative frequencies in R. ggplot
Below are some basic example with the build-in iris dataset. The relative part is obtained by multiplying the density with the binwidth.
library(ggplot2)
ggplot(iris, aes(Sepal.Length, fill = Species)) +
geom_histogram(aes(y = after_stat(density * width)),
position = "identity", alpha = 0.5)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
ggplot(iris, aes(Sepal.Length)) +
geom_histogram(aes(y = after_stat(density * width))) +
facet_wrap(~ Species)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Created on 2022-03-07 by the reprex package (v2.0.1)

Percentage labels in pie chart with ggplot

I'm working now in a statistics project and recently started with R. I have some problems with the visualization. I found a lot of different tutorials about how to add percentage labels in pie charts, but after one hour of trying I still don't get it. Maybe something is different with my data frame so that this doesn't work?
It's a data frame with collected survey answers, so I'm not allowed to publish them here. The column in question (geschäftliche_lage) is a factor with three levels ("Gut", "Befriedigend", "Schlecht"). I want to add percentage labels for each level.
I used the following code in order to create the pie chart:
dataset %>%
ggplot(aes(x= "", fill = geschäftliche_lage)) +
geom_bar(stat= "count", width = 1, color = "white") +
coord_polar("y", start = 0, direction = -1) +
scale_fill_manual(values = c("#00BA38", "#619CFF", "#F8766D")) +
theme_void()
This code gives me the desired pie chart, but without percentage labels. As soon as a I try to add percentage labels, everything is messed up. Do you know a clean code for adding percentage labels?
If you need more information or data, just let me know!
Greetings
Using mtcars as example data. Maybe this what your are looking for:
library(ggplot2)
ggplot(mtcars, aes(x = "", fill = factor(cyl))) +
geom_bar(stat= "count", width = 1, color = "white") +
geom_text(aes(label = scales::percent(..count.. / sum(..count..))), stat = "count", position = position_stack(vjust = .5)) +
coord_polar("y", start = 0, direction = -1) +
scale_fill_manual(values = c("#00BA38", "#619CFF", "#F8766D")) +
theme_void()
Created on 2020-05-25 by the reprex package (v0.3.0)

Adding numeric label to geom_hline in ggplot2

I have produced the graph pictured using the following code -
ggboxplot(xray50g, x = "SupplyingSite", y = "PercentPopAff",
fill = "SupplyingSite", legend = "none") +
geom_point() +
rotate_x_text(angle = 45) +
# ADD HORIZONTAL LINE AT BASE MEAN
geom_hline(yintercept = mean(xray50g$PercentPopAff), linetype = 2)
What I would like to do is label the horizontal geom_hline with it's numeric value so that it appears on the y axis.
I have provided an example of what I would like to achieve in the second image.
Could somebody please help with the code to achieve this for my plot?
Thanks!
There's a really great answer that should help you out posted here. As long as you are okay with formatting the "extra tick" to match the existing axis, the easiest solution is to just create your axis breaks manually and specify within scale_y_continuous. See below where I use an example to label a vertical dotted line on the x-axis using this method.
df <- data.frame(x=rnorm(1000, mean = 0.5))
ggplot(df, aes(x)) +
geom_histogram(binwidth = 0.1) +
geom_vline(xintercept = 0.5, linetype=2) +
scale_x_continuous(breaks=c(seq(from=-4,to=4,by=2), 0.5))
Again, for other methods, including those where you want the extra tick mark formatted differently than the rest of the axis, check the top answer here.

Visualising an individual 2d graph for all points on a plane

I have a M vs N curve (let's take it to be a sigmoid, for ease of understanding) for a given value of parameters P and Q. I need to visualise the M vs N curves for a range of values of P and Q (assume 10 values in 0 to 1, i.e. 0.1, 0.2, ..., 0.9 for both P and Q)
The only solution that I've found for this problem is a Trellis plot (essentially a matrix of plots). I'd like to know if there any other method to visualise this sort of a 4d(?) relationship besides the Trellis plots. Thanks.
I'm not sure I understand what you're hoping for, so let me know if this is on the right track. Below are three examples using R.
The first is indeed a matrix of plots where each panel represents a different value of q and, within each panel, each curve represents a different value of p. The second is a 3D plot which looks at a surface based on three of the variables with the fourth fixed. The third is a Shiny app that creates the same interactive plot as in the second example but also provides a slider that allows you to change p and see how the plot changes. Unfortunately, I'm not sure how to embed the interactive plots in Stackoverflow so I've just provided the code.
I'm not sure if there's an elegant way to look at all four variables at the same time, but maybe someone will come along with additional options.
Matrix of plots for various values of p and q
library(tidyverse)
theme_set(theme_classic())
# Function to plot
my_fun = function(x, p, q) {
1/(1 + exp(p + q*x))
}
# Parameters
params = expand.grid(p=seq(-2,2,length=6), q=seq(-1,1,length=11))
# x-values to feed to my_fun
x = seq(-10,10,0.1)
# Generate data frame for plotting
dat = map2_df(params$p, params$q, function(p, q) {
data.frame(p=p, q=q, x, y=my_fun(x, p, q))
})
ggplot(dat, aes(x,y,colour=p, group=p)) +
geom_line() +
facet_grid(. ~ q, labeller=label_both) +
labs(colour="p") +
scale_colour_gradient(low="red", high="blue") +
theme(legend.position="bottom")
3D plot with one variable fixed
The code below will produce an interactive 3D plot that you can zoom and rotate. I've fixed the value of p and drawn a plot of the y surface for a grid of x and q values.
library(rgl)
x = seq(-10,10,0.1)
q = seq(-1,1,0.01)
y = outer(x, q, function(a, b) 1/(1 + exp(1 + b*a)))
persp3d(x, q, y, col=hcl(240,80,65), specular="grey20",
xlab = "x", ylab = "q", zlab = "y")
I'm not sure how to embed the interactive plot, but here's a static image of one viewing angle:
Shiny app
The code below will create the same plot as above, but with the added ability to vary p with a slider and see how the plot changes.
Open an R script file and paste in the code below. Save it as app.r in its own directory then run the code. Both an rgl window and the Shiny app page with the slider for controlling the value of p should open. Resize the windows as desired and then move the slider to see how the function surface changes for various values of p.
library(shiny)
# Define UI for application that draws an interactive plot
ui <- fluidPage(
# Application title
titlePanel("Plot the function 1/(1 + exp(p + q*x))"),
# Sidebar with a slider input for number of bins
sidebarLayout(
sidebarPanel(
sliderInput("p",
"Vary the value of p and see how the plot changes",
min = -2,
max = 2,
value = 1,
step=0.2)
),
# Show a plot of the generated distribution
mainPanel(
plotOutput("distPlot")
)
)
)
# Define server logic required to draw the plot
server <- function(input, output) {
output$distPlot <- renderPlot({
library(rgl)
x = seq(-10,10,0.1)
q = seq(-1,1,0.01)
y = outer(x, q, function(a, b) 1/(1 + exp(input$p + b*a)))
persp3d(x, q, y, col=hcl(240,50,65), specular="grey20",
xlab = "x", ylab = "q", zlab = "y")
})
}
# Run the application
shinyApp(ui = ui, server = server)

ggplot2 - add manual legend to multiple layers

I have a ggplot in which I am using color for my geom_points as a function of one of my columns(my treatment) and then I am using the scale_color_manual to choose the colors.
I automatically get my legend right
The problem is I need to graph some horizontal lines that have to do with the experimental set up, which I am doing with geom_vline, but then I don't know how to manually add a separate legend that doesn't mess with the one I already have and that states what those lines are.
I have the following code
ggplot(dcons.summary, aes(x = meters, y = ymean, color = treatment, shape = treatment)) +
geom_point(size = 4) +
geom_errorbar(aes(ymin = ymin, ymax = ymax)) +
scale_color_manual(values=c("navy","seagreen3"))+
theme_classic() +
geom_vline(xintercept = c(0.23,3.23, 6.23,9.23), color= "bisque3", size=0.4) +
scale_x_continuous(limits = c(-5, 25)) +
labs(title= "Sediment erosion", subtitle= "-5 -> 25 meters; standard deviation; consistent measurements BESE & Control", x= "distance (meters)", y="erosion (cm)", color="Treatment", shape="Treatment")
So I would just need an extra legend beneath the "treatment" one that says "BESE PLOTS LOCATION" and that is related to the gray lines
I have been searching for a solution, I've tried using "scale_linetype_manual" and also "guides", but I'm not getting there
As you provided no reproducible example, I used data from the mtcars dataset.
In addition I modified this similar answer a little bit. As you already specified the color and in addition the fill factor is not working here, you can use the linetype as a second parameter within aes wich can be shown in the legend:
xid <- data.frame(xintercept = c(15,20,30), lty=factor(1))
mtcars %>%
ggplot(aes(mpg ,cyl, col=factor(gear))) +
geom_point() +
geom_vline(data=xid, aes(xintercept=xintercept, lty=lty) , col = "red", size=0.4) +
scale_linetype_manual(values = 1, name="",label="BESE PLOTS LOCATION")
Or without the second data.frame:
ggplot() +
geom_point(data = mtcars,aes(mpg ,cyl, col=factor(gear))) +
geom_vline(aes(xintercept=c(15,20,30), lty=factor(1) ), col = "red", size=0.4)+
scale_linetype_manual(values = 1, name="",label="BESE PLOTS LOCATION")