Problem with alignment of geom_point and geom_errorbar - ggplot2

I am trying to plot how different predictors associate with stroke and underlying phenotypes (i.e. cholesterol). In my data, I originally had working ggplot code in which shapes denoted the different variables (stroke, HDL cholesterol and total cholesterol) and colour denoted type (i.e. disease (stroke) or phenotype (HDL/total cholesterol). To make it more intuitive, I want to swap shape and colour around but now that I do this, I am having issues with position dodge and the alignment of geom_point and geom_error
stroke_graph <- ggplot(stroke,aes(y=as.numeric(stroke$test),
x=Clock,
shape = Type,
colour = Variable)) +
geom_point(data=stroke, aes(shape=Type, colour=Variable), show.legend=TRUE,
position=position_dodge(width=0.5), size = 3) +
geom_errorbar(aes(ymin = as.numeric(stroke$LCI), ymax= as.numeric(stroke$UCI)),
position = position_dodge(0.5), width = 0.05,
colour ="black")+
ylab("standardised beta/log odds")+ xlab ("")+
geom_hline(yintercept = 0, linetype = "dotted")+
theme(axis.text.x = element_text(size = 10, vjust = 0.5), legend.position = "none",
plot.title = element_text(size = 12))+
scale_y_continuous(limit = c(-0.402, 0.7))+ scale_shape_manual(values=c(15, 17, 18))+
theme(legend.position="right") + labs(shape = "Variable") + guides(shape = guide_legend(reverse=TRUE)) +
coord_flip()
stroke_graph + ggtitle("Stroke and Associated Phenotypes") + theme(plot.title = element_text(hjust = 0.5))
Graph now: 1
Previously working graph - only difference in code is swapping "Type" and "Variable": 2

Related

how to color different datasets separately when overlapping them using geom_smooth and color settings

i have 2 datasets that span full genomes, separated by chromosomes (scaffolds), for 2 group comparisons and i want to overlap them in a single graph.
the way i was doing was as follow:
ggplot(NULL, aes(color = as_factor(scaffold))) +
geom_smooth(data = windowStats_SBvsOC, aes(x = mid2, y = Fst_group1_group5), se=F) +
geom_smooth(data = windowStats_SCLvsSCU, aes(x = mid2, y = Fst_group3_group4), se=F) +
scale_y_continuous(expand = c(0,0), limits = c(0, 1)) +
scale_x_continuous(labels = chrom$chrID, breaks = axis_set$center) +
scale_color_manual(values = rep(c("#276FBF", "#183059"), unique(length(chrom$chrID)))) +
scale_size_continuous(range = c(0.5,3)) +
labs(x = NULL,
y = "Fst (smoothed means)") +
theme_minimal() +
theme(
legend.position = "none",
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
axis.title.y = element_text(),
axis.text.x = element_text(angle = 60, size = 8, vjust = 0.5))
this way, i get each chromosome with alternating colors, and the smoothing is per chromosome. but i wanted the colors to be different between the 2 groups so i can distinguish when they are overlapped like this. is there a way to do it? i can only do it once i remove the color by scaffold, but then the smoothing gets done across the whole genome and i don't want that!
my dataset is big, so i'm attaching it here!
i'm running this in rstudio 2022.02.3, R v.3.6.2 and package ggplot2
EDIT: i've figured out! i just needed to change color = as_factor(scaffold) to group = as_factor(scaffold); and then add the aes(color) to each geom_smooth() function.

ggplot sf variable factor with specific point colour, and size, with sf geometry

i have a shp file , with lat lon
( shp_4283 <- sf::st_transform(shp, crs = 4283) )
and 3 variables, of which i would like to plot the separate
$Substrate factors to separate colours and to their $geometry locations.
with geom_sf..
ggplot() +
geom_sf(data = subset(shp_4283, Substrate == "Sand", show.legend = "point"), #aes(shape = YOU),
color = "yellow", size = 1) +
geom_sf(data = subset(shp_4283, Substrate == "Mixed reef and sand", show.legend = "point"), #aes(shape = YOU),
color = "green", size = 1) +
geom_sf(data = subset(shp_4283, Substrate == "None modelled with certainty", show.legend = "point"), #aes(shape = YOU),
color = "grey", size = 2) +
geom_sf(data = subset(shp_4283, Substrate == "Reef", show.legend = "point"), #aes(shape = YOU),
color = "black", size = 2, show.legend = T) +
coord_sf()
the plot works, but with no legend as no aes() set.. but then further errors occur due to "Error in x[j] : invalid subscript type 'list'"
I understand to create a new df filtering each factor and its geometry to then plot from..
df <- shp_4283 %>%
# Your filter
filter(Substrate == "Reef") %>%
# 2 Extract coordinates
st_coordinates() %>%
# 3 to table /tibble
as.data.frame() %>%
**is this where i would code the 'column names' so that each
filtered $Substrate factor in a new df would be labelled appropriately?**
but is there a geom_.. way to plot separate variable factor from the sf df with its geometry.. and the legend mapping the color to the factor?
geom_sf() & subset() & explicit layer call
to view the smallest data-group last, on top of the progressively larger data-groups beneath
Col = c('green','grey','black','yellow') #define data-group colours
jr_map <- ggplot() +
geom_sf(data = shp_4283, aes(color = Substrate)) + # to map all data-groups to the plot, and more importantly, the legend.. legend appears now.. but not correct color-mapping
# individually map the separate groups, by factor, add point size, and legend visibility
geom_sf(data = subset(shp_4283, Substrate == "Sand", show.legend = "point"), color = "yellow", size = 1) +
geom_sf(data = subset(shp_4283, Substrate == "Mixed reef and sand", show.legend = "point"), color = "green", size = 1) +
geom_sf(data = subset(shp_4283, Substrate == "None modelled with certainty", show.legend = "point"), color = "grey", size = 2) +
geom_sf(data = subset(shp_4283, Substrate == "Reef", show.legend = "point"), color = "black", size = 2) +
# retrieve geometry
coord_sf() +
# add label for map projection/CRS
annotate(geom = "text", x = 115.095:115.1, y = -33.62:-33.6, label = "crs = 4283",
fontface = "italic", color = "grey22", size = 2.5) +
#geom_text(data=bb_df, aes(x=115,y=35), fill='black', color='black', alpha=.1, angle=35) + # incase an on-map descriptive label needed
# add north arrow
annotation_north_arrow(location = "bl", which_north = "true",
pad_x = unit(0.25, "cm"),
pad_y = unit(0.25, "cm"),
height = unit(0.6, "cm"), width = unit(0.6, "cm"),
style = north_arrow_fancy_orienteering) +
theme_bw() + # remove 'azure' ocean color
ggtitle("Jolly-Rog Drops" , subtitle = "Geographe Bay, Western Australia") +
xlab("Longitude") +
ylab("Latitude") +
## Legend : CUSTOM
#Title, colors (Col specified manually at start..ordered according to $Substrate listing),
scale_color_manual(name = "Substrate", values = Col) +
guides(color = guide_legend(override.aes = list(shape = c(2,2,2,2), size = 1, fill = Col), nrow = 2, byrow = TRUE, legend.text = element_text(size=0.2)) ) + # size and shape NOT WORKING!? HELP..
#theme(legend.position = c(0, 0))
theme(legend.position = "bottom")
jr_map
also thanks to this post (geom_ explanation)
the different variable factors ($Substrate) are 1. geo-plotted, and their 2. respective visibility, is easily adjustable, dependent on the layer at which they are coded from initial ggplot() call (farthest/last most prominent). The geom_sf() points (not geom_point()) are then made prominent through colour, shape, and size of point (alpha = also available).

What is Julia's equivalent ggplot code of R's?

I would like to plot a sophisticated graph in Julia. The code below is in Julia's version using ggplot.
using CairoMakie, DataFrames, Effects, GLM, StatsModels, StableRNGs, RCall
#rlibrary ggplot2
rng = StableRNG(42)
growthdata = DataFrame(; age=[13:20; 13:20],
sex=repeat(["male", "female"], inner=8),
weight=[range(100, 155; length=8); range(100, 125; length=8)] .+ randn(rng, 16))
mod_uncentered = lm(#formula(weight ~ 1 + sex * age), growthdata)
refgrid = copy(growthdata)
filter!(refgrid) do row
return mod(row.age, 2) == (row.sex == "male")
end
effects!(refgrid, mod_uncentered)
refgrid[!, :lower] = #. refgrid.weight - 1.96 * refgrid.err
refgrid[!, :upper] = #. refgrid.weight + 1.96 * refgrid.err
df= refgrid
ggplot(df, aes(x=:age, y=:weight, group = :sex, shape= :sex, linetype=:sex)) +
geom_point(position=position_dodge(width=0.15)) +
geom_ribbon(aes(ymin=:lower, ymax=:upper), fill="gray", alpha=0.5)+
geom_line(position=position_dodge(width=0.15)) +
ylab("Weight")+ xlab("Age")+
theme_classic()
However, I would like to modify this graph a bit more. For example, I would like to change the scale of the y axis, the colors of the ribbon, add some error bars, and also change the text size of the legend and so on. Since I am new to Julia, I am not succeding in finding the equivalent language code for these modifications. Could someone help me translate this R code below of ggplot into Julia's language?
t1= filter(df, sex=="male") %>% slice_max(df$weight)
ggplot(df, aes(age, weight, group = sex, shape= sex, linetype=sex,fill=sex, colour=sex)) +
geom_line(position=position_dodge(width=0.15)) +
geom_point(position=position_dodge(width=0.15)) +
geom_errorbar(aes(ymin = lower, ymax = upper),width = 0.1,
linetype = "solid",position=position_dodge(width=0.15))+
geom_ribbon(aes(ymin = lower, ymax = upper, fill = sex, colour = sex), alpha = 0.2) +
geom_text(data = t1, aes(age, weight, label = round(weight, 1)), hjust = -0.25, size=7,show_guide = FALSE) +
scale_y_continuous(limits = c(70, 150), breaks = seq(80, 140, by = 20))+
theme_classic()+
scale_colour_manual(values = c("orange", "blue")) +
guides(color = guide_legend(override.aes = list(linetype = c('dotted', 'dashed'))),
linetype = "none")+
xlab("Age")+ ylab("Average marginal effects") + ggtitle("Title") +
theme(
axis.title.y = element_text(color="Black", size=28, face="bold", hjust = 0.9),
axis.text.y = element_text(face="bold", color="black", size=16),
plot.title = element_text(hjust = 0.5, color="Black", size=28, face="bold"),
legend.title = element_text(color = "Black", size = 13),
legend.text = element_text(color = "Black", size = 16),
legend.position="bottom",
axis.text.x = element_text(face="bold", color="black", size=11),
strip.text = element_text(face= "bold", size=15)
)
As I commented before, you can use R-strings to run R code. To be clear, this isn't like your post's approach where you piece together many Julia objects that wrap many R objects, this is RCall converting a Julia Dataframe to an R dataframe then running your R code.
Running an R script may not seem very Julian, but code reuse is very Julian. Besides, you're still using an R library and active R session either way, and there might even be a slight performance benefit from reducing how often you make wrapper objects and switch between Julia and R.
## import libraries for Julia and R; still good to do at top
using CairoMakie, DataFrames, Effects, GLM, StatsModels, StableRNGs, RCall
R"""
library(ggplot2)
library(dplyr)
"""
## your Julia code without the #rlibrary or ggplot lines
rng = StableRNG(42)
growthdata = DataFrame(; age=[13:20; 13:20],
sex=repeat(["male", "female"], inner=8),
weight=[range(100, 155; length=8); range(100, 125; length=8)] .+ randn(rng, 16))
mod_uncentered = lm(#formula(weight ~ 1 + sex * age), growthdata)
refgrid = copy(growthdata)
filter!(refgrid) do row
return mod(row.age, 2) == (row.sex == "male")
end
effects!(refgrid, mod_uncentered)
refgrid[!, :lower] = #. refgrid.weight - 1.96 * refgrid.err
refgrid[!, :upper] = #. refgrid.weight + 1.96 * refgrid.err
df= refgrid
## convert Julia's df and run your R code in R-string
## - note that $df is interpolation of Julia's df into R-string,
## not R's $ operator like in rdf$weight
## - call the R dataframe rdf because df is already an R function
R"""
rdf <- $df
t1= filter(rdf, sex=="male") %>% slice_max(rdf$weight)
ggplot(rdf, aes(age, weight, group = sex, shape= sex, linetype=sex,fill=sex, colour=sex)) +
geom_line(position=position_dodge(width=0.15)) +
geom_point(position=position_dodge(width=0.15)) +
geom_errorbar(aes(ymin = lower, ymax = upper),width = 0.1,
linetype = "solid",position=position_dodge(width=0.15))+
geom_ribbon(aes(ymin = lower, ymax = upper, fill = sex, colour = sex), alpha = 0.2) +
geom_text(data = t1, aes(age, weight, label = round(weight, 1)), hjust = -0.25, size=7,show_guide = FALSE) +
scale_y_continuous(limits = c(70, 150), breaks = seq(80, 140, by = 20))+
theme_classic()+
scale_colour_manual(values = c("orange", "blue")) +
guides(color = guide_legend(override.aes = list(linetype = c('dotted', 'dashed'))),
linetype = "none")+
xlab("Age")+ ylab("Average marginal effects") + ggtitle("Title") +
theme(
axis.title.y = element_text(color="Black", size=28, face="bold", hjust = 0.9),
axis.text.y = element_text(face="bold", color="black", size=16),
plot.title = element_text(hjust = 0.5, color="Black", size=28, face="bold"),
legend.title = element_text(color = "Black", size = 13),
legend.text = element_text(color = "Black", size = 16),
legend.position="bottom",
axis.text.x = element_text(face="bold", color="black", size=11),
strip.text = element_text(face= "bold", size=15)
)
"""
The result is the same as your post's R code:
I used Vega-Lite (https://github.com/queryverse/VegaLite.jl) which is also grounded in the "Grammar of Graphics", and LinearRegression (https://github.com/ericqu/LinearRegression.jl) which provides similar features as GLM, although I think it is possible to get comparable results with the other plotting and linear regression packages. Nevertheless, I hope that this gives you a starting point.
using LinearRegression: Distributions, DataFrames, CategoricalArrays
using DataFrames, StatsModels, LinearRegression
using VegaLite
growthdata = DataFrame(; age=[13:20; 13:20],
sex=categorical(repeat(["male", "female"], inner=8), compress=true),
weight=[range(100, 155; length=8); range(100, 125; length=8)] .+ randn(16))
lm = regress(#formula(weight ~ 1 + sex * age), growthdata)
results = predict_in_sample(lm, growthdata, req_stats="all")
fp = select(results, [:age, :weight, :sex, :uclp, :lclp, :predicted]) |> #vlplot() +
#vlplot(
mark = :errorband, color = :sex,
y = { field = :uclp, type = :quantitative, title="Average marginal effects"},
y2 = { field = :lclp, type = :quantitative },
x = {:age, type = :quantitative} ) +
#vlplot(
mark = :line, color = :sex,
x = {:age, type = :quantitative},
y = {:predicted, type = :quantitative}) +
#vlplot(
:point, color=:sex ,
x = {:age, type = :quantitative, axis = {grid = false}, scale = {zero = false}},
y = {:weight, type = :quantitative, axis = {grid = false}, scale = {zero = false}},
title = "Title", width = 400 , height = 400
)
which gives:
You can change the style of the elements by changing the "config" as indicated here (https://www.queryverse.org/VegaLite.jl/stable/gettingstarted/tutorial/#Config-1).
As the Julia Vega-Lite is a wrapper to Vega-Lite additional documentation can be found on the Vega-lite website (https://vega.github.io/vega-lite/)

How to show where networkpersons live and how they are connected

I want to show where network people live and how they are connected. First, I drew a map of the 15 municipalities (based on SpatialPolygonsDataFrame, geom_polygon of ggplot2). Second, I placed the network people around the centroids of the polygons. After the third variant in "Three ways of visualizing a graph on a map" by Markus Konrad, I have so far created two layers https://datascience.blog.wzb.eu/2018/05/ 31 / three-ways-of-visualizing-a-graph-on-a-map /). As mapcoords I used coord_fixed (ratio = 1/1). To achieve a good result, I had to make manual adjustments in annotation_custom.
My questions:
First, is there a way to adapt the layers to each other without manual intervention?
Second, are there simpler solutions to geographically locate network people and their connections?my result so far
maptheme <- theme(panel.grid = element_blank()) +
theme(axis.text = element_blank()) +
theme(axis.ticks = element_blank()) +
theme(axis.title = element_blank()) +
theme(legend.position = "bottom") +
theme(panel.grid = element_blank()) +
theme(panel.background = element_rect(fill = "#596673")) +
theme(plot.margin = unit(c(0, 0, 0.5, 0), 'cm'))
mapcoords <- coord_fixed(ratio=1/1)
theme_transp_overlay <- theme(
panel.background = element_rect(fill = "transparent", color = NA),
plot.background = element_rect(fill = "transparent", color = NA))
ArlMap <- ggplot(ARLmap.data, aes(long, lat)) +
geom_polygon(aes(group=group), colour='white', fill='grey')+
theme(axis.text=element_blank())+
theme(axis.ticks=element_blank())+
theme(axis.title=element_blank())+
mapcoords + maptheme
nodes <- ggplot(nwdata) +
geom_point(aes(x = xkor, y = ykor, size = Btw),
shape = 21, fill = "white", color = "black", # draw nodes
stroke = 0.5) +
scale_size_continuous(guide = FALSE, range = c(1, 6)) +
mapcoords + maptheme + theme_transp_overlay
ArlMap +
annotation_custom(ggplotGrob(nodes), xmin = min(ARLmap.data$long)+900, xmax = max(ARLmap.data$long)-1200, ymin = min(ARLmap.data$lat)+1500, ymax = max(ARLmap.data$lat))
...
I'm at the goal. I came to the solution by consistently starting from a geographical approach: 1. The nodes of the network receive lon / lat coordinates. These are determined as rotation coordinates around the centroids of the geographical unit. 2. The connections between the nodes are provided with new start and end points on the basis of the lon / lat coordinates. 3. The plot is limited to the basic functions plot, lines and points.enter image description here

ggplot2: How to move y axis labels right next to the bars

I am working with following reproducible dataset:
df<- data.frame(name=c(letters[1:10],letters[1:10]),fc=runif(20,-5,5)
,fdr=runif(20),group=c(rep("gene",10),rep("protein",10)))
Code used to plot:
df$sig<- ifelse(df$fdr<0.05 & df$fdr>0 ,"*","")
ggplot(df, aes(x=reorder(name,fc),fc))+geom_col(aes(fill=group),position = "dodge",width = 0.9)+
coord_flip()+
geom_text(aes(label = sig),angle = 90, position = position_stack(vjust = -0.2), color= "black",size=3)+
scale_y_continuous(position = "right")+
scale_fill_manual(values = c("gene"= "#FF002B","protein"="blue"))+
geom_hline(yintercept = 0, colour = "gray" )+
theme(legend.position="none", axis.title.y=element_blank(),
axis.title.x=element_blank(),
axis.text.y=element_text(),
axis.line=element_line(color="gray"),axis.line.y=element_blank(),
axis.ticks.y=element_blank(),
panel.background=element_blank(),panel.border=element_blank(),panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),plot.background=element_blank())
Resulting in following plot:
Instead of having the y-axis labels on left side, I would like to place them right next to the bars. I want to emulate this chart published in nature:
https://www.nature.com/articles/ncomms2112/figures/3
Like this?
df<- data.frame(name=c(letters[1:10],letters[1:10]),fc=runif(20,-5,5)
,fdr=runif(20),group=c(rep("gene",10),rep("protein",10)))
df$sig<- ifelse(df$fdr<0.05 & df$fdr>0 ,"*","")
df$try<-c(1:10,1:10) #assign numbers to letters
x_pos<-ifelse(df$group=='gene',df$try-.2,df$try+.2) #align letters over bars
y_posneg<-ifelse(df$fc>0,df$fc+.5,df$fc-.5) #set up y axis position of letters
ggplot(df, aes(x=try,fc))+geom_col(aes(fill=group),position = "dodge",width = 0.9)+
coord_flip()+
geom_text(aes(y=y_posneg,x=x_pos,label = name),color= "black",size=6)+
scale_y_continuous(position = "right")+
scale_fill_manual(values = c("gene"= "#FF002B","protein"="blue"))+
geom_hline(yintercept = 0, colour = "gray" )+
theme(legend.position="none", axis.title.y=element_blank(),
axis.title.x=element_blank(),
axis.text.y=element_blank(),
axis.line=element_line(color="gray"),axis.line.y=element_blank(),
axis.ticks.y=element_blank(),
panel.background=element_blank(),panel.border=element_blank(),panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),plot.background=element_blank())
Gives:
Or perhaps this?
x_pos<-ifelse(df$group=='gene',df$try-.2,df$try+.2) #align letters over bars
y_pos<-ifelse(df$fc>0,-.2,.2) #set up y axis position of letters
ggplot(df, aes(x=try,fc))+geom_col(aes(fill=group),position = "dodge",width = 0.9)+
coord_flip()+
geom_text(aes(y=y_pos,x=x_pos,label = name),color= "black",size=3)+
scale_y_continuous(position = "right")+
scale_fill_manual(values = c("gene"= "#FF002B","protein"="blue"))+
geom_hline(yintercept = 0, colour = "gray" )+
theme(legend.position="none", axis.title.y=element_blank(),
axis.title.x=element_blank(),
axis.text.y=element_blank(),
axis.line=element_line(color="gray"),axis.line.y=element_blank(),
axis.ticks.y=element_blank(),
panel.background=element_blank(),panel.border=element_blank(),panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),plot.background=element_blank())
Gives: