Combining shape & color legends in ggplot2 - ggplot2

When I plot the data below, I get 2 separate legends: factor(Type), relating to color, & factor(Category), relating to shape. I would like to have one legend (with no title) that represents both color & shape. Other StackOverflow solutions have not worked for me, please help!
library(sp)
library(sf)
library(ggplot2)
library(ggmap)
library(dplyr)
Retrieve & format NYC area basemap
region.bb = c(left=-74.25,bottom=40.55,right=-73.7,top=40.97)
nyc.stamen <- get_stamenmap(bbox=region.bb,zoom=10,maptype="terrain-background")
Create data frame of coordinate data
Longitude <- c(-73.950311,-73.964482,-73.953678,-73.893522,-73.815856,-74.148499,-73.9465,-73.9585,-73.9223,-73.877744,-73.8796,-73.873983,-73.7781,-74.1745,-74.193432,-74.116770,-73.816316,-74.099108,-73.765924,-73.916045)
Latitude <- c(40.815313,40.767544,40.631762,40.872481,40.734335,40.604014,40.7315,40.8217,40.7905,40.837525,40.8105,40.776969,40.6413,40.6895,40.580011,40.773013,40.857311,40.744994,40.610648,40.799044)
Category <- c(0,1,1,1,1,1,2,2,2,3,4,5,5,5,6,6,6,7,7,7)
Type <- c(1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
coordinate.data <- data.frame(Longitude,Latitude,Type,Style,stringsAsFactors=F)
rownames(coordinate.data) <- c("METER","MANH","BKLN","BRON","QUEE","STAT","NEWTOWN_CREEK","NORTH_RIVER","WARDS_ISLAND","BUS_DEPOT","HUNTS_POINT","LGA","JFK","NJT","SITS","ERIE_NJ","BRONX_PELHAM","HUSDON","BAYSWATER","PP")
Plot points over NYC basemap
map.plot <- ggmap(nyc.stamen) +
xlab("Longitude") +
ylab("Latitude") +
geom_point(data=coordinate.data,aes(x=Longitude,y=Latitude,color=factor(Type),shape=factor(Category)),size=3) +
scale_shape_manual(values=c(8,4,0,1,2,5,6,10)) +
scale_color_manual(values=c("red","black")) +
theme(legend.background=element_rect(fill="white"),legend.key=element_rect(fill="white",color=NA))
print(map.plot)

Normally, you combine the color and shape legends when they both use the same variable as factor. In your case, color is a factor of Type and shape is a factor of Category. Combining them does not make any sense. My suggestion is to leave them with no title. While in your example there is a clear distinction of colors, you could have a red square and a black square, and in such a situation what should the legend display?
You can eliminate the legend title with statement
labs(x="Longitude" , y="Latitude" ,col=NULL, shape=NULL ) +
Also, you can combine the legends with the following statement:
guides(color="none", shape= guide_legend(override.aes=list(color=c("red","black") ))) +
My suggestion is not to do so.

Related

ggplot2 - wrap data around legend in custom position

When placing a legend in a custom position (using legend.position = c(x, y)) in a ggplot, is it possible to format the legend so that it does not overlay the data, and instead, the datapoints wrap around it?
In this example, would it be possible to, say, have ggplot insert extra space in the plot, so that datapoints are not obscured by the legend (without changing the legend.position)?
Thanks!
library(tidyverse)
data(mtcars)
ggplot(data = mtcars, aes(x = wt, y = hp))+
geom_point(aes(color = mpg))+
theme(legend.direction = "horizontal",
legend.position = c(0.5, 0.9))
An inelegant solution is to add plot.title(element_text(margin = margin(a, b, c, d))) where a, b, c and d are padding values for top, right, bottom, left, respectively, and adjust the c value until there is sufficient space. Let me know if you come up with a better solution!

Altering size of points on map in R (geom_sf) to reflect categorical data

I am creating a map to depict density of datapoints at different locations. At some locations, there is a high density of data available, and at others, there is a low density of data available. I would like to present the map with each data point shown but with each point a certain size to represent the density.
In my data table I have the location, and each location is assigned 'A', 'B', or 'C' to depict 'Low', 'Medium', and 'High' density. When plotting using geom_sf, I am able to get the points on the map, but I would like each category to be represented by a different size circle. I.e. 'Low density' locations with a small circle, and 'High density' locations with a larger circle.
I have been approaching the aesthetics of this map in the same way I would approach it as if it were a normal ggplot situation, but have not had any luck. I feel like I must be missing something obvious related to the fact that I am using geom_sf(), so any advice would be appreciated!
Using a very simple code:
ggplot() +
geom_sf(data = stc_land, color = "grey40", fill = "grey80") +
geom_sf(data = stcdens, aes(shape = Density) +
theme_classic()
I know that the aes() call should go in with the 'stcdens' data, and I got close with the 'shape = Density', but I am not sure how to move forward with assigning what shapes I want to each category.
You probably want to swap shape = Density for size = Density; then the plot should behave itself (and yes, it is a standard ggplot behavior, nothing sf specific :)
As your code is not exactly reproducible allow me to use my favorite example of 3 cities in NC:
library(sf)
library(ggplot2)
shape <- st_read(system.file("shape/nc.shp", package="sf")) # included with sf package
cities <- data.frame(name = c("Raleigh", "Greensboro", "Wilmington"),
x = c(-78.633333, -79.819444, -77.912222),
y = c(35.766667, 36.08, 34.223333),
population = c("high", "medium","low")) %>%
st_as_sf(coords = c("x", "y"), crs = 4326) %>%
dplyr::mutate(population = ordered(population,
levels = c("low", "medium", "high")))
ggplot() +
geom_sf(data = shape, fill = NA) +
geom_sf(data = cities, aes(size = population))
Note that I turned the population from a character variable to ordered factor, where high > medium > low (so that the circles follow the expected order).

Grouping the factors in ggplot

I am trying to create a graph based on matrix similar to one below... I am trying to group the Erosion values based on "Slope"...
library(ggplot2)
new_mat<-matrix(,nrow = 135, ncol = 7)
colnames(new_mat)<-c("Scenario","Runoff (mm)","Erosion (t/ac)","Slope","Soil","Tillage","Rotation")
for ( i in 1:nrow(new_mat)){
new_mat[i,2]<-sample(10:50, 1)
new_mat[i,3]<-sample(0.1:20, 1)
new_mat[i,4]<-sample(c("S2","S3","S4","S5","S1"),1)
new_mat[i,5]<-sample(c("Deep","Moderate","Shallow"),1)
new_mat[i,7]<-sample(c("WBP","WBF","WF"),1)
new_mat[i,6]<-sample(c("Intense","Reduced","Notill"),1)
new_mat[i,1]<-paste0(new_mat[i,4],"_",new_mat[i,5],"_",new_mat[i,6],"_",new_mat[i,7],"_")
}
#### Graph part ########
grphs_mat<-as.data.frame(new_mat)
grphs_mat$`Runoff (mm)`<-as.numeric(as.character(grphs_mat$`Runoff (mm)`))
grphs_mat$`Erosion (t/ac)`<-as.numeric(as.character(grphs_mat$`Erosion (t/ac)`))
ggplot(grphs_mat, aes(Scenario, `Erosion (t/ac)`,group=Slope, colour = Slope))+
scale_y_continuous(limits=c(0,max(as.numeric((grphs_mat$`Erosion (t/ac)`)))))+
geom_point()+geom_line()
But when i run this code.. The values are distributed in x-axis for all 135 scenarios. But what i want is grouping to be done in terms of slope but it also picks up the other common factors such as Soil+Rotation+Tillage and place it in x-axis. For example:
For these five scenarios:
S1_Deep_Intense_WBF_
S2_Deep_Intense_WBF_
S3_Deep_Intense_WBF_
S4_Deep_Intense_WBF_
S5_Deep_Intense_WBF_
It separates the S1, S2, S3,S4,S5 but also be able to know that other factors are same and put them in x-axis such that the slope lines are stacked on top of each other in 135/5 = 27 x-axis points. The final figure should look like this (Refer image). Apologies for not being able to explain it better.
I think i am making a mistake in grouping or assigning the x-axis values.
I will appreciate your suggestions.
In the example you give, I didn't get every possible factor combination represented so the plots looked a bit weird. What I did instead was start with the following:
set.seed(42)
new_mat <- matrix(,nrow = 1000, ncol = 7)
And then deduplicated this by summarising the values. A possible relevant step here for you analysis is that I made new variable with the interaction() function that is the combination of three other factors.
library(tidyverse)
df <- grphs_mat
df$x <- with(df, interaction(Rotation, Soil, Tillage))
# The simulation did not yield unique combinations
df <- df %>% group_by(x, Slope) %>%
summarise(n = sum(`Erosion (t/ac)`))
Next, I plotted this new x variable on the x-axis and used "stack" positions for the lines and points.
g <- ggplot(df, aes(x, y = n, colour = Slope, group = Slope)) +
geom_line(position = "stack") +
geom_point(position = "stack")
To make the x-axis slightly more readable, you can replace the . that the interaction() function placed by newlines.
g + scale_x_discrete(labels = function(x){gsub("\\.", "\n", x)})
Another option is to simply rotate the x axis labels:
g + theme(axis.text.x.bottom = element_text(angle = 90))
There are a few additional options for the x-axis if you go into ggplot2 extension packages.

Add a line at z=0 to ggplot2 heatmap

I have plotted a heatmap in ggplot2. I want to add a curved line to the plot to show where z=0 (i.e. where the value of the data used for the fill is zero), how can I do this?
Thanks
Since no example data or code is provided, I'll illustrate with the volcano dataset, representing heights of a volcano in a matrix. Since the data doesn't contain a zero point, we'll draw the line at the arbitrarily chosen 125 mark.
library(ggplot2)
# Convert matrix to data.frame
df <- data.frame(
row = as.vector(row(volcano)),
col = as.vector(col(volcano)),
value = as.vector(volcano)
)
# Set contour breaks at desired level
ggplot(df, aes(col, row, fill = value)) +
geom_raster() +
geom_contour(aes(z = value),
breaks = 125, col = 'red')
Created on 2020-04-06 by the reprex package (v0.3.0)
If this isn't a good approximation of your problem, I'd suggest to include example data and code in your question.

Aligning side-by-side related plots with ggplot2

I have the following two-part plot which are not aligned:
Side-by-side plots not aligned
These plots are produced by the following code:
require(ggplot2)
require(gridExtra)
set.seed(0)
data <- data.frame(x=rpois(30,5),y=rpois(30,11),z=rpois(300,25))
left.plot <- ggplot(data,aes(x,y))
+ geom_bin2d(binwidth=1)
margin.data <- as.data.frame( margin.table(table(data),1))
right.plot <- ggplot(margin.data, aes(x=x,y=Freq))
+ geom_bar(stat="identity")+coord_flip()
grid.arrange(left.plot, right.plot, ncol=2)
How can I align the rows in the left plot to the bars in the right plot?
Your issues are simple, albeit twofold.
Ultimately you need to use scale_y_continuous() and scale_x_continuous() to set your axis limits to match on eatch figure. That's impeded by the fact that the x value is a factor. Convert it to a numeric and throw in some scaling and you're good to go.
left.plot <- ggplot(data,aes(x,y)) +
geom_bin2d(binwidth=1) +
scale_y_continuous(limits = c(1, 16))
margin.data <- as.data.frame( margin.table(table(data),1))
right.plot <- ggplot(margin.data, aes(x=as.numeric(as.character(x)),y=Freq)) +
geom_bar(stat="identity") +
scale_x_continuous(limits = c(1, 16)) +
xlab("x") +
coord_flip()
Using package ggExtra I was able to get an almost solution
require(ggplot2)
require(ggExtra)
set.seed(0)
data <- data.frame(x=rpois(30,5),y=rpois(30,11),z=rpois(300,25))
left.plot <- ggplot(data,aes(x,y)) + geom_bin2d(binwidth=1)
ggMarginal(left.plot, margins="y", type = "histogram", size=2,bins=(max(data$y)-min(data$y)+1),binwidth=1.06)
I say almost because I had to set manually binwidth=1.06 to align bar and counts.
Manually aligned plots using ggExtra::ggMarginal