Formatting Y-axis with secondary axis [duplicate] - ggplot2

I need to plot a bar chart showing counts and a line chart showing rate all in one chart, I can do both of them separately, but when I put them together, I scale of the first layer (i.e. the geom_bar) is overlapped by the second layer (i.e. the geom_line).
Can I move the axis of the geom_line to the right?

Starting with ggplot2 2.2.0 you can add a secondary axis like this (taken from the ggplot2 2.2.0 announcement):
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
scale_y_continuous(
"mpg (US)",
sec.axis = sec_axis(~ . * 1.20, name = "mpg (UK)")
)

It's not possible in ggplot2 because I believe plots with separate y scales (not y-scales that are transformations of each other) are fundamentally flawed. Some problems:
The are not invertible: given a point on the plot space, you can not uniquely map it back to a point in the data space.
They are relatively hard to read correctly compared to other options. See A Study on Dual-Scale Data Charts by Petra Isenberg, Anastasia Bezerianos, Pierre Dragicevic, and Jean-Daniel Fekete for details.
They are easily manipulated to mislead: there is no unique way to specify the relative scales of the axes, leaving them open to manipulation. Two examples from the Junkcharts blog: one, two
They are arbitrary: why have only 2 scales, not 3, 4 or ten?
You also might want to read Stephen Few's lengthy discussion on the topic Dual-Scaled Axes in Graphs Are They Ever the Best Solution?.

Sometimes a client wants two y scales. Giving them the "flawed" speech is often pointless. But I do like the ggplot2 insistence on doing things the right way. I am sure that ggplot is in fact educating the average user about proper visualization techniques.
Maybe you can use faceting and scale free to compare the two data series? - e.g. look here: https://github.com/hadley/ggplot2/wiki/Align-two-plots-on-a-page

There are common use-cases dual y axes, e.g., the climatograph showing monthly temperature and precipitation. Here is a simple solution, generalized from Megatron's solution by allowing you to set the lower limit of the variables to something else than zero:
Example data:
climate <- tibble(
Month = 1:12,
Temp = c(-4,-4,0,5,11,15,16,15,11,6,1,-3),
Precip = c(49,36,47,41,53,65,81,89,90,84,73,55)
)
Set the following two values to values close to the limits of the data (you can play around with these to adjust the positions of the graphs; the axes will still be correct):
ylim.prim <- c(0, 180) # in this example, precipitation
ylim.sec <- c(-4, 18) # in this example, temperature
The following makes the necessary calculations based on these limits, and makes the plot itself:
b <- diff(ylim.prim)/diff(ylim.sec)
a <- ylim.prim[1] - b*ylim.sec[1]) # there was a bug here
ggplot(climate, aes(Month, Precip)) +
geom_col() +
geom_line(aes(y = a + Temp*b), color = "red") +
scale_y_continuous("Precipitation", sec.axis = sec_axis(~ (. - a)/b, name = "Temperature")) +
scale_x_continuous("Month", breaks = 1:12) +
ggtitle("Climatogram for Oslo (1961-1990)")
If you want to make sure that the red line corresponds to the right-hand y axis, you can add a theme sentence to the code:
ggplot(climate, aes(Month, Precip)) +
geom_col() +
geom_line(aes(y = a + Temp*b), color = "red") +
scale_y_continuous("Precipitation", sec.axis = sec_axis(~ (. - a)/b, name = "Temperature")) +
scale_x_continuous("Month", breaks = 1:12) +
theme(axis.line.y.right = element_line(color = "red"),
axis.ticks.y.right = element_line(color = "red"),
axis.text.y.right = element_text(color = "red"),
axis.title.y.right = element_text(color = "red")
) +
ggtitle("Climatogram for Oslo (1961-1990)")
which colors the right-hand axis:

Taking above answers and some fine-tuning (and for whatever it's worth), here is a way of achieving two scales via sec_axis:
Assume a simple (and purely fictional) data set dt: for five days, it tracks the number of interruptions VS productivity:
when numinter prod
1 2018-03-20 1 0.95
2 2018-03-21 5 0.50
3 2018-03-23 4 0.70
4 2018-03-24 3 0.75
5 2018-03-25 4 0.60
(the ranges of both columns differ by about factor 5).
The following code will draw both series that they use up the whole y axis:
ggplot() +
geom_bar(mapping = aes(x = dt$when, y = dt$numinter), stat = "identity", fill = "grey") +
geom_line(mapping = aes(x = dt$when, y = dt$prod*5), size = 2, color = "blue") +
scale_x_date(name = "Day", labels = NULL) +
scale_y_continuous(name = "Interruptions/day",
sec.axis = sec_axis(~./5, name = "Productivity % of best",
labels = function(b) { paste0(round(b * 100, 0), "%")})) +
theme(
axis.title.y = element_text(color = "grey"),
axis.title.y.right = element_text(color = "blue"))
Here's the result (above code + some color tweaking):
The point (aside from using sec_axis when specifying the y_scale is to multiply each value the 2nd data series with 5 when specifying the series. In order to get the labels right in the sec_axis definition, it then needs dividing by 5 (and formatting). So a crucial part in above code is really *5 in the geom_line and ~./5 in sec_axis (a formula dividing the current value . by 5).
In comparison (I don't want to judge the approaches here), this is how two charts on top of one another look like:
You can judge for yourself which one better transports the message (“Don’t disrupt people at work!”). Guess that's a fair way to decide.
The full code for both images (it's not really more than what's above, just complete and ready to run) is here: https://gist.github.com/sebastianrothbucher/de847063f32fdff02c83b75f59c36a7d a more detailed explanation here: https://sebastianrothbucher.github.io/datascience/r/visualization/ggplot/2018/03/24/two-scales-ggplot-r.html

You can create a scaling factor which is applied to the second geom and right y-axis. This is derived from Sebastian's solution.
library(ggplot2)
scaleFactor <- max(mtcars$cyl) / max(mtcars$hp)
ggplot(mtcars, aes(x=disp)) +
geom_smooth(aes(y=cyl), method="loess", col="blue") +
geom_smooth(aes(y=hp * scaleFactor), method="loess", col="red") +
scale_y_continuous(name="cyl", sec.axis=sec_axis(~./scaleFactor, name="hp")) +
theme(
axis.title.y.left=element_text(color="blue"),
axis.text.y.left=element_text(color="blue"),
axis.title.y.right=element_text(color="red"),
axis.text.y.right=element_text(color="red")
)
Note: using ggplot2 v3.0.0

The technical backbone to the solution of this challenge has been provided by Kohske some 3 years ago [KOHSKE]. The topic and the technicalities around its solution have been discussed on several instances here on Stackoverflow [IDs: 18989001, 29235405, 21026598]. So i shall only provide a specific variation and some explanatory walkthrough, using above solutions.
Let us assume we do have some data y1 in group G1 to which some data y2 in group G2 is related in some way, e.g. range/scale transformed or with some noise added. So one wants to plot the data together on one plot with the scale of y1 on the left and y2 on the right.
df <- data.frame(item=LETTERS[1:n], y1=c(-0.8684, 4.2242, -0.3181, 0.5797, -0.4875), y2=c(-5.719, 205.184, 4.781, 41.952, 9.911 )) # made up!
> df
item y1 y2
1 A -0.8684 -19.154567
2 B 4.2242 219.092499
3 C -0.3181 18.849686
4 D 0.5797 46.945161
5 E -0.4875 -4.721973
If we now plot our data together with something like
ggplot(data=df, aes(label=item)) +
theme_bw() +
geom_segment(aes(x='G1', xend='G2', y=y1, yend=y2), color='grey')+
geom_text(aes(x='G1', y=y1), color='blue') +
geom_text(aes(x='G2', y=y2), color='red') +
theme(legend.position='none', panel.grid=element_blank())
it doesnt align nicely as the smaller scale y1 obviosuly gets collapsed by larger scale y2.
The trick here to meet the challenge is to techncially plot both data sets against the first scale y1 but report the second against a secondary axis with labels showing the original scale y2.
So we build a first helper function CalcFudgeAxis which calculates and collects features of the new axis to be shown. The function can be amended to ayones liking (this one just maps y2 onto the range of y1).
CalcFudgeAxis = function( y1, y2=y1) {
Cast2To1 = function(x) ((ylim1[2]-ylim1[1])/(ylim2[2]-ylim2[1])*x) # x gets mapped to range of ylim2
ylim1 <- c(min(y1),max(y1))
ylim2 <- c(min(y2),max(y2))
yf <- Cast2To1(y2)
labelsyf <- pretty(y2)
return(list(
yf=yf,
labels=labelsyf,
breaks=Cast2To1(labelsyf)
))
}
what yields some:
> FudgeAxis <- CalcFudgeAxis( df$y1, df$y2 )
> FudgeAxis
$yf
[1] -0.4094344 4.6831656 0.4029175 1.0034664 -0.1009335
$labels
[1] -50 0 50 100 150 200 250
$breaks
[1] -1.068764 0.000000 1.068764 2.137529 3.206293 4.275058 5.343822
> cbind(df, FudgeAxis$yf)
item y1 y2 FudgeAxis$yf
1 A -0.8684 -19.154567 -0.4094344
2 B 4.2242 219.092499 4.6831656
3 C -0.3181 18.849686 0.4029175
4 D 0.5797 46.945161 1.0034664
5 E -0.4875 -4.721973 -0.1009335
Now I wraped Kohske's solution in the second helper function PlotWithFudgeAxis (into which we throw the ggplot object and helper object of the new axis):
library(gtable)
library(grid)
PlotWithFudgeAxis = function( plot1, FudgeAxis) {
# based on: https://rpubs.com/kohske/dual_axis_in_ggplot2
plot2 <- plot1 + with(FudgeAxis, scale_y_continuous( breaks=breaks, labels=labels))
#extract gtable
g1<-ggplot_gtable(ggplot_build(plot1))
g2<-ggplot_gtable(ggplot_build(plot2))
#overlap the panel of the 2nd plot on that of the 1st plot
pp<-c(subset(g1$layout, name=="panel", se=t:r))
g<-gtable_add_grob(g1, g2$grobs[[which(g2$layout$name=="panel")]], pp$t, pp$l, pp$b,pp$l)
ia <- which(g2$layout$name == "axis-l")
ga <- g2$grobs[[ia]]
ax <- ga$children[[2]]
ax$widths <- rev(ax$widths)
ax$grobs <- rev(ax$grobs)
ax$grobs[[1]]$x <- ax$grobs[[1]]$x - unit(1, "npc") + unit(0.15, "cm")
g <- gtable_add_cols(g, g2$widths[g2$layout[ia, ]$l], length(g$widths) - 1)
g <- gtable_add_grob(g, ax, pp$t, length(g$widths) - 1, pp$b)
grid.draw(g)
}
Now all can be put together: Below code shows, how the proposed solution could be used in a day-to-day environment. The plot call now doesnt plot the original data y2 anymore but a cloned version yf (held inside the pre-calculated helper object FudgeAxis), which runs of the scale of y1. The original ggplot objet is then manipulated with Kohske's helper function PlotWithFudgeAxis to add a second axis preserving the scales of y2. It plots as well the manipulated plot.
FudgeAxis <- CalcFudgeAxis( df$y1, df$y2 )
tmpPlot <- ggplot(data=df, aes(label=item)) +
theme_bw() +
geom_segment(aes(x='G1', xend='G2', y=y1, yend=FudgeAxis$yf), color='grey')+
geom_text(aes(x='G1', y=y1), color='blue') +
geom_text(aes(x='G2', y=FudgeAxis$yf), color='red') +
theme(legend.position='none', panel.grid=element_blank())
PlotWithFudgeAxis(tmpPlot, FudgeAxis)
This now plots as desired with two axis, y1 on the left and y2 on the right
Above solution is, to put it straight, a limited shaky hack. As it plays with the ggplot kernel it will throw some warnings that we exchange post-the-fact scales, etc. It has to be handled with care and may produce some undesired behaviour in another setting. As well one may need to fiddle around with the helper functions to get the layout as desired. The placement of the legend is such an issue (it would be placed between the panel and the new axis; this is why I droped it). The scaling / alignment of the 2 axis is as well a bit challenging: The code above works nicely when both scales contain the "0", else one axis gets shifted. So definetly with some opportunities to improve...
In case on wants to save the pic one has to wrap the call into device open / close:
png(...)
PlotWithFudgeAxis(tmpPlot, FudgeAxis)
dev.off()

Here are my two cents on how to do the transformations for secondary axis. First, you want to couple the the ranges of the primary and secondary data. This is usually messy in terms of polluting your global environment with variables you don't want.
To make this easier, we'll make a function factory that produces two functions, wherein scales::rescale() does all the heavy lifting. Because these are closures, they are aware of the environment in which they were created, so they 'have a memory' of the to and from parameters generated before creation.
One functions does the forward transformation: transforms the secondary data to the primary scale.
The second function does the reverse transformation: transforms data in primary units to secondary units.
library(ggplot2)
library(scales)
# Function factory for secondary axis transforms
train_sec <- function(primary, secondary, na.rm = TRUE) {
# Thanks Henry Holm for including the na.rm argument!
from <- range(secondary, na.rm = na.rm)
to <- range(primary, na.rm = na.rm)
# Forward transform for the data
forward <- function(x) {
rescale(x, from = from, to = to)
}
# Reverse transform for the secondary axis
reverse <- function(x) {
rescale(x, from = to, to = from)
}
list(fwd = forward, rev = reverse)
}
This seems all rather complicated, but making the function factory makes all the rest easier. Now, before we make a plot, we'll produce the relevant functions by showing the factory the primary and secondary data. We'll use the economics dataset which has very different ranges for the unemploy and psavert columns.
sec <- with(economics, train_sec(unemploy, psavert))
Then we use y = sec$fwd(psavert) to rescale the secondary data to primary axis, and specify ~ sec$rev(.) as the transformation argument to the secondary axis. This gives us a plot where the primary and secondary ranges occupy the same space on the plot.
ggplot(economics, aes(date)) +
geom_line(aes(y = unemploy), colour = "blue") +
geom_line(aes(y = sec$fwd(psavert)), colour = "red") +
scale_y_continuous(sec.axis = sec_axis(~sec$rev(.), name = "psavert"))
The factory is slightly more flexible than that, because if you simply want to rescale the maximum, you can pass in data that has the lower limit at 0.
# Rescaling the maximum
sec <- with(economics, train_sec(c(0, max(unemploy)),
c(0, max(psavert))))
ggplot(economics, aes(date)) +
geom_line(aes(y = unemploy), colour = "blue") +
geom_line(aes(y = sec$fwd(psavert)), colour = "red") +
scale_y_continuous(sec.axis = sec_axis(~sec$rev(.), name = "psavert"))
Created on 2021-02-05 by the reprex package (v0.3.0)
I admit the difference in this example is not that very obvious, but if you look closely you can see that the maxima are the same and the red line goes lower than the blue one.
EDIT:
This approach has now been captured and expanded in the help_secondary() function in the ggh4x package. Disclaimer: I'm the author of ggh4x.

The following article helped me to combine two plots generated by ggplot2 on a single row:
Multiple graphs on one page (ggplot2) by Cookbook for R
And here is what the code may look like in this case:
p1 <-
ggplot() + aes(mns)+ geom_histogram(aes(y=..density..), binwidth=0.01, colour="black", fill="white") + geom_vline(aes(xintercept=mean(mns, na.rm=T)), color="red", linetype="dashed", size=1) + geom_density(alpha=.2)
p2 <-
ggplot() + aes(mns)+ geom_histogram( binwidth=0.01, colour="black", fill="white") + geom_vline(aes(xintercept=mean(mns, na.rm=T)), color="red", linetype="dashed", size=1)
multiplot(p1,p2,cols=2)

For me the tricky part was figuring out the transformation function between the two axis. I used myCurveFit for that.
> dput(combined_80_8192 %>% filter (time > 270, time < 280))
structure(list(run = c(268L, 268L, 268L, 268L, 268L, 268L, 268L,
268L, 268L, 268L, 263L, 263L, 263L, 263L, 263L, 263L, 263L, 263L,
263L, 263L, 269L, 269L, 269L, 269L, 269L, 269L, 269L, 269L, 269L,
269L, 261L, 261L, 261L, 261L, 261L, 261L, 261L, 261L, 261L, 261L,
267L, 267L, 267L, 267L, 267L, 267L, 267L, 267L, 267L, 267L, 265L,
265L, 265L, 265L, 265L, 265L, 265L, 265L, 265L, 265L, 266L, 266L,
266L, 266L, 266L, 266L, 266L, 266L, 266L, 266L, 262L, 262L, 262L,
262L, 262L, 262L, 262L, 262L, 262L, 262L, 264L, 264L, 264L, 264L,
264L, 264L, 264L, 264L, 264L, 264L, 260L, 260L, 260L, 260L, 260L,
260L, 260L, 260L, 260L, 260L), repetition = c(8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L
), module = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "scenario.node[0].nicVLCTail.phyVLC", class = "factor"),
configname = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L), .Label = "Road-Vlc", class = "factor"), packetByteLength = c(8192L,
8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L,
8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L,
8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L,
8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L,
8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L,
8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L,
8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L,
8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L,
8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L,
8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L,
8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L, 8192L
), numVehicles = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), dDistance = c(80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L,
80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L,
80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L,
80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L,
80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L,
80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L,
80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L,
80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L,
80L, 80L, 80L, 80L, 80L, 80L, 80L, 80L), time = c(270.166006903445,
271.173853699836, 272.175873251122, 273.177524313334, 274.182946177105,
275.188959464989, 276.189675339937, 277.198250244799, 278.204619457189,
279.212562800009, 270.164199199177, 271.168527215152, 272.173072994958,
273.179210429715, 274.184351047337, 275.18980754378, 276.194816792995,
277.198598277809, 278.202398083519, 279.210634593917, 270.210674322891,
271.212395107473, 272.218871923292, 273.219060500457, 274.220486359614,
275.22401452372, 276.229646658839, 277.231060448138, 278.240407241942,
279.2437126347, 270.283554249858, 271.293168593832, 272.298574288769,
273.304413221348, 274.306272082517, 275.309023049011, 276.317805897347,
277.324403550028, 278.332855848701, 279.334046374594, 270.118608539613,
271.127947700074, 272.133887145863, 273.135726000491, 274.135994529981,
275.136563912708, 276.140120735361, 277.144298344151, 278.146885137621,
279.147552358659, 270.206015567272, 271.214618077209, 272.216566814903,
273.225435592582, 274.234014573683, 275.242949179958, 276.248417809711,
277.248800670023, 278.249750333404, 279.252926560188, 270.217182684494,
271.218357511397, 272.224698488895, 273.231112784327, 274.238740508457,
275.242715184122, 276.249053562718, 277.250325509798, 278.258488063493,
279.261141590137, 270.282904173953, 271.284689544638, 272.294220723234,
273.299749415592, 274.30628880553, 275.312075103126, 276.31579134717,
277.321905523606, 278.326305136748, 279.333056502253, 270.258991527456,
271.260224091407, 272.270076810133, 273.27052037648, 274.274119348094,
275.280808254502, 276.286353887245, 277.287064312339, 278.294444793276,
279.296772014594, 270.333066283904, 271.33877455992, 272.345842319903,
273.350858180493, 274.353972278505, 275.360454510107, 276.365088896161,
277.369166956941, 278.372571708911, 279.38017503079), distanceToTx = c(80.255266401689,
80.156059067023, 79.98823695539, 79.826647129071, 79.76678667135,
79.788239825292, 79.734539327997, 79.74766421514, 79.801243848241,
79.765920888341, 80.255266401689, 80.15850240049, 79.98823695539,
79.826647129071, 79.76678667135, 79.788239825292, 79.735078924078,
79.74766421514, 79.801243848241, 79.764622734914, 80.251248121732,
80.146436869316, 79.984682320466, 79.82292012342, 79.761908518748,
79.796988776281, 79.736920997657, 79.745038376718, 79.802638836686,
79.770029970452, 80.243475525691, 80.127918207499, 79.978303140866,
79.816259117883, 79.749322030693, 79.809916018889, 79.744456560867,
79.738655068783, 79.788697533211, 79.784288359619, 80.260412958482,
80.168426829066, 79.992034911214, 79.830845773284, 79.7756751763,
79.778156038931, 79.732399593756, 79.752769548846, 79.799967731078,
79.757585110481, 80.251248121732, 80.146436869316, 79.984682320466,
79.822062073459, 79.75884601899, 79.801590491435, 79.738335109094,
79.74347007248, 79.803215965043, 79.771471198955, 80.250257298678,
80.146436869316, 79.983831684476, 79.822062073459, 79.75884601899,
79.801590491435, 79.738335109094, 79.74347007248, 79.803849157574,
79.771471198955, 80.243475525691, 80.130180105198, 79.978303140866,
79.816881283718, 79.749322030693, 79.80984572883, 79.744456560867,
79.738655068783, 79.790548644175, 79.784288359619, 80.246349000313,
80.137056554491, 79.980581246037, 79.818924707937, 79.753176142361,
79.808777040341, 79.741609845588, 79.740770913572, 79.796316397253,
79.777593733292, 80.238796415443, 80.119021911134, 79.974810568944,
79.814065350562, 79.743657315504, 79.810146783217, 79.749945098869,
79.737122584544, 79.781650522348, 79.791554933936), headerNoError = c(0.99999999989702,
0.9999999999981, 0.99999999999946, 0.9999999928026, 0.99999873265475,
0.77080141574964, 0.99007491438593, 0.99994396605059, 0.45588747062284,
0.93484381262491, 0.99999999989702, 0.99999999999816, 0.99999999999946,
0.9999999928026, 0.99999873265475, 0.77080141574964, 0.99008458785106,
0.99994396605059, 0.45588747062284, 0.93480223051707, 0.99999999989735,
0.99999999999789, 0.99999999999946, 0.99999999287551, 0.99999876302649,
0.46903147501117, 0.98835168988253, 0.99994427085086, 0.45235035271542,
0.93496741877335, 0.99999999989803, 0.99999999999781, 0.99999999999948,
0.99999999318224, 0.99994254156311, 0.46891362282273, 0.93382613917348,
0.99994594904099, 0.93002915596843, 0.93569767251247, 0.99999999989658,
0.99999999998074, 0.99999999999946, 0.99999999272802, 0.99999871586781,
0.76935240919896, 0.99002587758346, 0.99999881589732, 0.46179415706093,
0.93417422376389, 0.99999999989735, 0.99999999999789, 0.99999999999946,
0.99999999289347, 0.99999876940486, 0.46930769326427, 0.98837353639905,
0.99994447154714, 0.16313586712094, 0.93500824170148, 0.99999999989744,
0.99999999999789, 0.99999999999946, 0.99999999289347, 0.99999876940486,
0.46930769326427, 0.98837353639905, 0.99994447154714, 0.16330039178981,
0.93500824170148, 0.99999999989803, 0.99999999999781, 0.99999999999948,
0.99999999316541, 0.99994254156311, 0.46794586553266, 0.93382613917348,
0.99994594904099, 0.9303627789484, 0.93569767251247, 0.99999999989778,
0.9999999999978, 0.99999999999948, 0.99999999311433, 0.99999878195152,
0.47101897739483, 0.93368891853679, 0.99994556595217, 0.7571113417265,
0.93553999975802, 0.99999999998191, 0.99999999999784, 0.99999999999971,
0.99999891129658, 0.99994309267792, 0.46510628979591, 0.93442584181035,
0.99894450514543, 0.99890078483692, 0.76933812306423), receivedPower_dbm = c(-93.023492290586,
-92.388378035287, -92.205716340607, -93.816400586752, -95.023489422885,
-100.86308557253, -98.464763536915, -96.175707680373, -102.06189538385,
-99.716653422746, -93.023492290586, -92.384760627397, -92.205716340607,
-93.816400586752, -95.023489422885, -100.86308557253, -98.464201120719,
-96.175707680373, -102.06189538385, -99.717150021506, -93.022927803442,
-92.404017215549, -92.204561341714, -93.814319484729, -95.016990717792,
-102.01669022332, -98.558088145955, -96.173817001483, -102.07406915124,
-99.71517574876, -93.021813165972, -92.409586309743, -92.20229160243,
-93.805335867418, -96.184419849593, -102.01709540787, -99.728735187547,
-96.163233028048, -99.772547164798, -99.706399753853, -93.024204617071,
-92.745813384859, -92.206884754512, -93.818508150122, -95.027018807793,
-100.87000577258, -98.467607232407, -95.005311380324, -102.04157607608,
-99.724619517, -93.022927803442, -92.404017215549, -92.204561341714,
-93.813803344588, -95.015606885523, -102.0157405687, -98.556982278361,
-96.172566862738, -103.21871579865, -99.714687230796, -93.022787428238,
-92.404017215549, -92.204274688493, -93.813803344588, -95.015606885523,
-102.0157405687, -98.556982278361, -96.172566862738, -103.21784988098,
-99.714687230796, -93.021813165972, -92.409950613665, -92.20229160243,
-93.805838770576, -96.184419849593, -102.02042267497, -99.728735187547,
-96.163233028048, -99.768774335378, -99.706399753853, -93.022228914406,
-92.411048503835, -92.203136463155, -93.807357409082, -95.012865008237,
-102.00985717796, -99.730352912911, -96.165675535906, -100.92744056572,
-99.708301333236, -92.735781110993, -92.408137395049, -92.119533319039,
-94.982938427575, -96.181073124017, -102.03018610927, -99.721633629806,
-97.32940323644, -97.347613268692, -100.87007386786), snr = c(49.848348091678,
57.698190927109, 60.17669971462, 41.529809724535, 31.452202106925,
8.1976890851341, 14.240447804094, 24.122884195464, 6.2202875499406,
10.674183333671, 49.848348091678, 57.746270018264, 60.17669971462,
41.529809724535, 31.452202106925, 8.1976890851341, 14.242292077376,
24.122884195464, 6.2202875499406, 10.672962852322, 49.854827699773,
57.49079026127, 60.192705735317, 41.549715223147, 31.499301851462,
6.2853718719014, 13.937702343688, 24.133388256416, 6.2028757927148,
10.677815810561, 49.867624820879, 57.417115267867, 60.224172277442,
41.635752021705, 24.074540962859, 6.2847854917092, 10.644529778044,
24.19227425387, 10.537686730745, 10.699414795917, 49.84017267426,
53.139646558768, 60.160512118809, 41.509660845114, 31.42665220053,
8.1846370024428, 14.231126423354, 31.584125885363, 6.2494585568733,
10.654622041348, 49.854827699773, 57.49079026127, 60.192705735317,
41.55465351989, 31.509340361646, 6.2867464196657, 13.941251828322,
24.140336174865, 4.765718874642, 10.679016976694, 49.856439162736,
57.49079026127, 60.196678846453, 41.55465351989, 31.509340361646,
6.2867464196657, 13.941251828322, 24.140336174865, 4.7666691818074,
10.679016976694, 49.867624820879, 57.412299088098, 60.224172277442,
41.630930975211, 24.074540962859, 6.279972363168, 10.644529778044,
24.19227425387, 10.546845071479, 10.699414795917, 49.862851240855,
57.397787176282, 60.212457625018, 41.61637603957, 31.529239767749,
6.2952688513108, 10.640565481982, 24.178672145334, 8.0771089950663,
10.694731030907, 53.262541905639, 57.43627424514, 61.382796189332,
31.747253311549, 24.093100244121, 6.2658701281075, 10.661949889074,
18.495227442305, 18.417839037171, 8.1845086722809), frameId = c(15051,
15106, 15165, 15220, 15279, 15330, 15385, 15452, 15511, 15566,
15019, 15074, 15129, 15184, 15239, 15298, 15353, 15412, 15471,
15526, 14947, 14994, 15057, 15112, 15171, 15226, 15281, 15332,
15391, 15442, 14971, 15030, 15085, 15144, 15203, 15262, 15321,
15380, 15435, 15490, 14915, 14978, 15033, 15092, 15147, 15198,
15257, 15312, 15371, 15430, 14975, 15034, 15089, 15140, 15195,
15254, 15313, 15368, 15427, 15478, 14987, 15046, 15105, 15160,
15215, 15274, 15329, 15384, 15447, 15506, 14943, 15002, 15061,
15116, 15171, 15230, 15285, 15344, 15399, 15454, 14971, 15026,
15081, 15136, 15195, 15258, 15313, 15368, 15423, 15478, 15039,
15094, 15149, 15204, 15263, 15314, 15369, 15428, 15487, 15546
), packetOkSinr = c(0.99999999314881, 0.9999999998736, 0.99999999996428,
0.99999952114066, 0.99991568416005, 3.00628034688444e-08,
0.51497487795954, 0.99627877136019, 0, 0.011303253101957,
0.99999999314881, 0.99999999987726, 0.99999999996428, 0.99999952114066,
0.99991568416005, 3.00628034688444e-08, 0.51530974419663,
0.99627877136019, 0, 0.011269851265775, 0.9999999931708,
0.99999999985986, 0.99999999996428, 0.99999952599145, 0.99991770469509,
0, 0.45861812482641, 0.99629897628155, 0, 0.011403119534097,
0.99999999321568, 0.99999999985437, 0.99999999996519, 0.99999954639936,
0.99618434878558, 0, 0.010513119213425, 0.99641022914441,
0.00801687746446111, 0.012011103529927, 0.9999999931195,
0.99999999871861, 0.99999999996428, 0.99999951617905, 0.99991456738049,
2.6525298291169e-08, 0.51328066587104, 0.9999212220316, 0,
0.010777054258914, 0.9999999931708, 0.99999999985986, 0.99999999996428,
0.99999952718674, 0.99991812902805, 0, 0.45929307038653,
0.99631228046814, 0, 0.011436292559188, 0.99999999317629,
0.99999999985986, 0.99999999996428, 0.99999952718674, 0.99991812902805,
0, 0.45929307038653, 0.99631228046814, 0, 0.011436292559188,
0.99999999321568, 0.99999999985437, 0.99999999996519, 0.99999954527918,
0.99618434878558, 0, 0.010513119213425, 0.99641022914441,
0.00821047996950475, 0.012011103529927, 0.99999999319919,
0.99999999985345, 0.99999999996519, 0.99999954188106, 0.99991896371849,
0, 0.010410830482692, 0.996384831822, 9.12484388049251e-09,
0.011877185067536, 0.99999999879646, 0.9999999998562, 0.99999999998077,
0.99992756868677, 0.9962208785486, 0, 0.010971897073662,
0.93214999078663, 0.92943956665979, 2.64925478221656e-08),
snir = c(49.848348091678, 57.698190927109, 60.17669971462,
41.529809724535, 31.452202106925, 8.1976890851341, 14.240447804094,
24.122884195464, 6.2202875499406, 10.674183333671, 49.848348091678,
57.746270018264, 60.17669971462, 41.529809724535, 31.452202106925,
8.1976890851341, 14.242292077376, 24.122884195464, 6.2202875499406,
10.672962852322, 49.854827699773, 57.49079026127, 60.192705735317,
41.549715223147, 31.499301851462, 6.2853718719014, 13.937702343688,
24.133388256416, 6.2028757927148, 10.677815810561, 49.867624820879,
57.417115267867, 60.224172277442, 41.635752021705, 24.074540962859,
6.2847854917092, 10.644529778044, 24.19227425387, 10.537686730745,
10.699414795917, 49.84017267426, 53.139646558768, 60.160512118809,
41.509660845114, 31.42665220053, 8.1846370024428, 14.231126423354,
31.584125885363, 6.2494585568733, 10.654622041348, 49.854827699773,
57.49079026127, 60.192705735317, 41.55465351989, 31.509340361646,
6.2867464196657, 13.941251828322, 24.140336174865, 4.765718874642,
10.679016976694, 49.856439162736, 57.49079026127, 60.196678846453,
41.55465351989, 31.509340361646, 6.2867464196657, 13.941251828322,
24.140336174865, 4.7666691818074, 10.679016976694, 49.867624820879,
57.412299088098, 60.224172277442, 41.630930975211, 24.074540962859,
6.279972363168, 10.644529778044, 24.19227425387, 10.546845071479,
10.699414795917, 49.862851240855, 57.397787176282, 60.212457625018,
41.61637603957, 31.529239767749, 6.2952688513108, 10.640565481982,
24.178672145334, 8.0771089950663, 10.694731030907, 53.262541905639,
57.43627424514, 61.382796189332, 31.747253311549, 24.093100244121,
6.2658701281075, 10.661949889074, 18.495227442305, 18.417839037171,
8.1845086722809), ookSnirBer = c(8.8808636558081e-24, 3.2219795637026e-27,
2.6468895519653e-28, 3.9807779074715e-20, 1.0849324265615e-15,
2.5705217057696e-05, 4.7313805615763e-08, 1.8800438086075e-12,
0.00021005320203921, 1.9147343768384e-06, 8.8808636558081e-24,
3.0694773489537e-27, 2.6468895519653e-28, 3.9807779074715e-20,
1.0849324265615e-15, 2.5705217057696e-05, 4.7223753038869e-08,
1.8800438086075e-12, 0.00021005320203921, 1.9171738578051e-06,
8.8229427230445e-24, 3.9715925056443e-27, 2.6045198111088e-28,
3.9014083702734e-20, 1.0342658440386e-15, 0.00019591630514278,
6.4692014108683e-08, 1.8600094209271e-12, 0.0002140067535655,
1.9074922485477e-06, 8.7096574467175e-24, 4.2779443633862e-27,
2.5231916788231e-28, 3.5761615214425e-20, 1.9750692814982e-12,
0.0001960392878411, 1.9748966344895e-06, 1.7515881895994e-12,
2.2078334799411e-06, 1.8649940680806e-06, 8.954486301678e-24,
3.2021085732779e-25, 2.690441113724e-28, 4.0627628846548e-20,
1.1134484878561e-15, 2.6061691733331e-05, 4.777159157954e-08,
9.4891388749738e-16, 0.00020359398491544, 1.9542110660398e-06,
8.8229427230445e-24, 3.9715925056443e-27, 2.6045198111088e-28,
3.8819641115984e-20, 1.0237769828158e-15, 0.00019562832342849,
6.4455095380046e-08, 1.8468752030971e-12, 0.0010099091367628,
1.9051035165106e-06, 8.8085966897635e-24, 3.9715925056443e-27,
2.594108048185e-28, 3.8819641115984e-20, 1.0237769828158e-15,
0.00019562832342849, 6.4455095380046e-08, 1.8468752030971e-12,
0.0010088638355194, 1.9051035165106e-06, 8.7096574467175e-24,
4.2987746909572e-27, 2.5231916788231e-28, 3.593647329558e-20,
1.9750692814982e-12, 0.00019705170257492, 1.9748966344895e-06,
1.7515881895994e-12, 2.1868296425817e-06, 1.8649940680806e-06,
8.7517439682173e-24, 4.3621551072316e-27, 2.553168170837e-28,
3.6469582463164e-20, 1.0032983660212e-15, 0.00019385229409318,
1.9830820164805e-06, 1.7760568361323e-12, 2.919419915209e-05,
1.8741284335866e-06, 2.8285944348148e-25, 4.1960751547207e-27,
7.8468215407139e-29, 8.0407329049747e-16, 1.9380328071065e-12,
0.00020004849911333, 1.9393279417733e-06, 5.9354475879597e-10,
6.4258355913627e-10, 2.6065221215415e-05), ookSnrBer = c(8.8808636558081e-24,
3.2219795637026e-27, 2.6468895519653e-28, 3.9807779074715e-20,
1.0849324265615e-15, 2.5705217057696e-05, 4.7313805615763e-08,
1.8800438086075e-12, 0.00021005320203921, 1.9147343768384e-06,
8.8808636558081e-24, 3.0694773489537e-27, 2.6468895519653e-28,
3.9807779074715e-20, 1.0849324265615e-15, 2.5705217057696e-05,
4.7223753038869e-08, 1.8800438086075e-12, 0.00021005320203921,
1.9171738578051e-06, 8.8229427230445e-24, 3.9715925056443e-27,
2.6045198111088e-28, 3.9014083702734e-20, 1.0342658440386e-15,
0.00019591630514278, 6.4692014108683e-08, 1.8600094209271e-12,
0.0002140067535655, 1.9074922485477e-06, 8.7096574467175e-24,
4.2779443633862e-27, 2.5231916788231e-28, 3.5761615214425e-20,
1.9750692814982e-12, 0.0001960392878411, 1.9748966344895e-06,
1.7515881895994e-12, 2.2078334799411e-06, 1.8649940680806e-06,
8.954486301678e-24, 3.2021085732779e-25, 2.690441113724e-28,
4.0627628846548e-20, 1.1134484878561e-15, 2.6061691733331e-05,
4.777159157954e-08, 9.4891388749738e-16, 0.00020359398491544,
1.9542110660398e-06, 8.8229427230445e-24, 3.9715925056443e-27,
2.6045198111088e-28, 3.8819641115984e-20, 1.0237769828158e-15,
0.00019562832342849, 6.4455095380046e-08, 1.8468752030971e-12,
0.0010099091367628, 1.9051035165106e-06, 8.8085966897635e-24,
3.9715925056443e-27, 2.594108048185e-28, 3.8819641115984e-20,
1.0237769828158e-15, 0.00019562832342849, 6.4455095380046e-08,
1.8468752030971e-12, 0.0010088638355194, 1.9051035165106e-06,
8.7096574467175e-24, 4.2987746909572e-27, 2.5231916788231e-28,
3.593647329558e-20, 1.9750692814982e-12, 0.00019705170257492,
1.9748966344895e-06, 1.7515881895994e-12, 2.1868296425817e-06,
1.8649940680806e-06, 8.7517439682173e-24, 4.3621551072316e-27,
2.553168170837e-28, 3.6469582463164e-20, 1.0032983660212e-15,
0.00019385229409318, 1.9830820164805e-06, 1.7760568361323e-12,
2.919419915209e-05, 1.8741284335866e-06, 2.8285944348148e-25,
4.1960751547207e-27, 7.8468215407139e-29, 8.0407329049747e-16,
1.9380328071065e-12, 0.00020004849911333, 1.9393279417733e-06,
5.9354475879597e-10, 6.4258355913627e-10, 2.6065221215415e-05
)), class = "data.frame", row.names = c(NA, -100L), .Names = c("run",
"repetition", "module", "configname", "packetByteLength", "numVehicles",
"dDistance", "time", "distanceToTx", "headerNoError", "receivedPower_dbm",
"snr", "frameId", "packetOkSinr", "snir", "ookSnirBer", "ookSnrBer"
))
Finding the transformation function
y1 --> y2
This function is used to transform the data of the secondary y axis to be "normalized" according to the first y axis
transformation function: f(y1) = 0.025*x + 2.75
y2 --> y1
This function is used to transform the break points of the first y axis to the values of the second y axis. Note that the axis are swapped now.
transformation function: f(y1) = 40*x - 110
Plotting
Note how the transformation functions are used in the ggplot call to transform the data "on-the-fly"
ggplot(data=combined_80_8192 %>% filter (time > 270, time < 280), aes(x=time) ) +
stat_summary(aes(y=receivedPower_dbm ), fun.y=mean, geom="line", colour="black") +
stat_summary(aes(y=packetOkSinr*40 - 110 ), fun.y=mean, geom="line", colour="black", position = position_dodge(width=10)) +
scale_x_continuous() +
scale_y_continuous(breaks = seq(-0,-110,-10), "y_first", sec.axis=sec_axis(~.*0.025+2.75, name="y_second") )
The first stat_summary call is the one that sets the base for the first y axis.
The second stat_summary call is called to transform the data. Remember that all of the data will take as base the first y axis. So that data needs to be normalized for the first y axis. To do that I use the transformation function on the data: y=packetOkSinr*40 - 110
Now to transform the second axis I use the opposite function within the scale_y_continuous call: sec.axis=sec_axis(~.*0.025+2.75, name="y_second").

There's always a way.
Here's a solution that allows for totally arbitrary axes without rescaling. The idea is to generate two plots, identical except for the axis, and hacking them together using the insert_yaxis_grob and get_y_axis functions in the cowplot package.
library(ggplot2)
library(cowplot)
## first plot
p1 <- ggplot(mtcars,aes(disp,hp,color=as.factor(am))) +
geom_point() + theme_bw() + theme(legend.position='top', text=element_text(size=16)) +
ylab("Horse points" )+ xlab("Display size") + scale_color_discrete(name='Transmitter') +
stat_smooth(se=F)
## same plot with different, arbitrary scale
p2 <- p1 +
scale_y_continuous(position='right',breaks=seq(120,173,length.out = 3),
labels=c('little','medium little','medium hefty'))
ggdraw(insert_yaxis_grob(p1,get_y_axis(p2,position='right')))

We definitely could build a plot with dual Y-axises using base R funtion plot.
# pseudo dataset
df <- data.frame(x = seq(1, 1000, 1), y1 = sample.int(100, 1000, replace=T), y2 = sample(50, 1000, replace = T))
# plot first plot
with(df, plot(y1 ~ x, col = "red"))
# set new plot
par(new = T)
# plot second plot, but without axis
with(df, plot(y2 ~ x, type = "l", xaxt = "n", yaxt = "n", xlab = "", ylab = ""))
# define y-axis and put y-labs
axis(4)
with(df, mtext("y2", side = 4))

It seemingly appears to be a simple question but it boggles around 2 fundamental questions. A) How to deal with a multi-scalar data while presenting in a comparative chart, and secondly, B) whether this can be done without some thumb rule practices of R programming such as i) melting data, ii) faceting, iii) adding another layer to existing one.
The solution given below satisfies both the above conditions as it deals data without having to rescale it and secondly, the techniques mentioned are not used.
Here is the result,
For those interested in knowing more about this method, please follow the link below.
How to plot a 2- y axis chart with bars side by side without re-scaling the data

You can use facet_wrap(~ variable, ncol= ) on a variable to create a new comparison. It's not on the same axis, but it is similar.

I acknowledge and agree with hadley (and others), that separate y-scales are "fundamentally flawed". Having said that – I often wish ggplot2 had the feature – particularly, when the data is in wide-format and I quickly want to visualise or check the data (i.e. for personal use only).
While the tidyverse library makes it fairly easy to convert the data to long-format (such that facet_grid() will work), the process is still not trivial, as seen below:
library(tidyverse)
df.wide %>%
# Select only the columns you need for the plot.
select(date, column1, column2, column3) %>%
# Create an id column – needed in the `gather()` function.
mutate(id = n()) %>%
# The `gather()` function converts to long-format.
# In which the `type` column will contain three factors (column1, column2, column3),
# and the `value` column will contain the respective values.
# All the while we retain the `id` and `date` columns.
gather(type, value, -id, -date) %>%
# Create the plot according to your specifications
ggplot(aes(x = date, y = value)) +
geom_line() +
# Create a panel for each `type` (ie. column1, column2, column3).
# If the types have different scales, you can use the `scales="free"` option.
facet_grid(type~., scales = "free")

I found this answer helped me the most, but found that there were some edge cases that it didn't seem to handle correctly, in particular negative cases, and also the case where my limits had 0 distance (which can happen if we are grabbing our limits from max/min of data). Testing seems to indicate that this works consistently
I use the following code. Here I assume we have [x1,x2] that we want to transform to [y1,y2]. The way I handled this was to transform [x1,x2] to [0,1] (a simple enough transformaton), then [0,1] to [y1,y2].
climate <- tibble(
Month = 1:12,
Temp = c(-4,-4,0,5,11,15,16,15,11,6,1,-3),
Precip = c(49,36,47,41,53,65,81,89,90,84,73,55)
)
#Set the limits of each axis manually:
ylim.prim <- c(0, 180) # in this example, precipitation
ylim.sec <- c(-4, 18) # in this example, temperature
b <- diff(ylim.sec)/diff(ylim.prim)
#If all values are the same this messes up the transformation, so we need to modify it here
if(b==0){
ylim.sec <- c(ylim.sec[1]-1, ylim.sec[2]+1)
b <- diff(ylim.sec)/diff(ylim.prim)
}
if (is.na(b)){
ylim.prim <- c(ylim.prim[1]-1, ylim.prim[2]+1)
b <- diff(ylim.sec)/diff(ylim.prim)
}
ggplot(climate, aes(Month, Precip)) +
geom_col() +
geom_line(aes(y = ylim.prim[1]+(Temp-ylim.sec[1])/b), color = "red") +
scale_y_continuous("Precipitation", sec.axis = sec_axis(~((.-ylim.prim[1]) *b + ylim.sec[1]), name = "Temperature"), limits = ylim.prim) +
scale_x_continuous("Month", breaks = 1:12) +
ggtitle("Climatogram for Oslo (1961-1990)")
The key parts here are that we transform the secondary y axis with ~((.-ylim.prim[1]) *b + ylim.sec[1]) and then apply the inverse to the actual values y = ylim.prim[1]+(Temp-ylim.sec[1])/b). We should also ensure that limits = ylim.prim.

The following incorporates Dag Hjermann's basic data and programming, improves upon user4786271's strategy to create a "transformation function" to optimally combine the plots and data axis, and responds to baptist's note that such a function can be created within R.
#Climatogram for Oslo (1961-1990)
climate <- tibble(
Month = 1:12,
Temp = c(-4,-4,0,5,11,15,16,15,11,6,1,-3),
Precip = c(49,36,47,41,53,65,81,89,90,84,73,55))
#y1 identifies the position, relative to the y1 axis,
#the locations of the minimum and maximum of the y2 graph.
#Usually this will be the min and max of y1.
#y1<-(c(max(climate$Precip), 0))
#y1<-(c(150, 55))
y1<-(c(max(climate$Precip), min(climate$Precip)))
#y2 is the Minimum and maximum of the secondary axis data.
y2<-(c(max(climate$Temp), min(climate$Temp)))
#axis combines y1 and y2 into a dataframe used for regressions.
axis<-cbind(y1,y2)
axis<-data.frame(axis)
#Regression of Temperature to Precipitation:
T2P<-lm(formula = y1 ~ y2, data = axis)
T2P_summary <- summary(lm(formula = y1 ~ y2, data = axis))
T2P_summary
#Identifies the intercept and slope of regressing Temperature to Precipitation:
T2PInt<-T2P_summary$coefficients[1, 1]
T2PSlope<-T2P_summary$coefficients[2, 1]
#Regression of Precipitation to Temperature:
P2T<-lm(formula = y2 ~ y1, data = axis)
P2T_summary <- summary(lm(formula = y2 ~ y1, data = axis))
P2T_summary
#Identifies the intercept and slope of regressing Precipitation to Temperature:
P2TInt<-P2T_summary$coefficients[1, 1]
P2TSlope<-P2T_summary$coefficients[2, 1]
#Create Plot:
ggplot(climate, aes(Month, Precip)) +
geom_col() +
geom_line(aes(y = T2PSlope*Temp + T2PInt), color = "red") +
scale_y_continuous("Precipitation", sec.axis = sec_axis(~.*P2TSlope + P2TInt, name = "Temperature")) +
scale_x_continuous("Month", breaks = 1:12) +
theme(axis.line.y.right = element_line(color = "red"),
axis.ticks.y.right = element_line(color = "red"),
axis.text.y.right = element_text(color = "red"),
axis.title.y.right = element_text(color = "red")) +
ggtitle("Climatogram for Oslo (1961-1990)")
Most noteworthy is that a new "transformation function" works better with just two data points from the data set of each axes—usually the maximum and minimum values of each set. The resulting slopes and intercepts of the two regressions enable ggplot2 to exactly pair the plots of the minimums and maximums of each axis. As user4786271 pointed out, the two regressions transform each data set and plot to the other. One transforms the break points of the first y axis to the values of the second y axis. The second transforms the data of the secondary y axis to be "normalized" according to the first y axis.
The following output shows how the axis align the minimums and maximums of each dataset:
Having the maximums and minimums match may be most appropriate; however, another benefit of this method is that the plot associated with the secondary axis can be easily shifted, if desired, by altering a programming line related to the primary axis data. The output below simply changes the minimum precipitation input in the programming line of y1 to "0", and thus aligns the minimum Temperature level with the "0" Precipitation level.
From: y1<-(c(max(climate$Precip), min(climate$Precip)))
To: y1<-(c(max(climate$Precip), 0))
Notice how the resulting new regressions and ggplot2 automatically adjusted the plot and axis to correctly align the minimum Temperature to the new "base" of the "0" Precipitation level. Likewise, one is easily able to elevate the Temperature plot so that it is more obvious. The following graph is created by simply changing the above-noted line to:
"y1<-(c(150, 55))"
The above line tells the maximum of the Temperature graph to coincide with the "150" Precipitation level, and the minimum of the temperature line to coincide with the "55" Precipitation level. Again, notice how ggplot2 and the resulting new regression outputs enable the graph to maintain correct alignment with the axis.
The above may not be a desirable output; however, it is an example of how the graph can be easily manipulated and still have correct relationships between the plots and the axis.
The incorporation of Dag Hjermann's theme improves identification of the axis corresponding to the plot.

The answer by Hadley gives an interesting reference to Stephen Few's report Dual-Scaled Axes in Graphs Are They Ever the Best Solution?.
I do not know what the OP means with "counts" and "rate" but a quick search gives me Counts and Rates, so I get some data about Accidents in North American Mountaineering1:
Years<-c("1998","1999","2000","2001","2002","2003","2004")
Persons.Involved<-c(281,248,301,276,295,231,311)
Fatalities<-c(20,17,24,16,34,18,35)
rate=100*Fatalities/Persons.Involved
df<-data.frame(Years=Years,Persons.Involved=Persons.Involved,Fatalities=Fatalities,rate=rate)
print(df,row.names = FALSE)
Years Persons.Involved Fatalities rate
1998 281 20 7.117438
1999 248 17 6.854839
2000 301 24 7.973422
2001 276 16 5.797101
2002 295 34 11.525424
2003 231 18 7.792208
2004 311 35 11.254019
And then I tried to do the graph as Few suggested at page 7 of the aforementioned report (and following the request of OP to graph the counts as a bar chart and the rates as a line chart) :
The other less obvious solution, which works only for time series, is
to convert all sets of values to a common quantitative scale by
displaying percentage differences between each value and a reference
(or index) value. For instance, select a particular point in time,
such as the first interval that appears in the graph, and express
each subsequent value as the percentage difference between it and the
initial value. This is done by dividing the value at each point in
time by the value for the initial point in time and then multiplying
it by 100 to convert the rate to a percentage, as illustrated below.
df2<-df
df2$Persons.Involved <- 100*df$Persons.Involved/df$Persons.Involved[1]
df2$rate <- 100*df$rate/df$rate[1]
plot(ggplot(df2)+
geom_bar(aes(x=Years,weight=Persons.Involved))+
geom_line(aes(x=Years,y=rate,group=1))+
theme(text = element_text(size=30))
)
And this is the result:
But I do not like it a lot and I am not able to easily put a legend on it...
1
WILLIAMSON, Jed, et al. Accidents in North American Mountaineering 2005. The Mountaineers Books, 2005.

Related

How to plot: Connected BEFORE and AFTER hormone levels with lines? [duplicate]

This question already has answers here:
Lines connecting jittered points - dodging by multiple groups
(2 answers)
Closed 25 days ago.
I have never posted on stack overflow (or any coding website) so I hope I can ask this well...
I am trying to make a plot showing how corticosterone (a hormone) increases in 30 minutes from baseline (base) to stress-induced (SI) levels in birds.
I captured starlings and took a baseline blood sample (Basecort), then waited 30 minutes and took a second blood sample (SIcort).
I would like to make a plot with each individual bird's Basecort and SIcort connected by lines.
(I have been on Google for 2 hours (not an exaggeration) and can't make anything work).
I used the following code to make this plot:
create list of variables
x <- list('Base CORT' = df_adults$Base.cort, 'SI CORT' = df_adults$SI.cort)
x
create plot that contains one strip chart per variable
stripchart(x,
main = 'Individual Changes in CORT',
xlab = 'CORT Sample',
col = c('#9A8822', '#F5CDB4'),
pch = 16,
method = 'jitter',
vertical = TRUE)
SEE PLOT HERE
I can't get any kind of "group" variable to work.
Does anyone have a clue how to connect the dots by BirdID?
This is what my dataframe looks like:
Dataframe
Thank you SO MUCH to anyone who's able to help.
As per comments, this has been asked in other threads - to help you here a suggestion how to do this on your data (as a wiki, I will close this question thereafter).
Please do your readers (and reviewers!) a favour and plot your data as a scatter plot!
library(tidyr)
library(dplyr)
library(ggplot2)
## I've slightly modified the data from above suggested threads.
df <- structure(list(BirdID = c("id_1", "id_2", "id_3", "id_4", "id_5", "id_6", "id_7", "id_8", "id_9", "id_10", "id_11", "id_12", "id_13", "id_14", "id_15", "id_16", "id_17", "id_18", "id_19", "id_20"), Basecort = c(9L, 7L, 2L, 2L, 1L, 5L, 6L, 7L, 5L, 9L, 5L, 2L, 9L, 4L, 6L, 10L, 4L, 10L, 7L, 9L), SIcort = c(4L, 1L, 7L, 3L, 5L, 10L, 10L, 9L, 5L, 9L, 5L, 10L, 1L, 3L, 10L, 6L, 4L, 9L, 6L, 8L)), row.names = c(NA, -20L), class = c("tbl_df", "tbl", "data.frame"))
df %>%
pivot_longer(-BirdID, "type", "value") %>%
mutate(value = jitter(value),
x = jitter(as.integer(factor(type)))) %>%
ggplot(aes(x, value, group = BirdID)) +
geom_point() +
geom_line() +
## you will need to change the x axis labels
scale_x_continuous(breaks = 1:2, labels = c("Basecort", "SIcort"))
## MUCH better
ggplot(df) +
geom_point(aes(Basecort, SIcort)) +
## you can add a line of equality to make it even more intuitive
geom_abline(intercept = 0, slope = 1, lty = 2, linewidth = .2) +
coord_equal()

GGplot geom_line works but ggplotly() only connect points of identical continuous values

I can create a coloured graph in ggplot where the geom_line() works as expected. However, when I pipe to ggplotly() , the line starts/ends at seemingly randon data points. How can I make it look like this, but with a tooltip?
Instead of:
libary(ggplot2)
library(plotly)
# df
data <- structure(list( Percent = c(0.32, 0.23, 0.75, 0.25, 0.482, 0.421, 0.5114, 0.3423, 0.27, 0.4324, 0.347, 0.377, 0.26,
0.375, 0.18604, 0.241378, 0.3095, 0.348837209, 0.33333, 0.1875, 0.2820, 0.65, 0.72, 0.75, 0.81, 0.87, 0.8244), finalpoint = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 0.8244), date = structure(c(18262, 18293, 18322, 18353, 18383, 18414, 18444, 18475, 18506, 18536,
18567, 18597, 18628, 18659, 18687, 18718, 18748, 18779, 18809, 18840, 18871, 18901, 18932, 18962, 18993, 19024, 19052), class = "Date"), Status_perc = structure(c(1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L), levels = c("<70%", "70-80%", "≥80%"), class = "factor")), row.names = c(NA,
-27L), class = c("tbl_df", "tbl", "data.frame"))
# create and save plot
test <- data %>%
ggplot( aes ( x = date, y = Percent,
label = finalpoint ,
colour= Percent,
group = 1)) + # Note sure why, but I have to add this
geom_line( ) +
geom_point ( ) +
geom_text(aes(label = ifelse(is.na(finalpoint), "", sprintf("%1.1f%%",finalpoint*100))) ,
nudge_y = +0.2, nudge_x = -50 ) + # Add label for final point, formatted as %.
scale_y_continuous(limits = c(0,1)) +
scale_colour_gradient(low = "red", high = "green",
limits = c(0,1))
# Pipe through ggplotly for interactivity
test %>% plotly::ggplotly( ) %>%
# tooltip = c("Percent", "edtriage", "Num_Denom" , ")) %>%
config(displayModeBar = F)

R: How to add fading geom_line() colours centred on the datapoint itself rather than the subsequent linking line?

I have created a ggplot that looks like this, except I want the colours to be 'centered' around the datapoint itself and then have a gradient/fade into the colour assigned to the next datapoint. Currently, it takes the factor assigned to one month and then carries that colour in the connecting line rather than centering. :
This is the fading colours I am aiming for:
library(ggplot)
# Create dataframe
data <- structure(list( Percent = c(0.32, 0.23, 0.75, 0.25, 0.482, 0.421, 0.5114, 0.3423, 0.27, 0.4324, 0.347, 0.377, 0.26,
0.375, 0.18604, 0.241378, 0.3095, 0.348837209, 0.33333, 0.1875, 0.2820, 0.65, 0.72, 0.75, 0.81, 0.87, 0.8244), finalpoint = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 0.8244), date = structure(c(18262, 18293, 18322, 18353, 18383, 18414, 18444, 18475, 18506, 18536,
18567, 18597, 18628, 18659, 18687, 18718, 18748, 18779, 18809, 18840, 18871, 18901, 18932, 18962, 18993, 19024, 19052), class = "Date"), Status_perc = structure(c(1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L), levels = c("<70%", "70-80%", "≥80%"), class = "factor")), row.names = c(NA,
-27L), class = c("tbl_df", "tbl", "data.frame"))
# Create custom colours (red, yellow, green) based on indicator status (Status_perc)
cols_status <- c("#A20000","#F6BE00" , "#C4C224")
data %>%
ggplot( aes ( x = date, y = Percent,
label = finalpoint ,
colour= Status_perc,
group =1)) + # Note sure why, but I have to add this
geom_line( ) +
geom_point () +
geom_text(aes(label = ifelse(is.na(finalpoint), "", sprintf("%1.1f%%",finalpoint*100))) ,
nudge_y = +0.2, nudge_x = -50 ) + # Add label for final point, formatted as %.
geom_hline( yintercept = 0.8, colour = "darkgrey" , linetype = 2, size = 0.4, alpha = 0.8) + # 80% Goal
scale_colour_manual( values = cols_status )

Rearrange stacked barplot legend labels without changing plot (and fix tick marks) in R

Is there a way to change the order of factor levels in a stacked barplot legend without changing the order of the plot too (and without mislabeling the data)? I'd like to change the order to "Presence" first, then "Absence".
I'm also having trouble with the tick marks being slightly shifted to one side.
dput(prop)
structure(list(WYR = c(2005L, 2005L, 2006L, 2006L, 2007L, 2007L,
2008L, 2008L, 2009L, 2009L, 2010L, 2010L, 2011L, 2011L, 2012L,
2012L, 2013L, 2013L, 2014L, 2014L, 2015L, 2015L, 2016L, 2016L,
2017L, 2017L, 2018L, 2018L, 2019L, 2019L, 2020L, 2020L), CYR = c(2005L,
2005L, 2006L, 2006L, 2007L, 2007L, 2008L, 2008L, 2009L, 2009L,
2010L, 2010L, 2011L, 2011L, 2012L, 2012L, 2013L, 2013L, 2014L,
2014L, 2015L, 2015L, 2016L, 2016L, 2017L, 2017L, 2018L, 2018L,
2019L, 2019L, 2020L, 2020L), class = structure(c(1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("prop_zero",
"prop_nonzero"), class = "factor"), proportions = c(0.170212765957447,
0.829787234042553, 0.170212765957447, 0.829787234042553, 0.361702127659574,
0.638297872340426, 0.234042553191489, 0.765957446808511, 0.234042553191489,
0.765957446808511, 0.434782608695652, 0.565217391304348, 0.58695652173913,
0.41304347826087, 0.574468085106383, 0.425531914893617, 0.51063829787234,
0.48936170212766, 0.595744680851064, 0.404255319148936, 0.608695652173913,
0.391304347826087, 0.51063829787234, 0.48936170212766, 0.404255319148936,
0.595744680851064, 0.319148936170213, 0.680851063829787, 0.468085106382979,
0.531914893617021, 0.608695652173913, 0.391304347826087)), row.names = c(NA,
-32L), class = c("tbl_df", "tbl", "data.frame"))
ggplot(prop, aes(x = CYR, y = proportions, fill = class)) +
geom_bar(position = "fill", stat = "identity") +
scale_fill_manual(values = c("grey70", "grey20"), labels = c("Absence", "Presence")) +
scale_y_continuous(limits = c(0, 1.0), expand = expansion(mult = c(0, 0.05))) +
scale_x_continuous(breaks = years, labels = ~ rep("", length(.x))) +
# CYR labels
annotate(
geom = "text",
x = prop$CYR,
y = -Inf,
label = prop$CYR,
size = 6.5 / .pt,
vjust = 2.5
) +
# WYR labels
annotate(
geom = "text",
x = prop$CYR,
y = -Inf,
label = prop$WYR,
size = 6.5 / .pt,
vjust = 4,
color = "grey"
) +
# CYR title
annotate(
geom = "text",
x = -Inf,
y = -Inf,
label = c("CYR"),
vjust = 2.5, hjust = 1,
size = 6.5 / .pt
) +
# WYR title
annotate(
geom = "text",
x = -Inf,
y = -Inf,
label = c("WYR"),
vjust = 4, hjust = 1,
size = 6.5 / .pt,
color = "grey") +
coord_cartesian(clip = "off") +
theme(
axis.text.x.bottom = element_text(margin = margin(t = 8.8, b = 8.8)),
axis.title.x = element_blank(),
axis.text.y = element_text(size = 10),
axis.title.y = element_text(margin = margin(t = 0, r = 10, b = 0, l = 0), size = 14),
axis.ticks = element_line(colour = "black", size = 1),
legend.title=element_blank(),
panel.border = element_rect(fill = NA, color = "black", size = 1),
plot.title = element_text(hjust = 0.5)) +
labs(y = "% presence/absence") +
ggtitle("DRY SEASONS")
Found the answer here!: Flip ordering of legend without altering ordering in plot
Just add this code to the end of the ggplot: + guides(fill = guide_legend(reverse = TRUE))

How can I translate this SQL code to R script using dplyr?

I'm currently working on a project and I want to summarize a column from a joined table twice. SQL code is this:
SELECT M.date,T.team_long_name AS Home_Team, M.home_team_goal, Te.team_long_name AS Away_Team, M.away_team_goal
FROM Match AS M JOIN Team AS T
ON T.team_api_id = M.home_team_api_id
JOIN Team AS Te
ON Te.team_api_id = M.away_team_api_id
WHERE match_api_id = 539848;
...and the result is this:
Database tables are as shown here:
I hope that I have provided all the information needed.
Question: How can I have the same result in R by only using dplyr library?
Table names and structure for the first 10 rows as below:
Match:
structure(list(id = 1:10, country_id = c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L), league_id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), season = c("2008/2009", "2008/2009", "2008/2009",
"2008/2009", "2008/2009", "2008/2009", "2008/2009", "2008/2009",
"2008/2009", "2008/2009"), stage = c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 10L), date = c("2008-08-17 00:00:00", "2008-08-16 00:00:00",
"2008-08-16 00:00:00", "2008-08-17 00:00:00", "2008-08-16 00:00:00",
"2008-09-24 00:00:00", "2008-08-16 00:00:00", "2008-08-16 00:00:00",
"2008-08-16 00:00:00", "2008-11-01 00:00:00"), match_api_id = c(492473L,
492474L, 492475L, 492476L, 492477L, 492478L, 492479L, 492480L,
492481L, 492564L), home_team_api_id = c(9987L, 10000L, 9984L,
9991L, 7947L, 8203L, 9999L, 4049L, 10001L, 8342L), away_team_api_id = c(9993L,
9994L, 8635L, 9998L, 9985L, 8342L, 8571L, 9996L, 9986L, 8571L
), home_team_goal = c(1L, 0L, 0L, 5L, 1L, 1L, 2L, 1L, 1L, 4L),
away_team_goal = c(1L, 0L, 3L, 0L, 3L, 1L, 2L, 2L, 0L, 1L
)), row.names = c(NA, 10L), class = "data.frame")
Team:
structure(list(id = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 614L, 1034L), team_api_id = c(9987L,
9993L, 10000L, 9994L, 9984L, 8635L, 9991L, 9998L, 7947L, 9985L,
8203L, 8342L, 9999L, 8571L, 4049L, 9996L, 10001L, 9986L, 9997L,
9989L), team_long_name = c("KRC Genk", "Beerschot AC", "SV Zulte-Waregem",
"Sporting Lokeren", "KSV Cercle Brugge", "RSC Anderlecht", "KAA Gent",
"RAEC Mons", "FCV Dender EH", "Standard de Liège", "KV Mechelen",
"Club Brugge KV", "KSV Roeselare", "KV Kortrijk", "Tubize", "Royal Excel Mouscron",
"KVC Westerlo", "Sporting Charleroi", "Sint-Truidense VV", "Lierse SK"
)), row.names = c(NA, 20L), class = "data.frame")
In the desired result I used match_api_id = 539848 but as it is not included in this sample data, use one of your own choice.
The main issue is to be able to have team_long_name twice in the result but for different teams, matching by their team_api_id 's.
Up front, the dbplyr pipe:
tbl_match <- tbl(fakedb, "Match")
tbl_team <- tbl(fakedb, "Team")
tbl_match %>%
filter(match_api_id == 492477) %>%
inner_join(select(tbl_team, home_team_api_id = team_api_id, Home_Team = team_long_name),
by = "home_team_api_id") %>%
inner_join(select(tbl_team, away_team_api_id = team_api_id, Away_Team = team_long_name),
by = "away_team_api_id") %>%
select(date, Home_Team, Away_Team) %>%
collect()
Edited to include collect(), since without it the output is not a proper frame and/or may not include all relevant data.
from the corresponding DBI call:
DBI::dbGetQuery(fakedb, some_long_query)
Backfill from your sample data. Note that your data is inconsistent and incomplete, so I had to make some assumptions/translations. For instance, your first structure, which I'm inferring is Match, does not match the schema as depicted in your picture: it includes extra columns like season and *_team_goal. Also, your queried match_api_id of 539848 is not in the sample data, so I used one that was present. (In the future, I suggest that your code and sample data should be consistent with regards to things like this.)
Code to generate a fake databsae for the purposes of this answer. Starting with your two structures as Match and Team.
library(dbplyr)
library(dplyr)
fakedb <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
copy_to(fakedb, Match)
copy_to(fakedb, Team)
some_long_query <- '
SELECT
M.date, T.team_long_name AS Home_Team, M.home_team_goal,
Te.team_long_name AS Away_Team, M.away_team_goal
FROM
Match AS M
JOIN Team AS T ON T.team_api_id = M.home_team_api_id
JOIN Team AS Te ON Te.team_api_id = M.away_team_api_id
WHERE
match_api_id = 492477;' # 539848
DBI::dbGetQuery(fakedb, some_long_query)
# date Home_Team home_team_goal Away_Team away_team_goal
# 1 2008-08-16 00:00:00 FCV Dender EH 1 Standard de Liège 3
tbl_match <- tbl(fakedb, "Match")
tbl_team <- tbl(fakedb, "Team")
tbl_match %>%
filter(match_api_id == 492477) %>%
inner_join(select(tbl_team, home_team_api_id = team_api_id, Home_Team = team_long_name),
by = "home_team_api_id") %>%
inner_join(select(tbl_team, away_team_api_id = team_api_id, Away_Team = team_long_name),
by = "away_team_api_id") %>%
select(date, Home_Team, home_team_goal, Away_Team, away_team_goal) %>%
collect()
# A tibble: 1 x 5
# date Home_Team home_team_goal Away_Team away_team_goal
# <chr> <chr> <int> <chr> <int>
# 1 2008-08-16 00:00:00 FCV Dender EH 1 Standard de Liège 3