create separate legend for each row of facet_grid - ggplot2

I am using nested_facet() to plot a large number of experiments, yielding a 9x6 array. Each panel in the array is a stacked bargraph with variables indicated by color, same set of variables common to each row.
My code...
ggplot(data, aes(x=enzyme_drug, y=counts, fill = species)) +
geom_bar(position = "fill", stat = "identity") +
theme(axis.text.x = element_text(angle = 90)) +
guides(fill=guide_legend(ncol=1)) +
facet_grid(pH ~ combo, scales = "free")
The only problem with putting them all together into a facet grid is that the number of variables indicated by color in the figure legend is too high, resulting in adjacent colors that are hard to differentiate.
An easy way out of this would be to create a separate legend for each row, with the small number of variables for each row being much easier to differentiate.
Any ideas on how to do this?
The other alternative I realise is to loop the creation of separate facet_grids into a list and then put them together with ggarrange() - this yields unwieldy x-axis labels repeated for each row, although I could remove the x-axes manually for each row in the arrangement there must be a simpler method?
Thanks in advance...

Related

why is ggplot2 geom_col misreading discrete x axis labels as continuous?

Aim: plot a column chart representing concentration values at discrete sites
Problem: the 14 site labels are numeric, so I think ggplot2 is assuming continuous data and adding spaces for what it sees as 'missing numbers'. I only want 14 columns with 14 marks/labels, relative to the 14 values in the dataframe. I've tried assigning the sites as factors and characters but neither work.
Also, how do you ensure the y-axis ends at '0', so the bottom of the columns meet the x-axis?
Thanks
Data:
Sites: 2,4,6,7,8,9,10,11,12,13,14,15,16,17
Concentration: 10,16,3,15,17,10,11,19,14,12,14,13,18,16
You have two questions in one with two pretty straightforward answers:
1. How to force a discrete axis when your column is a continuous one? To make ggplot2 draw a discrete axis, the data must be discrete. You can force your numeric data to be discrete by converting to a factor. So, instead of x=Sites in your plot code, use x=as.factor(Sites).
2. How to eliminate the white space below the columns in a column plot? You can control the limits of the y axis via the scale_y_continuous() function. By default, the limits extend a bit past the actual data (in this case, from 0 to the max Concentration). You can override that behavior via the expand= argument. Check the documentation for expansion() for more details, but here I'm going to use mult=, which uses a multiplication to find the new limits based on the data. I'm using 0 for the lower limit to make the lower axis limit equal the minimum in your data (0), and 0.05 as the upper limit to expand the chart limits about 5% past the max value (this is default, I believe).
Here's the code and resulting plot.
library(ggplot2)
df <- data.frame(
Sites = c(2,4,6,7,8,9,10,11,12,13,14,15,16,17),
Concentration = c(10,16,3,15,17,10,11,19,14,12,14,13,18,16)
)
ggplot(df, aes(x=as.factor(Sites), y=Concentration)) +
geom_col(color="black", fill="lightblue") +
scale_y_continuous(expand=expansion(mult=c(0, 0.05))) +
theme_bw()

how fix the y-axis's rate in plot

I am using a line to estimate the slope of my graphs. the data points are in the same size. But look at these two pictures. the first one seems to have a larger slope but its not true. the second one has larger slope. but since the y-axis has different rate, the first one looks to have a larger slope. is there any way to fix the rate of y-axis, then I can see with my eye which one has bigger slop?
code:
x = np.array(list(range(0,df.shape[0]))) # = array([0, 1, 2, ..., 3598, 3599, 3600])
df1[skill]=pd.to_numeric(df1[skill])
fit = np.polyfit(x, df1[skill], 1)
fit_fn = np.poly1d(fit)
df['fit_fn(x)']=fit_fn(x)
df[['Hodrick-Prescott filter',skill,'fit_fn(x)']].plot(title=skill + date)
Two ways:
One, use matplotlib.pyplot.axis to get the axis limits of the first figure and set the second figure to have the same axis limits (using the same function) (could also use get_ylim and set_ylim, which are specific to the y-axis but require directly referencing the Axes object)
Two, plot both in a subplots figure and set the argument sharey to True (my preferred, depending on the desired use)

Adjusting the y-axis in ggplot (bar size, ordering, formatting)

I have the following data.table:
I would like to have a plot which shows the columns symbol and value in a box plot. The boxes should be ordered by the column value.
My code, that I've tried:
plot1 <- ggplot(symbols, aes(symbol, value, fill = from)) +
geom_bar(stat = 'identity') +
ggtitle(paste0("Total quantity traded: ", format(sum(symbols$quantity), scientific = FALSE, nsmall = 2, big.mark = " "))) +
theme_bw()
plot1
This returns the following plot:
What I would like to change:
- flip x- and y-axis
- show the correct height of boxes (y-axis)...currently the relation between the boxes is not correct.
- decreasing order of the boxes by columns value
- format the y-axis with two digits
- make the x-axis readable...currently the x-axis is just a long bunch of what is written in column symbol.
Thanks upfront for the help!
To make things a bit easier, it is suggested that you post your data frame as the output of dput(your.data.frame), which presents code that can be used to replicate your dataset in r.
With that being said, I recreated your data (it was not too big)--some numbers were rounded a bit to make things easier.
A few comments:
y-axis numbers are odd: The numbers on your y-axis are not numeric. If you type str(your.data.frame) you'll probably notice that "value" is not numeric, but a character or factor. This can be easily remedied via: df$value <- as.numeric(df$value), where df is your dataframe.
flipping axis: You can use coord_flip() (typically added to the end of your ggplot call. Be warned that when you do this, your aesthetics flip for the plot, so just keep that in mind.
your dataframe name is also a function/data name in r: This may not be causing any issues (due to your environment), but just be aware to use caution to name your dataset to not have names that are used in r elsewhere. This goes for column/variable names too. I don't think it causes any issues here, but just an FYI
geom_col vs geom_bar: Check out this documentation link for some description on the differences between geom_bar and geom_col. Basically, you want to use geom_bar when your y-axis is count, and geom_col when your y-axis is a value. Here, you want to plot a value, so choose geom_col(), and not geom_bar().
Fixing the issues in plot
Here's the representation of your data (note I rounded... hopefully got the actual data correct, because I manually had to copy each value):
from symbol quantity usd value
1 BTC BTCUSDT 12910.470 6776.340 87485737
2 ETH ETHUSDT 6168.730 154.398 952445
3 BNB BNBUSDT 51002.650 14.764 753017
4 BNB BNBBTC 31071.280 14.764 458745
5 ETH ETHBTC 2216.576 154.398 342236
6 LTC LTCUSDT 4332.024 40.481 175368
7 BNB BNBETH 3150.030 14.764 46507
8 LTC LTCBTC 922.560 40.481 37346
9 LTC LTCBNB 521.476 40.481 21110
10 NEO NEOUSDT 2438.353 7.203 17564
11 NEO NEOBTC 417.930 7.203 3010
Here's the basic plot, flipped:
ggplot(df, aes(symbol, value, fill=from)) +
geom_col() +
coord_flip()
The problem here is that when you plot values... BTCUSDT is huge in comparison. I would suggest you plot on log of value. See this link for some advice on how to do that. I like the scale_y_log10() function, since it just works here pretty well:
ggplot(df, aes(symbol, value, fill=from)) +
geom_col() +
scale_y_log10() +
coord_flip()
If you wanted to keep the columns going in the vertical orientation, you can still do that and avoid having the text run into each other on the x-axis. In that case, you can rotate the labels via theme(axis.text.x=...). Note the adjustments to horizontal and vertical alignment (hjust=1), which forces the labels to be "right-aligned":
ggplot(df, aes(symbol, value, fill=from)) +
geom_col() +
scale_y_log10() +
theme(axis.text.x=element_text(angle=45, hjust=1))

R: Is it possible to render more precise outline weights in ggplot2?

Create a simple stacked bar chart:
require(ggplot2)
g <- ggplot(mpg, aes(class)) +
geom_bar(aes(fill = drv, size=ifelse(drv=="4",2,1)),color="black",width=.5)
I can change the size values in size=ifelse(drv=="4",2,1) to lots of different values and I still get the same two line weights. Is there a way around this? (2,1) produces the same chart as (1.1,1) and (10,1). Ideally the thicker weight should be just a bit thicker than the standard outline, rather than ~10x bigger.
As added background, you can set size outside of aes() and have outline width scale as you'd expect, but I can't can't assign the ifelse condition outside of aes() without getting an error.
The two size values you give as aesthetics are just just arbitrary levels that are not taken at face value by ggplot.
You can use
ggplot(mpg, aes(class)) +
geom_bar(
aes(fill = drv, size = drv == "4"),
color = "black", width = .5) +
scale_size_manual(values = c(1, 2))
Where the values parameter allows you to specify the precise sizes for each level you define in the asesthetics.

Display x-axis label at certain data-points in mschart

I'm trying to plot a normal distribution curve (as a SeriesChartType.Spline) with a selected items location on that curve. My x-axis is a little messy so I'm trying to tidy it up but I can't figure a way to show the axis label at specific locations.
I'd like to show the value at {x(0), x(mean), x(n)} and also the x-axis value of the selected item's data-point on the curve.
I've tried playing with the
.ChartAreas(0).AxisX.Interval
but I don't necessarily have a standard interval range.
Is there a way I can display the x-axis label only at specified data points?
[EDIT]:
As suggested I implemented several custom labels for this chart. They are not exactly what I'd call intuitive to use but they did the job in the end.
'//create x-axis labels
mu = Math.Round(mu, 2, MidpointRounding.AwayFromZero)
bci = Math.Round(CDbl(bci), 2, MidpointRounding.AwayFromZero)
Dim muLabel = String.Format("{0}({1})", "ยต", mu)
'//Fit axis
With .ChartAreas(0)
With .AxisX
.MajorGrid.LineWidth = 0
.MajorTickMark.Enabled = false
.Minimum = 0
With .CustomLabels
.Add(New CustomLabel(0, 0.4, 0, 0, LabelMarkStyle.LineSideMark)) '//origin label
.Add(New CustomLabel(mu-10, mu + 10, muLabel, 0, LabelMarkStyle.LineSideMark)) '//mean label)
.Add(New CustomLabel(bci-10, bci + 10, bci.ToString, 0, LabelMarkStyle.LineSideMark)) '//index label
End With
With .LabelStyle
.Format = "{0.00}"
.Font = New Font("Microsoft Sans Serif", 8)
End With
...
The ranges I picked for the labels are a bit arbitrary. My data distribution is not going to change much immediately so I picked a range that looked reasonable with the font so the labels sit in the centre. Looks much more readable now: http://i.imgur.com/7buwdyk.png
You have a choice of either
Replacing the normal labels by CustomLabels. They are a bit tricky since you can't set their position. Instead you need two positions (FromPosition and ToPosition) to declare the range where the CustomLabel shall be centered. Note that once you use CustomLabels no normal ones will show.
Or you can add TextAnnotations. You can set AnchorX to the value you want and the Y position to the minimum of your y-values. Getting these right is also a little tricky, involving the axes of the Annotation and also the IsSizeAlwaysRelative which should be false.
Or you could code the Pre- or PostPaint events and Graphics.DrawString or TextRenderer.DrawText the text you want using the ValueToPixelPosition axes functions to get the coordinates. This may actually be the easiest to do..
You should be able to do this with custom labels (example here), but might need to do some extra fiddling to hide the normal labels (example here).