I am kitting a PDF RMarkdown, however, the graphs nor the tables do not appear in the correct section. How can I stop them from floating so they appear in the right order?
You should use the out.extra argument of the code chunk to include !h. From https://yihui.org/knitr/options/
out.extra: (NULL; character) Extra options for figures. It can be an arbitrary string, to be inserted in \includegraphics[] in LaTeX output (e.g., out.extra = 'angle=90' to rotate the figure by 90 degrees), or in HTML output (e.g., out.extra = 'style="border:5px solid orange;"').
For example,
```{r out.extra='!h!}
library(ggplot2)
ggplot(airquality, aes(Temp, Ozone)) +
geom_point() +
geom_smooth(method = "loess")
```
Related
I am utilizing the pcolormesh function in Matplotlib to plot a series of gridded data (in parallel) across multiple map domains. The code snippet relevant to this question is as follows:
im = ax2.pcolormesh(xgrid, ygrid, data.variable.data[0], cmap=cmap, norm=norm, alpha=0.90, facecolor=None)
Where: xgrid = array of longitude points, ygrid = array of latitude points, data.variable.data[0] = array of corresponding data values, cmap = defined colormap, & norm = defined value normalization
Consider the following image generated from the provided code:
The undesired result I've found in the image above is what appears to be outlines around each grid square, or perhaps better described as patchwork that stands out slightly as the mesh alpha is reduced below 1.
I've set facecolor=None assuming that would remove these outlines, to no avail. What additions or corrections can I make to remove this feature?
I am using nested_facet() to plot a large number of experiments, yielding a 9x6 array. Each panel in the array is a stacked bargraph with variables indicated by color, same set of variables common to each row.
My code...
ggplot(data, aes(x=enzyme_drug, y=counts, fill = species)) +
geom_bar(position = "fill", stat = "identity") +
theme(axis.text.x = element_text(angle = 90)) +
guides(fill=guide_legend(ncol=1)) +
facet_grid(pH ~ combo, scales = "free")
The only problem with putting them all together into a facet grid is that the number of variables indicated by color in the figure legend is too high, resulting in adjacent colors that are hard to differentiate.
An easy way out of this would be to create a separate legend for each row, with the small number of variables for each row being much easier to differentiate.
Any ideas on how to do this?
The other alternative I realise is to loop the creation of separate facet_grids into a list and then put them together with ggarrange() - this yields unwieldy x-axis labels repeated for each row, although I could remove the x-axes manually for each row in the arrangement there must be a simpler method?
Thanks in advance...
I have the following data.table:
I would like to have a plot which shows the columns symbol and value in a box plot. The boxes should be ordered by the column value.
My code, that I've tried:
plot1 <- ggplot(symbols, aes(symbol, value, fill = from)) +
geom_bar(stat = 'identity') +
ggtitle(paste0("Total quantity traded: ", format(sum(symbols$quantity), scientific = FALSE, nsmall = 2, big.mark = " "))) +
theme_bw()
plot1
This returns the following plot:
What I would like to change:
- flip x- and y-axis
- show the correct height of boxes (y-axis)...currently the relation between the boxes is not correct.
- decreasing order of the boxes by columns value
- format the y-axis with two digits
- make the x-axis readable...currently the x-axis is just a long bunch of what is written in column symbol.
Thanks upfront for the help!
To make things a bit easier, it is suggested that you post your data frame as the output of dput(your.data.frame), which presents code that can be used to replicate your dataset in r.
With that being said, I recreated your data (it was not too big)--some numbers were rounded a bit to make things easier.
A few comments:
y-axis numbers are odd: The numbers on your y-axis are not numeric. If you type str(your.data.frame) you'll probably notice that "value" is not numeric, but a character or factor. This can be easily remedied via: df$value <- as.numeric(df$value), where df is your dataframe.
flipping axis: You can use coord_flip() (typically added to the end of your ggplot call. Be warned that when you do this, your aesthetics flip for the plot, so just keep that in mind.
your dataframe name is also a function/data name in r: This may not be causing any issues (due to your environment), but just be aware to use caution to name your dataset to not have names that are used in r elsewhere. This goes for column/variable names too. I don't think it causes any issues here, but just an FYI
geom_col vs geom_bar: Check out this documentation link for some description on the differences between geom_bar and geom_col. Basically, you want to use geom_bar when your y-axis is count, and geom_col when your y-axis is a value. Here, you want to plot a value, so choose geom_col(), and not geom_bar().
Fixing the issues in plot
Here's the representation of your data (note I rounded... hopefully got the actual data correct, because I manually had to copy each value):
from symbol quantity usd value
1 BTC BTCUSDT 12910.470 6776.340 87485737
2 ETH ETHUSDT 6168.730 154.398 952445
3 BNB BNBUSDT 51002.650 14.764 753017
4 BNB BNBBTC 31071.280 14.764 458745
5 ETH ETHBTC 2216.576 154.398 342236
6 LTC LTCUSDT 4332.024 40.481 175368
7 BNB BNBETH 3150.030 14.764 46507
8 LTC LTCBTC 922.560 40.481 37346
9 LTC LTCBNB 521.476 40.481 21110
10 NEO NEOUSDT 2438.353 7.203 17564
11 NEO NEOBTC 417.930 7.203 3010
Here's the basic plot, flipped:
ggplot(df, aes(symbol, value, fill=from)) +
geom_col() +
coord_flip()
The problem here is that when you plot values... BTCUSDT is huge in comparison. I would suggest you plot on log of value. See this link for some advice on how to do that. I like the scale_y_log10() function, since it just works here pretty well:
ggplot(df, aes(symbol, value, fill=from)) +
geom_col() +
scale_y_log10() +
coord_flip()
If you wanted to keep the columns going in the vertical orientation, you can still do that and avoid having the text run into each other on the x-axis. In that case, you can rotate the labels via theme(axis.text.x=...). Note the adjustments to horizontal and vertical alignment (hjust=1), which forces the labels to be "right-aligned":
ggplot(df, aes(symbol, value, fill=from)) +
geom_col() +
scale_y_log10() +
theme(axis.text.x=element_text(angle=45, hjust=1))
I am trying to plot my data (replicate results for each strain) and i want only one line graph for each strain, this means averaged results of replicates for each strain with points along the line with error bars (error between replicate data).
If you click on the image above, it shows the plot i have so far, which displays WT and WT.1 as seperate lines and all other replicates. However, they are replicates of each strain (WT,DrsbR,DsigB) and i want them to appear as one line of mean results for each strain instead. I am using ggplot package- and melting data with reshape package, but cannot figure out how to make my replicates appear as one line together with error bars (standard deviation of mean results between replicates).
The image in black and white is something i am looking for in my graph- seperate line with points of replicate data plotted as a mean value.
library(reshape2)
melted<-melt(abs2)
print(abs2)
melted<-melt(abs2,id=1,measured=c("WT","WT.1","DsigB","DsigB.1","DrsbR","DrsbR.1"))
View(melted)
colnames(melted)<-c("Time","Strain","Values")
##line graph for melted data
melted$Time<-as.factor(melted$Time)
abs2line=ggplot(melted,aes(Time,Values))+geom_line(aes(colour=Strain,group=Strain))
abs2line+
stat_summary(fun=mean,
geom="point",
aes(group=Time))+
stat_summary(fun.data=mean_cl_boot,
geom="errorbar",
width=.2)+
xlab("Time")+
ylab("OD600")+
theme_classic()+
labs(title="Growth Curve of Mutant Strains")
summary(melted)
print(melted)
One approach is to take your melted data frame and separate out the "variable" column into "species" and "strain" using the separate() function from tidyr. I don't have your dataset -- it is appreciated if you are able to share your dataset via dput(your.data.frame) for future questions -- so I made a dummy dataset that's similar to yours. Here we have two "species" (red and blue) and two "strains" for each species.
df <- data.frame(
time = seq(0, 40, by=10),
blue = c(0:4),
blue.1 = c(0, 1.1, 1.9, 3.1, 4.1),
red = seq(0, 8, by=2),
red.1 = c(0, 2.1, 4.2, 5.5, 8.2)
)
df.melt <- melt(df,
id.vars = 'time',
measure.vars = c('blue', 'blue.1', 'red', 'red.1'))
We can then use tidyr::separate() to separate the resulting "variable" column into a "species" column and a "strain" column. Luckily, your data contains a "." which can be a handy character to use for the separation:
df.melt.mod <- df.melt %>%
separate(col=variable, into=c('species', 'strain'), sep='\\.')
Note: The above code will give you a warning related to the point that "blue" and "red" do not have the "." character, thereby giving you NA for the "strain" column. We don't care here, because we're not using that column for anything here. In your own dataset, you can similarly not care too much.
Then, you can actually just use stat_summary() for all geoms... modify as you see fit for your own visual and thematic preference. Note that order matters for layering, so I plot geom_line first, then geom_point, then geom_errorbar. Also note that you can assign the group=species aesthetic in the base ggplot() call and that mapping applies to all geoms unless overwritten.
ggplot(df.melt.mod, aes(x=time, y=value, group=species)) +
stat_summary(
fun = mean,
geom='line',
aes(color=species)) +
stat_summary(
fun=mean,
geom='point') +
stat_summary(
fun.data=mean_cl_boot,
geom='errorbar',
width=0.5) +
theme_bw()
Create a simple stacked bar chart:
require(ggplot2)
g <- ggplot(mpg, aes(class)) +
geom_bar(aes(fill = drv, size=ifelse(drv=="4",2,1)),color="black",width=.5)
I can change the size values in size=ifelse(drv=="4",2,1) to lots of different values and I still get the same two line weights. Is there a way around this? (2,1) produces the same chart as (1.1,1) and (10,1). Ideally the thicker weight should be just a bit thicker than the standard outline, rather than ~10x bigger.
As added background, you can set size outside of aes() and have outline width scale as you'd expect, but I can't can't assign the ifelse condition outside of aes() without getting an error.
The two size values you give as aesthetics are just just arbitrary levels that are not taken at face value by ggplot.
You can use
ggplot(mpg, aes(class)) +
geom_bar(
aes(fill = drv, size = drv == "4"),
color = "black", width = .5) +
scale_size_manual(values = c(1, 2))
Where the values parameter allows you to specify the precise sizes for each level you define in the asesthetics.