I want to include a line chart (constructed with matplotlib) in an interactive dashboard. My graph describes the evolution for one year of the frequency of the word "France" in 7 media for Central Africa. The database is called: "df_france_pivot".
What I've seen so far is that first of all I have to transform my plot into an object with the go.figure function. So I tried this code:
`app = dash.Dash()
def update_graph():
plt.style.use('seaborn-darkgrid')
fig, ax = plt.subplots()
ax.set_prop_cycle(color=['304558', 'FE9235', '526683', 'FE574B', 'FFD104', '6BDF9C'])
num=0
for column in df_france_pivot.drop('month_year', axis=1):
num+=1
plt.plot(df_france_pivot['month_year'], df_france_pivot[column], marker='',
linewidth=1, alpha=0.9, label=column)
plt.xticks(rotation=45)
plt.legend(loc=0, prop={'size': 9},bbox_to_anchor=(1.05, 1.0), title='Media in South Africa')
plt.title("Frequency of the word 'France' in the media ", loc='left', fontsize=12, fontweight=0, color='orange')
plt.xlabel("Time")
plt.ylabel("Percentage")
figure = go.Figure(fig)
return figure
app.layout = html.Div(id = 'parent', children = [
html.H1(id = 'H1', children = 'Styling using html components', style = {'textAlign':'center',\
'marginTop':40,'marginBottom':40}),
dcc.Graph(id = 'line_plot', figure = update_graph())
]
)`
When running it I got this response: Output exceeds the size limit. Open the full output data in a text editor. Is it because my linechart is more complex i.e. with 7 lines?
Thank you in advance!
I've been trying to plot a rivers system over a ggmap and hitting a bunch of walls. Hopefully there is a good solution.
Here is where I'm getting the rivers data:
https://data.review.fao.org/map/catalog/srv/api/records/6a53d768-1e20-46ea-92a8-c4040286057d
Loading in:
basemap <- get_stamenmap(bbox = c(left = 149.5, bottom = -35.9, right = 151.5, top = -32.5),
zoom = 3, maptype = 'terrain-background')
rr <- st_read("rivers_australia_37252/rivers_australia_37252.shp")
box = c(xmin = 145, ymin = -37, xmax = 155, ymax = -30)
rivers <- st_crop(rr, box)
class(rivers)
[1] "sf" "data.frame"
Plotting and problem code
ggmap(basemap) +
geom_sf(data=rivers, inherit.aes = FALSE)
# Error in st_cast.POINT(x[[1]], to, ...) : cannot create MULTILINESTRING from POINT
ggmap(basemap) +
geom_sf(data=rivers, aes(geometry), inherit.aes = FALSE)
# Error in is.finite(x) : default method not implemented for type 'list'
I then tried unlist() and it came up with a fortify error.
Any suggestions of how to transform the data or what to add in the geom_sf() code would be appreciated. Thanks!
I am just starting to learn R shiny and am trying to create a shiny app that produces scatter plot for principal component analysis and allows user to choose various principal components on the X and Y axis of scatter plot. I know how to write R code to do PCA but I just cant seem to get the shiny app to get me what I need.. I have tried following the examples available for Iris kmeans clustering but I am having trouble getting the scatter plot. Here is my code so far (P.S. my original dataset has genes as rows and samples as columns (columns 1 through 10 are cancer samples, 11 through 20 are normal):
data<-read.table("genes_data.txt", header=TRUE, row.names=1)
pca_data<-prcomp(t(data), scale=T)
summary(pca_data)
pca_sig.var<-pca_data$sdev^2
pca_sig.var.per<-round(pca_sig.var/sum(pca_sig.var)*100, 1)
pca_sig.data<-data.frame(Sample=rownames(pca_data$x), PC1=pca_data$x[,1], PC2=pca_data$x[,2], PC3=pca_data$x[,3], PC4=pca_data$x[,4], PC5=pca_data$x[,5])
pca_sig.data<-pca_sig.data[-1]
pca_sig.data2<-pca_sig.data
pca_sig.data2$category=rep("CANCER", 20)
pca_sig.data2$category[11:20]=rep("NORMAL", 10)
View(pca_sig.data2)
ggplot(data=pca_sig.data2, aes(x=PC1, y=PC2, label=category, colour=category))+
geom_point(size=2, stroke=1, alpha=0.8, aes(color=category))+
xlab(paste("PCA1 - ", pca_sig.var.per[1], "%", sep=""))+
ylab(paste("PCA2 - ", pca_sig.var.per[2], "%", sep=""))+
theme_bw()+
ggtitle("My PCA Graph")
ui<-pageWithSidebar(
headerPanel('Gene Data PCA'),
sidebarPanel(
selectInput('xcol', 'X Variable', names(pca_sig.data2[,1:5])),
selectInput('ycol', 'Y Variable', names(pca_sig.data2[,1:5]),
selected=names(pca_sig.data2)[[2]])
),
mainPanel(
plotOutput('plot1')
)
)
server<- function(input, output, session) {
# Combine the selected variables into a new data frame
selectedData <- reactive({
pca_sig.data2[, c(input$xcol, input$ycol)]
})
output$plot1 <- renderPlot({
palette(c("#E41A1C", "#377EB8"))
par(mar = c(5.1, 4.1, 0, 1))
plot(selectedData(),
col=selectedData()$category,
pch = 20, cex = 3)
points(selectedData()[,1:5], pch = 4, cex = 4, lwd = 4)
})
}
shinyApp(ui = ui, server = server)
At the end, when I run the app, I get "Error:undefined columns selected"
Also, for simplicity sake let's assume that my original dataset that I want to do PCA on looks something like this (in reality I have about 600 genes and 20 samples):
probeID<-c("gene1", "gene2", "gene3", "gene4","gene5")
BCR1<-c(28.005966, 30.806433, 17.341375, 17.40666, 30.039436)
BCR2<-c(30.973469, 29.236025, 30.41161, 20.914383, 20.904331)
BCR3<-c(26.322796, 25.542833, 22.460772, 19.972183, 30.409641)
BCR4<-c(26.441898, 25.837685, 23.158352, 20.379173, 33.81327)
BCR5<-c(39.750206, 19.901133, 28.180124, 22.668673, 25.748884)
CTL6<-c(23.004385, 28.472675, 23.81621, 26.433413, 28.851719)
CTL7<-c(22.239546, 28.741674, 23.754929, 26.015385, 28.16368)
CTL8<-c(29.590443, 30.041988, 21.323061, 24.272501, 18.099016)
CTL9<-c(15.856442, 22.64224, 29.629637, 25.374926, 22.356894)
CTL10<-c(38.137985, 24.753338, 26.986668, 24.578161, 19.223558)
data<-data.frame(probeID, BCR1, BCR2, BCR3, BCR4, BCR5, CTL6, CTL7, CTL8, CTL9, CTL10)
where BCR1 through BCR5 are the cancer samples and CTL6 through CTL10 are the normal samples.
Is this what you want?
server<- function(input, output, session) {
# Combine the selected variables into a new data frame
selectedData <- reactive({
pca_sig.data2[c(input$xcol, input$ycol, 'category')]
})
output$plot1 <- renderPlot({
palette(c("#E41A1C", "#377EB8"))
plot(selectedData()[,c(1:2)], col=factor(selectedData()$category), pch = 20, cex = 3)
points(selectedData()[,c(1:2)], pch = 4, cex = 4, lwd = 4)
})
}
The result is like this:
I have the code below that is a combination of two boxplots and dot plots in one. It is a representation of barring density in 4 different species. The grey depicts the males and the tan the females.
data<-read.csv("C:/Users/Jeremy/Documents/A_Trogon rufus/Black-and-White/BARDATA_boxplots_M.csv")
datF<-read.csv("C:/Users/Jeremy/Documents/A_Trogon rufus/FEMALES_BW&Morphom.csv")
cleandataM<-subset(data, data$Age=="Adult" & data$White!="NA", select=(OTU:Density))
cleandatF<-subset(datF, datF$Age=="Adult", select=(OTU:Density))
dataM<- as.data.frame(cleandataM)
dataF<- as.data.frame(cleandatF)
library(ggplot2)
ggplot(dataM, aes(factor(OTU), Density))+
geom_boxplot(data=dataF,aes(factor(OTU),Density), fill="AntiqueWhite")+
geom_boxplot(fill="lightgrey", alpha=0.5)+
geom_point(data=dataF,position = position_jitter(width = 0.1), colour="tan")+
geom_point(data=dataM, position = position_jitter(width = 0.1), color="DimGrey")+ scale_x_discrete(name="",limits=order)+
scale_y_continuous(name="Bar Density (bars/cm)")+
theme(panel.background = element_blank(),panel.grid.minor=element_blank(),
panel.grid.major=element_blank(),axis.line = element_line(colour = "black"),
axis.title.y = element_text(colour="black", size=14),
axis.text.y = element_text(colour="black", size=12),
axis.text.x = element_text(colour="black", size=14))
This works just fine.
However, when I try to add a legend as:
legend("topright", inset=.01, bty="n", cex=.75, title="Sex",
c("Male", "Female"), fill=c("lightgrey", "black")
It returns the following Error:
Error in strwidth(legend, units = "user", cex = cex, font = text.font) :
plot.new has not been called yet
Please, is there someone who could suggest how to correct this?
I'm using R to loop through the columns of a data frame and make a graph of the resulting analysis. I don't get any errors when the script runs, but it generates a pdf that cannot be opened.
If I run the content of the script, it works fine. I wondered if there is a problem with how quickly it is looping through, so I tried to force it to pause. This did not seem to make a difference. I'm interested in any suggestions that people have, and I'm also quite new to R so suggestions as to how I can improve the approach are welcome too. Thanks.
for (i in 2:22) {
# Organise data
pop_den_z = subset(pop_den, pop_den[i] != "0") # Remove zeros
y = pop_den_z[,i] # Get y col
x = pop_den_z[,1] # get x col
y = log(y) # Log transform
# Regression
lm.0 = lm(formula = y ~ x) # make linear model
inter = summary(lm.0)$coefficients[1,1] # Get intercept
slop = summary(lm.0)$coefficients[2,1] # Get slope
# Write to File
a = c(i, inter, slop)
write(a, file = "C:/pop_den_coef.txt", ncolumns = 3, append = TRUE, sep = ",")
## Setup pdf
string = paste("C:/LEED/results/Images/R_graphs/Pop_den", paste(i-2), "City.pdf")
pdf(string, height = 6, width = 9)
p <- qplot(
x, y,
xlab = "Radius [km]",
ylab = "Population Density [log(people/km)]",
xlim = x_range,
main = "Analysis of Cities"
)
# geom_abline(intercept,slope)
p + geom_abline(intercept = inter, slope = slop, colour = "red", size = 1)
Sys.sleep(5)
### close the PDF file
dev.off()
}
The line should be
print(p + geom_abline(intercept = inter, slope = slop, colour = "red", size = 1))
In pdf devices, ggplot (and lattice) only writes to file when explicitly printed.