Related
I have a M vs N curve (let's take it to be a sigmoid, for ease of understanding) for a given value of parameters P and Q. I need to visualise the M vs N curves for a range of values of P and Q (assume 10 values in 0 to 1, i.e. 0.1, 0.2, ..., 0.9 for both P and Q)
The only solution that I've found for this problem is a Trellis plot (essentially a matrix of plots). I'd like to know if there any other method to visualise this sort of a 4d(?) relationship besides the Trellis plots. Thanks.
I'm not sure I understand what you're hoping for, so let me know if this is on the right track. Below are three examples using R.
The first is indeed a matrix of plots where each panel represents a different value of q and, within each panel, each curve represents a different value of p. The second is a 3D plot which looks at a surface based on three of the variables with the fourth fixed. The third is a Shiny app that creates the same interactive plot as in the second example but also provides a slider that allows you to change p and see how the plot changes. Unfortunately, I'm not sure how to embed the interactive plots in Stackoverflow so I've just provided the code.
I'm not sure if there's an elegant way to look at all four variables at the same time, but maybe someone will come along with additional options.
Matrix of plots for various values of p and q
library(tidyverse)
theme_set(theme_classic())
# Function to plot
my_fun = function(x, p, q) {
1/(1 + exp(p + q*x))
}
# Parameters
params = expand.grid(p=seq(-2,2,length=6), q=seq(-1,1,length=11))
# x-values to feed to my_fun
x = seq(-10,10,0.1)
# Generate data frame for plotting
dat = map2_df(params$p, params$q, function(p, q) {
data.frame(p=p, q=q, x, y=my_fun(x, p, q))
})
ggplot(dat, aes(x,y,colour=p, group=p)) +
geom_line() +
facet_grid(. ~ q, labeller=label_both) +
labs(colour="p") +
scale_colour_gradient(low="red", high="blue") +
theme(legend.position="bottom")
3D plot with one variable fixed
The code below will produce an interactive 3D plot that you can zoom and rotate. I've fixed the value of p and drawn a plot of the y surface for a grid of x and q values.
library(rgl)
x = seq(-10,10,0.1)
q = seq(-1,1,0.01)
y = outer(x, q, function(a, b) 1/(1 + exp(1 + b*a)))
persp3d(x, q, y, col=hcl(240,80,65), specular="grey20",
xlab = "x", ylab = "q", zlab = "y")
I'm not sure how to embed the interactive plot, but here's a static image of one viewing angle:
Shiny app
The code below will create the same plot as above, but with the added ability to vary p with a slider and see how the plot changes.
Open an R script file and paste in the code below. Save it as app.r in its own directory then run the code. Both an rgl window and the Shiny app page with the slider for controlling the value of p should open. Resize the windows as desired and then move the slider to see how the function surface changes for various values of p.
library(shiny)
# Define UI for application that draws an interactive plot
ui <- fluidPage(
# Application title
titlePanel("Plot the function 1/(1 + exp(p + q*x))"),
# Sidebar with a slider input for number of bins
sidebarLayout(
sidebarPanel(
sliderInput("p",
"Vary the value of p and see how the plot changes",
min = -2,
max = 2,
value = 1,
step=0.2)
),
# Show a plot of the generated distribution
mainPanel(
plotOutput("distPlot")
)
)
)
# Define server logic required to draw the plot
server <- function(input, output) {
output$distPlot <- renderPlot({
library(rgl)
x = seq(-10,10,0.1)
q = seq(-1,1,0.01)
y = outer(x, q, function(a, b) 1/(1 + exp(input$p + b*a)))
persp3d(x, q, y, col=hcl(240,50,65), specular="grey20",
xlab = "x", ylab = "q", zlab = "y")
})
}
# Run the application
shinyApp(ui = ui, server = server)
I'm performing curve fitting with scipy.optimize.leastsq. E.g. for a gaussian:
def fitGaussian(x, y, init=[1.0,0.0,4.0,0.1]):
fitfunc = lambda p, x: p[0]*np.exp(-(x-p[1])**2/(2*p[2]**2))+p[3] # Target function
errfunc = lambda p, x, y: fitfunc(p, x) - y # Distance to the target function
final, success = scipy.optimize.leastsq(errfunc, init[:], args=(x, y))
return fitfunc, final
Now, I want to optionally fix the values of some of the parameters in the fit. I found that suggestions are to use a different package lmfit, which I want to avoid, or are very general, like here.
Since I need a solution which
works with numpy/scipy (no further packages etc.)
is independent of the parameters themselves,
is flexible, in which parameters are fixed or not,
I came up with the following, using a condition on each of the parameters:
def fitGaussian2(x, y, init=[1.0,0.0,4.0,0.1], fix = [False, False, False, False]):
fitfunc = lambda p, x: (p[0] if not fix[0] else init[0])*np.exp(-(x-(p[1] if not fix[1] else init[1]))**2/(2*(p[2] if not fix[2] else init[2])**2))+(p[3] if not fix[3] else init[3])
errfunc = lambda p, x, y: fitfunc(p, x) - y # Distance to the target function
final, success = scipy.optimize.leastsq(errfunc, init[:], args=(x, y))
return fitfunc, final
While this works fine, it's neither practical, nor beautiful.
So my question is: Are there better ways of performing curve fitting in scipy for fixed parameters? Or are there wrappers, which already include such parameter fixing?
Using scipy, there are no builtin options that I am aware of. You will always have to do a work-around like the one you already did.
If you are willing to use a wrapper package however, may I recommend my own symfit? This is a wrapper to scipy with readability and less boilerplate code as its core principles. In symfit, your problem would be solved as:
from symfit import parameters, variables, exp, Fit, Parameter
a, b, c, d = parameters('a, b, c, d')
x, y = variables('x, y')
model_dict = {y: a * exp(-(x - b)**2 / (2 * c**2)) + d}
fit = Fit(model_dict, x=xdata, y=ydata)
fit_result = fit.execute()
The line a, b, c, d = parameters('a, b, c, d') makes four Parameter objects. To fix e.g. the parameter c to its initial value, do the following anywhere before calling fit.execute():
c.value = 4.0
c.fixed = True
So a possible end result might be:
from symfit import parameters, variables, exp, Fit, Parameter
a, b, c, d = parameters('a, b, c, d')
x, y = variables('x, y')
c.value = 4.0
c.fixed = True
model_dict = {y: a * exp(-(x - b)**2 / (2 * c**2)) + d}
fit = Fit(model_dict, x=xdata, y=ydata)
fit_result = fit.execute()
If you want to be more dynamic in your code, you could make the Parameter objects straight away using:
c = Parameter(4.0, fixed=True)
For more info, check the docs: http://symfit.readthedocs.io/en/latest/tutorial.html#simple-example
The above example using symfit would surely simply the syntax of the fitting approach, however, does the example given really constrain the variable c?
If you look at the fit_result.param you get the following:
OrderedDict([('a', 16.374368575343127),
('b', 0.49201249437123556),
('c', 0.5337962977235504),
('d', -9.55593614465743)])
The parameter c is not 4.0.
I am trying to plot two columns of raw data (I have used melt to combine them into one data frame) and then add separate error bars for each. However, I want to make the raw data for each column one pair of colors and the error bars another set of colors, but I can't seem to get it to work. The plot I am getting is at the link below. I want to have different color pairs for the raw data and for the error bars. A simple reproducible example is coded below, for illustrative purposes.
dat2.m<-data.frame(obs=c(2,4,6,8,12,16,2,4,6),variable=c("raw","raw","raw","ip","raw","ip","raw","ip","ip"),value=runif(9,0,10))
c <- ggplot(dat2.m, aes(x=obs, y=value, color=variable,fill=variable,size = 0.02)) +geom_jitter(size=1.25) + scale_colour_manual(values = c("blue","Red"))
c<- c+stat_summary(fun.data="median_hilow",fun.args=(conf.int=0.95),aes(color=variable), position="dodge",geom="errorbar", size=0.5,lty=1)
print(c)
[1]: http://i.stack.imgur.com/A5KHk.jpg
For the record: I think that this is a really, really bad idea. Unless you have a use case where this is crucial, I think you should re-examine your plan.
However, you can get around it by adding a new set of variables, padded with a space at the end. You will want/need to play around with the legends, but this should work (though it is definitely ugly):
dat2.m<- data.frame(obs=c(2,4,6,8,12,16,2,4,6),variable=c("raw","raw","raw","ip","raw","ip","raw","ip","ip"),value=runif(9,0,10))
c <- ggplot(dat2.m, aes(x=obs, y=value, color=variable,fill=variable,size = 0.02)) +geom_jitter(size=1.25) + scale_colour_manual(values = c("blue","Red","green","purple"))
c<- c+stat_summary(fun.data="median_hilow",fun.args=(conf.int=0.95),aes(color=paste(variable," ")), position="dodge",geom="errorbar", size=0.5,lty=1)
print(c)
One way around this would be to use repetitive calls to geom_point and stat_summary. Use the data argument of those functions to feed subsets of your dataset into each call, and set the color attribute outside of aes(). It's repetitive and somewhat defeats the compactness of ggplot, but it'd do.
c <- ggplot(dat2.m, aes(x = obs, y = value, size = 0.02)) +
geom_jitter(data = subset(dat2.m, variable == 'raw'), color = 'blue', size=1.25) +
geom_jitter(data = subset(dat2.m, variable == 'ip'), color = 'red', size=1.25) +
stat_summary(data = subset(dat2.m, variable == 'raw'), fun.data="median_hilow", fun.args=(conf.int=0.95), color = 'pink', position="dodge",geom="errorbar", size=0.5,lty=1) +
stat_summary(data = subset(dat2.m, variable == 'ip'), fun.data="median_hilow", fun.args=(conf.int=0.95), color = 'green', position="dodge",geom="errorbar", size=0.5,lty=1)
print(c)
I'm trying to mirror an image. That is, if, e.g., a person is facing to the left, when the program terminates I want that person to now be facing instead to the right.
I understand how mirroring works in JES, but I'm unsure how to proceed here.
Below is what I'm trying; be aware that image is a global variable declared in another function.
def flipPic(image):
width = getWidth(image)
height = getHeight(image)
for y in range(0, height):
for x in range(0, width):
left = getPixel(image, x, y)
right = getPixel(image, width-x-1, y)
color = getColor(left)
setColor(right, color)
show(image)
return image
try this
width = getWidth(pic)
height = getHeight(pic)
for y in range (0,height):
for x in range (0, width/2):
left=getPixel(pic, x, y)
right=getPixel(pic, width-x-1,y)
color1=getColor(left)
color2=getColor(right)
setColor(right, color1)
setColor(left, color2)
repaint(pic)
I personally find that repaint is confusing for newbies (like me!).
I'd suggest something like this:
def mirrorImage(image):
width = getWidth(image)
height = getHeight(image)
for y in range (0,height):
for x in range (0, width/2):
left=getPixel(pic, x, y)
right=getPixel(pic, width-x-1,y)
color1=getColor(left)
color2=getColor(right)
setColor(right, color1)
setColor(left, color2)
show(image)
return image
mirrorImage(image)
This seems to work well.. I put some comments in so you can rewrite in your own style.
feel free to ask questions but I think your question may already be answered^^
#this function will take the pixel values for a selected picture and
#past them to a new canvas but fliped over!
def flipPic(pict):
#here we take the height and width of the original picture
width=getWidth(pict)
height=getHeight(pict)
#here we make and empty canvas
newPict=makeEmptyPicture(width,height)
#the Y for loop is setting the range to working for the y axes the started the X for loop
for y in range(0, height):
#the X for loop is setting the range to work in for the x axis
for x in range(0, width):
#here we are collecting the colour information for the origional pix in range of X and
colour=getColor(getPixel(pict,x,y))
#here we are setting the colour information to its new position on the blank canvas
setColor(getPixel(newPict,width-x-1,y),colour)
#setColor(getPixel(newPict,width-x-1,height-y-1),colour)#upsidedown
show(newPict)
#drive function
pict = makePicture(pickAFile())
show(pict)
flipPic(pict)
Might be easier to read if you copy it over to JES first :D
BTW I got full marks for this one in my intro to programming class ;)
I asked this question yesterday about storing a plot within an object. I tried implementing the first approach (aware that I did not specify that I was using qplot() in my original question) and noticed that it did not work as expected.
library(ggplot2) # add ggplot2
string = "C:/example.pdf" # Setup pdf
pdf(string,height=6,width=9)
x_range <- range(1,50) # Specify Range
# Create a list to hold the plot objects.
pltList <- list()
pltList[]
for(i in 1 : 16){
# Organise data
y = (1:50) * i * 1000 # Get y col
x = (1:50) # get x col
y = log(y) # Use natural log
# Regression
lm.0 = lm(formula = y ~ x) # make linear model
inter = summary(lm.0)$coefficients[1,1] # Get intercept
slop = summary(lm.0)$coefficients[2,1] # Get slope
# Make plot name
pltName <- paste( 'a', i, sep = '' )
# make plot object
p <- qplot(
x, y,
xlab = "Radius [km]",
ylab = "Services [log]",
xlim = x_range,
main = paste("Sample",i)
) + geom_abline(intercept = inter, slope = slop, colour = "red", size = 1)
print(p)
pltList[[pltName]] = p
}
# close the PDF file
dev.off()
I have used sample numbers in this case so the code runs if it is just copied. I did spend a few hours puzzling over this but I cannot figure out what is going wrong. It writes the first set of pdfs without problem, so I have 16 pdfs with the correct plots.
Then when I use this piece of code:
string = "C:/test_tabloid.pdf"
pdf(string, height = 11, width = 17)
grid.newpage()
pushViewport( viewport( layout = grid.layout(3, 3) ) )
vplayout <- function(x, y){viewport(layout.pos.row = x, layout.pos.col = y)}
counter = 1
# Page 1
for (i in 1:3){
for (j in 1:3){
pltName <- paste( 'a', counter, sep = '' )
print( pltList[[pltName]], vp = vplayout(i,j) )
counter = counter + 1
}
}
dev.off()
the result I get is the last linear model line (abline) on every graph, but the data does not change. When I check my list of plots, it seems that all of them become overwritten by the most recent plot (with the exception of the abline object).
A less important secondary question was how to generate a muli-page pdf with several plots on each page, but the main goal of my code was to store the plots in a list that I could access at a later date.
Ok, so if your plot command is changed to
p <- qplot(data = data.frame(x = x, y = y),
x, y,
xlab = "Radius [km]",
ylab = "Services [log]",
xlim = x_range,
ylim = c(0,10),
main = paste("Sample",i)
) + geom_abline(intercept = inter, slope = slop, colour = "red", size = 1)
then everything works as expected. Here's what I suspect is happening (although Hadley could probably clarify things). When ggplot2 "saves" the data, what it actually does is save a data frame, and the names of the parameters. So for the command as I have given it, you get
> summary(pltList[["a1"]])
data: x, y [50x2]
mapping: x = x, y = y
scales: x, y
faceting: facet_grid(. ~ ., FALSE)
-----------------------------------
geom_point:
stat_identity:
position_identity: (width = NULL, height = NULL)
mapping: group = 1
geom_abline: colour = red, size = 1
stat_abline: intercept = 2.55595281266726, slope = 0.05543539319091
position_identity: (width = NULL, height = NULL)
However, if you don't specify a data parameter in qplot, all the variables get evaluated in the current scope, because there is no attached (read: saved) data frame.
data: [0x0]
mapping: x = x, y = y
scales: x, y
faceting: facet_grid(. ~ ., FALSE)
-----------------------------------
geom_point:
stat_identity:
position_identity: (width = NULL, height = NULL)
mapping: group = 1
geom_abline: colour = red, size = 1
stat_abline: intercept = 2.55595281266726, slope = 0.05543539319091
position_identity: (width = NULL, height = NULL)
So when the plot is generated the second time around, rather than using the original values, it uses the current values of x and y.
I think you should use the data argument in qplot, i.e., store your vectors in a data frame.
See Hadley's book, Section 4.4:
The restriction on the data is simple: it must be a data frame. This is restrictive, and unlike other graphics packages in R. Lattice functions can take an optional data frame or use vectors directly from the global environment. ...
The data is stored in the plot object as a copy, not a reference. This has two
important consequences: if your data changes, the plot will not; and ggplot2 objects are entirely self-contained so that they can be save()d to disk and later load()ed and plotted without needing anything else from that session.
There is a bug in your code concerning list subscripting. It should be
pltList[[pltName]]
not
pltList[pltName]
Note:
class(pltList[1])
[1] "list"
pltList[1] is a list containing the first element of pltList.
class(pltList[[1]])
[1] "ggplot"
pltList[[1]] is the first element of pltList.
For your second question: Multi-page pdfs are easy -- see help(pdf):
onefile: logical: if true (the default) allow multiple figures in one
file. If false, generate a file with name containing the
page number for each page. Defaults to ‘TRUE’.
For your main question, I don't understand if you want to store the plot inputs in a list for later processing, or the plot outputs. If it is the latter, I am not sure that plot() returns an object you can store and retrieve.
Another suggestion regarding your second question would be to use either Sweave or Brew as they will give you complete control over how you display your multi-page pdf.
Have a look at this related question.