How to draw lines based on intervals in matplotlib? - matplotlib

I have these three lists
odds = [1,3,5,7,9]
evens = [2,4,6,8,10]
all_nums = [2,1,4,3,6,5,8,7,10,9]
I need to first draw a line showing the values in all_nums, and then draw the other two lines that connect the values in odds and evens.
For example, after I first draw the line of all_nums, I got
And my final expected graph should be
I am not sure how to draw the red and green lines as they are produced based on an "interval 2" on the x-axis with respect to the blue line.
I have created a repl.it with my current code.
Note, my real project is more complicated than this example, in which the first line looks like
And I need to connect all the valley points and all the peak points, so I cannot simply apply tricks like changing odds = [1,3,5,7,9] to odds = [1,2,3,4,5,6,7,8,9,10] when drawing, as I wish the curve can also be smooth in the connection between points.
Thank you for your help!

I did something like this for the even and odd lines. odd looks 1:1 and even looks like y-2.
odds = [1,3,5,7,9]
evens = [2,4,6,8,10]
all_nums = [2,1,4,3,6,5,8,7,10,9]
even_sep=[]
odd_sep=[]
plt.plot(range(len(all_nums)), all_nums, label='odds and evens')
for draw_num_iter in range(len(all_nums)):
draw_num = all_nums[draw_num_iter]
plt.annotate(draw_num, xy=(draw_num_iter, draw_num), size=20)
for i in range(len(evens)):
even_sep.append(evens[i]-2)
plt.plot(even_sep,evens,'ro-')
for i in range(len(odds)):
odd_sep.append(odds[i])
plt.plot(odd_sep,odds,'g')
plt.legend(loc='best')
plt.show()

Related

"Zoom in" on a violinplot whilst keeping accurate quartile lines (matplotlib/seaborn)

TL;DR: How can I get a subrange of a violinplot whilst keeping accurate quartile lines?
I am using seaborn violinplots to make static charts for a report, but as far as I can tell, there's no way to redraw a particular area between limits whilst retaining the 25/median/75 quartile lines of the original dataset.
Here's my example dataset as a violin. The 25/median/75 values are left side: 1.0/5.0/9.0; right side: 2.0/5.0/9.0
My data has such a long tail that all the useful info is scrunched up into a tiny area. I want to ignore (but not throw away) the tail and show a closer look at the interesting bit.
I tried to reset the ylim using ax.set(ylim=(0, upp)), but the resultant graph is not great: it's jaggy and the inner lines don't meet the violin edge.
Is there a way to reset the y-axis limits but get a better quality result?
Next I tried to cut off the tail by dropping values from the dataset. I dropped anything over the 97th centile. The violin looks way better, but the quartile lines have been recalculated for this new dataset. They're showing a median of about 4, not 5 as per the original dataset.
I'm using inner="quartile", so the code that gets called in Seaborn is _ViolinPlotter::draw_quartiles
def draw_quartiles(self, ax, data, support, density, center, split=False):
"""Draw the quartiles as lines at width of density."""
q25, q50, q75 = np.percentile(data, [25, 50, 75])
self.draw_to_density(ax, center, q25, support, density, split,
linewidth=self.linewidth,
dashes=[self.linewidth * 1.5] * 2)
As you can see, it assumes (understandably) that one wants to draw the quartile lines at percentiles 25, 50 and 75. It'd be amazeballs if there was a way I could call draw_to_density with my own values (is there?).
At the moment, I am attempting to manually adjust the position of the lines. It's trivial to figure out & set the y-values:
for l in ax.lines:
l.set_ydata(<get correct quartile value from original dataset>)
but I'm finding it hard to figure out the limits for x, i.e. the density of the distribution at the quartiles. It seems to involve gaussian kde, and tbh it's getting hacky and inelegant at this point. Is there an easy way to calculate how long each line should be?
What do you suggest?
Thanks for your help
Lnr
W/ Thanks to #JohanC.
added gridsize=1000 to the params of the violinplot and used ax.set(ylim=(0, upp)) to resize the y-axis to show the range from 0 to upp where upp is the upper limit. Much prettier lookin' graph:

How would one draw an arbitrary curve in createJS

I am attempting to write a function using createJS to draw an arbitrary function and I'm having some trouble. I come from a d3 background so I'm having trouble breaking out of the data-binding mentality.
Suppose I have 2 arrays xData = [-10, -9, ... 10] and yData = Gaussian(xData) which is psuedocode for mapping each element of xData to its value on the bell curve. How can I now draw yData as a function of xData?
Thanks
To graph an arbitrary function in CreateJS, you draw lines connecting all the data points you have. Because, well, that's what graphing is!
The easiest way to do this is a for loop going through each of your data points, and calling a lineTo() for each. Because the canvas drawing API starts a line where you last 'left off', you actually don't even need to specify the line start for each line, but you DO have to move the canvas 'pen' to the first point before you start drawing. Something like:
// first make our shape to draw into.
let graph = new createjs.Shape();
let g = graph.graphics
g.beginStroke("#000");
xStart = xData[0];
yStart = yourFunction(xData[0]);
g.moveTo(xStart, yStart);
for( let i = 1; i < xData.length; i++){
nextX = xData[i], but normalized to fit on your graph area;
nextY = yourFunction(xData[i]), but similarly normalized;
g.lineTo(nextX, nextY);
}
This should get a basic version of the function drawing! Note that the line will be pretty jagged if you don't have a lot of data points, and you'll have to treat (normalize) your data to make it fit onto your screen. For instance, if you start at -10 for X, that's off the screen to the left by 10 pixels - and if it only runs from -10 to +10, your entire graph will be squashed into only 20 pixels of width.
I have a codepen showing this approach to graphing here. It's mapped to hit every pixel on the viewport and calculate a Y value for it, though, rather than your case where you have input X values. And FYI, the code for graphing is all inside the 'run' function at the top - everything in the PerlinNoiseMachine class is all about data generation, so you can ignore it for the purposes of this question.
Hope that helps! If you have any specific follow-up questions or code samples, please amend your question.

Constructing a bubble trellis plot with lattice in R

First off, this is a homework question. The problem is ex. 2.6 from pg.26 of An Introduction to Applied Multivariate Analysis. It's laid out as:
Construct a bubble plot of the earthquake data using latitude and longitude as the scatterplot and depth as the circles, with greater depths giving smaller circles. In addition, divide the magnitudes into three equal ranges and label the points in your bubble plot with a different symbol depending on the magnitude group into which the point falls.
I have figured out that symbols, which is in base graphics does not work well with lattice. Also, I haven't figured out if lattice has the functionality to change symbol size (i.e. bubble size). I bought the lattice book in a fit of desperation last night, and as I see in some of the examples, it is possible to symbol color and shape for each "cut" or panel. I am then working under the assumption that symbol size could then also be manipulated, but I haven't been able to figure out how.
My code looks like:
plot(xyplot(lat ~ long | cut(mag, 3), data=quakes,
layout=c(3,1), xlab="Longitude", ylab="Latitude",
panel = function(x,y){
grid.circle(x,y,r=sqrt(quakes$depth),draw=TRUE)
}
))
Where I attempt to use the grid package to draw the circles, but when this executes, I just get a blank plot. Could anyone please point me in the right direction? I would be very grateful!
Here is the some code for creating the plot that you need without using the lattice package. I obviously had to generate my own fake data so you can disregard all of that stuff and go straight to the plotting commands if you want.
####################################################################
#Pseudo Data
n = 20
latitude = sample(1:100,n)
longitude = sample(1:100,n)
depth = runif(n,0,.5)
magnitude = sample(1:100,n)
groups = rep(NA,n)
for(i in 1:n){
if(magnitude[i] <= 33){
groups[i] = 1
}else if (magnitude[i] > 33 & magnitude[i] <=66){
groups[i] = 2
}else{
groups[i] = 3
}
}
####################################################################
#The actual code for generating the plot
plot(latitude[groups==1],longitude[groups==1],col="blue",pch=19,ylim=c(0,100),xlim=c(0,100),
xlab="Latitude",ylab="Longitude")
points(latitude[groups==2],longitude[groups==2],col="red",pch=15)
points(latitude[groups==3],longitude[groups==3],col="green",pch=17)
points(latitude[groups==1],longitude[groups==1],col="blue",cex=1/depth[groups==1])
points(latitude[groups==2],longitude[groups==2],col="red",cex=1/depth[groups==2])
points(latitude[groups==3],longitude[groups==3],col="green",cex=1/depth[groups==3])
You just need to add default.units = "native" to grid.circle()
plot(xyplot(lat ~ long | cut(mag, 3), data=quakes,
layout=c(3,1), xlab="Longitude", ylab="Latitude",
panel = function(x,y){
grid.circle(x,y,r=sqrt(quakes$depth),draw=TRUE, default.units = "native")
}
))
Obviously you need to tinker with some of the settings to get what you want.
I have written a package called tactile that adds a function for producing bubbleplots using lattice.
tactile::bubbleplot(depth ~ lat*long | cut(mag, 3), data=quakes,
layout=c(3,1), xlab="Longitude", ylab="Latitude")

Contours based on a "label mask"

I have images that have had features extracted with a contouring algorithm (I'm doing astrophysical source extraction). This approach yields a "feature map" that has each pixel "labeled" with an integer (usually ~1000 unique features per map).
I would like to show each individual feature as its own contour.
One way I could accomplish this is:
for ii in range(labelmask.max()):
contour(labelmask,levels=[ii-0.5])
However, this is very slow, particularly for large images. Is there a better (faster) way?
P.S.
A little testing showed that skimage's find-contours is no faster.
As per #tcaswell's comment, I need to explain why contour(labels, levels=np.unique(levels)+0.5)) or something similar doesn't work:
1. Matplotlib spaces each subsequent contour "inward" by a linewidth to avoid overlapping contour lines. This is not the behavior desired for a labelmask.
2. The lowest-level contours encompass the highest-level contours
3. As a result of the above, the highest-level contours will be surrounded by a miniature version of whatever colormap you're using and will have extra-thick contours compared to the lowest-level contours.
Sorry for answering my own... impatience (and good luck) got the better of me.
The key is to use matplotlib's low-level C routines:
I = imshow(data)
E = I.get_extent()
x,y = np.meshgrid(np.linspace(E[0],E[1],labels.shape[1]), np.linspace(E[2],E[3],labels.shape[0]))
for ii in np.unique(labels):
if ii == 0: continue
tracer = matplotlib._cntr.Cntr(x,y,labels*(labels==ii))
T = tracer.trace(0.5)
contour_xcoords,contour_ycoords = T[0].T
# to plot them:
plot(contour_xcoords, contour_ycoords)
Note that labels*(labels==ii) will put each label's contour at a slightly different location; change it to just labels==ii if you want overlapping contours between adjacent labels.

plotting matrices with gnuplot

I am trying to plot a matrix in Gnuplot as I would using imshow in Matplotlib. That means I just want to plot the actual matrix values, not the interpolation between values. I have been able to do this by trying
splot "file.dat" u 1:2:3 ps 5 pt 5 palette
This way we are telling the program to use columns 1,2 and 3 in the file, use squares of size 5 and space the points with very narrow gaps. However the points in my dataset are not evenly spaced and hence I get discontinuities.
Anyone a method of plotting matrix values in gnuplot regardless of not evenly spaced in Xa and y axes?
Gnuplot doesn't need to have evenly space X and Y axes. ( see another one of my answers: https://stackoverflow.com/a/10690041/748858 ). I frequently deal with grids that look like x[i] = f_x(i) and y[j] = f_y(j). This is quite trivial to plot, the datafile just looks like:
#datafile.dat
x1 y1 z11
x1 y2 z12
...
x1 yN z1N
#<--- blank line (leave these comments out of your datafile ;)
x2 y1 z21
x2 y2 z22
...
x2 yN z2N
#<--- blank line
...
...
#<--- blank line
xN y1 zN1
...
xN yN zNN
(note the blank lines)
A datafile like that can be plotted as:
set view map
splot "datafile.dat" u 1:2:3 w pm3d
the option set pm3d corners2color can be used to fine tune which corner you want to color the rectangle created.
Also note that you could make essentially the same plot doing this:
set view map
plot "datafile.dat" u 1:2:3 w image
Although I don't use this one myself, so it might fail with a non-equally spaced rectangular grid (you'll need to try it).
Response to your comment
Yes, pm3d does generate (M-1)x(N-1) quadrilaterals as you've alluded to in your comment -- It takes the 4 corners and (by default) averages their value to assign a color. You seem to dislike this -- although (in most cases) I doubt you'd be able to tell a difference in the plot for reasonably large M and N (larger than 20). So, before we go on, you may want to ask yourself if it is really necessary to plot EVERY POINT.
That being said, with a little work, gnuplot can still do what you want. The solution is to specify that a particular corner is to be used to assign the color to the entire quadrilateral.
#specify that the first corner should be used for coloring the quadrilateral
set pm3d corners2color c1 #could also be c2,c3, or c4.
Then simply append the last row and last column of your matrix to plot it twice (making up an extra gridpoint to accommodate the larger dataset. You're not quite there yet, you still need to shift your grid values by half a cell so that your quadrilaterals are centered on the point in question -- which way you shift the cells depends on your choice of corner (c1,c2,c3,c4) -- You'll need to play around with it to figure out which one you want.
Note that the problem here isn't gnuplot. It's that there isn't enough information in the datafile to construct an MxN surface given MxN triples. At each point, you need to know it's position (x,y) it's value (z) and also the size of the quadrilateral to be draw there -- which is more information than you've packed into the file. Of course, you can guess the size in the interior points (just meet halfway), but there's no guessing on the exterior points. but why not just use the size of the next interior point?. That's a good question, and it would (typically) work well for rectangular grids, but that is only a special case (although a common one) -- which would (likely) fail miserably for many other grids. The point is that gnuplot decided that averaging the corners is typically "close enough", but then gives you the option to change it.
See the explanation for the input data here. You may have to change your data file's format accordingly.