Matplotlib problems with linewidths - matplotlib

I noticed in doing some line plots that Matplotlib exhibits strange behaviour (using Python 3.7 and the default TKAgg backend). I've created a program plotting lines of various widths to show the problem. The program creates a bunch of financial looking data and then runs through a loop showing line plots of various linewidths. At the beginning of each loop it asks the user to input the linewidth they would like to see. Just enter 0 to end the program.
import numpy as np
import matplotlib as mpl
from matplotlib.lines import Line2D
import matplotlib.pyplot as plt
# Initialize prices and arrays
initial_price = 35.24
quart_hour_prices = np.empty(32) # 32 15 min periods per days
day_prices = np.empty([100,2]) # 100 days [high,low]
quart_hour_prices[0] = initial_price
# Create Data
for day in range(100):
for t in range(1, 32):
quart_hour_prices[t] = quart_hour_prices[t-1] + np.random.normal(0, .2) # 0.2 stand dev in 15 min
day_prices[day,0] = quart_hour_prices.max()
day_prices[day,1] = quart_hour_prices.min()
quart_hour_prices[0] = quart_hour_prices[31]
# Setup Plot
fig, ax = plt.subplots()
# Loop through plots of various linewidths
while True:
lw = float(input("Enter linewidth:")) # input linewidth
if lw == 0: # enter 0 to exit program
exit()
plt.cla() # clear plot before adding new lines
plt.title("Linewidth is: " + str(round(lw,2)) + " points")
# loop through data to create lines on plot
for d in range(100):
high = day_prices[d,1]
low = day_prices[d,0]
hl_bar = Line2D(xdata=(d, d), ydata=(high, low), color='k', linewidth=lw, antialiased=False)
ax.add_line(hl_bar)
ax.autoscale_view()
plt.show(block=False)
Matplotlib defines linewidths in points and its default is to have 72ppi. It also uses a default of 100dpi. So this means each point of linewidth takes up .72 dots or pixels. Thus I would expect to see linewidths less than 0.72 to be one pixel wide, those from 0.72 - 1.44 to be two pixels wide, and so on. But this is not what was observed.
A 0.72 linewidth did indeed give me a line that was one pixel wide. And then when the linewidth is increased to 0.73 the line gets thicker as expected. But it is now three pixels wide, instead of the two I expected.
For linewidths less than 0.72 the plot remains the same all the way down to 0.36. But then when I enter a linewidth of 0.35 or less, the line suddenly gets thicker (2 pixels wide), as shown by the graph below. How can the line get thicker if I reduce the linewidth? This was very unexpected.
Continuing the same testing process for greater linewidths, the plot of the 0.73 linewidth remains the same all the way up until a width of 1.07. But then at 1.08 the linewidth mysteriously gets thinner (2 pixels wide) being the same as the 0.35 and below plots. How can the line get thinner if I increase the linewidth? This was also very unexpected.
This strange behavior continues with greater linewidths. Feel free to use the above code to try it for yourself. Here is a table to summarize the results :
Points Linewidth in pixels
0.01 - 0.35 2
0.36 - 0.72 1
0.73 - 1.07 3
1.08 - 1.44 2
1.45 - 1.79 4
1.80 - 2.16 3
2.17 - 2.51 5
2.52 - 2.88 4
The pattern is something like 1 step back, 2 steps forward. Does anyone know why Matplotlib produces these results?
The practical purpose behind this question is that I am trying to produce an algorithm to vary the linewidth depending upon the density of the data in the plot. But this is very difficult to do when the line thicknesses are jumping around in such a strange fashion.

Related

Matplotlib, Inkscape, Spyder, plots and SVG compatibility (true axis size)

I have been plotting data for years during my PhD and always had to fight with something that unfortunately plagues the scientific community: negligent data manipulation.
My problem is that when I plot with matplotlib two graphics with different number lengths in the Y axis, the result is two graphics with two different X axis sizes.
When I copy the resulting SVG image directly from Spyder IPython console (Copy SVG) and paste in Inkscape for editing, matching the axis is a painful task which requires scaling them correctly with absolute precision. I am aware there plugins that are able to rescale plots in Inkscape and etc.
Bonus solved problem 1: for some reason, the size of an SVG created by matplotlib is scaled by 0.75 relative to Inkscape
Bonus solved problem 2: Matplotlib uses... inches, so the 25.4 that is in the following code lines is simply to convert from inch to millimeters.
Sometimes, having more control at the root is better than patching and patching and patching. So here is my solution to those who have been agonizing like me over being able to have two plots with the same absolute axis sizes:
from matplotlib import pyplot as plt
inch = False # Set to True if you want to use inch (blergh...).
width = 50 # The actual size in millimeters for the X axis to have.
height = 20 # The actual size in millimeters for the Y axis to have.
figsize = [(-0.212+width)/(1+24.4*(not inch)),(-0.212+height)/(1+24.4*(not inch))] # [W, H]
# Attention to the 0.212 mm which is thickness of the axis line; the cap at the end of the axis is half of thickness and is accounted for the size of the axis in Inkscape. So, when you use the size of a line from Inkscape as the desired size of the axis in a plot from matplotlib, ax.get_linewidth() by default should be 0.8 (whatever 0.8 is.. but it seems like 0.212/25.4 * 100).
height_scale = 3 # Scale to account for the axis title, labels and ticks.
width_scale = 2 # Scale to account for the axis title, labels and ticks.
figsize = [width_scale*figsize[0]/0.75, height_scale*figsize[1]/0.75]
fig = plt.figure(figsize = (figsize[0], figsize[1]))
wpos = (50/(1+24.4*(not inch)))/(figsize[0]/0.75) # Giving 50 mm mandatory position shift for the Y axis, to accommodate the title, labels and ticks.
hpos = (40/(1+24.4*(not inch)))/(figsize[1]/0.75) # Giving 40 mm mandatory position shift for the X axis to accommodate the title, labels and ticks.
# Now comes the problem. The AXIS size is defined relatively to the FIGURE size. The following values will simply use the rescaled FIGURE sizes:
wscale = 1/width_scale # = (width_scale*figsize[0]/0.75)/width_scale = figsize[0]/0.75 which is our target size for Inkscape.
hscale = 1/height_scale
ax = fig.add_axes([wpos, hpos, wscale, hscale])
Then you can plot at will, copy the SVG output (in Spyder's IPython console, at least) and paste it in Inkscape.
The only set back is that the whole FIGURE size will be abnormal and you'll have to remove the white background from it in Inkscape. But that is something probably all of us already do.
This is a minimal working code. You can paste it in your IPython console and copy the SVG output, paste it in Inkscape and check the axis line size. It will be with a width of 50 mm and a height of 20 mm.

Dynamically scaling axes during a matplotlib ArtistAnimation

It appears to be impossible to change the y and x axis view limits during an ArtistAnimation, and have the frames replayed with different axis limits.
The limits seem to fixed to those set last before the animation function is called.
In the code below, I have two plotting stages. The input data in the second plot is a much smaller subset of the data in the 1st frame. The data in the 1st stage has a much wider range.
So, I need to "zoom in" when displaying the second plot (otherwise the plot would be very tiny if the axis limits remain the same).
The two plots are overlaid on two different images (that are of the same size, but different content).
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import matplotlib.image as mpimg
import random
# sample 640x480 image. Actual frame loops through
# many different images, but of same size
image = mpimg.imread('image_demo.png')
fig = plt.figure()
plt.axis('off')
ax = fig.gca()
artists = []
def plot_stage_1():
# both x, y axis limits automatically set to 0 - 100
# when we call ax.imshow with this extent
im_extent = (0, 100, 0, 100) # (xmin, xmax, ymin, ymax)
im = ax.imshow(image, extent=im_extent, animated=True)
# y axis is a list of 100 random numbers between 0 and 100
p, = ax.plot(range(100), random.choices(range(100), k=100))
# Text label at 90, 90
t = ax.text(im_extent[1]*0.9, im_extent[3]*0.9, "Frame 1")
artists.append([im, t, p])
def plot_stage_2():
# axes remain at the the 0 - 100 limit from the previous
# imshow extent so both the background image and plot are tiny
im_extent = (0, 10, 0, 10)
# so let's update the x, y axis limits
ax.set_xlim(im_extent[0], im_extent[1])
ax.set_ylim(im_extent[0], im_extent[3])
im = ax.imshow(image, extent=im_extent, animated=True)
p, = ax.plot(range(10), random.choices(range(10), k=10))
# Text label at 9, 9
t = ax.text(im_extent[1]*0.9, im_extent[3]*0.9, "Frame 2")
artists.append([im, t, p])
plot_stage_1()
plot_stage_2()
# clear white space around plot
fig.subplots_adjust(left=0, bottom=0, right=1, top=1, wspace=None, hspace=None)
# set figure size
fig.set_size_inches(6.67, 5.0, True)
anim = animation.ArtistAnimation(fig, artists, interval=2000, repeat=False, blit=False)
plt.show()
If I call just one of the two functions above, the plot is fine. However, if I call both, the axis limits in both frames will be 0 - 10, 0 - 10. So frame 1 will be super zoomed in.
Also calling ax.set_xlim(0, 100), ax.set_ylim(0, 100) in plot_stage_1() doesn't help. The last set_xlim(), set_ylim() calls fix the axis limits throughout all frames in the animation.
I could keep the axis bounds fixed and apply a scaling function to the input data.
However, I'm curious to know whether I can simply change the axis limits -- my code will be better this way, because the actual code is complicated with multiple stages, zooming plots across many different ranges.
Or perhaps I have to rejig my code to use FuncAnimation, instead of ArtistAnimation?
FuncAnimation appears to result in the expected behavior. So I'm changing my code to use that instead of ArtistAnimation.
Still curious to know though, whether this can at all be done using ArtistAnimation.

python pyplot negative contour lines not displayed

minr=min(r_s)
maxr=max(r_s)
mini=min(i_s)
maxi=max(i_s)
xi=np.arange(minr,maxr, 0.1)
yi=np.arange(mini,maxi, 0.1)
zi=mlab.griddata(r_s, i_s, r_z, xi, yi, interp='linear')
plt.rcParams['contour.negative_linestyle'] = 'dashed'
CS=plt.contour(xi,yi,zi,50, linewidths =2.0)
plt.clabel(CS, inline=1, fontsize=10)
CS = plt.contourf(xi,yi,zi,15,cmap=plt.cm.rainbow)
plt.colorbar()
plt.xlabel('RS')
plt.ylabel('IS')
plt.show()
print ("END")
The above code is written to display a contour map of scattered 3D points r_s, i_s, r_z. I was able to plot the contour map/lines but only positive contour lines are displayed. Am I missing something? I want to show many contour lines including the negative ones.
data varies as follows:
r_s: from -7 to 2.0 with a step of 0.1
i_s: from -3 to 15 with a step of 0.1
r_z: from -1100 to 400 randomly
I was able to find a solution to my problem. The code is fine. The problem is in the data. In fact, some data points (few points) were above 10^6 which forced the contour plot not to show the negative points (about -1000). After fixing the data, I was able to plot contour lines including negative contour lines with the above code.

How to change pyplot.specgram x and y axis scaling?

I have never worked with audio signals before and little do I know about signal processing. Nevertheless, I need to represent and audio signal using pyplot.specgram function from matplotlib library. Here is how I do it.
import matplotlib.pyplot as plt
import scipy.io.wavfile as wavfile
rate, frames = wavfile.read("song.wav")
plt.specgram(frames)
The result I am getting is this nice spectrogram below:
When I look at x-axis and y-axis which I suppose are frequency and time domains I can't get my head around the fact that frequency is scaled from 0 to 1.0 and time from 0 to 80k.
What is the intuition behind it and, what's more important, how to represent it in a human friendly format such that frequency is 0 to 100k and time is in sec?
As others have pointed out, you need to specify the sample rate, else you get a normalised frequency (between 0 and 1) and sample index (0 to 80k). Fortunately this is as simple as:
plt.specgram(frames, Fs=rate)
To expand on Nukolas answer and combining my Changing plot scale by a factor in matplotlib
and
matplotlib intelligent axis labels for timedelta
we can not only get kHz on the frequency axis, but also minutes and seconds on the time axis.
import matplotlib.pyplot as plt
import scipy.io.wavfile as wavfile
cmap = plt.get_cmap('viridis') # this may fail on older versions of matplotlib
vmin = -40 # hide anything below -40 dB
cmap.set_under(color='k', alpha=None)
rate, frames = wavfile.read("song.wav")
fig, ax = plt.subplots()
pxx, freq, t, cax = ax.specgram(frames[:, 0], # first channel
Fs=rate, # to get frequency axis in Hz
cmap=cmap, vmin=vmin)
cbar = fig.colorbar(cax)
cbar.set_label('Intensity dB')
ax.axis("tight")
# Prettify
import matplotlib
import datetime
ax.set_xlabel('time h:mm:ss')
ax.set_ylabel('frequency kHz')
scale = 1e3 # KHz
ticks = matplotlib.ticker.FuncFormatter(lambda x, pos: '{0:g}'.format(x/scale))
ax.yaxis.set_major_formatter(ticks)
def timeTicks(x, pos):
d = datetime.timedelta(seconds=x)
return str(d)
formatter = matplotlib.ticker.FuncFormatter(timeTicks)
ax.xaxis.set_major_formatter(formatter)
plt.show()
Result:
Firstly, a spectrogram is a representation of the spectral content of a signal as a function of time - this is a frequency-domain representation of the time-domain waveform (e.g. a sine wave, your file "song.wav" or some other arbitrary wave - that is, amplitude as a function of time).
The frequency values (y-axis, Hertz) are wholly dependant on the sampling frequency of your waveform ("song.wav") and will range from "0" to "sampling frequency / 2", with the upper limit being the "nyquist frequency" or "folding frequency" (https://en.wikipedia.org/wiki/Aliasing#Folding). The matplotlib specgram function will automatically determine the sampling frequency of the input waveform if it is not otherwise specified, which is defined as 1 / dt, with dt being the time interval between discrete samples of the waveform. You can can pass the option Fs='sampling rate' to the specgram function to manually define what it is. It will be easier for you to get your head around what is going on if you figure out and pass these variables to the specgram function yourself
The time values (x-axis, seconds) are purely dependent on the length of your "song.wav". You may notice some whitespace or padding if you use a large window length to calculate each spectra slice (think- the individual spectra which are arranged vertically and tiled horizontally to create the spectrogram image)
To make the axes more intuitive in the plot, use x- and y-axes labels and you can also scale the axes values (i.e. change the units) using a method similar to this
Take home message - try to be a bit more verbose with your code: see below for my example.
import matplotlib.pyplot as plt
import numpy as np
# generate a 5Hz sine wave
fs = 50
t = np.arange(0, 5, 1.0/fs)
f0 = 5
phi = np.pi/2
A = 1
x = A * np.sin(2 * np.pi * f0 * t +phi)
nfft = 25
# plot x-t, time-domain, i.e. source waveform
plt.subplot(211)
plt.plot(t, x)
plt.xlabel('time')
plt.ylabel('amplitude')
# plot power(f)-t, frequency-domain, i.e. spectrogram
plt.subplot(212)
# call specgram function, setting Fs (sampling frequency)
# and nfft (number of waveform samples, defining a time window,
# for which to compute the spectra)
plt.specgram(x, Fs=fs, NFFT=nfft, noverlap=5, detrend='mean', mode='psd')
plt.xlabel('time')
plt.ylabel('frequency')
plt.show()
5Hz_spectrogram:

matplotlib: preventing a few very large (or small) values to affect my contour

in plotting the data some times there are a few very large (or very small) numbers which, if not taken care of, will affect the contour in a bad way. a solution is to take out the 10% highest and lowest data out of the contour color grading and considering them as less than and more than. the following figure shows the idea:
the two arrow shapes on the top and the bottom of the bar support this idea. any value above 14 will be shown in white and any value below -2 will be shown in black color. how is it possible in matplotlib?
How can I define:
- to put the 5% of highest values and 5% of lowest values in two categories shown in the triangular parts in both ends of the bar? (Should I define it the contour operation or are there other ways?)
- what if I want to give certain values instead of the percentage? for instance, ask to put any value above 14 on the white triangule and any value below -2 as black areas?
Thank you so much for your help.
Taken from http://matplotlib.org/examples/api/colorbar_only.html. You can play with it and you will see if it could solve your problem.
import matplotlib.pyplot as plt
from matplotlib import mpl
import numpy as np
x = np.linspace(-1,1,100)
X,Y = np.meshgrid(x,x)
Z = np.exp(-X**2-Y**2)
vmin = 0.3 #Lower value
vmax = 0.9 #Upper value
bounds = np.linspace(vmin,vmax,4)
cmap = mpl.colors.ListedColormap([(0,0,0),(0.5,0.5,0.5),(0,1,0),(1,1,1)])
norm = mpl.colors.BoundaryNorm(bounds, cmap.N)
plt.imshow(Z,cmap=cmap,interpolation='nearest',vmin=vmin,vmax=vmax)
ax = plt.colorbar().ax
cb = mpl.colorbar.ColorbarBase(ax, norm=norm,
extend='both',
cmap=cmap)
cmap.set_over([0,0,1])
cmap.set_under([1,0,0])
plt.show()