Plotting audio data properties over long time periods - matplotlib

Using Python matplotlib I would like to plot sensor data over a period of several hours. The signal arrives via an audio card and gets sampled over short chunks of data. In the example below amplitude and RMS is plotted.
In order to plot RMS and other properties over much larger time periods than shown here, perhaps down sampling is needed. I am not sure how to accomplish that and would appreciate any further advice. The intention is to run the code on a Raspberry Pi.
Update 1. A very minimal example is shown for getting a longer time view of RMS.
Noticable is a considerable delay in response to audio signals in particular when adding more plots to the figure.
I also tried using Funcanimation without blitting because I would like to show a real-time axis and this is equally slow. Using PyQT should give better results.
import pyaudio
import struct
import matplotlib.pyplot as plt
import numpy as np
mic = pyaudio.PyAudio()
FORMAT = pyaudio.paInt16
RATE = 44100
CHUNK = int(RATE/20)
stream =, channels=CHANNELS, rate=RATE, input=True,
fig = plt.figure()
ax1 = fig.add_subplot(2, 1, 1)
ax2 = fig.add_subplot(2, 1, 2)
ax1.set_xlabel("Samples = 2*Chunk length ")
ax1.set_title('Audio example')
x = np.arange(0, 2 * CHUNK, 2)
ax1.set_ylim(-10e3, 10e3)
ax1.set_xlim(0, CHUNK)
line1, = ax1.plot(x, np.random.rand(CHUNK))
line2, = ax2.plot(x, np.random.rand(CHUNK))
ts = []
rs = []
while True:
data =
data = np.frombuffer(data, np.int16)
d = np.frombuffer(data, np.int16).astype(np.float)
rms2 = np.sqrt( np.mean(d**2) )
# Add x and y to lists
#Draw x and y lists
ax2.plot(ts,rs,color= 'black')
# Format plot
ax2.set_xlabel("Time in UTC")
ax2.set_ylabel("RMS values")
plt.setp(ax2.get_xticklabels(), ha="right", rotation=45)


xarray : how to stack several pcolormesh figures above a map?

For a ML project I'm currently on, I need to verify if the trained data are good or not.
Let's say that I'm "splitting" the sky into several altitude grids (let's take 3 values for the moment) and for a given region (let's say, Europe).
One grid could be a signal reception strength (RSSI), another one the signal quality (RSRQ)
Each cell of the grid is therefor a rectangle and it has a mean value of each measurement (i.e. RSSI or RSRQ) performed in that area.
I have hundreds of millions of data
In the code below, I know how to draw a coloured mesh with xarray for each altitude: I just use xr.plot.pcolormesh(lat,lon, the_data_set); that's fine
But this will only give me a "flat" figure like this:
RSSI value at 3 different altitudes
I need to draw all the pcolormesh() of a dataset for each altitude in such way that:
1: I can have the map at the bottom
2: Each pcolormesh() is stacked and "displayed" at its altitude
3: I need to add a 3d scatter plot for testing my trained data
4: Need to be interactive as I have to zoom in areas
For 2 and 3 above, I managed to do something using plt and cartopy :
enter image description here
But plt/cartopy combination is not as interactive as plotly.
But plotly doesn't have the pcolormesh functionality
And still ... I don't know in anycase, how to "stack" the pcolormesh results that I did get above.
I've been digging Internet for few days but I didn't find something that could satisfy all my criteria.
What I did to get my pcolormesh:
import numpy as np
import xarray as xr
import as ccrs
import matplotlib.pyplot as plt
class super_data():
def __init__(self, lon_bound,lat_bound,alt_bound,x_points,y_points,z_points):
self.lon_bound = lon_bound
self.lat_bound = lat_bound
self.alt_bound = alt_bound
self.x_points = x_points
self.y_points = y_points
self.z_points = z_points
self.lon,, self.alt = np.meshgrid(np.linspace(self.lon_bound[0], self.lon_bound[1], self.x_points),
np.linspace(self.lat_bound[0], self.lat_bound[1], self.y_points),
np.linspace(self.alt_bound[0], self.alt_bound[1], self.z_points))
self.this_xr = xr.Dataset(
coords={'lat': (('latitude', 'longitude','altitude'),,
'lon': (('latitude', 'longitude','altitude'), self.lon),
'alt': (('latitude', 'longitude','altitude'), self.alt)})
def add_data_array(self,ds_name,ds_min,ds_max):
def create_temp_data(ds_min,ds_max):
data = np.random.randint(ds_min,ds_max,size=self.y_points * self.x_points)
return data
temp_data = []
# Create "z_points" number of layers in the z axis
for i in range(self.z_points):
data = np.concatenate(temp_data)
data = data.reshape(self.z_points,self.x_points, self.y_points)
self.this_xr[ds_name] = (("altitude","longitude","latitude"),data)
def plot(self,dataset, extent=None, plot_center=False):
# I want t
if np.sqrt(self.z_points) == np.floor(np.sqrt(self.z_points)):
side_size = int(np.sqrt(self.z_points))
side_size = int(np.floor(np.sqrt(self.z_points) + 1))
fig = plt.figure()
for i in range(side_size):
for j in range(side_size):
if i_ax < self.z_points+1:
this_dataset = self.this_xr[dataset].sel(altitude=i_ax-1)
# Initialize figure with subplots
ax = fig.add_subplot(side_size, side_size, i_ax, projection=ccrs.PlateCarree())
i_ax += 1
this_dataset.plot.pcolormesh('lon', 'lat', ax=ax, infer_intervals=True, alpha=0.5)
if __name__ == "__main__":
# Wanted coverage :
lons = [-15, 30]
lats = [35, 65]
alts = [1000, 5000]
xarr = super_data(lons,lats,alts,10,8,3)
# Add some fake data
Thanks for you help

Unexplained "drops" in Savgol smoothing with higher polynomial for trends, stock, energy data (all kinds of time series basically!)

I have been trying to smooth curves with Savgol (scikit) and, in several of my attempt, raising the polynomial degree resulted in "drops" like the one I show below. This example is from Google trends data, but I had similar problems with stock data and electricity consumption data. Any lead as to why it behaves like it or how to solve it (and be able to raise the polynomial degree) would be highly appreciated.
Image below: "Sample output".
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
from pandas.plotting import register_matplotlib_converters
from pytrends.request import TrendReq
pytrends = TrendReq(hl='en-US', tz=360)
from scipy.signal import savgol_filter
kw_list = ["Carbon footprint"]
pytrends.build_payload(kw_list, timeframe='2004-12-14 2019-12-25', geo='', gprop='')
da1 = pytrends.interest_over_time()
#(drop last one for Savgol as need odd number, used to have 196 records)
Y3 = da1["Carbon footprint"]
fig = plt.figure(figsize=(18,9))
l = Y3.shape[0]
l = l if l%2 == 1 else l-1
# window = odd number closest to size of data
ax1 = plt.subplot(2,1,1)
ax1 = sns.lineplot(data=Y3, color="navy")
#Savgol with polynomial order = 7 is fine (but misses the initial plateau)
Y3_smooth = savgol_filter(Y3,l, 7)
ax1 = sns.lineplot(x=da1.index.to_pydatetime(),y=Y3_smooth, color="red")
plt.title(f"red = with Savgol, polynomial order = 7, window = {l}", fontsize=18)
ax2 = plt.subplot(2,1,2)
ax2 = sns.lineplot(data=Y3, color="navy")
#Savgol with polynomial order = 9 or more has a weird drop
Y3_smooth = savgol_filter(Y3,l, 10)
ax2 = sns.lineplot(x=da1.index.to_pydatetime(),y=Y3_smooth, color="red")
plt.title(f"red = with Savgol, polynomial order = 10, window = {l}", fontsize=18)
Sample output
If anyone is interested, I found this workaround using a different way to smooth. It works well including in the beginning and end, and allows a fine tuning of the degree of smoothing.
from scipy.ndimage.filters import gaussian_filter1d
def smooth(y, sigma=2):
y_smooth = gaussian_filter1d(y, sigma)
return y_smooth

Python keeps overwriting hist on previous plot but doesn't save it with the desired plot

I am saving two separate figures, that each should contain 2 plots together.
The problem is that the first figure is ok, but the second one, does not gets overwritten on the new plot but on the previous one, but in the saved figure, I only find one of the plots :
This is the first figure , and I get the first figure correctly :
import scipy.stats as s
import numpy as np
import os
import pandas as pd
import openpyxl as pyx
import matplotlib
matplotlib.rcParams["backend"] = "TkAgg"
#matplotlib.rcParams['backend'] = "Qt4Agg"
#matplotlib.rcParams['backend'] = "nbAgg"
import matplotlib.pyplot as plt
import math
data = [336256, 620316, 958846, 1007830, 1080401]
pdf = array([ 0.00449982, 0.0045293 , 0.00455894, 0.02397463,
0.02395788, 0.02394114])
fig, ax = plt.subplots();
fig = plt.figure(figsize=(40,30))
x = np.linspace(np.min(data), np.max(data), 100);
plt.plot(x, s.exponweib.pdf(x, *, 1, 1, loc=0, scale=2)))
plt.hist(data, bins = np.linspace(data[0], data[-1], 100), normed=True, alpha= 1)
text1= ' Weibull'
plt.savefig(text1+ '.png' )
datar =np.asarray(data)
mu, sigma = datar.mean() , datar.std() # mean and standard deviation
normal_std = np.sqrt(np.log(1 + (sigma/mu)**2))
normal_mean = np.log(mu) - normal_std**2 / 2
hs = np.random.lognormal(normal_mean, normal_std, 1000)
print(hs.max()) # some finite number
print(hs.mean()) # about 136519
print(hs.std()) # about 50405
count, bins, ignored = plt.hist(hs, 100, normed=True)
x = np.linspace(min(bins), max(bins), 10000)
pdfT = [];
for el in range (len(x)):
pdfTmp = (math.exp(-(np.log(x[el]) - normal_mean)**2 / (2 * normal_std**2)))
pdfT += [pdfTmp]
pdf = np.asarray(pdfT)
This is the second set :
fig, ax = plt.subplots();
fig = plt.figure(figsize=(40,40))
plt.plot(x, pdf, linewidth=2, color='r')
plt.hist(data, bins = np.linspace(data[0], data[-1], 100), normed=True, alpha= 1)
text= ' Lognormal '
plt.savefig(text+ '.png' )
The first plot saves the histogram together with curve. instead the second one only saves the curve
update 1 : looking at This Question , I found out that clearing the plot history will help the figures don't mixed up , but still my second set of plots, I mean the lognormal do not save together, I only get the curve and not the histogram.
This is happening, because you have set normed = True, which means that area under the histogram is normalized to 1. And since your bins are very wide, this means that the actual height of the histogram bars are very small (in this case so small that they are not visible)
If you use
n, bins, _ = plt.hist(data, bins = np.linspace(data[0], data[-1], 100), normed=True, alpha= 1)
n will contain the y-value of your bins and you can confirm this yourself.
Also have a look at the documentation for plt.hist.
So if you set normed to False, the histogram will be visible.
Edit: number of bins
import numpy as np
import matplotlib.pyplot as plt
rand_data = np.random.uniform(0, 1.0, 100)
fig = plt.figure()
ax_1 = fig.add_subplot(211)
ax_1.hist(rand_data, bins=10)
ax_2 = fig.add_subplot(212)
ax_2.hist(rand_data, bins=100)
will give you two plots similar (since its random) to:
which shows how the number of bins changes the histogram.
A histogram visualises the distribution of your data along one dimension, so not sure what you mean by number of inputs and bins.

A real time Spectrum analyser with pyaudio in python on Raspi

I am trying to get an fft plot on realtime audio using a USB microphone plugged into my raspi. I want to be able to activate an LED when a certain frequency is detected through the fft plot. I have so far tried to get just a live sound wave to be plotted but I am having trouble. I have followed this video:
I have tried changing the chunk size to a greater value and a lower value but have had no success.For some reason I get the -9981 error but it takes a long time to print the error. No plot is displayed. I have even tried overclocking my Raspberry Pi to see if that would work but it still doesn't work.
I was wondering if anyone else had tried something like this on their Pi and if it was possible or if I had to do it using a different package other than pyaudio.
Here is my python code:
import pyaudio
import struct
import numpy as np
import matplotlib.pyplot as plt
CHUNK = 100000
FORMAT = pyaudio.paInt16
RATE = 44100
p = pyaudio.PyAudio()
stream =
format = FORMAT,
channels = CHANNELS,
rate = RATE,
input = True,
output = True,
frames_per_buffer = CHUNK,
start = True
fig, ax = plt.subplots()
x = np.arange(0, 2 * CHUNK, 2)
line, = ax.plot(x, np.random.rand(CHUNK))
ax.set_ylim(0, 255)
ax.set_xlim(0, CHUNK)
while True:
data =
data_int = np.array(struct.unpack(str(CHUNK*2) + 'B', data), dtype='b')[::2] + 127
To display add:
ax.set_xlim(0, CHUNK)
But with rpi you have to configure your usb sound card as default card

Legend not working for live data and while loop configuration

My code takes a continuously updating input from raspberry pi, which is then plotted onto a graph. I'm trying to use the legend to display the current frequency (most recent output of y_data) however I can't seem to get it to display. Placing plt.legend() just before results in a display, however freezing of the graph. Any help would be greatly appreciated.
import matplotlib
from matplotlib.figure import Figure
import matplotlib.pyplot as plt
import RPi.GPIO as GPIO
import time
import numpy as np
x_data = []
y_data = []
fig, ax = plt.subplots()
line, = plt.plot([],[], 'k-',label = 'data', drawstyle = 'steps')
avr, = plt.plot([],[], 'g--',label = 'mean') = False)
def update(x_data, y_data, average):
data = round(y_data[-1], 1)
ax.legend((line, avr), (data, 'mean'))
while True: #Begin continuous loop
NUM_CYCLES = 10 #Loops to be averaged over
start = time.time()
for impulse_count in range(NUM_CYCLES):
duration = time.time() - start #seconds to run for loop
frequency = NUM_CYCLES / duration #Frequency in Hz
bpm = (frequency/1000)*60 #Frequency / no. of cogs per breath * min
x_data.append(time.time()) #add new data to data lists
average = sum(y_data)/float(len(y_data))
update(x_data,y_data, average) #call function to update graph contents
I think you should call fig.canvas.draw() at the end of the update function, not in the middle of it. I'm not sure why you add all the artists again in the update function, so you may leave that out. Concerning the legend, It's probably best to create it once at the beginning and inside the update function only update the relevant text.
Commenting out all the GPIO stuff, this is a version which works fine for me:
import matplotlib
from matplotlib.figure import Figure
import matplotlib.pyplot as plt
#import RPi.GPIO as GPIO
import time
import numpy as np
x_data = []
y_data = []
fig, ax = plt.subplots()
line, = plt.plot([],[], 'k-',label = 'data', drawstyle = 'steps')
avr, = plt.plot([],[], 'g--',label = 'mean')
# add legend already at the beginning
legend = ax.legend((line, avr), (0.0, 'mean')) = False)
def update(x_data, y_data, average):
#fig.canvas.draw() <- use this at the end
#ax.draw_artist(ax.patch) # useless?
#ax.draw_artist(line) # useless?
#ax.draw_artist(avr) # useless?
data = round(y_data[-1], 1)
# only update legend here
#fig.canvas.update() # <- what is this one needed for?
while True: #Begin continuous loop
NUM_CYCLES = 10 #Loops to be averaged over
start = time.time()
#for impulse_count in range(NUM_CYCLES):
a = np.random.rand(700,800) # <- just something that takes a little time
duration = time.time() - start #seconds to run for loop
frequency = NUM_CYCLES / duration #Frequency in Hz
bpm = (frequency/1000)*60 #Frequency / no. of cogs per breath * min
x_data.append(time.time()) #add new data to data lists
average = sum(y_data)/float(len(y_data))
update(x_data,y_data, average) #call function to update graph contents
Add plt.draw() (or fig.canvas.draw_idle() for a more OO approach) at the end of update.