Python audio signal doesn't filter well - numpy

I need to take an .wav audio file that's noisy and filter out all that noise. I have to do it using Fourier Transform. After some days researching and experimenting, I finally made a working function, the problem is that it doesn't work as I intend it to. Here is the function I made:
# Audio signal processing
from scipy.io.wavfile import read, write
import matplotlib.pyplot as plt
import numpy as np
from scipy.fft import fft, fftfreq, ifft
def AudioSignalProcessing(audio):
# Import the .wav format audio into two variables:
# sampling (int)
# audio signal (numpy array)
sampling, signal = read(audio)
# time duration of the audio
length = signal.shape[0] / sampling
# x axis based on the time duration
time = np.linspace(0., length, signal.shape[0])
# show original signal
plt.plot(time, signal)
plt.xlabel("Time (s)")
plt.ylabel("Amplitude")
plt.title("Original signal")
plt.show()
# apply Fourier transform and normalize
transform = abs(fft(signal))
transform = transform/np.linalg.norm(transform)
# obtain frequencies
xf = fftfreq(transform.size, 1/sampling)
# show transformed signal (frequencies domain)
plt.plot(xf, transform)
plt.xlabel("Frecuency (Hz)")
plt.ylabel("Amplitude")
plt.title("Frequency domain signal")
plt.show()
# filter the transformed signal to a 40% of its maximum amplitude
threshold = np.amax(transform)*0.4
filtered = transform[np.where(transform > threshold)]
xf_filtered = xf[np.where(transform > threshold)]
# show filtered transformed signal
plt.plot(xf_filtered, filtered)
plt.xlabel("Frecuency (Hz)")
plt.ylabel("Amplitude")
plt.title("FILTERED time domain signal")
plt.show()
# transform the signal back to the time domain
filtrada = ifft(signal)
# show original signal filtered
plt.plot(time, filtrada)
plt.xlabel("Time (s)")
plt.ylabel("Amplitude")
plt.title("Filtered signal")
plt.show()
# convert audio signal to .wav format audio
# write(audio.replace(".wav", " filtrado.wav"), sampling, filtrada.astype(signal.dtype))
return None
AudioSignalProcessing("audio.wav")
Here is the output plots:
Original signal
Transformed signal
Filtered transformed signal
Filtered audio signal
The filtered frequencies don't look as I think they should, and after converting the filtered signal back to audio it doesn't sound good at all. Also, I've tried with different audios but the same filter distortion happens.

I suggest asking at https://dsp.stackexchange.com/ for detailed signal processing questions.
It looks like you want to keep only those frequency components that are within at least 40% of the maximum component. If that is the case:
Keep the complex form of the DFT, or you won't be able to transform back; so remove the abs from the line transform = abs(fft(signal)).
Don't use np.where to "keep" the frequencies you want; instead, set the places where the transform magnitude is below you threshold to 0; something like
transform[abs(transform) < 0.4 * max(abs(transform))] = 0
Finally, apply the inverse DFT to this altered transform; you've applied it to signal (see line filtrata = ifft(signal)). (You probably get warning when plotting filtrada about discarding imaginary values.)

Related

Analysis frame centering for scipy.signal.stft (and matplotlib plotting discrepancy)

I'm doing some experiments with Scipy's STFT, and would like to confirm that I'm understanding things correctly.
The following code generates the image I would expect, but labeled with the wrong time values:
from math import ceil, log
from scipy.io.wavfile import read
from scipy.signal import stft
import numpy as np
import matplotlib.pyplot as plt
# read a 2s, 440 Hz test tone, padded with 0.5s of silence on either end
fs, x = read('a440_2s_padded.wav')
nperseg = 44100
# pick an FFT size that's the smallest power of 2 >= the window size
nfft = pow(2, ceil(log(nperseg, 2)))
# N.B. no overlap between windows
f, t, Zxx = stft(x, fs, 'blackman', nperseg=nperseg, noverlap=0, nfft=nfft, boundary='zeros')
# crop the display to relevant bins
minBin, maxBin = 600, 700
# plot it
plt.pcolormesh(t, f[minBin:maxBin], np.abs(Zxx[minBin:maxBin]), vmin=None, vmax=None)
plt.title('STFT Magnitude')
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()
matplotlib STFT output
As noted in the code, I'm analyzing a 2s, 440 Hz test tone, padded with 0.5s of silence on either end, but in the image, the signal starts at 1s and lasts until 3s. For small nperseg values, this discrepancy doesn't make much difference, but for large values and musical data, the difference can be substantial, as it determines whether the STFT is centering its frames within beats (the desired behavior), or on beats (undesired, because then it's smearing data from two consecutive beats).
Am I misunderstanding something about the STFT analysis settings? Thanks for any insight.

Scipy Butter bandpass is not producing the desired results

So I'm trying to bandpass filter a wav PCM 24-bit 44.1khz file. What I would like to do is bandpass each frequency from 0Hz-22Khz.
So far I have loaded the data and can display it on Matplot and it looks like the following.
But when I go to apply the bandpass filter which I got from here
http://scipy-cookbook.readthedocs.io/items/ButterworthBandpass.html
I get the following result:
So I'm trying to bandpass at 100-101Hz as a test, here is my code:
from WaveData import WaveData
import matplotlib.pyplot as plt
from scipy.signal import butter, lfilter, freqz
from scipy.io.wavfile import read
import numpy as np
from WaveData import WaveData
class Filter:
def __init__(self, wav):
self.waveData = WaveData(wav)
def butter_bandpass(self, lowcut, highcut, fs, order=5):
nyq = 0.5 * fs
low = lowcut / nyq
high = highcut / nyq
b, a = butter(order, [low, high], btype='band')
return b, a
def butter_bandpass_filter(self, data, lowcut, highcut, fs, order):
b, a = self.butter_bandpass(lowcut, highcut, fs, order=order)
y = lfilter(b, a, data)
return y
def getFilteredSignal(self, freq):
return self.butter_bandpass_filter(data=self.waveData.file['Data'], lowcut=100, highcut=101, fs=44100, order=3)
def getUnprocessedData(self):
return self.waveData.file['Data']
def plot(self, signalA, signalB=None):
plt.plot(signalA)
if signalB != None:
plt.plot(signalB)
plt.show()
if __name__ == "__main__":
# file = WaveData("kick.wav")
# fileA = read("kick0.wav")
f = Filter("kick.wav")
a, b = f. butter_bandpass(lowcut=100, highcut=101, fs=44100)
w, h = freqz(b, a, worN=22000) ##Filted signal is not working?
f.plot(h, w)
print("break")
I dont understand where I have gone wrong.
Thanks
What #WoodyDev said is true: 1 Hz out of 44.1 kHz is way way too tiny of a bandpass for any kind of filter. Just look at the filter coefficients butter returns:
In [3]: butter(5, [100/(44.1e3/2), 101/(44.1e3/2)], btype='band')
Out[3]:
(array([ 1.83424060e-21, 0.00000000e+00, -9.17120299e-21, 0.00000000e+00,
1.83424060e-20, 0.00000000e+00, -1.83424060e-20, 0.00000000e+00,
9.17120299e-21, 0.00000000e+00, -1.83424060e-21]),
array([ 1. , -9.99851389, 44.98765092, -119.95470631,
209.90388506, -251.87018009, 209.88453023, -119.93258575,
44.9752074 , -9.99482662, 0.99953904]))
Look at the b coefficients (the first array): their values at 1e-20, meaning the filter design totally failed to converge, and if you apply it to any signal, the output will be zero—which is what you found.
You didn't mention your application but if you really really want to keep the signal's frequency content between 100 and 101 Hz, you could take a zero-padded FFT of the signal, zero out the portions of the spectrum outside that band, and IFFT (look at rfft, irfft, and rfftfreq in numpy.fft module).
Here's a function that applies a brick-wall bandpass filter in the Fourier domain using FFTs:
import numpy.fft as fft
import numpy as np
def fftBandpass(x, low, high, fs=1.0):
"""
Apply a bandpass signal via FFTs.
Parameters
----------
x : array_like
Input signal vector. Assumed to be real-only.
low : float
Lower bound of the passband in Hertz. (If less than or equal
to zero, a high-pass filter is applied.)
high : float
Upper bound of the passband, Hertz.
fs : float
Sample rate in units of samples per second. If `high > fs / 2`,
the output is low-pass filtered.
Returns
-------
y : ndarray
Output signal vector with all frequencies outside the `[low, high]`
passband zeroed.
Caveat
------
Note that the energe in `y` will be lower than the energy in `x`, i.e.,
`sum(abs(y)) < sum(abs(x))`.
"""
xf = fft.rfft(x)
f = fft.rfftfreq(len(x), d=1 / fs)
xf[f < low] = 0
xf[f > high] = 0
return fft.irfft(xf, len(x))
if __name__ == '__main__':
fs = 44.1e3
N = int(fs)
x = np.random.randn(N)
t = np.arange(N) / fs
import pylab as plt
plt.figure()
plt.plot(t, x, t, 100 * fftBandpass(x, 100, 101, fs=fs))
plt.xlabel('time (seconds)')
plt.ylabel('signal')
plt.legend(['original', 'scaled bandpassed'])
plt.show()
You can put this in a file, fftBandpass.py, and just run it with python fftBandpass.py to see it create the following plot:
Note I had to scale the 1 Hz bandpassed signal by 100 because, after bandpassing that much, there's very little energy in the signal. Also note that the signal living inside this small a passband is pretty much just a sinusoid at around 100 Hz.
If you put the following in your own code: from fftBandpass import fftBandpass, you can use the fftBandpass function.
Another thing you could try is to decimate the signal 100x, so convert it to a signal that was sampled at 441 Hz. 1 Hz out of 441 Hz is still a crazy-narrow passband but you might have better luck than trying to bandpass the original signal. See scipy.signal.decimate, but don't try and call it with q=100, instead recursively decimate the signal, by 2, then 2, then 5, then 5 (for total decimation of 100x).
So there are some problems with your code which means you aren't plotting the results correctly, although I believe this isn't your main problem.
Check your code
In the example you linked, they show precisely the process for calculating, and plotting the filter at different orders:
for order in [3, 6, 9]:
b, a = butter_bandpass(lowcut, highcut, fs, order=order)
w, h = freqz(b, a, worN=2000)
plt.plot((fs * 0.5 / np.pi) * w, abs(h), label="order = %d" % order)
You are currently not scaling your frequency axis correctly, or calling the absolute to get the real informatino from h, like the correct code above.
Check your theory
However your main issue, is your such steep bandpass (i.e. only 100Hz - 101Hz). It is very rare that I have seen a filter so sharp as this is very processing intensive (will require a lot of filter coefficients), and because you are only looking at a range of 1Hz, it will completely get rid of all other frequencies.
So the graph you have shown with the gain as 0 may very well be correct. If you use their example and change the bandpass cutoff frequencies to 100Hz -> 101Hz, then the output result is an array of (almost if not completely) zeros. This is because it will only be looking at the energy of the signal in a 1Hz range which will be very very small if you think about it.
If you are doing this for analysis, the frequency spacing tends to be much larger i.e. Octave Bands (or smaller divisions of octave bands).
The Spectrogram
As I am not sure of your end purpose I cannot clarify exactly which route you should take to get there. However, using bandpass filters on every single frequency up to 20kHz seems kind of silly in this day and age.
If I remember correctly, some of the first spectrogram attempts with needles on paper used this technique with analog band pass filter banks to analyze the frequency content. So this makes me think you may be looking for something to do with a spectrogram? It lets you analyze the whole signal's frequency information vs time and still has all of the signal's amplitude information. Python already has spectrogram functionality included as part of scipy or Matplotlib.

Tensorboard histograms to matplotlib

I would like to "dump" the tensorboard histograms and plot them via matplotlib. I would have more scientific paper appealing plots.
I managed to hack the way through the Summary file using the tf.train.summary_iterator and dump the histogram that I wanted to dump( tensorflow.core.framework.summary_pb2.HistogramProto object).
By doing that and implementing what the java-script code does with the data (https://github.com/tensorflow/tensorboard/blob/c2fe054231fe77f3a5b05dbc519f713d2e738d1c/tensorboard/plugins/histogram/tf_histogram_dashboard/histogramCore.ts#L104), I managed to get something similar (same trends) with the tensorboard plots, but not the exact same plot.
Can I have some light on this?
Thanks
In order to plot a tensorboard histogram with matplotlib I am doing the following:
event_acc = EventAccumulator(path, size_guidance={
'histograms': STEP_COUNT,
})
event_acc.Reload()
tags = event_acc.Tags()
result = {}
for hist in tags['histograms']:
histograms = event_acc.Histograms(hist)
result[hist] = np.array([np.repeat(np.array(h.histogram_value.bucket_limit), np.array(h.histogram_value.bucket).astype(np.int)) for h in histograms])
return result
h.histogram_value.bucket_limit gives me the value and h.histogram_value.bucket the count of this value. So when i repeat the values accordingly (np.repeat(...)), I get a huge array of expected size. This array can now be plotted with the default matplotlib logic.
The best solution is loading all events and reconstructing all the histogram (as the answer of #khuesmann) but not using EventAccumulator but EventFileLoader. This will give you a histogram per wall time and step as the ones Tensorboard plots. It can be extended to return a list of actions by timestep and wall time.
Don't forget to check which tag will you use.
from tensorboard.backend.event_processing.event_file_loader import EventFileLoader
# Just in case, PATH_OF_FILE is the path of the file, not the folder
loader = EventFileLoader(PATH_Of_FILE)
# Where to store values
wtimes,steps,actions = [],[],[]
for event in loader.Load():
wtime = event.wall_time
step = event.step
if len(event.summary.value) > 0:
summary = event.summary.value[0]
if summary.tag == HISTOGRAM_TAG:
wtimes += [wtime]*int(summary.histo.num)
steps += [step] *int(summary.histo.num)
for num,val in zip(summary.histo.bucket,summary.histo.bucket_limit):
actions += [val] *int(num)
bear in mind that tensorflow approximates the actions and treats the actions as continuous variables, so even if you have discrete actions (e.g. 0,1,3) you will end up actions as 0.2,0.4,0.9,1.4 ... in that case round the values will do it.
A good solution is the one from #khuesmann, but this only allows you to retrieve the accumulated histogram, not the histogram per step -- which is the one actually being showed in tensorboard.
If you want the distribution and so far, what I have understood is that Tensorboard usually compresses the histogram to decrease the memory used to store the data -- imagine storing a 2D histogram over 4 million steps, the memory can increase fast quickly. These compress histograms are accessible by doing this:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
n2n = EventAccumulator(PATH)
n2n.Reload()
# Check the tags under histograms and choose the one you want
n2n.Tags()
# This will give you the list used by tensorboard
# of the compress histograms by timestep and wall time
n2n.CompressedHistograms(HISTOGRAM_TAG)
The only problem is that it compresses the histogram to five percentiles (in Basic points they are 0, 668, 1587, 3085, 5000, 6915, 8413, 9332, 10000) which corresponds to (-Inf, -1.5, -1, -0.5, 0, 0.5, 1, 1.5, Inf) in standard deviations. Check the code here.
I haven't read much, but it wouldn't be hard to reconstruct the temporal histograms that tensorboard shows. If I find a way to do it, I will post it here.
The simplest way is to parse the events with tbparse and plot the histograms with seaborn kde_ridgeplot.
This tutorial generates the stacked distribution plot with around 30 lines of Python code:
Tensorboard preview:
Parse by tbparse & plotted by seaborn:
You can open an issue if you encountered any question during parsing. (I'm the author of tbparse)

How to change pyplot.specgram x and y axis scaling?

I have never worked with audio signals before and little do I know about signal processing. Nevertheless, I need to represent and audio signal using pyplot.specgram function from matplotlib library. Here is how I do it.
import matplotlib.pyplot as plt
import scipy.io.wavfile as wavfile
rate, frames = wavfile.read("song.wav")
plt.specgram(frames)
The result I am getting is this nice spectrogram below:
When I look at x-axis and y-axis which I suppose are frequency and time domains I can't get my head around the fact that frequency is scaled from 0 to 1.0 and time from 0 to 80k.
What is the intuition behind it and, what's more important, how to represent it in a human friendly format such that frequency is 0 to 100k and time is in sec?
As others have pointed out, you need to specify the sample rate, else you get a normalised frequency (between 0 and 1) and sample index (0 to 80k). Fortunately this is as simple as:
plt.specgram(frames, Fs=rate)
To expand on Nukolas answer and combining my Changing plot scale by a factor in matplotlib
and
matplotlib intelligent axis labels for timedelta
we can not only get kHz on the frequency axis, but also minutes and seconds on the time axis.
import matplotlib.pyplot as plt
import scipy.io.wavfile as wavfile
cmap = plt.get_cmap('viridis') # this may fail on older versions of matplotlib
vmin = -40 # hide anything below -40 dB
cmap.set_under(color='k', alpha=None)
rate, frames = wavfile.read("song.wav")
fig, ax = plt.subplots()
pxx, freq, t, cax = ax.specgram(frames[:, 0], # first channel
Fs=rate, # to get frequency axis in Hz
cmap=cmap, vmin=vmin)
cbar = fig.colorbar(cax)
cbar.set_label('Intensity dB')
ax.axis("tight")
# Prettify
import matplotlib
import datetime
ax.set_xlabel('time h:mm:ss')
ax.set_ylabel('frequency kHz')
scale = 1e3 # KHz
ticks = matplotlib.ticker.FuncFormatter(lambda x, pos: '{0:g}'.format(x/scale))
ax.yaxis.set_major_formatter(ticks)
def timeTicks(x, pos):
d = datetime.timedelta(seconds=x)
return str(d)
formatter = matplotlib.ticker.FuncFormatter(timeTicks)
ax.xaxis.set_major_formatter(formatter)
plt.show()
Result:
Firstly, a spectrogram is a representation of the spectral content of a signal as a function of time - this is a frequency-domain representation of the time-domain waveform (e.g. a sine wave, your file "song.wav" or some other arbitrary wave - that is, amplitude as a function of time).
The frequency values (y-axis, Hertz) are wholly dependant on the sampling frequency of your waveform ("song.wav") and will range from "0" to "sampling frequency / 2", with the upper limit being the "nyquist frequency" or "folding frequency" (https://en.wikipedia.org/wiki/Aliasing#Folding). The matplotlib specgram function will automatically determine the sampling frequency of the input waveform if it is not otherwise specified, which is defined as 1 / dt, with dt being the time interval between discrete samples of the waveform. You can can pass the option Fs='sampling rate' to the specgram function to manually define what it is. It will be easier for you to get your head around what is going on if you figure out and pass these variables to the specgram function yourself
The time values (x-axis, seconds) are purely dependent on the length of your "song.wav". You may notice some whitespace or padding if you use a large window length to calculate each spectra slice (think- the individual spectra which are arranged vertically and tiled horizontally to create the spectrogram image)
To make the axes more intuitive in the plot, use x- and y-axes labels and you can also scale the axes values (i.e. change the units) using a method similar to this
Take home message - try to be a bit more verbose with your code: see below for my example.
import matplotlib.pyplot as plt
import numpy as np
# generate a 5Hz sine wave
fs = 50
t = np.arange(0, 5, 1.0/fs)
f0 = 5
phi = np.pi/2
A = 1
x = A * np.sin(2 * np.pi * f0 * t +phi)
nfft = 25
# plot x-t, time-domain, i.e. source waveform
plt.subplot(211)
plt.plot(t, x)
plt.xlabel('time')
plt.ylabel('amplitude')
# plot power(f)-t, frequency-domain, i.e. spectrogram
plt.subplot(212)
# call specgram function, setting Fs (sampling frequency)
# and nfft (number of waveform samples, defining a time window,
# for which to compute the spectra)
plt.specgram(x, Fs=fs, NFFT=nfft, noverlap=5, detrend='mean', mode='psd')
plt.xlabel('time')
plt.ylabel('frequency')
plt.show()
5Hz_spectrogram:

Numpy: find mean coordinate of points along line

I have a bunch of points in a 2D space which all reside on a line (polygon). How can I compute the mean coordinate of these points on the line?
I don't mean the centroid of the points in the 2D space (as #rth initially proposed in his answer), but the mean location of the points along the line on which they reside. So basically, I could transform the line to a 1D axis, compute the mean location in 1D, and transform the location of the mean back into the 2D space.
Maybe these are exactly the necessary steps, but I think (or hope) that there is a function in numpy/scipy which allows me to do this in one step.
Edit: The approach you describe in the question is indeed probably the simplest way for solving this problem.
Here is an implementation that calculates the positions of vertices along the line in 1D, takes their mean, and finally calculates the corresponding 2D position with parametric interpolation,
import numpy as np
from scipy.interpolate import splprep, splev
vert = np.random.randn(1000, 2) # vertices definition here
# calculate the Euclidean distances between consecutive vertices
# equivalent to a for loop with
# dl[i] = ((vert[i+1, 0] - vert[i, 0])**2 + (vert[i+1,1] - vert[i,1])**2)**0.5
dl = (np.diff(vert, axis=0)**2).sum(axis=1)**0.5
# pad with 0, so dl.shape[0] == vert.shape[0] for convenience
dl = np.insert(dl, 0, 0.0)
l = np.cumsum(dl) # 1D coordinates along the line
l_mean = np.mean(l) # mean in the line coordinates
# calculate the coordinate of l_mean in 2D space
# with parametric B-spline interpolation
tck, _ = splprep(x=vert.T, u=l, k=3)
res = splev(l_mean, tck)
print(res)
Edit2: Assuming now that you have a high resolution set of points for your path vert_full and some approximate measurements vert_1, vert_2, etc, what you could do is the following.
Project each points of vert_1, etc. onto the exact path. Assuming that vert_full has much more datapoints than vert_1, we can simply look for the nearest neighbours of vert_1 in vert_full:
from scipy.spatial import cKDTree
tr = cKDTree(vert_full)
d, idx = tr.query(vert_1, k=1)
vert_1_proj = vert_full[idx] # this gives the projected corrdinates onto vert_full
# I have not actually run this, so it might require minor changes
Use the above mean calculation with the new vert_1_proj vector.
Meanwhile I've found the answer to my question, although using Shapely instead of Numpy.
from shapely.geometry import LineString, Point
# lists of points as (x,y) tuples
path_xy = [...]
points_xy = [...] # should be on or near path
path = LineString(path_xy) # create path object
pts = [Point(p) for p in points_xy] # create point objects
dist = [path.project(p) for p in pts] # distances along path
mean_dist = np.mean(dist) # mean distance along path
mean = path.interpolate(mean_dist) # mean point
mean_xy = (mean.x,mean.y)
This works perfectly!
(That's is also why I have to accept it as the answer, though I highly appreciate #rth's help!)