Low-pass Chebyshev type-I filter with Scipy - numpy

I am reading a paper, trying to reproduce the results of the paper. In this paper, they use a low-pass Chebyshev type-I filter on the raw data. And they give those parameters.
Sampling frequency = 32Hz, Fcut=0.25Hz, Apass = 0.001dB, Astop = -100dB, Fstop = 2Hz, Order of the filter = 5. I found some materials help me understand these parameters
But when I take a look at the scipy.signal.cheby1. The parameters required by this function are different.
cheby1(N, rp, Wn, btype='low', analog=False, output='ba')
Here N:The order of the filter; btype: type of filter, in my case, it is 'lowpass'; analog=False, because the data is sampled, so it is digital; output: specifies the type of output. But I am not sure about rp, Wn.
In the documentation, it says:
rp : float
The maximum ripple allowed below unity gain in the passband. Specified in decibels, as a positive number.
Wn : array_like
A scalar or length-2 sequence giving the critical frequencies. For Type I filters, this is the point in the transition band at which the gain first drops below -rp. For digital filters, Wn is normalized from 0 to 1, where 1 is the Nyquist frequency, pi radians/sample. (Wn is thus in half-cycles / sample.) For analog filters, Wn is an angular frequency (e.g. rad/s).
According to this question:
How To apply a filter to a signal in python
I know how I can use the filter. But I don't know how to create a filter which has the same parameters as mentioned above. I don't know how to convert these parameters and provide them to the function in Scipy.

Take a look at the wikipedia page on the Type I Chebyshev filter. Note that your plot illustrates the characteristics of a general filter. A lowpass Type I Chebyshev filter, however, has no ripple in the stop band.
You have three available parameters for the design of a Type I Chebyshev filter: the filter order, the ripple factor, and the cutoff frequency. These are the first three parameters of scipy.signal.cheby1:
The first argument of cheby1 is the order of the filter.
The second argument, rp, corresponds to δ in the wikipedia page, and is apparently what you called Apass.
The third argument is wn, the cutoff frequency expressed as a fraction of the Nyquist frequency. In your case, you could write something like
fs = 32 # Sample rate (Hz)
fcut = 0.25 # Desired filter cutoff frequency (Hz)
# Cutoff frequency relative to the Nyquist
wn = fcut / (0.5*fs)
Once those three parameters are chosen, all the other characteristics
(e.g. transition band width, Astop, Fstop, etc) are determined. So it appears that the specification that you give, "Sampling frequency = 32Hz, Fcut=0.25Hz, Apass = 0.001dB, Astop = -100dB, Fstop = 2Hz, Order of the filter = 5", are not compatible with a Type I Chebyshev filter. In particular, I get a gain of approximately -78 dB at 2 Hz.
(If you increase the order to 6, then the gain at 2 Hz is approximately -103.)
Here's a complete script, followed by the plot that it generates. The plot shows just the pass band, but you can change the arguments of the xlim and ylim functions to see more.
import numpy as np
from scipy.signal import cheby1, freqz
import matplotlib.pyplot as plt
# Sampling parameters
fs = 32 # Hz
# Desired filter parameters
order = 5
Apass = 0.001 # dB
fcut = 0.25 # Hz
# Normalized frequency argument for cheby1
wn = fcut / (0.5*fs)
b, a = cheby1(order, Apass, wn)
w, h = freqz(b, a, worN=8000)
plt.figure(1)
plt.plot(0.5*fs*w/np.pi, 20*np.log10(np.abs(h)))
plt.axvline(fcut, color='r', alpha=0.2)
plt.plot([0, fcut], [-Apass, -Apass], color='r', alpha=0.2)
plt.xlim(0, 0.3)
plt.xlabel('Frequency (Hz)')
plt.ylim(-5*Apass, Apass)
plt.ylabel('Gain (dB)')
plt.grid()
plt.title("Chebyshev Type I Lowpass Filter")
plt.tight_layout()
plt.show()

Related

Parameters for numpy.random.lognormal function

I need to create a fictitious log-normal distribution of household income in a particular area. The data I have are: Average: 13,600 and Standard Deviation 7,900.
What should be the parameters in the function numpy.random.lognormal?
When i set the mean and the standard deviation as they are most of the values in the distribution are "inf", and the values also doesn't make sense when i set the parameters as the log of the mean and standard deviation.
If someone can help me to figure out what the parameters are it would be great.
Thanks!
This is indeed a nontrivial task as the moments of the log-normal distribution should be solved for the unknown parameters. By looking at say [Wikipedia][1], you will find the mean and variance of the log-normal distribution to be exp(mu + sigma2) and [exp(sigma2)-1]*exp(2*mu+sigma**2), respectively.
The choice of mu and sigma should solve exp(mu + sigma**2) = 13600 and [exp(sigma**2)-1]*exp(2*mu+sigma**2)= 7900**2. This can be solved analytically because the first equation squared provides exactly exp(2*mu+sigma**2) thus eliminating the variable mu from the second equation.
A sample code is provided below. I took a large sample size to explicitly show that the mean and standard deviation of the simulated data are close to the desired numbers.
import numpy as np
# Input characteristics
DataAverage = 13600
DataStdDev = 7900
# Sample size
SampleSize = 100000
# Mean and variance of the standard lognormal distribution
SigmaLogNormal = np.sqrt( np.log(1+(DataStdDev/DataAverage)**2))
MeanLogNormal = np.log( DataAverage ) - SigmaLogNormal**2/2
print(MeanLogNormal, SigmaLogNormal)
# Obtain draw from log-normal distribution
Draw = np.random.lognormal(mean=MeanLogNormal, sigma=SigmaLogNormal, size=SampleSize)
# Check
print( np.mean(Draw), np.std(Draw))

Creating a matrix with certain conditions

I am trying to create a matrix using PyTorch of size 32x10x1.
The conditions that I need to fulfill are that
torch.mean(a, dim=0) # size is 10x1 and should be almost 0
torch.mean(a, dim=1) # size is 32x1 and should be almost 0
This is a noise matrix for GANs and I am trying to sample it from Normal Distribution. I tried using torch.MultiVariateNormal() but it didnt give me matrix of that shape
Is there any other function or something in numpy or scikit to get this kind of matrix
Use numpy.random.normal
import numpy.random as npr
mean = 0
std_dev = 0.1
size = (32, 10, 1)
mat = npr.normal(loc=mean, scale=std_dev, size=size)
and set the mean and standard deviation as desired to keep the values close to 0.
Here you can see the effect of changing the mean and standard deviation on the graph
By Inductiveload - self-made, Mathematica, Inkscape, Public Domain, https://commons.wikimedia.org/w/index.php?curid=3817954

Bad result plotting windowing FFT

im playing with python and scipy to understand windowing, i made a plot to see how windowing behave under FFT, but the result is not what i was specting.
the plot is:
the middle plots are pure FFT plot, here is where i get weird things.
Then i changed the trig. function to get leak, putting a 1 straight for the 300 first items of the array, the result:
the code:
sign_freq=80
sample_freq=3000
num=np.linspace(0,1,num=sample_freq)
i=0
#wave data:
sin=np.sin(2*pi*num*sign_freq)+np.sin(2*pi*num*sign_freq*2)
while i<1000:
sin[i]=1
i=i+1
#wave fft:
fft_sin=np.fft.fft(sin)
fft_freq_axis=np.fft.fftfreq(len(num),d=1/sample_freq)
#wave Linear Spectrum (Rms)
lin_spec=sqrt(2)*np.abs(np.fft.rfft(sin))/len(num)
lin_spec_freq_axis=np.fft.rfftfreq(len(num),d=1/sample_freq)
#window data:
hann=np.hanning(len(num))
#window fft:
fft_hann=np.fft.fft(hann)
#window fft Linear Spectrum:
wlin_spec=sqrt(2)*np.abs(np.fft.rfft(hann))/len(num)
#window + sin
wsin=hann*sin
#window + sin fft:
wsin_spec=sqrt(2)*np.abs(np.fft.rfft(wsin))/len(num)
wsin_spec_freq_axis=np.fft.rfftfreq(len(num),d=1/sample_freq)
fig=plt.figure()
ax1 = fig.add_subplot(431)
ax2 = fig.add_subplot(432)
ax3 = fig.add_subplot(433)
ax4 = fig.add_subplot(434)
ax5 = fig.add_subplot(435)
ax6 = fig.add_subplot(436)
ax7 = fig.add_subplot(413)
ax8 = fig.add_subplot(414)
ax1.plot(num,sin,'r')
ax2.plot(fft_freq_axis,abs(fft_sin),'r')
ax3.plot(lin_spec_freq_axis,lin_spec,'r')
ax4.plot(num,hann,'b')
ax5.plot(fft_freq_axis,fft_hann)
ax6.plot(lin_spec_freq_axis,wlin_spec)
ax7.plot(num,wsin,'c')
ax8.plot(wsin_spec_freq_axis,wsin_spec)
plt.show()
EDIT: as asked in the comments, i plotted the functions in dB scale, obtaining much clearer plots. Thanks a lot #SleuthEye !
It appears the plot which is problematic is the one generated by:
ax5.plot(fft_freq_axis,fft_hann)
resulting in the graph:
instead of the expected graph from Wikipedia.
There are a number of issues with the way the plot is constructed. The first is that this command essentially attempts to plot a complex-valued array (fft_hann). You may in fact be getting the warning ComplexWarning: Casting complex values to real discards the imaginary part as a result. To generate a graph which looks like the one from Wikipedia, you would have to take the magnitude (instead of the real part) with:
ax5.plot(fft_freq_axis,abs(fft_hann))
Then we notice that there is still a line striking through our plot. Looking at np.fft.fft's documentation:
The values in the result follow so-called “standard” order: If A = fft(a, n), then A[0] contains the zero-frequency term (the sum of the signal), which is always purely real for real inputs. Then A[1:n/2] contains the positive-frequency terms, and A[n/2+1:] contains the negative-frequency terms, in order of decreasingly negative frequency.
[...]
The routine np.fft.fftfreq(n) returns an array giving the frequencies of corresponding elements in the output.
Indeed, if we print the fft_freq_axis we can see that the result is:
[ 0. 1. 2. ..., -3. -2. -1.]
To get around this problem we simply need to swap the lower and upper parts of the arrays with np.fft.fftshift:
ax5.plot(np.fft.fftshift(fft_freq_axis),np.fft.fftshift(abs(fft_hann)))
Then you should note that the graph on Wikipedia is actually shown with amplitudes in decibels. You would then need to do the same with:
ax5.plot(np.fft.fftshift(fft_freq_axis),np.fft.fftshift(20*np.log10(abs(fft_hann))))
We should then be getting closer, but the result is not quite the same as can be seen from the following figure:
This is due to the fact that the plot on Wikipedia actually has a higher frequency resolution and captures the value of the frequency spectrum as its oscillates, whereas your plot samples the spectrum at fewer points and a lot of those points have near zero amplitudes. To resolve this problem, we need to get the frequency spectrum of the window at more frequency points.
This can be done by zero padding the input to the FFT, or more simply setting the parameter n (desired length of the output) to a value much larger than the input size:
N = 8*len(num)
fft_freq_axis=np.fft.fftfreq(N,d=1/sample_freq)
fft_hann=np.fft.fft(hann, N)
ax5.plot(np.fft.fftshift(fft_freq_axis),np.fft.fftshift(20*np.log10(abs(fft_hann))))
ax5.set_xlim([-40, 40])
ax5.set_ylim([-50, 80])

comparing two frequency spectra

I'm trying to compare two frequency spectra but I am confused over a number of points.
One device samples at 40 Hz the other at 100 Hz and so I'm not sure if I need to take this into account. Anyway I have produced frequency spectra from both devices and now I wish to compare these. How can I do correlation at each point so that I get pearson correlations at each point. I know how to do an overall one of course but I want to see points of strong correlation and those less strong?
If you are calculating power spectral densities P(f), then it doesn't matter how your original signal x(t) is sampled. You can directly and quantitatuively compare both spectra. To make sure that you have calculated the spectral densities you can explicitly check Parsevals theorem:
$ \int P(f) df = \int x(t)^2 dt $
Of course you have to think about which frequencies are actually evaluated Remember that a fft gives you frequencies from f = 1/T until or below the Nyquist frequency f_ny = 1/(2 dt) depending on the number of samples in x(t) being even or odd.
Here's a python example code for psd
def psd(x,dt=1.):
"""Computes one-sided power spectral density of x.
PSD estimated via abs**2 of Fourier transform of x
Takes care of even or odd number of elements in x:
- if x is even both f=0 and Nyquist freq. appear once
- if x is odd f=0 appears once and Nyquist freq. does not appear
Note that there are no tapers applied: This may lead to leakage!
Parseval's theorem (Variance of time series equal to integral over PSD) holds and can be checked via
print ( np.var(x), sum(Px*f[1]) )
Accordingly, the etsimated PSD is independent of time series length
Author/date: M. von Papen / 16.03.2017
"""
N = np.size(x)
xf = np.fft.fft(x)
Px = abs(xf)**2./N*dt
f = np.arange(N/2+1)/(N*dt)
if np.mod(N,2) == 0:
Px[1:N/2] = 2.*Px[1:N/2]
else:
Px[1:N/2+1] = 2.*Px[1:N/2+1]
# Take one-sided spectrum
Px = Px[0:N/2+1]
return Px, f

parameters for low pass fir filter using scipy

I am trying to write a simple low pass filter using scipy, but I need help defining the parameters.
I have 3.5 million records in the time series data that needs to be filtered, and the data is sampled at 1000 hz.
I am using signal.firwin and signal.lfilter from the scipy library.
The parameters I am choosing in the code below do not filter my data at all. Instead, the code below simply produces something that graphically looks like the same exact data except for a time phase distortion that shifts the graph to the right by slightly less than 1000 data points (1 second).
In another software program, running a low pass fir filter through graphical user interface commands produces output that has similar means for each 10 second (10,000 data point) segment, but that has drastically lower standard deviations so that we essentially lose the noise in this particular data file and replace it with something that retains the mean value while showing longer term trends that are not polluted by higher frequency noise. The other software's parameters dialog box contains a check box that allows you to select the number of coefficients so that it "optimizes based on sample size and sampling frequency." (Mine are 3.5 million samples collected at 1000 hz, but I would like a function that uses these inputs as variables.)
*Can anyone show me how to adjust the code below so that it removes all frequencies above 0.05 hz?* I would like to see smooth waves in the graph rather than just the time distortion of the same identical graph that I am getting from the code below now.
class FilterTheZ0():
def __init__(self,ZSmoothedPylab):
#------------------------------------------------------
# Set the order and cutoff of the filter
#------------------------------------------------------
self.n = 1000
self.ZSmoothedPylab=ZSmoothedPylab
self.l = len(ZSmoothedPylab)
self.x = arange(0,self.l)
self.cutoffFreq = 0.05
#------------------------------------------------------
# Run the filter
#------------------------------------------------------
self.RunLowPassFIR_Filter(self.ZSmoothedPylab, self.n, self.l
, self.x, self.cutoffFreq)
def RunLowPassFIR_Filter(self,data, order, l, x, cutoffFreq):
#------------------------------------------------------
# Set a to be the denominator coefficient vector
#------------------------------------------------------
a = 1
#----------------------------------------------------
# Create the low pass FIR filter
#----------------------------------------------------
b = signal.firwin(self.n, cutoff = self.cutoffFreq, window = "hamming")
#---------------------------------------------------
# Run the same data set through each of the various
# filters that were created above.
#---------------------------------------------------
response = signal.lfilter(b,a,data)
responsePylab=p.array(response)
#--------------------------------------------------
# Plot the input and the various outputs that are
# produced by running each of the various filters
# on the same inputs.
#--------------------------------------------------
plot(x[10000:20000],data[10000:20000])
plot(x[10000:20000],responsePylab[10000:20000])
show()
return
Cutoff is normalized to the Nyquist frequency, which is half the sampling rate. So with FS = 1000 and FC = 0.05, you want cutoff = 0.05/500 = 1e-4.
from scipy import signal
FS = 1000.0 # sampling rate
FC = 0.05/(0.5*FS) # cutoff frequency at 0.05 Hz
N = 1001 # number of filter taps
a = 1 # filter denominator
b = signal.firwin(N, cutoff=FC, window='hamming') # filter numerator
M = FS*60 # number of samples (60 seconds)
n = arange(M) # time index
x1 = cos(2*pi*n*0.025/FS) # signal at 0.025 Hz
x = x1 + 2*rand(M) # signal + noise
y = signal.lfilter(b, a, x) # filtered output
plot(n/FS, x); plot(n/FS, y, 'r') # output in red
grid()
The filter output is delayed half a second (the filter is centered on tap 500). Note that the DC offset added by the noise is preserved by the low-pass filter. Also, 0.025 Hz is well within the pass range, so the output swing from peak to peak is approximately 2.
The units of cutoff freq are probably [0,1) where 1.0 is equivalent to FS (the sampling frequency). So if you really mean 0.05Hz and FS=1000Hz, you'd want to pass cutoffFreq / 1000. You may need a longer filter to get such a low cutoff.
(BTW you are passing some arguments but then using the object attributes instead, but I don't see that introducing any obvious bugs yet...)