I wanted to create two random noise signals sampled 2.5G sa/s in frequency range 200kHz - 20Mhz, time of signal 5us and calculate fft of it but I have a problem with fft. Thank you for help, the code is
import numpy as np
import matplotlib.pyplot as plot
from scipy import signal
from scipy import fft
import pandas as pd
t = np.arange(0, 5e-6, 4e-10)
s1 = 1e-8*np.random.normal(0, 1, 12500)
s2 = 1e-8*np.random.normal(0, 1, 12500)
sos1 = signal.butter(N=10, Wn=[200000, 20000000], btype='band', fs=2.5e9, output='sos')
sos2 = signal.butter(N=10, Wn=[200000, 20000000], btype='band', fs=2.5e9, output='sos')
fs1 = signal.sosfilt(sos1, s1)
fs2 = signal.sosfilt(sos2, s2)
f1 = abs(fs1.fft())
f2 = abs(fs2.fft())
ax1 = plot.subplot(311)
plot.plot(t, fs1, t, fs2)
#ax1.set_xlim([0, 5e-6])
plot.xlabel('Time (s)')
plot.ylabel('Current (A)')
ax2 = plot.subplot(312)
plot.plot(f1, f2)
plot.xlabel('Frequency (Hz)')
plot.ylabel('Current (A)')
plot.show()
I had to do some changes to your code in order to run it. The main one was to change fs1.fft() to fft.fft().
Another issue is the 'fft.fftshift()' method that you should be aware of. You can calculate the frequency vector by hand, but this is somewhat tedious because of the order of the elements in the resulting fft vector. The result of the fft has a peculiar frequency arrangement. From the scipy.fft.fft() documentation:
The frequency term f=k/n is found at y[k]. At y[n/2] we reach the Nyquist frequency and wrap around to the negative-frequency terms. So, for an 8-point transform, the frequencies of the result are [0, 1, 2, 3, -4, -3, -2, -1]. To rearrange the fft output so that the zero-frequency component is centered, like [-4, -3, -2, -1, 0, 1, 2, 3], use fftshift.
So, the easiest way is to use scipy.fft.fftfreq() to let scipy do the calculation for you. If you want to plot it in a natural way, then you should call scipy.fft.fftshift() to shift the zero Hz frequency to the center of the array.
Also, as you are using real signals, for efficiency reasons you might consider using the real number version of the fft algorithm scipy.fft.rfft(). The output does not include the negative frequencies, since for real input arrays the output of the full algorithm is always symmetric.
Please see the code below.
import matplotlib
matplotlib.use('Qt5Agg')
import numpy as np
import matplotlib.pyplot as plot
from scipy import signal
from scipy import fft
import pandas as pd
sampling_freq_Hz = 2.5e9
sampling_period_s = 1 / sampling_freq_Hz
signal_duration_s = 5.0e-6
wanted_number_of_points = signal_duration_s / sampling_period_s
f_low_Hz = 200.0e3
f_high_Hz = 20.0e6
msg = f'''
Sampling frequency: {sampling_freq_Hz} Hz
Sampling period: {sampling_period_s} s
Signal duration: {signal_duration_s} s
Wanted number of points: {wanted_number_of_points}
Lower frequency limit {f_low_Hz}
Upper frequency limit: {f_high_Hz}
'''
print(msg)
# Time axis
time_s = np.arange(0, signal_duration_s, sampling_period_s)
real_number_of_points = time_s.size
print(f'Real number of points: {real_number_of_points}')
# Normal(0,sigma^2) distributed random noise
sigma_2 = 1.0e-8
s1 = np.random.normal(0, sigma_2, real_number_of_points)
s2 = np.random.normal(0, sigma_2, real_number_of_points)
# Since both filters are equal, you only need one
sos1 = signal.butter(N=10, Wn=[f_low_Hz, f_high_Hz], btype='band', fs=sampling_freq_Hz, output='sos')
#sos2 = signal.butter(N=10, Wn=[f_low_Hz, f_high_Hz], btype='band', fs=sampling_freq_Hz, output='sos')
# Do the actual filtering
filtered_signal_1 = signal.sosfilt(sos1, s1)
filtered_signal_2 = signal.sosfilt(sos1, s2)
# Absolute value
f_1 = abs(fft.fft(filtered_signal_1))
f_2 = abs(fft.fft(filtered_signal_2))
freqs_Hz = fft.fftfreq(time_s.size, sampling_period_s)
# Shift the FFT for understandable plotting
f_1_shift = fft.fftshift(f_1)
f_2_shift = fft.fftshift(f_2)
freqs_Hz_shift = fft.fftshift(freqs_Hz)
# Plot
ax1 = plot.subplot(311)
ax1.plot(time_s, filtered_signal_1, time_s, filtered_signal_2)
#ax1.set_xlim([0, 5e-6])
ax1.set_xlabel('Time (s)')
ax1.set_ylabel('Current (A)')
ax2 = plot.subplot(313)
ax2.plot(freqs_Hz_shift, f_1_shift, freqs_Hz_shift, f_2_shift)
ax2.set_xlabel('Frequency (Hz)')
ax2.set_ylabel('Current (A)')
plot.show()
Related
For some reason when I am trying to large amount of data to a sin wave it fails and fits it to a horizontal line. Can somebody explain?
Minimal working code:
import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize
# Seed the random number generator for reproducibility
import pandas
np.random.seed(0)
# Here it work as expected
# x_data = np.linspace(-5, 5, num=50)
# y_data = 2.9 * np.sin(1.05 * x_data + 2) + 250 + np.random.normal(size=50)
# With this data it breaks
x_data = np.linspace(0, 2500, num=2500)
y_data = -100 * np.sin(0.01 * x_data + 1) + 250 + np.random.normal(size=2500)
# And plot it
plt.figure(figsize=(6, 4))
plt.scatter(x_data, y_data)
def test_func(x, a, b, c, d):
return a * np.sin(b * x + c) + d
# Used to fit the correct function
# params, params_covariance = optimize.curve_fit(test_func, x_data, y_data)
# making some guesses
params, params_covariance = optimize.curve_fit(test_func, x_data, y_data,
p0=[-80, 3, 0, 260])
print(params)
plt.figure(figsize=(6, 4))
plt.scatter(x_data, y_data, label='Data')
plt.plot(x_data, test_func(x_data, *params),
label='Fitted function')
plt.legend(loc='best')
plt.show()
Does anybody know, how to fix this issue. Should I use a different fitting method not least square? Or should I reduce the number of data points?
Given your data, you can use the more robust lmfit instead of scipy.
In particular, you can use SineModel (see here for details).
SineModel in lmfit is not for "shifted" sine waves, but you can easily deal with the shift doing
y_data_offset = y_data.mean()
y_transformed = y_data - y_data_offset
plt.scatter(x_data, y_transformed)
plt.axhline(0, color='r')
Now you can fit to sine wave
from lmfit.models import SineModel
mod = SineModel()
pars = mod.guess(y_transformed, x=x_data)
out = mod.fit(y_transformed, pars, x=x_data)
you can inspect results with print(out.fit_report()) and plot results with
plt.plot(x_data, y_data, lw=7, color='C1')
plt.plot(x_data, out.best_fit+y_data_offset, color='k')
# we add the offset ^^^^^^^^^^^^^
or with the builtin plot method out.plot_fit(), see here for details.
Note that in SineModel all parameters "are constrained to be non-negative", so your defined negative amplitude (-100) will be positive (+100) in the parameters fit results. So the phase too won't be 1 but π+1 (PS: they call shift the phase)
print(out.best_values)
{'amplitude': 99.99631403054289,
'frequency': 0.010001193681616227,
'shift': 4.1400215410836605}
Using Python matplotlib I would like to plot sensor data over a period of several hours. The signal arrives via an audio card and gets sampled over short chunks of data. In the example below amplitude and RMS is plotted.
In order to plot RMS and other properties over much larger time periods than shown here, perhaps down sampling is needed. I am not sure how to accomplish that and would appreciate any further advice. The intention is to run the code on a Raspberry Pi.
Update 1. A very minimal example is shown for getting a longer time view of RMS.
Noticable is a considerable delay in response to audio signals in particular when adding more plots to the figure.
I also tried using Funcanimation without blitting because I would like to show a real-time axis and this is equally slow. Using PyQT should give better results.
import pyaudio
import struct
import matplotlib.pyplot as plt
import numpy as np
mic = pyaudio.PyAudio()
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
CHUNK = int(RATE/20)
stream = mic.open(format=FORMAT, channels=CHANNELS, rate=RATE, input=True,
output=True,
frames_per_buffer=CHUNK)
fig = plt.figure()
ax1 = fig.add_subplot(2, 1, 1)
ax2 = fig.add_subplot(2, 1, 2)
ax1.set_xlabel("Samples = 2*Chunk length ")
ax1.set_ylabel("Amplitude")
ax1.set_title('Audio example')
fig.tight_layout(pad=3.0)
x = np.arange(0, 2 * CHUNK, 2)
ax1.set_ylim(-10e3, 10e3)
ax1.set_xlim(0, CHUNK)
line1, = ax1.plot(x, np.random.rand(CHUNK))
line2, = ax2.plot(x, np.random.rand(CHUNK))
ts = []
rs = []
while True:
data = stream.read(CHUNK)
data = np.frombuffer(data, np.int16)
d = np.frombuffer(data, np.int16).astype(np.float)
rms2 = np.sqrt( np.mean(d**2) )
#print(rms2)
# Add x and y to lists
ts.append(dt.datetime.now())
rs.append(rms2)
#Draw x and y lists
ax2.clear()
ax2.plot(ts,rs,color= 'black')
# Format plot
ax2.set_xlabel("Time in UTC")
ax2.set_ylabel("RMS values")
ax2.set_title('RMS')
line1.set_ydata(data)
line2.set_ydata(rms2)
plt.setp(ax2.get_xticklabels(), ha="right", rotation=45)
fig.gca().relim()
fig.gca().autoscale_view()
#fig.canvas.draw()
#fig.canvas.flush_events()
plt.pause(0.01)
I have been trying to smooth curves with Savgol (scikit) and, in several of my attempt, raising the polynomial degree resulted in "drops" like the one I show below. This example is from Google trends data, but I had similar problems with stock data and electricity consumption data. Any lead as to why it behaves like it or how to solve it (and be able to raise the polynomial degree) would be highly appreciated.
Image below: "Sample output".
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
from pytrends.request import TrendReq
pytrends = TrendReq(hl='en-US', tz=360)
from scipy.signal import savgol_filter
kw_list = ["Carbon footprint"]
pytrends.build_payload(kw_list, timeframe='2004-12-14 2019-12-25', geo='', gprop='')
da1 = pytrends.interest_over_time()
#(drop last one for Savgol as need odd number, used to have 196 records)
Y3 = da1["Carbon footprint"]
fig = plt.figure(figsize=(18,9))
l = Y3.shape[0]
l = l if l%2 == 1 else l-1
# window = odd number closest to size of data
ax1 = plt.subplot(2,1,1)
ax1 = sns.lineplot(data=Y3, color="navy")
#Savgol with polynomial order = 7 is fine (but misses the initial plateau)
Y3_smooth = savgol_filter(Y3,l, 7)
ax1 = sns.lineplot(x=da1.index.to_pydatetime(),y=Y3_smooth, color="red")
plt.title(f"red = with Savgol, polynomial order = 7, window = {l}", fontsize=18)
ax2 = plt.subplot(2,1,2)
ax2 = sns.lineplot(data=Y3, color="navy")
#Savgol with polynomial order = 9 or more has a weird drop
Y3_smooth = savgol_filter(Y3,l, 10)
ax2 = sns.lineplot(x=da1.index.to_pydatetime(),y=Y3_smooth, color="red")
plt.title(f"red = with Savgol, polynomial order = 10, window = {l}", fontsize=18)
Sample output
If anyone is interested, I found this workaround using a different way to smooth. It works well including in the beginning and end, and allows a fine tuning of the degree of smoothing.
from scipy.ndimage.filters import gaussian_filter1d
def smooth(y, sigma=2):
y_smooth = gaussian_filter1d(y, sigma)
return y_smooth
I am saving two separate figures, that each should contain 2 plots together.
The problem is that the first figure is ok, but the second one, does not gets overwritten on the new plot but on the previous one, but in the saved figure, I only find one of the plots :
This is the first figure , and I get the first figure correctly :
import scipy.stats as s
import numpy as np
import os
import pandas as pd
import openpyxl as pyx
import matplotlib
matplotlib.rcParams["backend"] = "TkAgg"
#matplotlib.rcParams['backend'] = "Qt4Agg"
#matplotlib.rcParams['backend'] = "nbAgg"
import matplotlib.pyplot as plt
import math
data = [336256, 620316, 958846, 1007830, 1080401]
pdf = array([ 0.00449982, 0.0045293 , 0.00455894, 0.02397463,
0.02395788, 0.02394114])
fig, ax = plt.subplots();
fig = plt.figure(figsize=(40,30))
x = np.linspace(np.min(data), np.max(data), 100);
plt.plot(x, s.exponweib.pdf(x, *s.exponweib.fit(data, 1, 1, loc=0, scale=2)))
plt.hist(data, bins = np.linspace(data[0], data[-1], 100), normed=True, alpha= 1)
text1= ' Weibull'
plt.savefig(text1+ '.png' )
datar =np.asarray(data)
mu, sigma = datar.mean() , datar.std() # mean and standard deviation
normal_std = np.sqrt(np.log(1 + (sigma/mu)**2))
normal_mean = np.log(mu) - normal_std**2 / 2
hs = np.random.lognormal(normal_mean, normal_std, 1000)
print(hs.max()) # some finite number
print(hs.mean()) # about 136519
print(hs.std()) # about 50405
count, bins, ignored = plt.hist(hs, 100, normed=True)
x = np.linspace(min(bins), max(bins), 10000)
pdfT = [];
for el in range (len(x)):
pdfTmp = (math.exp(-(np.log(x[el]) - normal_mean)**2 / (2 * normal_std**2)))
pdfT += [pdfTmp]
pdf = np.asarray(pdfT)
This is the second set :
fig, ax = plt.subplots();
fig = plt.figure(figsize=(40,40))
plt.plot(x, pdf, linewidth=2, color='r')
plt.hist(data, bins = np.linspace(data[0], data[-1], 100), normed=True, alpha= 1)
text= ' Lognormal '
plt.savefig(text+ '.png' )
The first plot saves the histogram together with curve. instead the second one only saves the curve
update 1 : looking at This Question , I found out that clearing the plot history will help the figures don't mixed up , but still my second set of plots, I mean the lognormal do not save together, I only get the curve and not the histogram.
This is happening, because you have set normed = True, which means that area under the histogram is normalized to 1. And since your bins are very wide, this means that the actual height of the histogram bars are very small (in this case so small that they are not visible)
If you use
n, bins, _ = plt.hist(data, bins = np.linspace(data[0], data[-1], 100), normed=True, alpha= 1)
n will contain the y-value of your bins and you can confirm this yourself.
Also have a look at the documentation for plt.hist.
So if you set normed to False, the histogram will be visible.
Edit: number of bins
import numpy as np
import matplotlib.pyplot as plt
rand_data = np.random.uniform(0, 1.0, 100)
fig = plt.figure()
ax_1 = fig.add_subplot(211)
ax_1.hist(rand_data, bins=10)
ax_2 = fig.add_subplot(212)
ax_2.hist(rand_data, bins=100)
plt.show()
will give you two plots similar (since its random) to:
which shows how the number of bins changes the histogram.
A histogram visualises the distribution of your data along one dimension, so not sure what you mean by number of inputs and bins.
I want to create a scatter plot with matplotlib where the data points have scalar data attached to them and are assigned a color depending on how large their attached value is relative to the other points in the set. I.e., I want something akin to a heatmap. However, I'm looking for a "discrete" heatmap, i.e. nothing should be ploted where there were no points in the original data set and, in particular, no interpolation (in space) should be performed.
Can this be done?
you can use scatter, and set the attached value to c parameter:
import numpy as np
import pylab as pl
x = np.random.uniform(-1, 1, 1000)
y = np.random.uniform(-1, 1, 1000)
z = np.sqrt(x*x+y*y)
pl.scatter(x, y, c=z)
pl.colorbar()
pl.show()
Solving this in Altair.
import numpy as np
import pylab as pl
x = np.random.uniform(-1, 1, 1000)
y = np.random.uniform(-1, 1, 1000)
z = np.sqrt(x*x+y*y)
df = pd.DataFrame({'x':x,'y':y, 'z':z})
from altair import *
Chart(df).mark_circle().encode(x='x',y='y', color='z')