TypeError: 1st argument must be a real sequence 2 signal.spectrogram - pandas

I'm trying to take a signal from an electrical reading and decompose it into its spectrogram, but I keep getting a weird error. Here is the code:
f, t, Sxx = signal.spectrogram(i_data.values, 130)
plt.pcolormesh(t, f, Sxx)
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()
And here is the error:
convert_to_spectrogram(i_data.iloc[1000,:10020].dropna().values)
Traceback (most recent call last):
File "<ipython-input-140-e5951b2d2d97>", line 1, in <module>
convert_to_spectrogram(i_data.iloc[1000,:10020].dropna().values)
File "<ipython-input-137-5d63a96c8889>", line 2, in convert_to_spectrogram
f, t, Sxx = signal.spectrogram(wf, 130)
File "//anaconda3/lib/python3.7/site-packages/scipy/signal/spectral.py", line 750, in spectrogram
mode='psd')
File "//anaconda3/lib/python3.7/site-packages/scipy/signal/spectral.py", line 1836, in _spectral_helper
result = _fft_helper(x, win, detrend_func, nperseg, noverlap, nfft, sides)
File "//anaconda3/lib/python3.7/site-packages/scipy/signal/spectral.py", line 1921, in _fft_helper
result = func(result, n=nfft)
File "//anaconda3/lib/python3.7/site-packages/mkl_fft/_numpy_fft.py", line 335, in rfft
output = mkl_fft.rfft_numpy(x, n=n, axis=axis)
File "mkl_fft/_pydfti.pyx", line 609, in mkl_fft._pydfti.rfft_numpy
File "mkl_fft/_pydfti.pyx", line 502, in mkl_fft._pydfti._rc_fft1d_impl
TypeError: 1st argument must be a real sequence 2
My reading has a full cycle of 130 observations and its stored as individual values of a pandas df. The wave I am using in particular can be found here. Anyone have any ideas what this error means?
(Small disclaimer, I do not know much about signal processing, so please forgive me if this is a naive question)

Python 3.6.9, scipy 1.3.3
Downloading your file and reading it with pandas.read_csv, I could generate the following spectrogram.
import matplotlib.pyplot as plt
import pandas as pd
from scipy.signal import spectrogram
i_data = pd.read_csv('wave.csv')
f, t, Sxx = spectrogram(i_data.values[:, 1], 130)
plt.pcolormesh(t, f, Sxx)
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()

Related

NumPy Tensordot axes=2

I know there are many questions about tensordot, and I've skimmed some of the 15 page mini-book answers that people I'm sure spent hours making, but I haven't found an explanation of what axes=2 does.
This made me think that np.tensordot(b,c,axes=2) == np.sum(b * c), but as an array:
b = np.array([[1,10],[100,1000]])
c = np.array([[2,3],[5,7]])
np.tensordot(b,c,axes=2)
Out: array(7532)
But then this failed:
a = np.arange(30).reshape((2,3,5))
np.tensordot(a,a,axes=2)
If anyone can provide a short, concise explanation of np.tensordot(x,y,axes=2), and only axes=2, then I would gladly accept it.
In [70]: a = np.arange(24).reshape(2,3,4)
In [71]: np.tensordot(a,a,axes=2)
Traceback (most recent call last):
File "<ipython-input-71-dbe04e46db70>", line 1, in <module>
np.tensordot(a,a,axes=2)
File "<__array_function__ internals>", line 5, in tensordot
File "/usr/local/lib/python3.8/dist-packages/numpy/core/numeric.py", line 1116, in tensordot
raise ValueError("shape-mismatch for sum")
ValueError: shape-mismatch for sum
In my previous post I deduced that axis=2 translates to axes=([-2,-1],[0,1])
How does numpy.tensordot function works step-by-step?
In [72]: np.tensordot(a,a,axes=([-2,-1],[0,1]))
Traceback (most recent call last):
File "<ipython-input-72-efdbfe6ff0d3>", line 1, in <module>
np.tensordot(a,a,axes=([-2,-1],[0,1]))
File "<__array_function__ internals>", line 5, in tensordot
File "/usr/local/lib/python3.8/dist-packages/numpy/core/numeric.py", line 1116, in tensordot
raise ValueError("shape-mismatch for sum")
ValueError: shape-mismatch for sum
So that's trying to do a double axis reduction on the last 2 dimensions of the first a, and the first 2 dimensions of the second a. With this a that's a dimensions mismatch. Evidently this axes was intended for 2d arrays, without much thought given to 3d ones. It is not a 3 axis reduction.
These single digit axes values are something that some developer thought would be convenient, but that does not mean they were rigorously thought out or tested.
The tuple axes gives you more control:
In [74]: np.tensordot(a,a,axes=[(0,1,2),(0,1,2)])
Out[74]: array(4324)
In [75]: np.tensordot(a,a,axes=[(0,1),(0,1)])
Out[75]:
array([[ 880, 940, 1000, 1060],
[ 940, 1006, 1072, 1138],
[1000, 1072, 1144, 1216],
[1060, 1138, 1216, 1294]])

Pandas 0.24.0 breaks my pandas dataframe with special column identifiers

I had code that worked fine until I tried to run it on a coworker's machine, whereupon I discovered that while it worked using pandas 0.22.0, it broke on pandas 0.24.0. For the moment, we've solved this problem by downgrading their copy of pandas, but I would like to find a better solution if one exists.
The problem seems to be that I am creating a user-defined class to use as identifiers for my columns in the dataframe. When trying to compare two dataframes it for some reason tries to call my column labels as functions, and then throws an exception because they aren't callable
Here's some example code:
import pandas as pd
import numpy as np
class label(object):
def __init__(self, var):
self.var = var
def __eq__(self,other):
return self.var == other.var
df = pd.DataFrame(np.eye(5),columns=[label(ii) for ii in range(5)])
df == df
This produces the following stack trace:
Traceback (most recent call last):
File "<ipython-input-4-496e4ab3f9d9>", line 1, in <module>
df==df1
File "C:\...\site-packages\pandas\core\ops.py", line 2098, in f
return dispatch_to_series(self, other, func, str_rep)
File "C:\...\site-packages\pandas\core\ops.py", line 1157, in dispatch_to_series
new_data = expressions.evaluate(column_op, str_rep, left, right)
File "C:\...\site-packages\pandas\core\computation\expressions.py", line 208, in evaluate
return _evaluate(op, op_str, a, b, **eval_kwargs)
File "C:\...\site-packages\pandas\core\computation\expressions.py", line 68, in _evaluate_standard
return op(a, b)
File "C:\...\site-packages\pandas\core\ops.py", line 1135, in column_op
for i in range(len(a.columns))}
File "C:\...\site-packages\pandas\core\ops.py", line 1135, in <dictcomp>
for i in range(len(a.columns))}
File "C:\...\site-packages\pandas\core\ops.py", line 1739, in wrapper
name=res_name).rename(res_name)
File "C:\...\site-packages\pandas\core\series.py", line 3733, in rename
return super(Series, self).rename(index=index, **kwargs)
File "C:\...\site-packages\pandas\core\generic.py", line 1091, in rename
level=level)
File "C:\...\site-packages\pandas\core\internals\managers.py", line 171, in rename_axis
obj.set_axis(axis, _transform_index(self.axes[axis], mapper, level))
File "C:\...\site-packages\pandas\core\internals\managers.py", line 2004, in _transform_index
items = [func(x) for x in index]
TypeError: 'label' object is not callable
I've found I can fix the problem by making my class callable with a single argument and returning that argument, but that breaks .loc indexing, which will default to treating my objects as callables.
This problem only occurs when the custom objects are in the columns - the index can handle them just fine.
Is this a bug or a change in usage, and is there any way I can work around it without giving up my custom labels?

python-xarray: rolling mean example

I have a file which is monthly data for one year (12 points). The data starts in December and ends in November. I'm hoping to create a 3-month running mean file which would be DJF, JFM, ..., SON (10 points)
I noticed there is a DataArray.rolling function which returns a rolling window option and I think would be useful for this. However, I haven't found any examples using the rolling function. I admit i'm not familiar with bottleneck, pandas.rolling_mean or the more recent pandas.rolling so my entry level is fairly low.
Here's some code to test:
import numpy as np
import pandas as pd
import xarray as xr
lat = np.linspace(-90, 90, num=181); lon = np.linspace(0, 359, num=360)
# Define monthly average time as day in middle of month
time = pd.date_range('15/12/1999', periods=12, freq=pd.DateOffset(months=1))
# Create data as 0:11 at each grid point
a = np.linspace(0,11,num=12)
# expand to 2D
a2d = np.repeat(tmp[:, np.newaxis], len(lat), axis=1)
# expand to 3D
a3d = np.repeat(a2d[:, :, np.newaxis], len(lon), axis=2)
# I'm sure there was a cleaner way to do that...
da = xr.DataArray(a3d, coords=[time, lat, lon], dims=['time','lat','lon'])
# Having a stab at the 3-month rolling mean
da.rolling(dim='time',window=3).mean()
# Error output:
Traceback (most recent call last):
File "<ipython-input-132-9d64cc09c263>", line 1, in <module>
da.rolling(dim='time',window=3).mean()
File "/Users/Ray/anaconda/lib/python3.6/site-packages/xarray/core/common.py", line 478, in rolling
center=center, **windows)
File "/Users/Ray/anaconda/lib/python3.6/site-packages/xarray/core/rolling.py", line 126, in __init__
center=center, **windows)
File "/Users/Ray/anaconda/lib/python3.6/site-packages/xarray/core/rolling.py", line 62, in __init__
raise ValueError('exactly one dim/window should be provided')
ValueError: exactly one dim/window should be provided
You are very close. The rolling method takes a key/value pair that maps as dim/window_size. This should work for you.
da.rolling(time=3).mean()

Plotting Single Quiver Arrow with basemap

New to Python, I know this is an easy question with a one line answer but I can't figure it out.
I'm trying to plot a single quiver arrow on top of a basemap with a single longitude and latitude coordinate point but I'm getting errors when trying to plot the quiver.
Here's what I have so far:
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
lat,lon=28.31393296,-96.63100599
u,v=6.16,-3.02
m=Basemap(projection='mill',llcrnrlon=-96.9,llcrnrlat=28.245,
urcrnrlon=-96.587,urcrnrlat=28.485,resolution='i')
m.drawcoastlines()
m.fillcontinents(color='coral',lake_color='aqua')
m.drawmapboundary(fill_color='aqua')
m.quiver(lon,lat,u,v,latlon=True)
plt.show()
And I'm getting the error:
Traceback (most recent call last):
File "<ipython-input-46-c277f7618784>", line 1, in <module>
runfile('C:/animationtest.py', wdir='C:')
File "C:\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile
execfile(filename, namespace)
File "C:\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "C:/animationtest.py", line 41, in <module>
m.quiver(lon,lat,ve,vn,latlon=True)
File "C:\Anaconda\lib\site-packages\mpl_toolkits\basemap\__init__.py", line 556, in with_transform
x1, u = self.shiftdata(x, u)
File "C:\Anaconda\lib\site-packages\mpl_toolkits\basemap\__init__.py", line 4713, in shiftdata
raise ValueError('1-d or 2-d longitudes required')
ValueError: 1-d or 2-d longitudes required
Can someone please explain why this doesn't work?
EDIT: Figured out the solution, had to convert lat lon to map projection with the line
x,y=m(lon,lat)
And then plot the quiver with
m.quiver(x,y,u,v)
Does exactly what I needed to do

Pandas Group Example Errors

I am trying to replicate one example out of Wes McKinney's book on Pandas, the code is here (it assumes all names datafiles are under names folder)
# -*- coding: utf-8 -*-
import numpy as np
import pandas as pd
years = range(1880, 2011)
pieces = []
columns = ['name', 'sex', 'births']
for year in years:
path = 'names/yob%d.txt' % year
frame = pd.read_csv(path, names=columns)
frame['year'] = year
pieces.append(frame)
names = pd.concat(pieces, ignore_index=True)
names
def get_tops(group):
return group.sort_index(by='births', ascending=False)[:1000]
grouped = names.groupby(['year','sex'])
grouped.apply(get_tops)
I am using Pandas 0.10 and Python 2.7. The error I am seeing is this:
Traceback (most recent call last):
File "names.py", line 21, in <module>
grouped.apply(get_tops)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/groupby.py", line 321, in apply
return self._python_apply_general(f)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/groupby.py", line 324, in _python_apply_general
keys, values, mutated = self.grouper.apply(f, self.obj, self.axis)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/groupby.py", line 585, in apply
values, mutated = splitter.fast_apply(f, group_keys)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/groupby.py", line 2127, in fast_apply
results, mutated = lib.apply_frame_axis0(sdata, f, names, starts, ends)
File "reduce.pyx", line 421, in pandas.lib.apply_frame_axis0 (pandas/lib.c:24934)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/frame.py", line 2028, in __setattr__
self[name] = value
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/frame.py", line 2043, in __setitem__
self._set_item(key, value)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/frame.py", line 2078, in _set_item
value = self._sanitize_column(key, value)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/frame.py", line 2112, in _sanitize_column
raise AssertionError('Length of values does not match '
AssertionError: Length of values does not match length of index
Any ideas?
I think this was a bug introduced in 0.10, namely issue #2605,
"AssertionError when using apply after GroupBy". It's since been fixed.
You can either wait for the 0.10.1 release, which shouldn't be too long from now, or you can upgrade to the development version (either via git or simply by downloading the zip of master.)