np.fft.fft off by a factor of 1000 (fitting an powerspectrum) - numpy

I'm trying to make a powerspectrum from an experimental dataset which I am reading in, and then to fit it to an theoretical curve. Now everything is working fine and I'm not getting errors, except for the fact that my curve keeps differing by a factor of 1000 from the data and I have absolutely no idea what the problem could be. I've asked a few people, but to no avail. (I hope that you guys will be able to help)
Anyways, I'm pretty sure that its not the units, as they were tripple checked by me and 2 others. Basically, I need to fit a powerspectrum to an equation by using the least squares method.
I can't post the whole code, as its rather long and a bit messy, but this is the fourier part, I added comments to all arrays and vars which have not been declared in the code)
#Calculate stuff
Nm = 10**-6 #micro to meter
KbT = 4.10E-21 #Joule
T = 297. #K
l = zvalue*Nm #meter
meany = np.mean(cleandatay*Nm) #meter (cleandata is the array that I read in from a cvs at the start.)
SDy = sum((cleandatay*Nm - meany)**2)/len(cleandatay) #meter^2
FmArray[0][i] = ((KbT*l)/SDy) #N
#print FmArray[0][i]
print float((i*100/len(filelist)))#how many % done?
#fourier
dt = cleant[1]-cleant[0] #timestep
N = len(cleandatay) #Same for cleant, its the corresponding time to cleandatay
Here is where the fourier part starts, I take the fft and turn it into a powerspectrum. Then I calculate the corresponding freq steps with the array freqs
fouriery = np.fft.fft((cleandatay*(10**-6)))
fourierpower = (np.abs(fouriery))**2
fourierpower = fourierpower[1:N/2] #remove 0th datapoint and /2 (remove negative freqs)
fourierpower = fourierpower*dt #*dt to account for steps
freqs = (1.+np.arange((N/2)-1.))/50.
#Least squares method
eta = 8.9E-4 #pa*s
Rbead = 0.5E-6#meter
constant = 2*KbT/(3*eta*pi*Rbead)
omega = 2*pi*freqs #rad/s
Wcarray = 2.*pi*np.arange(0,30, 0.02003) #0.02 = 30/len(freqs)
ChiSq = np.zeros(len(Wcarray))
for k in range(0, len(Wcarray)):
Py = (constant / (Wcarray[k]**2 + omega**2))
ChiSq[k] = sum((fourierpower - Py)**2)
pylab.loglog(omega, Py)
print k*100/len(Wcarray)
index = np.where(ChiSq == min(ChiSq))
cutoffw = Wcarray[index]
Pygoed = (constant / (Wcarray[index]**2 + omega**2))
print cutoffw
print constant
print min(ChiSq)
pylab.loglog(omega,ChiSq)
So I have no idea what could be going wrong, I think its the fft, as nothing else can really go wrong.
Below is the pic I get when I plot all the fit lines against the spectrum, as you can see it is off by about 1000 (actually exactly 1000, as this leaves a least square residue of 10^-22, but I can't just randomly multiply without knowing why)
Just to elaborate on the picture. The green dots are the fft spectrum, the lines are the fits, the red dot is where it thinks the cutoff frequency is, and the blue line is the chi-squared fit, looking for the lowest value.

Take a look at the documentation for the FFT that you are using. Many FFTs introduce a scaling factor that is usually N * result (number of samples). Multiplying by 1/N will scale the results back in line. (You said that the result is 1000 too high....could it be that you are using a 1024 size FFT?)

Your library FFT routine might include a scale factor of 1/sqrt(n).
Check the documentation for the fft you used, as the proportion of the scale factor allocated between the fft and the ifft is arbitrary.

Related

Is it possible to compute the sign of a permutation in linear time?

I was just wondering if there's a way to compute the sign of a permutation within linear (or at least better than n^2?) time
For example, let's say I have an array of n numbers and I permute two elements within this array which would flip the sign of the permutation. I have a function that can compute this in n^2 time, however, it seems there might be a more efficient algorithm.
I've attached a minimal reproducible example of computing in quadratic time,
import numpy as np
vals = np.arange(1,6,1)
pvals = np.arange(1,6,1)
pvals[0], pvals[1] = pvals[1], pvals[0] #swap
def quadratic(vals):
sgn_matrix = np.sign(np.expand_dims(vals, -1) - np.expand_dims(vals, -2))
return np.prod(np.tril(np.ones_like(sgn_matrix)) + np.triu(sgn_matrix, 1))
def sub_quadratic(vals):
#algorithm quicker than quadratic time?
sgn = quadratic(vals)
print(sgn) #prints +1
psgn = quadratic(pvals)
print(psgn) #prints -1 (because one permutation)
I have had a look around SO (here for example) and people keep talking about cyclic permutations which apparently can compute in linear time but it's something I'm unaware of completely and can't find much of myself.
TL;DR Does anyone know of a method for computing the sign of a permutation in sub-quadratic time ?
Just decompose it into transpositions and check whether you needed an even or odd number of transpositions:
def permutation_sign(perm):
parity = 1
perm = perm.copy()
for i in range(len(perm)):
while perm[i] != i+1:
parity *= -1
j = perm[i] - 1
# Note: if you try to inline the j computation into the next line,
# you'll get evaluation order bugs.
perm[i], perm[j] = perm[j], perm[i]
return parity

numpy.cov or numpy.linalg.eigvals gives wrong results

I have high (100) dimensional data. I want to get the eigenvectors of the covariance matrix of the data.
Cov = numpy.cov(data)
EVs = numpy.linalg.eigvals(Cov)
I get a vector containing some eigenvalues which are complex numbers. This is mathematically impossible. Granted, the imaginary parts of the complex numbers are very small but it still causes issues later on. Is this a numerical issue? If so, does the issue lie with cov, eigvals function or both?
To give more color on that, I did the same calculation in Mathematica which gives, of course, a correct result. Turns out there are some eigenvalues which are very close to zero but not quiet zero and numpy gets all of these wrong (magnitude wise and it makes some of them into complex numbers)
I was facing a similar issue: np.linalg.eigvals was returning a complex vector in which the imaginary part was quasi-zero everywhere.
Using np.linalg.eigvalsh instead fixed it for me.
I don't know the exact reason, but most probably it is a numerical issue and eigvalsh seems to handle it whereas eigvals doesn't. Note that the ordering of the actual eigenvalues may differ.
The following snippet illustrates the fix:
import numpy as np
from numpy.linalg import eigvalsh, eigvals
D = 10
MUL = 100
EPS = 1e-8
x = np.random.rand(1, D) * MUL
x -= x.mean()
S = np.matmul(x.T, x) + I
# adding epsilon*I avoids negative eigenvalues due to numerical error
# since the matrix is actually positive semidef. (useful for cholesky etc)
S += np.eye(D, dtype=np.float64) * EPS
print(sorted(eigvalsh(S)))
print(sorted(eigvals(S)))

Zoom in on np.fft2 result

Is there a way to chose the x/y output axes range from np.fft2 ?
I have a piece of code computing the diffraction pattern of an aperture. The aperture is defined in a 2k x 2k pixel array. The diffraction pattern is basically the inner part of the 2D FT of the aperture. The np.fft2 gives me an output array same size of the input but with some preset range of the x/y axes. Of course I can zoom in by using the image viewer, but I have already lost detail. What is the solution?
Thanks,
Gert
import numpy as np
import matplotlib.pyplot as plt
r= 500
s= 1000
y,x = np.ogrid[-s:s+1, -s:s+1]
mask = x*x + y*y <= r*r
aperture = np.ones((2*s+1, 2*s+1))
aperture[mask] = 0
plt.imshow(aperture)
plt.show()
ffta= np.fft.fft2(aperture)
plt.imshow(np.log(np.abs(np.fft.fftshift(ffta))**2))
plt.show()
Unfortunately, much of the speed and accuracy of the FFT come from the outputs being the same size as the input.
The conventional way to increase the apparent resolution in the output Fourier domain is by zero-padding the input: np.fft.fft2(aperture, [4 * (2*s+1), 4 * (2*s+1)]) tells the FFT to pad your input to be 4 * (2*s+1) pixels tall and wide, i.e., make the input four times larger (sixteen times the number of pixels).
Begin aside I say "apparent" resolution because the actual amount of data you have hasn't increased, but the Fourier transform will appear smoother because zero-padding in the input domain causes the Fourier transform to interpolate the output. In the example above, any feature that could be seen with one pixel will be shown with four pixels. Just to make this fully concrete, this example shows that every fourth pixel of the zero-padded FFT is numerically the same as every pixel of the original unpadded FFT:
# Generate your `ffta` as above, then
N = 2 * s + 1
Up = 4
fftup = np.fft.fft2(aperture, [Up * N, Up * N])
relerr = lambda dirt, gold: np.abs((dirt - gold) / gold)
print(np.max(relerr(fftup[::Up, ::Up] , ffta))) # ~6e-12.
(That relerr is just a simple relative error, which you want to be close to machine precision, around 2e-16. The largest error between every 4th sample of the zero-padded FFT and the unpadded FFT is 6e-12 which is quite close to machine precision, meaning these two arrays are nearly numerically equivalent.) End aside
Zero-padding is the most straightforward way around your problem. But it does cost you a lot of memory. And it is frustrating because you might only care about a tiny, tiny part of the transform. There's an algorithm called the chirp z-transform (CZT, or colloquially the "zoom FFT") which can do this. If your input is N (for you 2*s+1) and you want just M samples of the FFT's output evaluated anywhere, it will compute three Fourier transforms of size N + M - 1 to obtain the desired M samples of the output. This would solve your problem too, since you can ask for M samples in the region of interest, and it wouldn't require prohibitively-much memory, though it would need at least 3x more CPU time. The downside is that a solid implementation of CZT isn't in Numpy/Scipy yet: see the scipy issue and the code it references. Matlab's CZT seems reliable, if that's an option; Octave-forge has one too and the Octave people usually try hard to match/exceed Matlab.
But if you have the memory, zero-padding the input is the way to go.

comparing two frequency spectra

I'm trying to compare two frequency spectra but I am confused over a number of points.
One device samples at 40 Hz the other at 100 Hz and so I'm not sure if I need to take this into account. Anyway I have produced frequency spectra from both devices and now I wish to compare these. How can I do correlation at each point so that I get pearson correlations at each point. I know how to do an overall one of course but I want to see points of strong correlation and those less strong?
If you are calculating power spectral densities P(f), then it doesn't matter how your original signal x(t) is sampled. You can directly and quantitatuively compare both spectra. To make sure that you have calculated the spectral densities you can explicitly check Parsevals theorem:
$ \int P(f) df = \int x(t)^2 dt $
Of course you have to think about which frequencies are actually evaluated Remember that a fft gives you frequencies from f = 1/T until or below the Nyquist frequency f_ny = 1/(2 dt) depending on the number of samples in x(t) being even or odd.
Here's a python example code for psd
def psd(x,dt=1.):
"""Computes one-sided power spectral density of x.
PSD estimated via abs**2 of Fourier transform of x
Takes care of even or odd number of elements in x:
- if x is even both f=0 and Nyquist freq. appear once
- if x is odd f=0 appears once and Nyquist freq. does not appear
Note that there are no tapers applied: This may lead to leakage!
Parseval's theorem (Variance of time series equal to integral over PSD) holds and can be checked via
print ( np.var(x), sum(Px*f[1]) )
Accordingly, the etsimated PSD is independent of time series length
Author/date: M. von Papen / 16.03.2017
"""
N = np.size(x)
xf = np.fft.fft(x)
Px = abs(xf)**2./N*dt
f = np.arange(N/2+1)/(N*dt)
if np.mod(N,2) == 0:
Px[1:N/2] = 2.*Px[1:N/2]
else:
Px[1:N/2+1] = 2.*Px[1:N/2+1]
# Take one-sided spectrum
Px = Px[0:N/2+1]
return Px, f

How can DWT be used in LSB substitution steganography

In steganography, the least significant bit (LSB) substitution method embeds the secret bits in the place of bits from the cover medium, for example, image pixels. In some methods, the Discrete Wavelet Transform (DWT) of the image is taken and the secret bits are embedded in the DWT coefficients, after which the inverse trasform is used to reconstruct the stego image.
However, the DWT produces float coefficients and for the LSB substitution method integer values are required. Most papers I've read use the 2D Haar Wavelet, yet, they aren't clear on their methodology. I've seen the transform being defined in terms of low and high pass filters (float transforms), or taking the sum and difference of pair values, or the average and mean difference, etc.
More explicitly, either in the forward or the inverse transform (but not necessarily in both depending on the formulas used) eventually float numbers will appear. I can't have them for the coefficients because the substitution won't work and I can't have them for the reconstructed pixels because the image requires integer values for storage.
For example, let's consider a pair of pixels, A and B as a 1D array. The low frequency coefficient is defined by the sum, i.e., s = A + B, and the high frequency coefficient by the difference, i.e., d = A - B. We can then reconstruct the original pixels with B = (s - d) / 2 and A = s - B. However, after any bit twiddling with the coefficients, s - d may not be even anymore and float values will emerge for the reconstructed pixels.
For the 2D case, the 1D transform is applied separately for the rows and the columns, so eventually a division by 4 will occur somewhere. This can result in values with float remainders .00, .25, .50 and .75. I've only come across one paper which addresses this issue. The rest are very vague in their methodology and I struggle to replicate them. Yet, the DWT has been widely implemented for image steganography.
My question is, since some of the literature I've read hasn't been enlightening, how can this be possible? How can one use a transform which introduces float values, yet the whole steganography method requires integers?
One solution that has worked for me is using the Integer Wavelet Transform, which some also refer to as a lifting scheme. For the Haar wavelet, I've seen it defined as:
s = floor((A + B) / 2)
d = A - B
And for inverse:
A = s + floor((d + 1) / 2)
B = s - floor(d / 2)
All the values throughout the whole process are integers. The reason it works is because the formulas contain information about both the even and odd parts of the pixels/coefficients, so there is no loss of information from rounding down. Even if one modifies the coefficients and then takes the inverse transform, the reconstructed pixels will still be integers.
Example implementation in Python:
import numpy as np
def _iwt(array):
output = np.zeros_like(array)
nx, ny = array.shape
x = nx // 2
for j in xrange(ny):
output[0:x,j] = (array[0::2,j] + array[1::2,j])//2
output[x:nx,j] = array[0::2,j] - array[1::2,j]
return output
def _iiwt(array):
output = np.zeros_like(array)
nx, ny = array.shape
x = nx // 2
for j in xrange(ny):
output[0::2,j] = array[0:x,j] + (array[x:nx,j] + 1)//2
output[1::2,j] = output[0::2,j] - array[x:nx,j]
return output
def iwt2(array):
return _iwt(_iwt(array.astype(int)).T).T
def iiwt2(array):
return _iiwt(_iiwt(array.astype(int).T).T)
Some languages already have built-in functions for this purpose. For example, Matlab uses lwt2() and ilwt2() for 2D lifting-scheme wavelet transform.
els = {'p',[-0.125 0.125],0};
lshaarInt = liftwave('haar','int2int');
lsnewInt = addlift(lshaarInt,els);
[cAint,cHint,cVint,cDint] = lwt2(x,lsnewInt) % x is your image
xRecInt = ilwt2(cAint,cHint,cVint,cDint,lsnewInt);
An article example where IWT was used for image steganography is Raja, K.B. et. al (2008) Robust image adaptive steganography using integer wavelets.