Zoom in on np.fft2 result - numpy

Is there a way to chose the x/y output axes range from np.fft2 ?
I have a piece of code computing the diffraction pattern of an aperture. The aperture is defined in a 2k x 2k pixel array. The diffraction pattern is basically the inner part of the 2D FT of the aperture. The np.fft2 gives me an output array same size of the input but with some preset range of the x/y axes. Of course I can zoom in by using the image viewer, but I have already lost detail. What is the solution?
Thanks,
Gert
import numpy as np
import matplotlib.pyplot as plt
r= 500
s= 1000
y,x = np.ogrid[-s:s+1, -s:s+1]
mask = x*x + y*y <= r*r
aperture = np.ones((2*s+1, 2*s+1))
aperture[mask] = 0
plt.imshow(aperture)
plt.show()
ffta= np.fft.fft2(aperture)
plt.imshow(np.log(np.abs(np.fft.fftshift(ffta))**2))
plt.show()

Unfortunately, much of the speed and accuracy of the FFT come from the outputs being the same size as the input.
The conventional way to increase the apparent resolution in the output Fourier domain is by zero-padding the input: np.fft.fft2(aperture, [4 * (2*s+1), 4 * (2*s+1)]) tells the FFT to pad your input to be 4 * (2*s+1) pixels tall and wide, i.e., make the input four times larger (sixteen times the number of pixels).
Begin aside I say "apparent" resolution because the actual amount of data you have hasn't increased, but the Fourier transform will appear smoother because zero-padding in the input domain causes the Fourier transform to interpolate the output. In the example above, any feature that could be seen with one pixel will be shown with four pixels. Just to make this fully concrete, this example shows that every fourth pixel of the zero-padded FFT is numerically the same as every pixel of the original unpadded FFT:
# Generate your `ffta` as above, then
N = 2 * s + 1
Up = 4
fftup = np.fft.fft2(aperture, [Up * N, Up * N])
relerr = lambda dirt, gold: np.abs((dirt - gold) / gold)
print(np.max(relerr(fftup[::Up, ::Up] , ffta))) # ~6e-12.
(That relerr is just a simple relative error, which you want to be close to machine precision, around 2e-16. The largest error between every 4th sample of the zero-padded FFT and the unpadded FFT is 6e-12 which is quite close to machine precision, meaning these two arrays are nearly numerically equivalent.) End aside
Zero-padding is the most straightforward way around your problem. But it does cost you a lot of memory. And it is frustrating because you might only care about a tiny, tiny part of the transform. There's an algorithm called the chirp z-transform (CZT, or colloquially the "zoom FFT") which can do this. If your input is N (for you 2*s+1) and you want just M samples of the FFT's output evaluated anywhere, it will compute three Fourier transforms of size N + M - 1 to obtain the desired M samples of the output. This would solve your problem too, since you can ask for M samples in the region of interest, and it wouldn't require prohibitively-much memory, though it would need at least 3x more CPU time. The downside is that a solid implementation of CZT isn't in Numpy/Scipy yet: see the scipy issue and the code it references. Matlab's CZT seems reliable, if that's an option; Octave-forge has one too and the Octave people usually try hard to match/exceed Matlab.
But if you have the memory, zero-padding the input is the way to go.

Related

How to calculate a very large correlation matrix

I have an np.array of observations z where z.shape is (100000, 60). I want to efficiently calculate the 100000x100000 correlation matrix and then write to disk the coordinates and values of just those elements > 0.95 (this is a very small fraction of the total).
My brute-force version of this looks like the following but is, not surprisingly, very slow:
for i1 in range(z.shape[0]):
for i2 in range(i1+1):
r = np.corrcoef(z[i1,:],z[i2,:])[0,1]
if r > 0.95:
file.write("%6d %6d %.3f\n" % (i1,i2,r))
I realize that the correlation matrix itself could be calculated much more efficiently in one operation using np.corrcoef(z), but the memory requirement is then huge. I'm also aware that one could break up the data set into blocks and calculate bite-size subportions of the correlation matrix at one time, but programming that and keeping track of the indices seems unnecessarily complicated.
Is there another way (e.g., using memmap or pytables) that is both simple to code and doesn't put excessive demands on physical memory?
After experimenting with the memmap solution proposed by others, I found that while it was faster than my original approach (which took about 4 days on my Macbook), it still took a very long time (at least a day) -- presumably due to inefficient element-by-element writes to the outputfile. That wasn't acceptable given my need to run the calculation numerous times.
In the end, the best solution (for me) was to sign in to Amazon Web Services EC2 portal, create a virtual machine instance (starting with an Anaconda Python-equipped image) with 120+ GiB of RAM, upload the input data file, and do the calculation (using the matrix multiplication method) entirely in core memory. It completed in about two minutes!
For reference, the code I used was basically this:
import numpy as np
import pickle
import h5py
# read nparray, dimensions (102000, 60)
infile = open(r'file.dat', 'rb')
x = pickle.load(infile)
infile.close()
# z-normalize the data -- first compute means and standard deviations
xave = np.average(x,axis=1)
xstd = np.std(x,axis=1)
# transpose for the sake of broadcasting (doesn't seem to work otherwise!)
ztrans = x.T - xave
ztrans /= xstd
# transpose back
z = ztrans.T
# compute correlation matrix - shape = (102000, 102000)
arr = np.matmul(z, z.T)
arr /= z.shape[0]
# output to HDF5 file
with h5py.File('correlation_matrix.h5', 'w') as hf:
hf.create_dataset("correlation", data=arr)
From my rough calculations, you want a correlation matrix that has 100,000^2 elements. That takes up around 40 GB of memory, assuming floats.
That probably won't fit in computer memory, otherwise you could just use corrcoef.
There's a fancy approach based on eigenvectors that I can't find right now, and that gets into the (necessarily) complicated category...
Instead, rely on the fact that for zero mean data the covariance can be found using a dot product.
z0 = z - mean(z, 1)[:, None]
cov = dot(z0, z0.T)
cov /= z.shape[-1]
And this can be turned into the correlation by normalizing by the variances
sigma = std(z, 1)
corr = cov
corr /= sigma
corr /= sigma[:, None]
Of course memory usage is still an issue.
You can work around this with memory mapped arrays (make sure it's opened for reading and writing) and the out parameter of dot (For another example see Optimizing my large data code with little RAM)
N = z.shape[0]
arr = np.memmap('corr_memmap.dat', dtype='float32', mode='w+', shape=(N,N))
dot(z0, z0.T, out=arr)
arr /= sigma
arr /= sigma[:, None]
Then you can loop through the resulting array and find the indices with a large correlation coefficient. (You may be able to find them directly with where(arr > 0.95), but the comparison will create a very large boolean array which may or may not fit in memory).
You can use scipy.spatial.distance.pdist with metric = correlation to get all the correlations without the symmetric terms. Unfortunately this will still leave you with about 5e10 terms that will probably overflow your memory.
You could try reformulating a KDTree (which can theoretically handle cosine distance, and therefore correlation distance) to filter for higher correlations, but with 60 dimensions it's unlikely that would give you much speedup. The curse of dimensionality sucks.
You best bet is probably brute forcing blocks of data using scipy.spatial.distance.cdist(..., metric = correlation), and then keep only the high correlations in each block. Once you know how big a block your memory can handle without slowing down due to your computer's memory architecture it should be much faster than doing one at a time.
please check out deepgraph package.
https://deepgraph.readthedocs.io/en/latest/tutorials/pairwise_correlations.html
I tried on z.shape = (2500, 60) and pearsonr for 2500 * 2500. It has an extreme fast speed.
Not sure for 100000 x 100000 but worth trying.

Power spectrum incorrectly yielding negative values

I have a real signal in time given by:
And I am simply trying to compute its power spectrum, which is the Fourier transform of the autocorrelation of the signal, and is also a purely real and positive quantity in this case. To do this, I simply write:
import numpy as np
from scipy.fftpack import fft, arange, rfftfreq, rfft
from pylab import *
lags1, c1, line1, b1 = acorr(((Y_DATA)), usevlines=False, normed=True, maxlags=3998, lw=2)
Power_spectrum = (fft(np.real(c1)))
freqs = np.fft.fftfreq(len(c1), dx)
plt.plot(freqs,Power_spectrum)
plt.xlabel('f (Hz)')
plt.xlim([-20000,20000])
plt.show()
But the output gives:
which has negative-valued output. Although if I simply take the absolute value of the data on the y-axis and plot it (i.e. np.abs(Power_spectrum)), then the output is:
which is exactly what I expect. Although why is this only fixed by taking the absolute value of my power spectrum? I checked my autocorrelation and plotted it—it seems to be working as expected and matches what others have computed.
Although what appears odd is the next step when I take the FFT. The FFT function outputs negative values which is contrary to the theory discussed in the link above and I don't quite understand why. Any thoughts on what is going wrong?
The power spectrum is the FFT of the autocorrelation, but that's not an efficient way to calculate it.
The autocorrelation is probably calculated with an FFT and iFFT, anyway.
The power spectrum is also just the squared magnitude of the FFT coefficients.
Do that instead so that the total work will be one FFT instead of 3.
An fft produces a complex result (real and imaginary components to represent both magnitude and phase of the spectrum). You have to take the (squared) magnitude of the complex vector to get the power spectrum.

Bad result plotting windowing FFT

im playing with python and scipy to understand windowing, i made a plot to see how windowing behave under FFT, but the result is not what i was specting.
the plot is:
the middle plots are pure FFT plot, here is where i get weird things.
Then i changed the trig. function to get leak, putting a 1 straight for the 300 first items of the array, the result:
the code:
sign_freq=80
sample_freq=3000
num=np.linspace(0,1,num=sample_freq)
i=0
#wave data:
sin=np.sin(2*pi*num*sign_freq)+np.sin(2*pi*num*sign_freq*2)
while i<1000:
sin[i]=1
i=i+1
#wave fft:
fft_sin=np.fft.fft(sin)
fft_freq_axis=np.fft.fftfreq(len(num),d=1/sample_freq)
#wave Linear Spectrum (Rms)
lin_spec=sqrt(2)*np.abs(np.fft.rfft(sin))/len(num)
lin_spec_freq_axis=np.fft.rfftfreq(len(num),d=1/sample_freq)
#window data:
hann=np.hanning(len(num))
#window fft:
fft_hann=np.fft.fft(hann)
#window fft Linear Spectrum:
wlin_spec=sqrt(2)*np.abs(np.fft.rfft(hann))/len(num)
#window + sin
wsin=hann*sin
#window + sin fft:
wsin_spec=sqrt(2)*np.abs(np.fft.rfft(wsin))/len(num)
wsin_spec_freq_axis=np.fft.rfftfreq(len(num),d=1/sample_freq)
fig=plt.figure()
ax1 = fig.add_subplot(431)
ax2 = fig.add_subplot(432)
ax3 = fig.add_subplot(433)
ax4 = fig.add_subplot(434)
ax5 = fig.add_subplot(435)
ax6 = fig.add_subplot(436)
ax7 = fig.add_subplot(413)
ax8 = fig.add_subplot(414)
ax1.plot(num,sin,'r')
ax2.plot(fft_freq_axis,abs(fft_sin),'r')
ax3.plot(lin_spec_freq_axis,lin_spec,'r')
ax4.plot(num,hann,'b')
ax5.plot(fft_freq_axis,fft_hann)
ax6.plot(lin_spec_freq_axis,wlin_spec)
ax7.plot(num,wsin,'c')
ax8.plot(wsin_spec_freq_axis,wsin_spec)
plt.show()
EDIT: as asked in the comments, i plotted the functions in dB scale, obtaining much clearer plots. Thanks a lot #SleuthEye !
It appears the plot which is problematic is the one generated by:
ax5.plot(fft_freq_axis,fft_hann)
resulting in the graph:
instead of the expected graph from Wikipedia.
There are a number of issues with the way the plot is constructed. The first is that this command essentially attempts to plot a complex-valued array (fft_hann). You may in fact be getting the warning ComplexWarning: Casting complex values to real discards the imaginary part as a result. To generate a graph which looks like the one from Wikipedia, you would have to take the magnitude (instead of the real part) with:
ax5.plot(fft_freq_axis,abs(fft_hann))
Then we notice that there is still a line striking through our plot. Looking at np.fft.fft's documentation:
The values in the result follow so-called “standard” order: If A = fft(a, n), then A[0] contains the zero-frequency term (the sum of the signal), which is always purely real for real inputs. Then A[1:n/2] contains the positive-frequency terms, and A[n/2+1:] contains the negative-frequency terms, in order of decreasingly negative frequency.
[...]
The routine np.fft.fftfreq(n) returns an array giving the frequencies of corresponding elements in the output.
Indeed, if we print the fft_freq_axis we can see that the result is:
[ 0. 1. 2. ..., -3. -2. -1.]
To get around this problem we simply need to swap the lower and upper parts of the arrays with np.fft.fftshift:
ax5.plot(np.fft.fftshift(fft_freq_axis),np.fft.fftshift(abs(fft_hann)))
Then you should note that the graph on Wikipedia is actually shown with amplitudes in decibels. You would then need to do the same with:
ax5.plot(np.fft.fftshift(fft_freq_axis),np.fft.fftshift(20*np.log10(abs(fft_hann))))
We should then be getting closer, but the result is not quite the same as can be seen from the following figure:
This is due to the fact that the plot on Wikipedia actually has a higher frequency resolution and captures the value of the frequency spectrum as its oscillates, whereas your plot samples the spectrum at fewer points and a lot of those points have near zero amplitudes. To resolve this problem, we need to get the frequency spectrum of the window at more frequency points.
This can be done by zero padding the input to the FFT, or more simply setting the parameter n (desired length of the output) to a value much larger than the input size:
N = 8*len(num)
fft_freq_axis=np.fft.fftfreq(N,d=1/sample_freq)
fft_hann=np.fft.fft(hann, N)
ax5.plot(np.fft.fftshift(fft_freq_axis),np.fft.fftshift(20*np.log10(abs(fft_hann))))
ax5.set_xlim([-40, 40])
ax5.set_ylim([-50, 80])

2D DCT coeficients meaning of a gray scale image

what are the DCT coefficients mean. And what is the difference between a positive and a negative DCT's coefficient for example coeficient 5 and -5.
Thanks
The DCT is simply a 1-to-1 transformation of the data.
Suppose you have a set of blueprints on paper. You scan them in. Once scanned they are crooked. You use Photoshop or something like it to rotate the image to its aligned to the edges and easier to work with.
The DCT is like a rotation in that it simply makes the image data easier to work with. I have to say that a lot of books make this confusing by adding spectral analysis mumbo-jumbo.
Desirable attributes of the DCT for this purpose are:
That it is a transformation to an orthonormal basis set. If D is the DCT transformation matrix, X is the input and Y is the output so that
X D = Y
Then there is an inverse matrix Q that gives:
Y Q = X
And Q is the transpose of D.
Therefore, it is just as easy to go forwards as it is to go backwards with the DCT.
The DCT transformation tends to concentrate the most important image data in one corner of the output matrix. The data at the opposite corner tends to be discardable without noticeably affecting photographic images.
As to your other question, the JPEG input pixels are translated to the range -127 to 128. Your starting values usually have negative values to it's no surprise that you get negative output values. Even if you did have all positive input values you could still get negative output values. There is no real significance between positive and negative values.

Faster way to perform point-wise interplation of numpy array?

I have a 3D datacube, with two spatial dimensions and the third being a multi-band spectrum at each point of the 2D image.
H[x, y, bands]
Given a wavelength (or band number), I would like to extract the 2D image corresponding to that wavelength. This would be simply an array slice like H[:,:,bnd]. Similarly, given a spatial location (i,j) the spectrum at that location is H[i,j].
I would also like to 'smooth' the image spectrally, to counter low-light noise in the spectra. That is for band bnd, I choose a window of size wind and fit a n-degree polynomial to the spectrum in that window. With polyfit and polyval I can find the fitted spectral value at that point for band bnd.
Now, if I want the whole image of bnd from the fitted value, then I have to perform this windowed-fitting at each (i,j) of the image. I also want the 2nd-derivative image of bnd, that is, the value of the 2nd-derivative of the fitted spectrum at each point.
Running over the points, I could polyfit-polyval-polyder each of the x*y spectra. While this works, this is a point-wise operation. Is there some pytho-numponic way to do this faster?
If you do least-squares polynomial fitting to points (x+dx[i],y[i]) for a fixed set of dx and then evaluate the resulting polynomial at x, the result is a (fixed) linear combination of the y[i]. The same is true for the derivatives of the polynomial. So you just need a linear combination of the slices. Look up "Savitzky-Golay filters".
EDITED to add a brief example of how S-G filters work. I haven't checked any of the details and you should therefore not rely on it to be correct.
So, suppose you take a filter of width 5 and degree 2. That is, for each band (ignoring, for the moment, ones at the start and end) we'll take that one and the two on either side, fit a quadratic curve, and look at its value in the middle.
So, if f(x) ~= ax^2+bx+c and f(-2),f(-1),f(0),f(1),f(2) = p,q,r,s,t then we want 4a-2b+c ~= p, a-b+c ~= q, etc. Least-squares fitting means minimizing (4a-2b+c-p)^2 + (a-b+c-q)^2 + (c-r)^2 + (a+b+c-s)^2 + (4a+2b+c-t)^2, which means (taking partial derivatives w.r.t. a,b,c):
4(4a-2b+c-p)+(a-b+c-q)+(a+b+c-s)+4(4a+2b+c-t)=0
-2(4a-2b+c-p)-(a-b+c-q)+(a+b+c-s)+2(4a+2b+c-t)=0
(4a-2b+c-p)+(a-b+c-q)+(c-r)+(a+b+c-s)+(4a+2b+c-t)=0
or, simplifying,
22a+10c = 4p+q+s+4t
10b = -2p-q+s+2t
10a+5c = p+q+r+s+t
so a,b,c = p-q/2-r-s/2+t, (2(t-p)+(s-q))/10, (p+q+r+s+t)/5-(2p-q-2r-s+2t).
And of course c is the value of the fitted polynomial at 0, and therefore is the smoothed value we want. So for each spatial position, we have a vector of input spectral data, from which we compute the smoothed spectral data by multiplying by a matrix whose rows (apart from the first and last couple) look like [0 ... 0 -9/5 4/5 11/5 4/5 -9/5 0 ... 0], with the central 11/5 on the main diagonal of the matrix.
So you could do a matrix multiplication for each spatial position; but since it's the same matrix everywhere you can do it with a single call to tensordot. So if S contains the matrix I just described (er, wait, no, the transpose of the matrix I just described) and A is your 3-dimensional data cube, your spectrally-smoothed data cube would be numpy.tensordot(A,S).
This would be a good point at which to repeat my warning: I haven't checked any of the details in the few paragraphs above, which are just meant to give an indication of how it all works and why you can do the whole thing in a single linear-algebra operation.