I have a 3D datacube, with two spatial dimensions and the third being a multi-band spectrum at each point of the 2D image.
H[x, y, bands]
Given a wavelength (or band number), I would like to extract the 2D image corresponding to that wavelength. This would be simply an array slice like H[:,:,bnd]. Similarly, given a spatial location (i,j) the spectrum at that location is H[i,j].
I would also like to 'smooth' the image spectrally, to counter low-light noise in the spectra. That is for band bnd, I choose a window of size wind and fit a n-degree polynomial to the spectrum in that window. With polyfit and polyval I can find the fitted spectral value at that point for band bnd.
Now, if I want the whole image of bnd from the fitted value, then I have to perform this windowed-fitting at each (i,j) of the image. I also want the 2nd-derivative image of bnd, that is, the value of the 2nd-derivative of the fitted spectrum at each point.
Running over the points, I could polyfit-polyval-polyder each of the x*y spectra. While this works, this is a point-wise operation. Is there some pytho-numponic way to do this faster?
If you do least-squares polynomial fitting to points (x+dx[i],y[i]) for a fixed set of dx and then evaluate the resulting polynomial at x, the result is a (fixed) linear combination of the y[i]. The same is true for the derivatives of the polynomial. So you just need a linear combination of the slices. Look up "Savitzky-Golay filters".
EDITED to add a brief example of how S-G filters work. I haven't checked any of the details and you should therefore not rely on it to be correct.
So, suppose you take a filter of width 5 and degree 2. That is, for each band (ignoring, for the moment, ones at the start and end) we'll take that one and the two on either side, fit a quadratic curve, and look at its value in the middle.
So, if f(x) ~= ax^2+bx+c and f(-2),f(-1),f(0),f(1),f(2) = p,q,r,s,t then we want 4a-2b+c ~= p, a-b+c ~= q, etc. Least-squares fitting means minimizing (4a-2b+c-p)^2 + (a-b+c-q)^2 + (c-r)^2 + (a+b+c-s)^2 + (4a+2b+c-t)^2, which means (taking partial derivatives w.r.t. a,b,c):
4(4a-2b+c-p)+(a-b+c-q)+(a+b+c-s)+4(4a+2b+c-t)=0
-2(4a-2b+c-p)-(a-b+c-q)+(a+b+c-s)+2(4a+2b+c-t)=0
(4a-2b+c-p)+(a-b+c-q)+(c-r)+(a+b+c-s)+(4a+2b+c-t)=0
or, simplifying,
22a+10c = 4p+q+s+4t
10b = -2p-q+s+2t
10a+5c = p+q+r+s+t
so a,b,c = p-q/2-r-s/2+t, (2(t-p)+(s-q))/10, (p+q+r+s+t)/5-(2p-q-2r-s+2t).
And of course c is the value of the fitted polynomial at 0, and therefore is the smoothed value we want. So for each spatial position, we have a vector of input spectral data, from which we compute the smoothed spectral data by multiplying by a matrix whose rows (apart from the first and last couple) look like [0 ... 0 -9/5 4/5 11/5 4/5 -9/5 0 ... 0], with the central 11/5 on the main diagonal of the matrix.
So you could do a matrix multiplication for each spatial position; but since it's the same matrix everywhere you can do it with a single call to tensordot. So if S contains the matrix I just described (er, wait, no, the transpose of the matrix I just described) and A is your 3-dimensional data cube, your spectrally-smoothed data cube would be numpy.tensordot(A,S).
This would be a good point at which to repeat my warning: I haven't checked any of the details in the few paragraphs above, which are just meant to give an indication of how it all works and why you can do the whole thing in a single linear-algebra operation.
Related
If I have a 2-dimensional (x and y coordinates) polynomial transform function of 1st/affine, 2nd, or 3rd order (i.e. I have the coefficients/transformation matrix A), what is the mathematical or programmatic approach to getting the exact inverse of this function? Ideally, how would I implement this in Numpy? This is in the context of image warping or map georeferencing, i.e. transforming or warping the coordinates from an input image to an output image in a new warped coordinate system.
Attempted Solution
To solve this I have tried a matrix algebra approach for solving sets of equations. Mathematically, the transformation procedure is represented as Au = v. Forward transforming is easy, where you calculate u as a column-matrix containing the terms of the polynomial equation based on your input coordinates, and then matrix-multiply u with the transformation matrix A, in order to get the transformed output column matrix v containing the output coordinates. Backwards transforming on the other hand, means we know the output coordinates v and want to find the input coordinates u, so we need to reshuffle our equation as u = Av. By the rules of matrix algebra, the A matrix has to be inverted when moving it over. Implementing this in Numpy for a 2nd order polynomial transform, it does seem to work:
import numpy as np
# input coords
x = np.array([13])
y = np.array([13])
# terms of the 2nd order polynomial equation
x = x
y = y
xx = x*x
xy = x*y
yy = y*y
ones = np.ones(x.shape)
# u consists of each term in 2nd order polynomial equation
# with each term being array if want to transform multiple
u = np.array([xx,xy,yy,x,y,ones])
print('original input u', u)
## output:
## ('original input u', array([[169.],
## [169.],
## [169.],
## [ 13.],
## [ 13.],
## [ 1.]]))
# forward transform matrix
A = np.array([[1,2,3,1,6,8],
[5,2,9,2,0,1],
[8,1,5,8,4,3],
[1,4,8,2,3,9],
[9,3,2,1,9,5],
[4,2,5,6,2,1]])
# get forward coords
v = A.dot(u)
print('output v', v)
## output:
## ('output v', array([[1113.],
## [2731.],
## [2525.],
## [2271.],
## [2501.],
## [1964.]]))
# get backward coords (should exactly reproduce the input coords)
Ainv = np.linalg.inv(A)
u_pred = Ainv.dot(v)
print('backwards predicted input u', u_pred)
## output:
## ('backwards predicted input u', array([[169.],
## [169.],
## [169.],
## [ 13.],
## [ 13.],
## [ 1.]]))
In the above example the output v is actually a 1x6 matrix, where only the top two rows/values represent the transformed x and y coordinates. The problem becomes that we need all the additional values in v in order to exactly inverse the coordinates. But in real-world scenarios we only know transformed x and y values (i.e. the top two rows/values of v), we don't know the full 1x6 v matrix.
Maybe I'm thinking about this wrong, or maybe this matrix algebra approach is not the right approach, since 2nd order polynomials and higher are no longer linear? Any alternate programmatic/numpy approaches for inversing the polyonimal transformation?
Some context
I've looked up many similar questions and websites as well as numpy functions such as numpy.polynomial.Polynomial.fit, but most of them relate only to inversing 1-dimensional polynomial transforms. The few links I've found that talk about 2-dimensional transforms say there is no exact way to inverse it, which doesn't make sense since this is a very common operation in image warping/resampling and map georeferencing. For example, the steps for warping an image is often broken down to:
Forward project all original pixel (column-row) coordinates u using the transformation function/matrix A, in order to find the bounds of the transformed coordinate space v.
Then for every coordinate sampled at regular intervals in the transformed coordinate space bounds (found in step 1), backwards sample these v coordinates in the transformed coordinate system to find their original coordinates u. This determines which original pixels to sample for each location in the transformed image.
My problem then is that I have the forward transformation necessary for step 1, but I need to find the exact inverse of that transformation necessary for backwards sampling in step 2. Either a math answer or a numpy solution would be fine.
Inversion of a 2D affine function is pretty easy. It takes the resolution of a 2x2 linear system of equations.
The case of quadratic and cubic polynomials is much more problematic. If I am right, a system in two unknows is equivalent to a single quartic or nonic (degree 9) polynomial equation. Explicit (though complicated) formulas exist for the quartic case, but none for the nonic case, and you will have to resort to numerical methods (Newton's iterations).
In addition, the solution of these nonlinear equations are not unique (you can have 4 or 9 solutions) and you need to keep the right ones.
If your transformation remains close to affine (such as when correcting image distortion), I would suggest to choose an affine transformation that approximates the complete equation, use the backward transformation to find initial approximations, then refine with Newton.
what are the DCT coefficients mean. And what is the difference between a positive and a negative DCT's coefficient for example coeficient 5 and -5.
Thanks
The DCT is simply a 1-to-1 transformation of the data.
Suppose you have a set of blueprints on paper. You scan them in. Once scanned they are crooked. You use Photoshop or something like it to rotate the image to its aligned to the edges and easier to work with.
The DCT is like a rotation in that it simply makes the image data easier to work with. I have to say that a lot of books make this confusing by adding spectral analysis mumbo-jumbo.
Desirable attributes of the DCT for this purpose are:
That it is a transformation to an orthonormal basis set. If D is the DCT transformation matrix, X is the input and Y is the output so that
X D = Y
Then there is an inverse matrix Q that gives:
Y Q = X
And Q is the transpose of D.
Therefore, it is just as easy to go forwards as it is to go backwards with the DCT.
The DCT transformation tends to concentrate the most important image data in one corner of the output matrix. The data at the opposite corner tends to be discardable without noticeably affecting photographic images.
As to your other question, the JPEG input pixels are translated to the range -127 to 128. Your starting values usually have negative values to it's no surprise that you get negative output values. Even if you did have all positive input values you could still get negative output values. There is no real significance between positive and negative values.
Is there a way to chose the x/y output axes range from np.fft2 ?
I have a piece of code computing the diffraction pattern of an aperture. The aperture is defined in a 2k x 2k pixel array. The diffraction pattern is basically the inner part of the 2D FT of the aperture. The np.fft2 gives me an output array same size of the input but with some preset range of the x/y axes. Of course I can zoom in by using the image viewer, but I have already lost detail. What is the solution?
Thanks,
Gert
import numpy as np
import matplotlib.pyplot as plt
r= 500
s= 1000
y,x = np.ogrid[-s:s+1, -s:s+1]
mask = x*x + y*y <= r*r
aperture = np.ones((2*s+1, 2*s+1))
aperture[mask] = 0
plt.imshow(aperture)
plt.show()
ffta= np.fft.fft2(aperture)
plt.imshow(np.log(np.abs(np.fft.fftshift(ffta))**2))
plt.show()
Unfortunately, much of the speed and accuracy of the FFT come from the outputs being the same size as the input.
The conventional way to increase the apparent resolution in the output Fourier domain is by zero-padding the input: np.fft.fft2(aperture, [4 * (2*s+1), 4 * (2*s+1)]) tells the FFT to pad your input to be 4 * (2*s+1) pixels tall and wide, i.e., make the input four times larger (sixteen times the number of pixels).
Begin aside I say "apparent" resolution because the actual amount of data you have hasn't increased, but the Fourier transform will appear smoother because zero-padding in the input domain causes the Fourier transform to interpolate the output. In the example above, any feature that could be seen with one pixel will be shown with four pixels. Just to make this fully concrete, this example shows that every fourth pixel of the zero-padded FFT is numerically the same as every pixel of the original unpadded FFT:
# Generate your `ffta` as above, then
N = 2 * s + 1
Up = 4
fftup = np.fft.fft2(aperture, [Up * N, Up * N])
relerr = lambda dirt, gold: np.abs((dirt - gold) / gold)
print(np.max(relerr(fftup[::Up, ::Up] , ffta))) # ~6e-12.
(That relerr is just a simple relative error, which you want to be close to machine precision, around 2e-16. The largest error between every 4th sample of the zero-padded FFT and the unpadded FFT is 6e-12 which is quite close to machine precision, meaning these two arrays are nearly numerically equivalent.) End aside
Zero-padding is the most straightforward way around your problem. But it does cost you a lot of memory. And it is frustrating because you might only care about a tiny, tiny part of the transform. There's an algorithm called the chirp z-transform (CZT, or colloquially the "zoom FFT") which can do this. If your input is N (for you 2*s+1) and you want just M samples of the FFT's output evaluated anywhere, it will compute three Fourier transforms of size N + M - 1 to obtain the desired M samples of the output. This would solve your problem too, since you can ask for M samples in the region of interest, and it wouldn't require prohibitively-much memory, though it would need at least 3x more CPU time. The downside is that a solid implementation of CZT isn't in Numpy/Scipy yet: see the scipy issue and the code it references. Matlab's CZT seems reliable, if that's an option; Octave-forge has one too and the Octave people usually try hard to match/exceed Matlab.
But if you have the memory, zero-padding the input is the way to go.
I'm looking for the most abundant frequency in a periodic signal.
I'm trying to understand what do I get if I perform a Fourier transformation on a periodic signal and filter for frequencies which have negative fft values.
In other words, what do the axis of plots 2 and 3 (see below) express? I'm plotting frequency (cycles/second) over the fft-transformed signal - what do negative values on the y axis mean, and would it make sense that I'd be interested in only those?
import numpy as np
import scipy
# generate data
time = scipy.linspace(0,120,4000)
acc = lambda t: 10*scipy.sin(2*pi*2.0*t) + 5*scipy.sin(2*pi*8.0*t) + 2*scipy.random.random(len(t))
signal = acc(time)
# get frequencies from decomposed fft
W = np.fft.fftfreq(signal.size, d=time[1]-time[0])
f_signal = np.fft.fft(signal)
# filter signal
# I'm getting only the "negative" part!
cut_f_signal = f_signal.copy()
# filter noisy frequencies
cut_f_signal[(W < 8.0)] = 0
cut_f_signal[(W > 8.2)] = 0
# inverse fourier to get filtered frequency
cut_signal = np.fft.ifft(cut_f_signal)
# plot
plt.subplot(221)
plt.plot(time,signal)
plt.subplot(222)
plt.plot(W, f_signal)
plt.subplot(223)
plt.plot(W, cut_f_signal)
plt.subplot(224)
plt.plot(time, cut_signal)
plt.show()
The FFT of a real-valued input signal will produce a conjugate symmetric result. (That's just the way the math works best.) So, for FFT result magnitudes only of real data, the negative frequencies are just mirrored duplicates of the positive frequencies, and can thus be ignored when analyzing the result.
However if you want to do the inverse and compute the IFFT, you will need to feed the IFFT a conjugate symmetric negative half (or upper half, above Fs/2) of frequency data, or else your IFFT result will end up producing a complex result (e.g. with non-zero imaginary (sqrt(-1)) components, rarely what one want when dealing with base-band real data).
If you want to filter the FFT data and end up with real results from an IFFT, you will need to filter the positive and negative frequencies symmetrically identically to maintain the needed symmetry.
The FFT also produces a complex result, where the value and sign the components (real and imaginary) of each result bin represents the phase as well as the magnitude of the component basis vector (complex sinusoid, or real cosine plus real sine components). Any negative value just represents a phase rotation from if the same result was positive.
As #hotpaw2 already wrote in his comment above, the result of a FFT performed on a real signal in time domain generates complex values in frequency domain.
The input value f_signal of your plot command is a vector of complex values.
plt.subplot(222)
plt.plot(W, f_signal)
This results in meaningless output.
You should plot the absolute values of f_signal.
If you are interested in the phase you should plot the angle, too.
In Matlab this would look like this:
% Plot the absolute values of f_signal
plot(W, abs(f_signal));
% Plot the phase of f_signal
plot(W, (unwrap(angle(f_signal)));
Suppose you have a list of 2D points with an orientation assigned to them. Let the set S be defined as:
S={ (x,y,a) | (x,y) is a 2D point, a is an orientation (an angle) }.
Given an element s of S, we will indicate with s_p the point part and with s_a the angle part. I would like to know if there exist an efficient data structure such that, given a query point q, is able to return all the elements s in S such that
(dist(q_p, s_p) < threshold_1) AND (angle_diff(q_a, s_a) < threshold_2) (1)
where dist(p1,p2), with p1,p2 2D points, is the euclidean distance, and angle_diff(a1,a2), with a1,a2 angles, is the difference between angles (taken to be the smallest one). The data structure should be efficient w.r.t. insertion/deletion of elements and the search as defined above. The number of vectors can grow up to 10.000 and more, but take this with a grain of salt.
Now suppose to change the above requirement: instead of using the condition (1), let's request all the elements of S such that, given a distance function d, we want all elements of S such that d(q,s) < threshold. If i remember well, this last setup is called range-search. I don't know if the first case can be transformed in the second.
For the distance search I believe the accepted best method is a Binary Space Partition tree. This can be stored as a series of bits. Each two bits (for a 2D tree) or three bits (for a 3D tree) subdivides the space one more level, increasing resolution.
Using a BSP, locating a set of objects to compare distances with is pretty easy. Just find the smallest set of squares or cubes which contain the edges of your distance box.
For the angle, I don't know of anything. I suppose that you could store each object in a second list or tree sorted by its angle. Then you would find every object at the proper distance using the BSP, every object at the proper angles using the angle tree, then do a set intersection.
You have effectively described a "three dimensional cyclindrical space", ie. a space that is locally three dimensional but where one dimension is topologically cyclic. In other words, it is locally flat and may be modeled as the boundary of a four-dimensional object C4 in (x, y, z, w) defined by
z^2 + w^2 = 1
where
a = arctan(w/z)
With this model, the space defined by your constraints is a 2-dimensional cylinder wrapped "lengthwise" around a cross section wedge, where the wedge wraps around the 4-d cylindrical space with an angle of 2 * threshold_2. This can be modeled using a "modified k-d tree" approach (modified 3-d tree), where the data structure is not a tree but actually a graph (it has cycles). You can still partition this space into cells with hyperplane separation, but traveling along the curve defined by (z, w) in the positive direction may encounter a point encountered in the negative direction. The tree should be modified to actually lead to these nodes from both directions, so that the edges are bidirectional (in the z-w curve direction - the others are obviously still unidirectional).
These cycles do not change the effectiveness of the data structure in locating nearby points or allowing your constraint search. In fact, for the most part, those algorithms are only slightly modified (the simplest approach being to hold a visited node data structure to prevent cycles in the search - you test the next neighbors about to be searched).
This will work especially well for your criteria, since the region you define is effectively bounded by these axis-defined hyperplane-bounded cells of a k-d tree, and so the search termination will leave a region on average populated around pi / 4 percent of the area.