I'm currently writing something which will compute the 2d FFT of an image and pick out certain peaks in the magnitude spectrum. My images all have scales on the x and y axis in nm but I'm struggling to understand how I would convert from my length scales to frequencies.
I'm sure I'm just misunderstanding something simple here but I can't find anything on the subject that seems relevant.
Thanks in advance for any help.
Brief overview of FFT
Remember FFT is just a Fast version of the Discrete Fourier Transform algorithm. It transforms data back and forth between the signals original domain (which could be either time or space) and its corresponding frequency domain. In case of images, you are talking about spatial frequency.
Spatial frequency
Spatial frequency is similar to regular temporal frequency. But it has units of m⁻¹ rather than s⁻¹.
Remember that from Nyquist-Shannon sampling theorem the highest frequency component is equal to half of the sampling rate. In time frequency, this means that if you are sampling at 1000Hz the highest signal frequency you can sample is 500Hz. In spatial domain, this is equivalent. If you are sampling every milimetre (1000m⁻¹) your signal can only contain the highest frequency of 500m⁻¹ (wavelength of 2mm).
You can sort of picture it by imagining you sample 1, -1, 1, -1 ... at every 1mm, the wave clearly repeats every 2mm.
The smallest frequency component is going to depend on the length of the signal. Clearly if you have a 1s sample, the smallest frequency bin you can detect is 1Hz. As you have probably already noticed, same applies to spatial frequencies.
Now you could look at your sampling rate, and your signal length and work out the frequency spacing and align for the shifted FFT signal... But numpy already provides a very powerful method for generating your frequency axis numpy.fft.fftfreq. You give it the length of your signal n and sample spacing d and it will provide you with the correctly scaled and spaced frequency axis.
So in your case where you have an x x y image with pixels every nm you would generate your x and y spatial frequency axis like this:
d = 1e-9 # Sample spacing is 1nm
y, x = image.shape # Get the y and x size of your input image (assuming its just 2D)
y_freq = np.fft.fftfreq(y, d)
x_freq = np.fft.fftfreq(x, d)
Frequency order
Remember that by default, the output of a FFT is shifted so the coefficients start at DC, continue to highest frequency and then wrap to the negative frequencies
[0, 1, 2, 3, 4, -5, -4, -3, -2, -1]
fftfreq outputs the frequencie axis in the same way. In case you want to reorder the axis, us np.fft.fftshift. This will reorder the frequencies to a more logical order like this:
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4]
This also works for 2D FFT. Basically your code would look something like this:
image_f = np.fft.fft2(image) # 2D FFT of image
image_fs = np.fft.fftshift(image_f) # shift the FFT of the image
d = 1e-9 # Sampling rate is 1/1nm
y, x = image.shape # Get the y and x size of your input image (assuming its just 2D)
# Compute the shifted Spacial Frequency axis with units m⁻¹
y_freq = np.fft.fftshift(np.fft.fftfreq(y, d))
x_freq = np.fft.fftshift(np.fft.fftfreq(x, d))
fig, ax = plt.subplots()
# Plot the absolute of the 2D FFT. Set the axis extent to the min/max values for x and y freq axis
# Note, in imshow, you actually don't really need the full freq axis, just the limits
ax.imshow(np.abs(image_fs), origin='lower', extent=[x_freq[0], x_freq[-1], y_freq[0], y_freq[-1]])
Related
Assume I have a data like this:
x = np.random.randn(4, 100000)
and I fit a histogram
hist = np.histogramdd(x, density=True)
What I want is to get the probability of number g, e.g. g=0.1. Assume some hypothetical function foo then.
g = 0.1
prob = foo(hist, g)
print(prob)
>> 0.2223124214
How could I do something like this, where I get probability back for a single or a vector of numbers for a fitted histogram? Especially histogram that is N-dimensional.
histogramdd takes O(r^D) memory, and unless you have a very large dataset or very small dimension you will have a poor estimate. Consider your example data, 100k points in 4-D space, the default histogram will be 10 x 10 x 10 x 10, so it will have 10k bins.
x = np.random.randn(4, 100000)
hist = np.histogramdd(x.transpose(), density=True)
np.mean(hist[0] == 0)
gives something arround 0.77 meaning that 77% of the bins in the histogram have no points.
You probably want to smooth the distribution. Unless you have a good reason to not do, I would suggest you to use Gaussian kernel-density Estimate
x = np.random.randn(4, 100000) # d x n array
f = scipy.stats.gaussian_kde(x) # d-dimensional PDF
f([1,2,3,4]) # evaluate the PDF in a given point
I have some timecourse data that visually appears to have differing levels of high frequency fluctuation. I have plotted the timecourse Data A and B below.
I have used numpy FFT to perform Fourier Transformation as follows:
Fs = 1.0; # sampling rate
Ts = 864000; # sampling interval
t = np.arange(0,Ts,Fs) # time vector
n = 864000 # number of datapoints of timecourse data
k = np.arange(n)
T = n/Fs
frq = k/T # two sides frequency range
frq = frq[range(n/2)] # one side frequency range
Y = np.fft.fft(A1)/n # fft computing and normalization of timecourse data (A1)
Y = Y[range(n/2)]
Z = abs(Y)
##### Calculate Mean Frequency ########
Mean_Frequency = sum((frq*Z))/(sum(Z))
################################make sure the first value doesnt create an issue
Freq = frq[1:]
Z = Z[1:]
max_y = max(Z) # Find the maximum y value
mode_Frequency = Freq[Z.argmax()] # Find the x value corresponding to the maximum y value
###################### Plot figures
fig, ax = plt.subplots(2, 1)
fig.suptitle(str(D[k]), fontsize=14, fontweight='bold')
ax[0].plot(t,A1, 'black')
ax[0].set_xlabel('Time')
ax[0].set_ylabel('Amplitude')
ax[1].plot(frq,abs(Y),'r') # plotting the spectrum
ax[1].set_xscale('log')
ax[1].set_xlabel('Freq (Hz)')
ax[1].set_ylabel('|Y(freq)|')
text(2*mode_x, 0.95*max_y, "Mode Frequency (Hz): "+str(mode_x))
text(2*mode_x, 0.85*max_y, "Mean Frequency (Hz): "+str(mean_x))
plt.savefig("/home/phoenix/Desktop/Figures/Figure3/FourierGraphs/"+str(D[k])+".png")
plt.close()
This results look like this:
The timecourse data (black) for A on the left appears to me to have much more high frequency noise than than the data on the right (B).
yet the mean frequency is higher for the data on the left.
Is this because I have performed FFT incorrectly, or because I have calculated mean frequency incorrectly or because I need to use a different method to capture the really low frequency noise in timecourse B?
thanks for your time.
I'm new to Tensorflow, I'm processing a number of feature maps from two images. At a certain point, I've N features map for each of the two images and I want to obtain a single, feature volume concatenation of the two.
I can simply concat them with tf.concat([features1, features2]), in order to obtain a new volume having, for each pixel, the features at that position from both images.
What if I want to concat features with pixels having different coordinates on the two images?
For example, I've a function mapping x,y from the first image to u,v on the second image. Such functions does not follow a shared rule for all pixels (e.g., it's not a simple horizontal translation). Using numpy arrays, the behavior should be this:
for i in range(0,H):
for j in range(0,W):
u = f(x)
v = f(y)
concat[i][j] = np.concatenate([image1[y][x], image2[v][u]])
I tried to slice single pixel features and concat them together within for loops, but as you know it's very inefficient (and infeasible with large images, the memory required is just too high).
matrix = []
for y in range(0,H):
row = []
for x in range(0,W):
row.append(tf.concat([tf.slice(image1, [0, y, x, 0], [-1, 1, 1, -1]), tf.slice(image2, [0, v, u, 0], [-1, 1, 1, -1]) ], 3))
row_array = tf.concat(row, 2)
matrix.append(row_array)
result = tf.concat(matrix, 1)
What's the best option, it it exists?
This is a mathematical question, but it is tied to the numpy implementation, so I decided to ask it at SO. Perhaps I'm hugely misunderstanding something, but if so I would like to be put straight.
numpy.ftt.ftt computes DFT according to equation:
numpy.ftt.fftfreq is supposed to return frequencies at which DFT was computed.
Say we have:
x = [0, 0, 1, 0, 0]
X = np.fft.fft(x)
freq = np.fft.fftfreq(5)
Then for signal x, its DFT transformation is X, and frequencies at which X is computed are given by freq. For example X[0] is DFT of x at frequency freq[0], X[1] is DFT of x at frequency freq[1], and so on.
But when I compute DFT of a simple signal by hand with the formula quoted above, my results indicate that X[1] is DFT of x at frequency 1, not at freq[1], X[2] is DFT of x at frequency 2, etc, not at freq[2], etc.
As an example:
In [32]: x
Out[32]: [0, 0, 1, 0, 0]
In [33]: X
Out[33]:
array([
1.00000000+0.j,
-0.80901699-0.58778525j,
0.30901699+0.95105652j, 0.30901699-0.95105652j,
-0.80901699+0.58778525j])
In [34]: freq
Out[34]: array([ 0. , 0.2, 0.4, -0.4, -0.2])
If I compute DFT of above signal for k = 0.2 (or freq[1]), I get
X at freq = 0.2: 0.876 - 0.482j, which isn't X[1].
If however I compute for k = 1 I get the same results as are in X[1] or -0.809 - 0.588j.
So what am I misunderstanding? If numpy.fft.fft(x)[n] is a DFT of x at frequency n, not at frequency numpy.fft.fttfreq(len(x))[n], what is the purpose of numpy.fft.fttfreq?
I think that because the values in the array returned by the numpy.fft.fttfreq are equal to the (k/n)*sampling frequency.
The frequencies of the dft result are equal to k/n divided by the time spacing, because the periodic function's period's amplitude will become the inverse of the original value after fft. You can consider the digital signal function is a periodic sampling function convoluted by the analog signal function. The convolution in time domain means multiplication in frequency domain, so that the time spacing of the input data will affect the frequency spacing of the dft result and the frequency spacing's value will become the original one divided by the time spacing. Originally, the frequency spacing of the dft result is equal to 1/n when the time spacing is equal to 1. So after the dft, the frequency spacing will become 1/n divided by the time spacing, which eqauls to 1/n multiplied by the sampling frequency.
To calculate that, the numpy.fft.fttfreq has two arguments, the length of the input and time spacing, which means the inverse of the sampling rate. The length of the input is equal to n, and the time spacing is equal to the value which the result k/n divided by (Default is 1.)
I have tried to let k = 2, and the result is equal to the X[2] in your example. In this situation, the k/n*1 is equal to the freq[2].
The DFT is a dimensionless basis transform or matrix multiplication. The output or result of a DFT has nothing to do with frequencies unless you know the sampling rate represented by the input vector (samples per second, per meter, per radian, etc.)
You can compute a Goertzel filter of the same length N with k=0.2, but that result isn't contained in an DFT or FFT result of length N. A DFT only contains complex Goertzel filter results for integer k values. And to get from k to the frequency represented by X[k], you need to know the sample rate.
Yours is not a SO question
You wrote
If I compute DFT of above signal for k = 0.2 .
and I reply "You shouldn't"... the DFT can be meaningfully computed only for integer values of k.
The relationship between an index k and a frequency is given by f_k = k Δf or, if you prefer circular frequencies, ω_k = k Δω where Δf = 1/T and Δω = 2πΔf, T being the period of the signal.
The arguments of fftfreq are a bit misleading... the required one is the number of samples n and the optional argument is the sampling interval, by default d=1.0, but at any rate T=n*d and Δf = 1/(n*d)
>>> fftfreq(5) # d=1
array([ 0. , 0.2, 0.4, -0.4, -0.2])
>>> fftfreq(5,2)
array([ 0. , 0.1, 0.2, -0.2, -0.1])
>>> fftfreq(5,10)
array([ 0. , 0.02, 0.04, -0.04, -0.02])
and the different T are 5,10,50 and the respective df are -.2,0.1,0.02 as (I) expected.
Why fftfreq doesn't simply require the signal's period? because it is mainly intended as an helper in demangling the Nyquist frequency issue.
As you know, the DFT is periodic, for a signal x of length N you have that
DFT(x,k) is equal to DFT(x,k+mN) where m is an integer.
This imply that there are only N/2 positive and N/2 negative distinct frequencies and that, when N/2<k<N, the frequency that must be associated with k in the most meaningful way is not k df but (k-N) df.
To perform this, fftfreq needs more information that the period T, hence the choice of requiring n and computing df from an assumption on sampling interval.
I am looking for algorithm to solve the following problem :
I have two sets of vectors, and I want to find the matrix that best approximate the transformation from the input vectors to the output vectors.
vectors are 3x1, so matrix is 3x3.
This is the general problem. My particular problem is I have a set of RGB colors, and another set that contains the desired color. I am trying to find an RGB to RGB transformation that would give me colors closer to the desired ones.
There is correspondence between the input and output vectors, so computing an error function that should be minimized is the easy part. But how can I minimize this function ?
This is a classic linear algebra problem, the key phrase to search on is "multiple linear regression".
I've had to code some variation of this many times over the years. For example, code to calibrate a digitizer tablet or stylus touch-screen uses the same math.
Here's the math:
Let p be an input vector and q the corresponding output vector.
The transformation you want is a 3x3 matrix; call it A.
For a single input and output vector p and q, there is an error vector e
e = q - A x p
The square of the magnitude of the error is a scalar value:
eT x e = (q - A x p)T x (q - A x p)
(where the T operator is transpose).
What you really want to minimize is the sum of e values over the sets:
E = sum (e)
This minimum satisfies the matrix equation D = 0 where
D(i,j) = the partial derivative of E with respect to A(i,j)
Say you have N input and output vectors.
Your set of input 3-vectors is a 3xN matrix; call this matrix P.
The ith column of P is the ith input vector.
So is the set of output 3-vectors; call this matrix Q.
When you grind thru all of the algebra, the solution is
A = Q x PT x (P x PT) ^-1
(where ^-1 is the inverse operator -- sorry about no superscripts or subscripts)
Here's the algorithm:
Create the 3xN matrix P from the set of input vectors.
Create the 3xN matrix Q from the set of output vectors.
Matrix Multiply R = P x transpose (P)
Compute the inverseof R
Matrix Multiply A = Q x transpose(P) x inverse (R)
using the matrix multiplication and matrix inversion routines of your linear algebra library of choice.
However, a 3x3 affine transform matrix is capable of scaling and rotating the input vectors, but not doing any translation! This might not be general enough for your problem. It's usually a good idea to append a "1" on the end of each of the 3-vectors to make then a 4-vector, and look for the best 3x4 transform matrix that minimizes the error. This can't hurt; it can only lead to a better fit of the data.
You don't specify a language, but here's how I would approach the problem in Matlab.
v1 is a 3xn matrix, containing your input colors in vertical vectors
v2 is also a 3xn matrix containing your output colors
You want to solve the system
M*v1 = v2
M = v2*inv(v1)
However, v1 is not directly invertible, since it's not a square matrix. Matlab will solve this automatically with the mrdivide operation (M = v2/v1), where M is the best fit solution.
eg:
>> v1 = rand(3,10);
>> M = rand(3,3);
>> v2 = M * v1;
>> v2/v1 - M
ans =
1.0e-15 *
0.4510 0.4441 -0.5551
0.2220 0.1388 -0.3331
0.4441 0.2220 -0.4441
>> (v2 + randn(size(v2))*0.1)/v1 - M
ans =
0.0598 -0.1961 0.0931
-0.1684 0.0509 0.1465
-0.0931 -0.0009 0.0213
This gives a more language-agnostic solution on how to solve the problem.
Some linear algebra should be enough :
Write the average squared difference between inputs and outputs ( the sum of the squares of each difference between each input and output value ). I assume this as definition of "best approximate"
This is a quadratic function of your 9 unknown matrix coefficients.
To minimize it, derive it with respect to each of them.
You will get a linear system of 9 equations you have to solve to get the solution ( unique or a space variety depending on the input set )
When the difference function is not quadratic, you can do the same but you have to use an iterative method to solve the equation system.
This answer is better for beginners in my opinion:
Have the following scenario:
We don't know the matrix M, but we know the vector In and a corresponding output vector On. n can range from 3 and up.
If we had 3 input vectors and 3 output vectors (for 3x3 matrix), we could precisely compute the coefficients αr;c. This way we would have a fully specified system.
But we have more than 3 vectors and thus we have an overdetermined system of equations.
Let's write down these equations. Say that we have these vectors:
We know, that to get the vector On, we must perform matrix multiplication with vector In.In other words: M · I̅n = O̅n
If we expand this operation, we get (normal equations):
We do not know the alphas, but we know all the rest. In fact, there are 9 unknowns, but 12 equations. This is why the system is overdetermined. There are more equations than unknowns. We will approximate the unknowns using all the equations, and we will use the sum of squares to aggregate more equations into less unknowns.
So we will combine the above equations into a matrix form:
And with some least squares algebra magic (regression), we can solve for b̅:
This is what is happening behind that formula:
Transposing a matrix and multiplying it with its non-transposed part creates a square matrix, reduced to lower dimension ([12x9] · [9x12] = [9x9]).
Inverse of this result allows us to solve for b̅.
Multiplying vector y̅ with transposed x reduces the y̅ vector into lower [1x9] dimension. Then, by multiplying [9x9] inverse with [1x9] vector we solved the system for b̅.
Now, we take the [1x9] result vector and create a matrix from it. This is our approximated transformation matrix.
A python code:
import numpy as np
import numpy.linalg
INPUTS = [[5,6,2],[1,7,3],[2,6,5],[1,7,5]]
OUTPUTS = [[3,7,1],[3,7,1],[3,7,2],[3,7,2]]
def get_mat(inputs, outputs, entry_len):
n_of_vectors = inputs.__len__()
noe = n_of_vectors*entry_len# Number of equations
#We need to construct the input matrix.
#We need to linearize the matrix. SO we will flatten the matrix array such as [a11, a12, a21, a22]
#So for each row we combine the row's variables with each input vector.
X_mat = []
for in_n in range(0, n_of_vectors): #For each input vector
#populate all matrix flattened variables. for 2x2 matrix - 4 variables, for 3x3 - 9 variables and so on.
base = 0
for col_n in range(0, entry_len): #Each original unknown matrix's row must be matched to all entries in the input vector
row = [0 for i in range(0, entry_len ** 2)]
for entry in inputs[in_n]:
row[base] = entry
base+=1
X_mat.append(row)
Y_mat = [item for sublist in outputs for item in sublist]
X_np = np.array(X_mat)
Y_np = np.array([Y_mat]).T
solution = np.dot(np.dot(numpy.linalg.inv(np.dot(X_np.T,X_np)),X_np.T),Y_np)
var_mat = solution.reshape(entry_len, entry_len) #create square matrix
return var_mat
transf_mat = get_mat(INPUTS, OUTPUTS, 3) #3 means 3x3 matrix, and in/out vector size 3
print(transf_mat)
for i in range(0,INPUTS.__len__()):
o = np.dot(transf_mat, np.array([INPUTS[i]]).T)
print(f"{INPUTS[i]} x [M] = {o.T} ({OUTPUTS[i]})")
The output is as such:
[[ 0.13654096 0.35890767 0.09530002]
[ 0.31859558 0.83745124 0.22236671]
[ 0.08322497 -0.0526658 0.4417611 ]]
[5, 6, 2] x [M] = [[3.02675088 7.06241873 0.98365224]] ([3, 7, 1])
[1, 7, 3] x [M] = [[2.93479472 6.84785436 1.03984767]] ([3, 7, 1])
[2, 6, 5] x [M] = [[2.90302805 6.77373212 2.05926064]] ([3, 7, 2])
[1, 7, 5] x [M] = [[3.12539476 7.29258778 1.92336987]] ([3, 7, 2])
You can see, that it took all the specified inputs, got the transformed outputs and matched the outputs to the reference vectors. The results are not precise, since we have an approximation from the overspecified system. If we used INPUT and OUTPUT with only 3 vectors, the result would be exact.