Tensorflow - concat single pixels features - tensorflow

I'm new to Tensorflow, I'm processing a number of feature maps from two images. At a certain point, I've N features map for each of the two images and I want to obtain a single, feature volume concatenation of the two.
I can simply concat them with tf.concat([features1, features2]), in order to obtain a new volume having, for each pixel, the features at that position from both images.
What if I want to concat features with pixels having different coordinates on the two images?
For example, I've a function mapping x,y from the first image to u,v on the second image. Such functions does not follow a shared rule for all pixels (e.g., it's not a simple horizontal translation). Using numpy arrays, the behavior should be this:
for i in range(0,H):
for j in range(0,W):
u = f(x)
v = f(y)
concat[i][j] = np.concatenate([image1[y][x], image2[v][u]])
I tried to slice single pixel features and concat them together within for loops, but as you know it's very inefficient (and infeasible with large images, the memory required is just too high).
matrix = []
for y in range(0,H):
row = []
for x in range(0,W):
row.append(tf.concat([tf.slice(image1, [0, y, x, 0], [-1, 1, 1, -1]), tf.slice(image2, [0, v, u, 0], [-1, 1, 1, -1]) ], 3))
row_array = tf.concat(row, 2)
matrix.append(row_array)
result = tf.concat(matrix, 1)
What's the best option, it it exists?

Related

How do I convert scale bars after a 2d FFT?

I'm currently writing something which will compute the 2d FFT of an image and pick out certain peaks in the magnitude spectrum. My images all have scales on the x and y axis in nm but I'm struggling to understand how I would convert from my length scales to frequencies.
I'm sure I'm just misunderstanding something simple here but I can't find anything on the subject that seems relevant.
Thanks in advance for any help.
Brief overview of FFT
Remember FFT is just a Fast version of the Discrete Fourier Transform algorithm. It transforms data back and forth between the signals original domain (which could be either time or space) and its corresponding frequency domain. In case of images, you are talking about spatial frequency.
Spatial frequency
Spatial frequency is similar to regular temporal frequency. But it has units of m⁻¹ rather than s⁻¹.
Remember that from Nyquist-Shannon sampling theorem the highest frequency component is equal to half of the sampling rate. In time frequency, this means that if you are sampling at 1000Hz the highest signal frequency you can sample is 500Hz. In spatial domain, this is equivalent. If you are sampling every milimetre (1000m⁻¹) your signal can only contain the highest frequency of 500m⁻¹ (wavelength of 2mm).
You can sort of picture it by imagining you sample 1, -1, 1, -1 ... at every 1mm, the wave clearly repeats every 2mm.
The smallest frequency component is going to depend on the length of the signal. Clearly if you have a 1s sample, the smallest frequency bin you can detect is 1Hz. As you have probably already noticed, same applies to spatial frequencies.
Now you could look at your sampling rate, and your signal length and work out the frequency spacing and align for the shifted FFT signal... But numpy already provides a very powerful method for generating your frequency axis numpy.fft.fftfreq. You give it the length of your signal n and sample spacing d and it will provide you with the correctly scaled and spaced frequency axis.
So in your case where you have an x x y image with pixels every nm you would generate your x and y spatial frequency axis like this:
d = 1e-9 # Sample spacing is 1nm
y, x = image.shape # Get the y and x size of your input image (assuming its just 2D)
y_freq = np.fft.fftfreq(y, d)
x_freq = np.fft.fftfreq(x, d)
Frequency order
Remember that by default, the output of a FFT is shifted so the coefficients start at DC, continue to highest frequency and then wrap to the negative frequencies
[0, 1, 2, 3, 4, -5, -4, -3, -2, -1]
fftfreq outputs the frequencie axis in the same way. In case you want to reorder the axis, us np.fft.fftshift. This will reorder the frequencies to a more logical order like this:
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4]
This also works for 2D FFT. Basically your code would look something like this:
image_f = np.fft.fft2(image) # 2D FFT of image
image_fs = np.fft.fftshift(image_f) # shift the FFT of the image
d = 1e-9 # Sampling rate is 1/1nm
y, x = image.shape # Get the y and x size of your input image (assuming its just 2D)
# Compute the shifted Spacial Frequency axis with units m⁻¹
y_freq = np.fft.fftshift(np.fft.fftfreq(y, d))
x_freq = np.fft.fftshift(np.fft.fftfreq(x, d))
fig, ax = plt.subplots()
# Plot the absolute of the 2D FFT. Set the axis extent to the min/max values for x and y freq axis
# Note, in imshow, you actually don't really need the full freq axis, just the limits
ax.imshow(np.abs(image_fs), origin='lower', extent=[x_freq[0], x_freq[-1], y_freq[0], y_freq[-1]])

How can one utilize the indices provided by torch.topk()?

Suppose I have a pytorch tensor x of shape [N, N_g, 2]. It can be viewed as N * N_g 2d vectors. Specifically, x[i, j, :] is the 2d vector of the jth group in the ith batch.
Now I am trying to get the coordinates of vectors of top 5 length in each group. So I tried the following:
(i) First I used x_len = (x**2).sum(dim=2).sqrt() to compute their lengths, resulting in x_len.shape==[N, N_g].
(ii) Then I used tk = x_len.topk(5) to get the top 5 lengths in each group.
(iii) The desired output would be a tensor x_top5 of shape [N, 5, 2]. Naturally I thought of using tk.indices to index x so as to obtain x_top5. But I failed as it seems such indexing is not supported.
How can I do this?
A minimal example:
x = torch.randn(10,10,2) # N=10 is the batchsize, N_g=10 is the group size
x_len = (x**2).sum(dim=2).sqrt()
tk = x_len.topk(5)
x_top5 = x[tk.indices]
print(x_top5.shape)
# torch.Size([10, 5, 10, 2])
However, this gives x_top5 as a tensor of shape [10, 5, 10, 2], instead of [10, 5, 2] as desired.

how to avoid split and sum of pieces in pytorch or numpy

I want to split a long vector into smaller unequal pieces, do a summation on each piece and gather the results into a new vector.
I need to do this in pytorch but I am also interested to see how this is done with numpy.
This can easily be accomplish by splitting the vector.
sizes = [3, 7, 5, 9]
X = torch.ones(sum(sizes))
Y = torch.tensor([s.sum() for s in torch.split(X, sizes)])
or with np.ones and np.split.
Is there a more efficient way to do this?
Edit:
Inspired by the first comment:
indices = np.cumsum([0]+sizes)[:-1]
Y = np.add.reduceat(X, indices.tolist())
solves it for numpy. I am still looking for a solution with pytorch.
index_add_ is your friend!
# inputs
sizes = torch.tensor([3, 7, 5, 9], dtype=torch.long)
x = torch.ones(sizes.sum())
# prepare an index vector for summation (what elements of x are summed to each element of y)
ind = torch.zeros(sizes.sum(), dtype=torch.long)
ind[torch.cumsum(sizes, dim=0)[:-1]] = 1
ind = torch.cumsum(ind, dim=0)
# prepare the output
y = torch.zeros(len(sizes))
# do the actual summation
y.index_add_(0, ind, x)

Numpy - many matrices to same vector

Is there an efficient way to multiply many different rotation matrices to the same vector?
Right now I am doing the following, extremely slow procedure
for i, rm in enumerate(ray_rotation_matrices):
scan_point = rm * np.vstack([scale, 0, 0, 1])
scan_points[i] = np.hstack(scan_point[:3])
Each rm is a 4x4 matrix for homogenous coordinates. Can I somehow broadcast, but how do I make sure it applies matrix multiplication and not element wise product?
I want to get rid of the for loop...
Use one large array and the matrix multiplication operator #. It is vectorized out of the box. Example:
# a stack of five 4x4 matrices
>>> m = np.random.random((5, 4, 4))
>>> v = np.random.random((4,))
# all five matrix vector products in one go
>>> m#v
array([[1.08929927, 0.98770373, 1.0470138 , 1.266117 ],
[0.71691193, 0.68655178, 1.25601832, 1.22123406],
[1.3964922 , 1.02123137, 1.03709715, 0.72414757],
[0.9422159 , 0.84904553, 0.8506686 , 1.29374861],
[1.02159382, 1.36399314, 1.06503775, 0.56242674]])
# doing it one-by-one gives the same answer
>>> [mi#v for mi in m]
[array([1.08929927, 0.98770373, 1.0470138 , 1.266117 ]), array([0.71691193, 0.68655178, 1.25601832, 1.22123406]), array([1.3964922 , 1.02123137, 1.03709715, 0.72414757]), array([0.9422159 , 0.84904553, 0.8506686 , 1.29374861]), array([1.02159382, 1.36399314, 1.06503775, 0.56242674])]

Sample from a tensor in Tensorflow along an axis

I have a matrix L of shape (2,5,2). The values along the last axis form a probability distribution. I want to sample another matrix S of shape (2, 5) where each entry is one of the following integers: 0, 1.
For example,
L = [[[0.1, 0.9],[0.2, 0.8],[0.3, 0.7],[0.5, 0.5],[0.6, 0.4]],
[[0.5, 0.5],[0.9, 0.1],[0.7, 0.3],[0.9, 0.1],[0.1, 0.9]]]
One of the samples could be,
S = [[1, 1, 1, 0, 1],
[1, 1, 1, 0, 1]]
The distributions are binomial in the above example. However, in general, the last dimension of L can be any positive integer, so the distributions can be multinomial.
The samples need to be generated efficiently within Tensorflow computation graph. I know how to do this using numpy using the functions apply_along_axis and numpy.random.multinomial.
You can use tf.multinomial() here.
You will first need to reshape your input tensor to shape [-1, N] (where N is the last dimension of L):
# L has shape [2, 5, 2]
L = tf.constant([[[0.1, 0.9],[0.2, 0.8],[0.3, 0.7],[0.5, 0.5],[0.6, 0.4]],
[[0.5, 0.5],[0.9, 0.1],[0.7, 0.3],[0.9, 0.1],[0.1, 0.9]]])
dims = L.get_shape().as_list()
N = dims[-1] # here N = 2
logits = tf.reshape(L, [-1, N]) # shape [10, 2]
Now we can apply the function tf.multinomial() to logits:
samples = tf.multinomial(logits, 1)
# We reshape to match the initial shape minus the last dimension
res = tf.reshape(samples, dims[:-1])
Be cautious when using tf.multinomial(). The inputs to the function should be logits and not probability distributions.
However, in your example, the last axis is a probability distribution.