How to understand the conv2d_transpose in tensorflow - tensorflow

The following is a test for conv2d_transpose.
import tensorflow as tf
import numpy as np
x = tf.constant(np.array([[
[[-67], [-77]],
[[-117], [-127]]
]]), tf.float32)
# shape = (3, 3, 1, 1) -> (height, width, input_channels, output_channels) - 3x3x1 filter
f = tf.constant(np.array([
[[[-1]], [[2]], [[-3]]],
[[[4]], [[-5]], [[6]]],
[[[-7]], [[8]], [[-9]]]
]), tf.float32)
conv = tf.nn.conv2d_transpose(x, f, output_shape=(1, 5, 5, 1), strides=[1, 2, 2, 1], padding='VALID')
The result:
tf.Tensor(
[[[[ 67.]
[ -134.]
[ 278.]
[ -154.]
[ 231.]]
[[ -268.]
[ 335.]
[ -710.]
[ 385.]
[ -462.]]
[[ 586.]
[ -770.]
[ 1620.]
[ -870.]
[ 1074.]]
[[ -468.]
[ 585.]
[-1210.]
[ 635.]
[ -762.]]
[[ 819.]
[ -936.]
[ 1942.]
[-1016.]
[ 1143.]]]], shape=(1, 5, 5, 1), dtype=float32)
To my understanding, it should work as described in Figure 4.5 in the doc
Therefore, the first element (conv[0,0,0,0]) should be -67*-9=603. Why it turns out to be 67?
The result may be expained by the following image:. But why the convolution kernel is inversed?

To explain best, I have made a draw.io figure to explain the results that you obtained.
I guess above illustration might help explain the reason why the first element of transpose conv. feature map is 67.
A key thing to note:
Unlike traditional convolution, in transpose convolution each element of the filter is multiplied by an element of the input feature map and the results of those individual multiplications & intermediate feature maps are overlaid on one another to create the final feature map. The stride determines how far apart the overlays are. In our case, stride = 2, hence the filter moves by 2 in both x & y dimension after each convolution with the original downsampled feature map.

Related

Get column-wise maximums from a NumPy array

I have a 2D array, say
x = np.random.rand(10, 3)
array([[ 0.51158246, 0.51214272, 0.1107923 ],
[ 0.5210391 , 0.85308284, 0.63227215],
[ 0.57239625, 0.06276943, 0.1069803 ],
[ 0.71627613, 0.66454443, 0.56771438],
[ 0.24595493, 0.01007568, 0.84959605],
[ 0.99158904, 0.25034553, 0.00144037],
[ 0.43292656, 0.9247424 , 0.5123086 ],
[ 0.07224077, 0.57230282, 0.88522979],
[ 0.55665913, 0.20119776, 0.58865823],
[ 0.55129624, 0.26226446, 0.63070611]])
Then I find the indexes of maximum elements along the columns:
indexes = np.argmax(x, axis=0)
array([5, 6, 7])
So far so good.
But how do I actually get those elements? That is, how do I get ?some_operation?(x, indexes) == [0.99158904, 0.9247424, 0.88522979]?
Note that I need both the indexes and the associated values.
The best I could come up with was x[indexes, range(x.shape[1])], but it looks kinda complicated and inefficient. Is there a more idiomatic way?
You can use np.amax to find max value along an axis.
Using your example (x is the original array in your post):
In[1]: np.argmax(x, axis=0)
Out[1]:
array([5, 6, 7], dtype=int64)
In[2]: np.amax(x, axis=0)
Out[2]:
array([ 0.99158904, 0.9247424 , 0.88522979])
Documentation link

ValueError: Floating point image RGB values must be in the 0..1 range. while using matplotlib

I want to visualize weights of the layer of a neural network. I'm using pytorch.
import torch
import torchvision.models as models
from matplotlib import pyplot as plt
def plot_kernels(tensor, num_cols=6):
if not tensor.ndim==4:
raise Exception("assumes a 4D tensor")
if not tensor.shape[-1]==3:
raise Exception("last dim needs to be 3 to plot")
num_kernels = tensor.shape[0]
num_rows = 1+ num_kernels // num_cols
fig = plt.figure(figsize=(num_cols,num_rows))
for i in range(tensor.shape[0]):
ax1 = fig.add_subplot(num_rows,num_cols,i+1)
ax1.imshow(tensor[i])
ax1.axis('off')
ax1.set_xticklabels([])
ax1.set_yticklabels([])
plt.subplots_adjust(wspace=0.1, hspace=0.1)
plt.show()
vgg = models.vgg16(pretrained=True)
mm = vgg.double()
filters = mm.modules
body_model = [i for i in mm.children()][0]
layer1 = body_model[0]
tensor = layer1.weight.data.numpy()
plot_kernels(tensor)
The above gives this error ValueError: Floating point image RGB values must be in the 0..1 range.
My question is should I normalize and take absolute value of the weights to overcome this error or is there anyother way ?
If I normalize and use absolute value I think the meaning of the graphs change.
[[[[ 0.02240197 -1.22057354 -0.55051649]
[-0.50310904 0.00891289 0.15427093]
[ 0.42360783 -0.23392732 -0.56789106]]
[[ 1.12248898 0.99013627 1.6526649 ]
[ 1.09936976 2.39608836 1.83921957]
[ 1.64557672 1.4093554 0.76332706]]
[[ 0.26969245 -1.2997849 -0.64577204]
[-1.88377869 -2.0100112 -1.43068039]
[-0.44531786 -1.67845118 -1.33723605]]]
[[[ 0.71286005 1.45265901 0.64986968]
[ 0.75984162 1.8061738 1.06934202]
[-0.08650422 0.83452386 -0.04468433]]
[[-1.36591709 -2.01630116 -1.54488969]
[-1.46221244 -2.5365622 -1.91758668]
[-0.88827479 -1.59151018 -1.47308767]]
[[ 0.93600738 0.98174071 1.12213969]
[ 1.03908169 0.83749604 1.09565806]
[ 0.71188802 0.85773659 0.86840987]]]
[[[-0.48592842 0.2971966 1.3365227 ]
[ 0.47920835 -0.18186836 0.59673625]
[-0.81358945 1.23862112 0.13635623]]
[[-0.75361633 -1.074965 0.70477796]
[ 1.24439156 -1.53563368 -1.03012812]
[ 0.97597247 0.83084011 -1.81764793]]
[[-0.80762428 -0.62829626 1.37428832]
[ 1.01448071 -0.81775147 -0.41943246]
[ 1.02848887 1.39178836 -1.36779451]]]
...,
[[[ 1.28134537 -0.00482408 0.71610934]
[ 0.95264435 -0.09291686 -0.28001019]
[ 1.34494913 0.64477581 0.96984017]]
[[-0.34442815 -1.40002513 1.66856039]
[-2.21281362 -3.24513769 -1.17751861]
[-0.93520379 -1.99811196 0.72937071]]
[[ 0.63388056 -0.17022935 2.06905985]
[-0.7285465 -1.24722099 0.30488953]
[ 0.24900314 -0.19559766 1.45432627]]]
[[[-0.80684513 2.1764245 -0.73765725]
[-1.35886598 1.71875226 -1.73327696]
[-0.75233924 2.14700699 -0.71064663]]
[[-0.79627383 2.21598244 -0.57396138]
[-1.81044972 1.88310981 -1.63758397]
[-0.6589964 2.013237 -0.48532376]]
[[-0.3710472 1.4949851 -0.30245575]
[-1.25448656 1.20453358 -1.29454732]
[-0.56755757 1.30994892 -0.39370224]]]
[[[-0.67361742 -3.69201088 -1.23768616]
[ 3.12674141 1.70414758 -1.76272404]
[-0.22565465 1.66484773 1.38172317]]
[[ 0.28095332 -2.03035069 0.69989491]
[ 1.97936332 1.76992691 -1.09842575]
[-2.22433758 0.52577412 0.18292744]]
[[ 0.48471382 -1.1984663 1.57565165]
[ 1.09911084 1.31910467 -0.51982772]
[-2.76202297 -0.47073677 0.03936549]]]]
It sounds as if you already know your values are not in that range. Yes, you must re-scale them to the range 0.0 - 1.0. I suggest that you want to retain visibility of negative vs positive, but that you let 0.5 be your new "neutral" point. Scale such that current 0.0 values map to 0.5, and your most extreme value (largest magnitude) scale to 0.0 (if negative) or 1.0 (if positive).
Thanks for the vectors. It looks like your values are in the range -2.25 to +2.0. I suggest a rescaling new = (1/(2*2.25)) * old + 0.5

bad result from numpy corrcoef and minimum spanning tree

I have this code:
mm = np.array([[1, 4, 7, 8], [2, 2, 8, 4], [1, 13, 1, 5]])
mm = np.column_stack(mm)
mmCov = np.cov(mm, rowvar=0)
print("covariance\n", mmCov)
# my code to get correlations
mmResCor = np.zeros(shape=(3, 3))
for i in range(len(mmCov)):
for j in range(len(mmCov[i])):
mmResCor[i][j] = mmCov[i][j] / (math.sqrt(mmCov[i][i] * mmCov[j] [j]))
print("correlaciones a mano\n", mmResCor)
mmCor = np.corrcoef(mmCov, rowvar=0)
print("correlations\n", mmCor)
X = csr_matrix(mmCor)
XX = minimum_spanning_tree(X)
print("minimun spanning tree\n", XX)
first: each column represents a variable, with observations in the rows
numpy corrcoef use this relation with covariance matrix:
R_{ij} = \frac{ C_{ij} } { \sqrt{ C_{ii} * C_{jj} } }
when I use numpy corrcoef I get this matrix
correlations
[[ 1. 0.8660254 -0.82603319]
[ 0.8660254 1. -0.99717646]
[-0.82603319 -0.99717646 1. ]]
but when I apply "my code" to get the same result...
mmResCor = np.zeros(shape=(3, 3))
for i in range(len(mmCov)):
for j in range(len(mmCov[i])):
mmResCor[i][j] = mmCov[i][j] / (math.sqrt(mmCov[i][i] * mmCov[j][j]))
I get this matrix
correlaciones a mano
[[ 1. 0.67082039 0. ]
[ 0.67082039 1. -0.5 ]
[ 0. -0.5 1. ]]
why do I get differents results if its suppose I am doing the same?
One more question:
When I apply minimun_spanning_tree I get this:
minimun spanning tree
(0, 2) -0.826033187631
(1, 2) -0.997176464953
Is there any way to represent these or can I save this result in some variables?
The np.corrcoef should take the data as the input. You're passing the covariance matrix as input. If you pass the data, you get the same result as your manual computation:
>>> np.corrcoef(mm, rowvar=0)
array([[ 1. , 0.67082039, 0. ],
[ 0.67082039, 1. , -0.5 ],
[ 0. , -0.5 , 1. ]])
Regarding the minimum spanning tree, I'm not sure what your question is, but the output XX is a sparse matrix which stores a matrix representation of the tree.

How to efficiently prepare matrices (2-d array) for multiple arguments?

If you want to evaluate a 1-d array for multiple arguments efficiently i.e. without for-loop, you can do this:
x = array([1, 2, 3])
def gen_1d_arr(x):
arr = array([2 + x, 2 - x,])
return arr
gen_1d_arr(x).T
and you get:
array([[ 3, 1],
[ 4, 0],
[ 5, -1]])
Okay, but how do you do this for 2-d array like below:
def gen_2d_arr(x):
arr = array([[2 + x, 2 - x,],
[2 * x, 2 / x]])
return arr
and obtain this?:
array([[[ 3. , 1. ],
[ 2. , 2. ]],
[[ 4. , 0. ],
[ 4. , 1. ]],
[[ 5. , -1. ],
[ 6. , 0.66666667]]])
Also, is this generally possible for n-d arrays?
Look at what you get with your function
In [274]: arr = np.array([[2 + x, 2 - x,],
[2 * x, 2 / x]])
In [275]: arr
Out[275]:
array([[[ 3. , 4. , 5. ],
[ 1. , 0. , -1. ]],
[[ 2. , 4. , 6. ],
[ 2. , 1. , 0.66666667]]])
In [276]: arr.shape
Out[276]: (2, 2, 3)
The 3 comes from x. The middle 2 comes from [2+x, 2-x] pairs, and the 1st 2 from the outer list.
Looks like what you want is a (3,2,2) array. One option is to apply a transpose or axis swap to arr.
arr.transpose([2,0,1])
The basic operation of np.array([arr1,arr2]) is to construct a new array with a new dimension in front, i.e. with shape (2, *arr1(shape)).
There are other operations that combine arrays. np.concatenate and its variants hstack, vstack, dstack, column_stack, join arrays. .reshape() and [None,...], atleast_nd etc add dimensions. Look at the code of the stack functions to get some ideas on how to combine arrays using these tools.
On the question of efficiency, my time tests show that concatenate operations are generally faster than np.array. Often np.array converts its inputs to lists, and reparses the values. This gives it more power in cooercing arrays to specific dtypes, but at the expense of time. But I'd only worry about this with large arrays where construction time is significant.

fft of numpy and octave different on transpose

First of i know there is an identical question with answer in SO here: FFT in Matlab and numpy / scipy give different results
but the answer given there does not work on the test i did:
when i do an fft from numpy.fft i get following result:
In [30]: numpy.fft.fft(numpy.array([1+0.5j, 3+0j, 2+0j, 8+3j]))
Out[30]: array([ 14.+3.5j, -4.+5.5j, -8.-2.5j, 2.-4.5j])
which is identical to the output of in my case octave)
octave:39> fft([1+0.5j,3+0j,2+0j,8+3j])
ans =
Columns 1 through 3:
14.0000 + 3.5000i -4.0000 + 5.5000i -8.0000 - 2.5000i
Column 4:
2.0000 - 4.5000i
but if i transpose the list in octave and python i get:
In [9]: numpy.fft.fft(numpy.array([1+0.5j, 3+0j, 2+0j, 8+3j]).transpose())
Out[9]: array([ 14.+3.5j, -4.+5.5j, -8.-2.5j, 2.-4.5j])
and for octave:
octave:40> fft([1+0.5j,3+0j,2+0j,8+3j]')
ans =
14.0000 - 3.5000i
2.0000 + 4.5000i
-8.0000 + 2.5000i
-4.0000 - 5.5000i
I also tried to reshape in python but this results in:
In [33]: numpy.fft.fft(numpy.reshape(numpy.array([1+0.5j,3+0j,2+0j,8+3j]), (4,1)))
Out[33]:
array([[ 1.+0.5j],
[ 3.+0.j ],
[ 2.+0.j ],
[ 8.+3.j ]])
how do i get the same result in python as in octave?
+ i don't have matlab to test, otherwise i would check if it returns the same as octave just to be sure.
Why NumPy and octave gave different results:
The inputs were different. The ' in octave is returning the complex conjugate transpose, not the transpose, .':
octave:6> [1+0.5j,3+0j,2+0j,8+3j]'
ans =
1.0000 - 0.5000i
3.0000 - 0.0000i
2.0000 - 0.0000i
8.0000 - 3.0000i
So to make NumPy's result match octave's:
In [115]: np.fft.fft(np.array([1+0.5j, 3+0j, 2+0j, 8+3j]).conj()).reshape(-1, 1)
Out[115]:
array([[ 14.-3.5j],
[ 2.+4.5j],
[ -8.+2.5j],
[ -4.-5.5j]])
octave:7> fft([1+0.5j,3+0j,2+0j,8+3j]')
ans =
14.0000 - 3.5000i
2.0000 + 4.5000i
-8.0000 + 2.5000i
-4.0000 - 5.5000i
In NumPy, the transpose of a 1D array is the same 1D array.
That's why fft(np.array([1+0.5j, 3+0j, 2+0j, 8+3j]).transpose()) returns a 1D array.
Reshaping after taking the FFT of a 1D array:
You could take the FFT first, and then reshape. To make a 1D array 2-dimensional you could use reshape to obtain a column-like array of shape (4,1), or use np.atleast_2d followed by transpose:
In [115]: np.fft.fft(np.array([1+0.5j, 3+0j, 2+0j, 8+3j]).conj()).reshape(-1, 1)
Out[115]:
array([[ 14.-3.5j],
[ 2.+4.5j],
[ -8.+2.5j],
[ -4.-5.5j]])
or
In [116]: np.atleast_2d(np.fft.fft(np.array([1+0.5j, 3+0j, 2+0j, 8+3j]).conj())).T
Out[116]:
array([[ 14.-3.5j],
[ 2.+4.5j],
[ -8.+2.5j],
[ -4.-5.5j]])
Taking the FFT of a 2D array:
np.fft.fft takes the FFT over the last axis by default.
This is why reshaping to shape (4, 1) did not work. Instead, reshape the array to (1, 4):
In [117]: np.fft.fft(np.reshape(np.array([1+0.5j,3+0j,2+0j,8+3j]), (1,4)).conj()).T
Out[117]:
array([[ 14.-3.5j],
[ 2.+4.5j],
[ -8.+2.5j],
[ -4.-5.5j]])
Or you could use np.matrix to
make a 2D matrix of shape (1, 4).
Again the FFT is taken over the last axis, returns an array of shape (1, 4), which you can then transpose to get the desired result:
In [121]: np.fft.fft(np.matrix([1+0.5j, 3+0j, 2+0j, 8+3j]).conj()).T
Out[121]:
array([[ 14.-3.5j],
[ 2.+4.5j],
[ -8.+2.5j],
[ -4.-5.5j]])
This, perhaps, gives you the neatest syntax. But be aware that this passes a np.matrix as input but returns a np.ndarray as output.
As Warren Weckesser pointed out, if you already have a 2D NumPy array, and wish to take the FFT of its columns, then you could pass axis=0 to the call to np.fft.fft.
Also, The matrix class (unlike the ndarray class) has a H property which returns the complex conjugate transpose. Thus
In [114]: np.fft.fft(np.matrix([1+0.5j, 3+0j, 2+0j, 8+3j]).H, axis=0)
Out[114]:
array([[ 14.-3.5j],
[ 2.+4.5j],
[ -8.+2.5j],
[ -4.-5.5j]])