Keras/Tensorflow batch matrix multiplication across axis - tensorflow

Say I have tensors
a
Out[15]: <tf.Tensor 'Placeholder_2:0' shape=(?, 1152, 8) dtype=float32>
b
Out[16]: <tf.Variable 'Variable:0' shape=(16, 8, 1152, 10) dtype=float32_ref>
a represents a batch of 1152 8-dimensional vectors and
b is 1152*10, (16, 8) matrices.
I wish to multiply those matrices with the 8-dimensional vectors in a and get a tensor of shape (None, 16, 1152, 10) back. I know in tensorflow one can use einsum to get this job done
tf.einsum('ijkl,bkj->bikl', b, a)
gives me the correct output and the shape. But tf.einsum is very slow compare to similar functions like K.batch_dot or tf.tensordot. However, I struggled to understand how these functions handle axes and broadcasting rules. Any help?

By using transpose and reshape you can achieve the same:
a : [batch, 1152, 8] --> reshape --> [batch, 1, 1, 1152, 8]
b : [16,8,1152,10] --> transpose --> [16, 10, 1152, 8]
--> expand_dims --> [1, 16, 10, 1152, 8]
multiply (a, b) --> [batch, 16, 10, 1152, 8]
reduce_sum axis 4 --> [batch, 16, 10, 1152]
Code:
#inputs
import numpy.testing as npt
x = np.random.normal(size=(5,1152,8))
y = np.random.normal(size=(16, 8, 1152, 10))
a = tf.placeholder(tf.float32,shape=(None, 1152, 8))
b = tf.constant(y, tf.float32)
out = tf.reduce_sum(tf.expand_dims(tf.transpose(b,[0, 3, 2, 1]),0)
* tf.reshape(a,[-1,1,1,tf.shape(a)[1], tf.shape(a)[2]]), axis=4)
out = tf.transpose(out, [0,1,3,2])
out_ein = tf.einsum('ijkl,bkj->bikl', b, a)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
o = sess.run(out, {a: x})
e = sess.run(out_ein, {a: x})
npt.assert_almost_equal(o, e, decimal=5)
#almost the same

Related

Matrix size-incompatible for custom multi model

I am getting the following error:
Node: 'BGNet/dense/BiasAdd'
Matrix size-incompatible: In[0]: [1120,0], In[1]: [2048,1024]
[[{{node BGNet/dense/BiasAdd}}]] [Op:__inference_train_function_11676]
I found the root in this part of the model:
File "<ipython-input-14-3dcbdf5337b8>", line 69, in call
f = self.dense(f)
This is my custom multi model:
class BGNet(tf.keras.Model):
def __init__(self, img_h, img_w, img_c, batch_size, classes):
super(BGNet, self).__init__(name='BGNet')
self.img_h = img_h
self.img_w = img_w
self.img_c = img_c
self.batch_size = batch_size
self.classes = classes
# (224, 224, 3)
self.bgblock0 = BGBlock(f=[32, 32, 32, 32],
k=[7, 5, 5, 5],
d=[1, 2, 2, 1],
stage=0)
# (112, 112, 32)
self.bgblock1 = BGBlock(f=[64, 64, 64, 64],
k=[5, 5, 5, 3],
d=[2, 1, 1, 2],
stage=1)
# (56, 56, 64)
self.bgblock2 = BGBlock(f=[128, 128, 128, 128],
k=[5, 5, 3, 3],
d=[2, 1, 2, 1],
stage=2)
# (28, 28, 128)
self.bgblock3 = BGBlock(f=[256, 256, 256, 256],
k=[5, 3, 3, 3,],
d=[1, 2, 1, 2],
stage=3)
# (14, 14, 256)
self.bgblock4 = BGBlock(f=[512, 512, 512],
k=[3, 3, 3],
d=[1, 1, 2],
stage=4)
# (7, 7, 512)
self.bgblock5 = BGBlock(f=[1024, 1024, 1024],
k=[3, 3, 1],
d=[2, 1, 1],
stage=5)
# (4, 4, 1024)
self.bgblock6 = BGBlock(f=[2048, 2048],
k=[1, 1],
d=[1, 2],
stage=6)
# (2, 2, 2048)
self.flatten = tf.keras.layers.Flatten(name='flatten')
self.dense = tf.keras.layers.Dense(1024, activation='tanh', name='dense')
self.dropout = tf.keras.layers.Dropout(0.2, name='dropout')
self.prob = tf.keras.layers.Dense(1, activation='sigmoid', name='prob')
self.concat1 = tf.keras.layers.Concatenate(axis=-1, name='concat1')
self.bbox1 = tf.keras.layers.Dense(512, activation='relu', name='bbox1')
self.bbox2 = tf.keras.layers.Dropout(0.1, name='bbox2')
self.bbox3 = tf.keras.layers.Dense(256, activation='sigmoid', name='bbox3')
self.bbox = tf.keras.layers.Dense(4, name='bbox')
self.concat2 = tf.keras.layers.Concatenate(axis=-1, name='concat2')
self.cat = tf.keras.layers.Dense(len(self.classes), activation='softmax', name='cat')
def call(self, input_tensor, training=True):
x = self.bgblock0(input_tensor)
x = self.bgblock1(x)
x = self.bgblock2(x)
x = self.bgblock3(x)
x = self.bgblock4(x)
x = self.bgblock5(x)
x = self.bgblock6(x)
f = self.flatten(x)
f = self.dense(f)
f = self.dropout(f)
p = self.prob(f)
b = self.concat1([f, p])
b = self.bbox1(b)
b = self.bbox2(b)
b = self.bbox3(b)
b = self.bbox(b)
c = self.concat2([f, b])
c = self.cat(c)
return {'prob': p, 'bbox': b, 'class': c}
model1 = BGNet(H, W, C, B, N)
model1.build(input_shape=(B, H, W, C))
model1.call(tf.keras.layers.Input(shape=(H, W, C), batch_size=B))
model1.summary(print_fn=tf.print, expand_nested=True, show_trainable=True)
The custom (BGBlocks) blocks are not that important but if you are curious they are convolution blocks consisting of conv2d, batchnorm, activation and pooling layers
The model produces 3 outputs of different size vector while sharing the first dense layers. The output layers first predict the confidence score(prob in loss) of the an object being in the image. Next they predict the bounding box(bbox in loss) and finally the class(class in loss) of the bounded object.
The main issue is after the flatten layer. The model builds without errors with input images of (224, 224, 3). This is how the summary of the model looks: model.summary() image
I have even created a custom IOU (Intersection Over Union) for bounding boxes to be used as model metric. The losses are simple, inbuilt and as follows:
loss = {'prob': 'binary_crossentropy', 'bbox': 'mse', 'class': 'categorical_crossentropy'}
Hoe can I resolve this error?

Copy tensor using K.tile()

I have tensor (None, 196) and after reshaping, it becomes (None, 14, 14).
And now, I want to copy it to channel axis, so that the shape should be (None, 14, 14, 512). Lastly, I want to copy to timestep axis, so it becomes (None, 10, 14, 14, 512). I accomplish those steps using this snippet code:
def replicate(tensor, input_target):
batch_size = K.shape(tensor)[0]
nf, h, w, c = input_target
x = K.reshape(tensor, [batch_size, 1, h, w, 1])
# Replicate to channel dimension
x = K.tile(x, [batch_size, 1, 1, 1, c])
# Replicate to timesteps dimension
x = K.tile(x, [batch_size, nf, 1, 1, 1])
return x
x = ...
x = Lambda(replicate, arguments={'input_target':input_shape})(x)
another_x = Input(shape=input_shape) # shape (10, 14, 14, 512)
x = layers.multiply([x, another_x])
x = ...
I plot the model and the output shape is just like I want it to be. But, the problem arises in model training. I set the batch size to 2. This the the error message:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [8,10,14,14,512] vs. [2,10,14,14,512]
[[{{node multiply_1/mul}} = Mul[T=DT_FLOAT, _class=["loc:#training/Adam/gradients/multiply_1/mul_grad/Sum"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](Lambda_2/Tile_1, _arg_another_x_0_0/_189)]]
[[{{node metrics/top_k_categorical_accuracy/Mean_1/_265}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6346_metrics/top_k_categorical_accuracy/Mean_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Looks like, K.tile() increases the batch size from 2 to 8. When I set the batch size to 10, it becomes 1000.
So, my question is how to achieve the result as I want? Is it good way to use tile()? Or, should I use repeat_elements()? Thanks!
I am using Tensorflow 1.12.0 and Keras 2.2.4.
As a rule of thumb, try to avoid bringing batch size to the transformations happening in the Lambda layer.
When you use tile operation, you only set only the dimension that needs to change (for example you had batch_size value in your tile operation which is wrong). Also I am using tf.tile instead of K.tile (TF 1.12 doesn't have tile in the Keras backend it seems).
def replicate(tensor, input_target):
_, nf, h, w, c = input_target
x = K.reshape(tensor, [-1, 1, h, w, 1])
# Replicate to channel dimension
# You can combine below lines to tf.tile(x, [1, nf, 1, 1, c]) as well
x = tf.tile(x, [1, 1, 1, 1, c])
# Replicate to timesteps dimension
x = tf.tile(x, [1, nf, 1, 1, 1])
return x
Simple example
input_shape= [None, 10, 14, 14, 512]
x = Input(shape=(196,))
x = Lambda(replicate, arguments={'input_target':input_shape})(x)
print(x.shape)
Which gives
>>> (?, 10, 14, 14, 512)

K-means example(tf.expand_dims)

In Example code of Kmeans of Tensorflow,
When use the function 'tf.expand_dims'(Inserts a dimension of 1 into a tensor's shape.) in point_expanded, centroids_expanded
before calculate tf.reduce_sum.
why is these have different indexes(0, 1) in second parameter?
import numpy as np
import tensorflow as tf
points_n = 200
clusters_n = 3
iteration_n = 100
points = tf.constant(np.random.uniform(0, 10, (points_n, 2)))
centroids = tf.Variable(tf.slice(tf.random_shuffle(points), [0, 0],[clusters_n, -1]))
points_expanded = tf.expand_dims(points, 0)
centroids_expanded = tf.expand_dims(centroids, 1)
distances = tf.reduce_sum(tf.square(tf.subtract(points_expanded, centroids_expanded)), 2)
assignments = tf.argmin(distances, 0)
means = []
for c in range(clusters_n):
means.append(tf.reduce_mean(tf.gather(points,tf.reshape(tf.where(tf.equal(assignments, c)), [1, -1])), reduction_indices=[1]))
new_centroids = tf.concat(means,0)
update_centroids = tf.assign(centroids, new_centroids)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for step in range(iteration_n):
[_, centroid_values, points_values, assignment_values] = sess.run([update_centroids, centroids, points, assignments])
print("centroids" + "\n", centroid_values)
plt.scatter(points_values[:, 0], points_values[:, 1], c=assignment_values, s=50, alpha=0.5)
plt.plot(centroid_values[:, 0], centroid_values[:, 1], 'kx', markersize=15)
plt.show()
This is done to subtract each centroid from each point. First, make sure you understand the notion of broadcasting (https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
that is linked from tf.subtract (https://www.tensorflow.org/api_docs/python/tf/subtract). Then, you just need to draw the shapes of points, expanded_points, centroids, and expanded_centroids and understand what values get "broadcast" where. Once you do that you will see that broadcasting allows you to compute exactly what you want - subtract each point from each centroid.
As a sanity check, since there are 200 points, 3 centroids, and each is 2D, we should have 200*3*2 differences. This is exactly what we get:
In [53]: points
Out[53]: <tf.Tensor 'Const:0' shape=(200, 2) dtype=float64>
In [54]: points_expanded
Out[54]: <tf.Tensor 'ExpandDims_4:0' shape=(1, 200, 2) dtype=float64>
In [55]: centroids
Out[55]: <tf.Variable 'Variable:0' shape=(3, 2) dtype=float64_ref>
In [56]: centroids_expanded
Out[56]: <tf.Tensor 'ExpandDims_5:0' shape=(3, 1, 2) dtype=float64>
In [57]: tf.subtract(points_expanded, centroids_expanded)
Out[57]: <tf.Tensor 'Sub_5:0' shape=(3, 200, 2) dtype=float64>
If you are having trouble drawing the shapes, you can think of broadcasting the expanded_points with dimension (1, 200, 2) to dimension (3, 200, 2) as copying the 200x2 matrix 3 times along the first dimension. The 3x2 matrix in centroids_expanded (of shape (3, 1, 2)) get copied 200 times along the second dimension.

TensorFlow: questions regarding tf.argmax() and tf.equal()

I am learning the TensorFlow, building a multilayer_perceptron model. I am looking into some examples like the one at: https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/multilayer_perceptron.ipynb
I then have some questions in the code below:
def multilayer_perceptron(x, weights, biases):
:
:
pred = multilayer_perceptron(x, weights, biases)
:
:
with tf.Session() as sess:
sess.run(init)
:
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print ("Accuracy:", accuracy.eval({x: X_test, y: y_test_onehot}))
I am wondering what do tf.argmax(prod,1) and tf.argmax(y,1) mean and return (type and value) exactly? And is correct_prediction a variable instead of real values?
Finally, how do we get the y_test_prediction array (the prediction result when the input data is X_test) from the tf session? Thanks a lot!
tf.argmax(input, axis=None, name=None, dimension=None)
Returns the index with the largest value across axis of a tensor.
input is a Tensor and axis describes which axis of the input Tensor to reduce across. For vectors, use axis = 0.
For your specific case let's use two arrays and demonstrate this
pred = np.array([[31, 23, 4, 24, 27, 34],
[18, 3, 25, 0, 6, 35],
[28, 14, 33, 22, 20, 8],
[13, 30, 21, 19, 7, 9],
[16, 1, 26, 32, 2, 29],
[17, 12, 5, 11, 10, 15]])
y = np.array([[31, 23, 4, 24, 27, 34],
[18, 3, 25, 0, 6, 35],
[28, 14, 33, 22, 20, 8],
[13, 30, 21, 19, 7, 9],
[16, 1, 26, 32, 2, 29],
[17, 12, 5, 11, 10, 15]])
Evaluating tf.argmax(pred, 1) gives a tensor whose evaluation will give array([5, 5, 2, 1, 3, 0])
Evaluating tf.argmax(y, 1) gives a tensor whose evaluation will give array([5, 5, 2, 1, 3, 0])
tf.equal(x, y, name=None) takes two tensors(x and y) as inputs and returns the truth value of (x == y) element-wise.
Following our example, tf.equal(tf.argmax(pred, 1),tf.argmax(y, 1)) returns a tensor whose evaluation will givearray(1,1,1,1,1,1).
correct_prediction is a tensor whose evaluation will give a 1-D array of 0's and 1's
y_test_prediction can be obtained by executing pred = tf.argmax(logits, 1)
The documentation for tf.argmax and tf.equal can be accessed by following the links below.
tf.argmax() https://www.tensorflow.org/api_docs/python/math_ops/sequence_comparison_and_indexing#argmax
tf.equal() https://www.tensorflow.org/versions/master/api_docs/python/control_flow_ops/comparison_operators#equal
Reading the documentation:
tf.argmax
Returns the index with the largest value across axes of a tensor.
tf.equal
Returns the truth value of (x == y) element-wise.
tf.cast
Casts a tensor to a new type.
tf.reduce_mean
Computes the mean of elements across dimensions of a tensor.
Now you can easily explain what it does. Your y is one-hot encoded, so it has one 1 and all other are zero. Your pred represents probabilities of classes. So argmax finds the positions of best prediction and correct value. After that you check whether they are the same.
So now your correct_prediction is a vector of True/False values with the size equal to the number of instances you want to predict. You convert it to floats and take the average.
Actually this part is nicely explained in TF tutorial in the Evaluate the Model part
tf.argmax(input, axis=None, name=None, dimension=None)
Returns the index with the largest value across axis of a tensor.
For the case in specific, it receives pred as argument for it's input and 1 as axis. The axis describes which axis of the input Tensor to reduce across. For vectors, use axis = 0.
Example: Given the list [2.11,1.0021,3.99,4.32] argmax will return 3 which is the index of the highest value.
correct_prediction is a tensor that will be evaluated later. It is not a regular python variable. It contains the necessary information to compute the value later.
For this specific case, it will be part of another tensor accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) and will be evaluated by eval on accuracy.eval({x: X_test, y: y_test_onehot}).
y_test_prediction should be your correct_prediction tensor.
For those who do not have much time to understand tf.argmax:
x = np.array([[1, 9, 3],[4, 5, 6]])
tf.argmax(x, axis = 0)
output:
[array([1, 0, 1], dtype=int64)]
tf.argmax(x, axis = 1)
Output:
[array([1, 2], dtype=int64)]
source

Confused about conv2d_transpose

I'm getting this error message when using conv2d_transpose:
W tensorflow/core/common_runtime/executor.cc:1102] 0x7fc81f0d6250 Compute status: Invalid argument: Conv2DBackpropInput: Number of rows of out_backprop doesn't match computed: actual = 32, computed = 4
[[Node: generator/g_h1/conv2d_transpose = Conv2DBackpropInput[T=DT_FLOAT, padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/cpu:0"](generator/g_h1/conv2d_transpose/output_shape, generator/g_h1/w/read, _recv_l_0)]]
However, it occurs after the graph is built while compiling the loss function (Adam). Any ideas on what would cause this? I suspect it's related to the input dimensions but I'm not sure exactly why.
Full error: https://gist.github.com/jimfleming/75d88e888044615dd6e3
Relevant code:
# l shape: [batch_size, 32, 32, 4]
output_shape = [self.batch_size, 8, 8, 128]
filter_shape = [7, 7, 128, l.get_shape()[-1]]
strides = [1, 2, 2, 1]
with tf.variable_scope("g_h1"):
w = tf.get_variable('w', filter_shape, initializer=tf.random_normal_initializer(stddev=0.02))
h1 = tf.nn.conv2d_transpose(l, w, output_shape=output_shape, strides=strides, padding='SAME')
h1 = tf.nn.relu(h1)
output_shape = [self.batch_size, 16, 16, 128]
filter_shape = [7, 7, 128, h1.get_shape()[-1]]
strides = [1, 2, 2, 1]
with tf.variable_scope("g_h2"):
w = tf.get_variable('w', filter_shape, initializer=tf.random_normal_initializer(stddev=0.02))
h2 = tf.nn.conv2d_transpose(h1, w,output_shape=output_shape, strides=strides, padding='SAME')
h2 = tf.nn.relu(h2)
output_shape = [self.batch_size, 32, 32, 3]
filter_shape = [5, 5, 3, h2.get_shape()[-1]]
strides = [1, 2, 2, 1]
with tf.variable_scope("g_h3"):
w = tf.get_variable('w', filter_shape, initializer=tf.random_normal_initializer(stddev=0.02))
h3 = tf.nn.conv2d_transpose(h2, w,output_shape=output_shape, strides=strides, padding='SAME')
h3 = tf.nn.tanh(h3)
Thanks for the question! You're exactly right---the problem is that the input and output dimensions being passed to tf.nn.conv2d_transpose don't agree. (The error may be detected when computing gradients, but the gradient computation isn't the problem.)
Let's look at just the first part of your code, and simplify it a little bit:
sess = tf.Session()
batch_size = 3
output_shape = [batch_size, 8, 8, 128]
strides = [1, 2, 2, 1]
l = tf.constant(0.1, shape=[batch_size, 32, 32, 4])
w = tf.constant(0.1, shape=[7, 7, 128, 4])
h1 = tf.nn.conv2d_transpose(l, w, output_shape=output_shape, strides=strides, padding='SAME')
print sess.run(h1)
I replaced the variables with constants --- it's easier to see what's going on.
If you try to run this code, you get a similar error:
InvalidArgumentError: Conv2DCustomBackpropInput: Size of out_backprop doesn't match computed: actual = 32, computed = 4
[[Node: conv2d_transpose_6 = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/cpu:0"](conv2d_transpose_6/output_shape, Const_25, Const_24)]]
Now, the error is a little misleading --- it talks about the 'out_backprop' argument to 'Conv2DCustomBackpropInput'. The key is that tf.nn.conv2d_transpose is actually just the gradient of tf.nn.conv2d, so Tensorflow uses the same code internally (Conv2DCustomBackpropInput) to compute the gradient of tf.nn.conv2d and to compute tf.nn.conv2d_transpose.
The error means that the 'output_shape' you requested is not possible, given the shapes of 'l' and 'w'.
Since tf.nn.conv2d_transpose is the backward (gradient) counterpart of tf.nn.conv2d, one way to see what the correct shapes should be is to use the corresponding forward operation:
output = tf.constant(0.1, shape=output_shape)
expected_l = tf.nn.conv2d(output, w, strides=strides, padding='SAME')
print expected_l.get_shape()
# Prints (3, 4, 4, 4)
That is, in the forward direction, if you provided a tensor of shape 'output_shape', you would get out a tensor of shape (3, 4, 4, 4).
So one way to fix the problem is to change the shape of 'l' to (3, 4, 4, 4); if you change the code above to:
l = tf.constant(0.1, shape=[batch_size, 4, 4, 4])
everything works fine.
In general, try using tf.nn.conv2d to get a feel for what the relationship between the tensor shapes is. Since tf.nn.conv2d_transpose is its backward counterpart, it has the same relationship between input, output and filter shapes (but with the roles of the input and output reversed.)
Hope that helps!
Using padding='SAME' in tf.nn.conv2d_transpose() function may works too