How to merge two dimensions of a tensor without assuming the shape of it - tensorflow

Let's assume the following function:
from tensorflow.python.keras import backend as K
def broadcast_sum(a, b):
a = K.expand_dims(a, 1)
b = K.expand_dims(b, 2)
c = a + b
cs = K.shape(c)
return K.reshape(c, (cs[0], -1, cs[-1]))
Given the two tensors of shapes (1, 3, 2) and (1, 4, 2), it correctly returns:
>>> broadcast_sum(K.placeholder((1, 3, 2)), K.placeholder((1, 4, 2)))
>>> <tf.Tensor 'Reshape_2:0' shape=(1, 12, 2) dtype=float32>
Right now, this function works only with 3D input (because of the reshape line). My question is, how can I make this work with any shape (using the same function) without knowing the shape? Of course, I'm assuming the inputs are of the same shape and at least a 3D. But how can I have a single function that works with 3D, 4D and so on?
And I'm assuming that it's always the second dimension (from left) that the function will broadcast and the rest of the dimensions are identical between the two inputs. Here are the shapes that I want to make the same function to work with:
>>> broadcast_sum(K.placeholder((1, 3, 5, 2)), K.placeholder((1, 4, 5, 2)))
>>> <tf.Tensor 'Reshape_3:0' shape=(1, 60, 2) dtype=float32>
Of course, the returned tensor is wrong right now. It should be of shape (1, 12, 5, 2).
[UPDATE]
Please also consider that the first dimension (the batch size) could be None. In fact, any of the dimensions except the rightmost one could be None.

And I'm assuming that it's always the second dimension (from left)
that the function will broadcast and the rest of the dimensions are
identical between the two inputs.
Based on this, I reuse the shape information from one of the inputs.
from tensorflow.python.keras import backend as K
def broadcast_sum(a, b):
final_shape = (a.shape[0], -1, *a.shape[2:])
a = K.expand_dims(a, 1)
b = K.expand_dims(b, 2)
c = a + b
return K.reshape(c, final_shape)
print(broadcast_sum(K.placeholder((1, 3, 2)), K.placeholder((1, 4, 2))))
print(broadcast_sum(K.placeholder((1, 3, 5, 2)), K.placeholder((1, 4, 5, 2))))
Tensor("Reshape:0", shape=(1, 4, 3, 2), dtype=float32)
Tensor("Reshape_1:0", shape=(1, 12, 5, 2), dtype=float32)

Related

Tensorflow weightNorm with variable length input

in_dim, out_dim = 10, 7
bias = False
activation = None
layer = tfp.layers.weight_norm.WeightNorm(
tf.keras.layers.Dense(out_dim, input_shape=(None, in_dim, ),
use_bias = bias, activation = activation),
input_shape = (None, None, in_dim))
I would like to give a input with variable length in the second dimension.
Suppose I run the below code first.
input = tf.random.normal(shape = (2, 3, 10))
output = layer(input)
output.shape
# [2, 3, 7]
After running the above code, I give another input to network
input2 = tf.random.normal(shape = (2, 4, 10))
output2 = layer(input2)
However, it causes error
Input 0 of layer "weight_norm_3" is incompatible with the layer:
expected shape=(None, 3, 10), found shape=(2, 4, 10)
I would like to give a variable length of second dimension. How can I do it?

Add vs Concatenate layer in Keras

I'm looking through some different neural network architectures and trying to piece together how to recreate them on my own.
One issue I'm running into is the functional difference between the Concatenate() and Add() layers in Keras. It seems like they accomplish similar things (combining multiple layers together), but I don't quite see the real difference between the two.
Here's a sample keras model that takes two separate inputs and then combines them:
inputs1 = Input(shape = (32, 32, 3))
inputs2 = Input(shape = (32, 32, 3))
x1 = Conv2D(kernel_size = 24, strides = 1, filters = 64, padding = "same")(inputs1)
x1 = BatchNormalization()(x1)
x1 = ReLU()(x1)
x1 = Conv2D(kernel_size = 24, strides = 1, filters = 64, padding = "same")(x1)
x2 = Conv2D(kernel_size = 24, strides = 1, filters = 64, padding = "same")(inputs2)
x2 = BatchNormalization()(x2)
x2 = ReLU()(x2)
x2 = Conv2D(kernel_size = 24, strides = 1, filters = 64, padding = "same")(x2)
add = Concatenate()([x1, x2])
out = Flatten()(add)
out = Dense(24, activation = 'softmax')(out)
out = Dense(10, activation = 'softmax')(out)
out = Flatten()(out)
mod = Model([inputs1, inputs2], out)
I can substitute out the Add() layer with the Concatenate() layer and everything works fine, and the models seem similar, but I have a hard time understanding the difference.
For reference, here's the plot of each one with keras's plot_model function:
KERAS MODEL WITH ADDED LAYERS:
KERAS MODEL WITH CONCATENATED LAYERS:
I notice when you concatenate your model size is bigger vs adding layers. Is it the case with a concatenation you just stack the weights from the previous layers on top of each other and with an Add() you add together the values?
It seems like it should be more complicated, but I'm not sure.
As you said, both of them combine input, but they combine in a different way.
their name already suggest their usage
Add() inputs are added together,
For example (assume batch_size=1)
x1 = [[0, 1, 2]]
x2 = [[3, 4, 5]]
x = Add()([x1, x2])
then x should be [[3, 5, 7]], where each element is added
notice that the input shape is (1, 3) and (1, 3), the output is also (1, 3)
Concatenate() concatenates the output,
For example (assume batch_size=1)
x1 = [[0, 1, 2]]
x2 = [[3, 4, 5]]
x = Concatenate()([x1, x2])
then x should be [[0, 1, 2, 3, 4, 5]], where the inputs are horizontally stacked together,
notice that the input shape is (1, 3) and (1, 3), the output is also (1, 6),
even when the tensor has more dimensions, similar behaviors still apply.
Concatenate creates a bigger model for an obvious reason, the output size is simply the size of all inputs summed, while add has the same size with one of the inputs
For more information about add/concatenate, and other ways to combine multiple inputs, see this

How to efficiently mask tensors in tensorflow only given the indices of the last axis?

Imagine I have a tensor of shape (batch_size, a, ... , c, d, e)where are a, ... ,c,d,e are defined integers. For example (batch_size, 500, 3, 2, 2, 69) or (batch_size, 2, 2).
My question is for all tensors but let's stick to the example of tensor1.get_shape() = (?, 500, 3, 2, 2, 69)
Given that I have tensor2 with tensor2.get_shape() = (?, 500, 3, 2, 2, 14) containing indices of the last axis of tensor1, I have 2 problems:
1) I want to construct a mask for tensor1 of shape (?, 500, 3, 2, 2, 69) from tensor2. For example a possible row along the last axis for tensor2 would be [1,8,3,68,2,4,58,19,20,21,26,48,56,11] but since tensor2 is constructed from tensor1 these indices vary for new input. These are the indices of the last axis that have to be kept of tensor1. Everything else has to be masked out.
2) given that I have the mask of shape (?, 500, 3, 2, 2, 69) for tensor1, how do I mask out the undesired values while maintaining the batch size dimension? The masked out tensor should have shape (?, 500, 3, 2, 2, 14).
Answers in keras or numpy would also be neat, although knowing how to do it in numpy wouldn't solve my problem, I'd still like to know.
answer to 1:
tf.gather_nd(mask, [tf.range(tf.shape(tensor1)[0])[:,None, None, None, None, None],tf.range(tf.shape(tensor1)[1])[:,None, None, None, None],tf.range(tf.shape(tensor1)[2])[:,None, None, None],tf.range(tf.shape(tensor1)[3])[:,None, None],tf.range(tf.shape(tensor1)[4])[:,None],tensor2])
There is probably no solution to 2. I will try pytorch.

K-means example(tf.expand_dims)

In Example code of Kmeans of Tensorflow,
When use the function 'tf.expand_dims'(Inserts a dimension of 1 into a tensor's shape.) in point_expanded, centroids_expanded
before calculate tf.reduce_sum.
why is these have different indexes(0, 1) in second parameter?
import numpy as np
import tensorflow as tf
points_n = 200
clusters_n = 3
iteration_n = 100
points = tf.constant(np.random.uniform(0, 10, (points_n, 2)))
centroids = tf.Variable(tf.slice(tf.random_shuffle(points), [0, 0],[clusters_n, -1]))
points_expanded = tf.expand_dims(points, 0)
centroids_expanded = tf.expand_dims(centroids, 1)
distances = tf.reduce_sum(tf.square(tf.subtract(points_expanded, centroids_expanded)), 2)
assignments = tf.argmin(distances, 0)
means = []
for c in range(clusters_n):
means.append(tf.reduce_mean(tf.gather(points,tf.reshape(tf.where(tf.equal(assignments, c)), [1, -1])), reduction_indices=[1]))
new_centroids = tf.concat(means,0)
update_centroids = tf.assign(centroids, new_centroids)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for step in range(iteration_n):
[_, centroid_values, points_values, assignment_values] = sess.run([update_centroids, centroids, points, assignments])
print("centroids" + "\n", centroid_values)
plt.scatter(points_values[:, 0], points_values[:, 1], c=assignment_values, s=50, alpha=0.5)
plt.plot(centroid_values[:, 0], centroid_values[:, 1], 'kx', markersize=15)
plt.show()
This is done to subtract each centroid from each point. First, make sure you understand the notion of broadcasting (https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
that is linked from tf.subtract (https://www.tensorflow.org/api_docs/python/tf/subtract). Then, you just need to draw the shapes of points, expanded_points, centroids, and expanded_centroids and understand what values get "broadcast" where. Once you do that you will see that broadcasting allows you to compute exactly what you want - subtract each point from each centroid.
As a sanity check, since there are 200 points, 3 centroids, and each is 2D, we should have 200*3*2 differences. This is exactly what we get:
In [53]: points
Out[53]: <tf.Tensor 'Const:0' shape=(200, 2) dtype=float64>
In [54]: points_expanded
Out[54]: <tf.Tensor 'ExpandDims_4:0' shape=(1, 200, 2) dtype=float64>
In [55]: centroids
Out[55]: <tf.Variable 'Variable:0' shape=(3, 2) dtype=float64_ref>
In [56]: centroids_expanded
Out[56]: <tf.Tensor 'ExpandDims_5:0' shape=(3, 1, 2) dtype=float64>
In [57]: tf.subtract(points_expanded, centroids_expanded)
Out[57]: <tf.Tensor 'Sub_5:0' shape=(3, 200, 2) dtype=float64>
If you are having trouble drawing the shapes, you can think of broadcasting the expanded_points with dimension (1, 200, 2) to dimension (3, 200, 2) as copying the 200x2 matrix 3 times along the first dimension. The 3x2 matrix in centroids_expanded (of shape (3, 1, 2)) get copied 200 times along the second dimension.

TensorFlow: questions regarding tf.argmax() and tf.equal()

I am learning the TensorFlow, building a multilayer_perceptron model. I am looking into some examples like the one at: https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/multilayer_perceptron.ipynb
I then have some questions in the code below:
def multilayer_perceptron(x, weights, biases):
:
:
pred = multilayer_perceptron(x, weights, biases)
:
:
with tf.Session() as sess:
sess.run(init)
:
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print ("Accuracy:", accuracy.eval({x: X_test, y: y_test_onehot}))
I am wondering what do tf.argmax(prod,1) and tf.argmax(y,1) mean and return (type and value) exactly? And is correct_prediction a variable instead of real values?
Finally, how do we get the y_test_prediction array (the prediction result when the input data is X_test) from the tf session? Thanks a lot!
tf.argmax(input, axis=None, name=None, dimension=None)
Returns the index with the largest value across axis of a tensor.
input is a Tensor and axis describes which axis of the input Tensor to reduce across. For vectors, use axis = 0.
For your specific case let's use two arrays and demonstrate this
pred = np.array([[31, 23, 4, 24, 27, 34],
[18, 3, 25, 0, 6, 35],
[28, 14, 33, 22, 20, 8],
[13, 30, 21, 19, 7, 9],
[16, 1, 26, 32, 2, 29],
[17, 12, 5, 11, 10, 15]])
y = np.array([[31, 23, 4, 24, 27, 34],
[18, 3, 25, 0, 6, 35],
[28, 14, 33, 22, 20, 8],
[13, 30, 21, 19, 7, 9],
[16, 1, 26, 32, 2, 29],
[17, 12, 5, 11, 10, 15]])
Evaluating tf.argmax(pred, 1) gives a tensor whose evaluation will give array([5, 5, 2, 1, 3, 0])
Evaluating tf.argmax(y, 1) gives a tensor whose evaluation will give array([5, 5, 2, 1, 3, 0])
tf.equal(x, y, name=None) takes two tensors(x and y) as inputs and returns the truth value of (x == y) element-wise.
Following our example, tf.equal(tf.argmax(pred, 1),tf.argmax(y, 1)) returns a tensor whose evaluation will givearray(1,1,1,1,1,1).
correct_prediction is a tensor whose evaluation will give a 1-D array of 0's and 1's
y_test_prediction can be obtained by executing pred = tf.argmax(logits, 1)
The documentation for tf.argmax and tf.equal can be accessed by following the links below.
tf.argmax() https://www.tensorflow.org/api_docs/python/math_ops/sequence_comparison_and_indexing#argmax
tf.equal() https://www.tensorflow.org/versions/master/api_docs/python/control_flow_ops/comparison_operators#equal
Reading the documentation:
tf.argmax
Returns the index with the largest value across axes of a tensor.
tf.equal
Returns the truth value of (x == y) element-wise.
tf.cast
Casts a tensor to a new type.
tf.reduce_mean
Computes the mean of elements across dimensions of a tensor.
Now you can easily explain what it does. Your y is one-hot encoded, so it has one 1 and all other are zero. Your pred represents probabilities of classes. So argmax finds the positions of best prediction and correct value. After that you check whether they are the same.
So now your correct_prediction is a vector of True/False values with the size equal to the number of instances you want to predict. You convert it to floats and take the average.
Actually this part is nicely explained in TF tutorial in the Evaluate the Model part
tf.argmax(input, axis=None, name=None, dimension=None)
Returns the index with the largest value across axis of a tensor.
For the case in specific, it receives pred as argument for it's input and 1 as axis. The axis describes which axis of the input Tensor to reduce across. For vectors, use axis = 0.
Example: Given the list [2.11,1.0021,3.99,4.32] argmax will return 3 which is the index of the highest value.
correct_prediction is a tensor that will be evaluated later. It is not a regular python variable. It contains the necessary information to compute the value later.
For this specific case, it will be part of another tensor accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) and will be evaluated by eval on accuracy.eval({x: X_test, y: y_test_onehot}).
y_test_prediction should be your correct_prediction tensor.
For those who do not have much time to understand tf.argmax:
x = np.array([[1, 9, 3],[4, 5, 6]])
tf.argmax(x, axis = 0)
output:
[array([1, 0, 1], dtype=int64)]
tf.argmax(x, axis = 1)
Output:
[array([1, 2], dtype=int64)]
source