no broadcasting for tf.matmul in tensorflow for 4D 3D tensors - tensorflow

First I find another question here No broadcasting for tf.matmul in TensorFlow
But that question does not solve my problem.
My problem is a batch of matrices multiply another batch of vectors.
x=tf.placeholder(tf.float32,shape=[10,1000,3,4])
y=tf.placeholder(tf.float32,shape=[1000,4])
x is a batch of matrices.There are 10*1000 matrices.Each matrix is of shape [3,4]
y is a batch of vectors.There are 1000 vectors.Each vector is of shape[4]
Dim 1 of x and dim 0 of y are the same.(Here is 1000)
If tf.matmul had supported broadcasting,I could write
y=tf.reshape(y,[1,1000,4,1])
result=tf.matmul(x,y)
result=tf.reshape(result,[10,1000,3])
But tf.matmul does not support broadcasting
If I use the approach of the question I referenced above
x=tf.reshape(x,[10*1000*3,4])
y=tf.transpose(y,perm=[1,0]) #[4,1000]
result=tf.matmul(x,y)
result=tf.reshape(result,[10,1000,3,1000])
The result is of shape [10,1000,3,1000],not [10,1000,3].
I don't know how to remove the redundant 1000
How to get the same result as the tf.matmul which supports broadcasting?

I solve it myself.
x=tf.transpose(x,perm=[1,0,2,3]) #[1000,10,3,4]
x=tf.reshape(x,[1000,30,4])
y=tf.reshape(y,[1000,4,1])
result=tf.matmul(x,y) #[1000,30,1]
result=tf.reshape(result,[1000,10,3])
result=tf.transpose(result,perm=[1,0,2]) #[10,1000,3]

As indicated here, you can use a function to work around:
def broadcast_matmul(A, B):
"Compute A # B, broadcasting over the first `N-2` ranks"
with tf.variable_scope("broadcast_matmul"):
return tf.reduce_sum(A[..., tf.newaxis] * B[..., tf.newaxis, :, :],
axis=-2)

Related

How to vectorize this operation in numpy?

I have a 2d array s and I want to calculate differences elementwise, i.e.:
Since it cannot be written as a single matrix multiplication, I was wondering what is the proper way to vectorize it?
You can use broadcasting for that: d = s[:, None, :] - s[None, :, :]. Note the None enable you to create a new dimension. Numpy implicitly perform the broadcasting operation between the two arrays.

How to use tensor shape parameters for something useful?

I'm trying to use the shape of an incoming tensor to form the output, sort of like this:
import tensorflow.keras.backend as K
def myFunc(x):
sz = tf.shape(x)[1]
# .. other stuff
z = K.repeat_elements(y, sz, axis=1)
This results in TypeError: Tensor object cannot be interpreted as integer.
How do I get around this?
If you know are that the dimension of x is known in advance, you can use x.shape[1] instead of tf.shape(x)[1], which will return an integer.
But I would advise to use tf.repeat instead of tf.keras.backend.repeat_elements. tf.repeat will work regardless the usage of tf.shape(x) or x.shape.

How to get batch_size if shape method in Keras & TF returns None for the batch_size?

I'm wrapping a function as a layer. In this function, I need to know what is the shape of the input. The first index of shape is the batch_size, I need to know it! The problem is that K.int_shape returns something like (None, 2, 10). But, this (None) thing should be known at runtime, right? it is still None and causes an error.
Basically, in my function I want to create a constant that is as long as the batch_size.
Here is my function for what its worth
def func(inputs):
max_iter=3
x, y= inputs
c= tf.complex(x, y)
print(K.int_shape(c))
z= tf.zeros(shape=K.int_shape(c), dtype='complex64')
#b=K.switch(K.greater( tf.abs(c) , 4), K.constant(1, shape=(1,1)), K.constant(0, shape=(1,1)))
for i in range(max_iter):
c= c * c + z
return c
layer= Lambda(func)
You can see where I created the constant z. I want its shape to be equal to the input shape. But this is causing an error with massive trace. If I replace that with a fixed shape it works. I traced the error to this damn None thing.
Instead of using int_shape, you can use tf.zeros_like to create z
z= tf.zeros_like(c, dtype='complex64')

Logical AND/OR in Keras Backend

Tensorflow has tf.logical_and() and tf.logical_or() for comparison of two boolean tensors, i.e. tf.logical_and(x,y)==TRUE if x==TRUE and y==TRUE (doc). I can't find anything like this in the Keras backend though. They have keras.backend.any() and .all(), but this is for aggregation within a tensor, not between. I've been having to use workarounds with nested K.switch() functions, but it is painfully inelegant.
Let x and y be boolean keras tensors of the same shape.
To take elementwise or, do the following:
keras.backend.any(keras.backend.stack([x, y], axis=0), axis=0)
To take elementwise and, do the following:
keras.backend.all(keras.backend.stack([x, y], axis=0), axis=0)
Here keras.backend.stack([x, y], axis=0) stacks x and y into a new tensor with an additional dimension at number 0. After that keras.backend.any takes a logical or along the new dimension, and keras.backend.any takes the logical and.
My solution (perhaps not the best, because I haven't found others either), is:
A = K.cast(someBooleanTensor, K.floatx())
B = K.cast(anotherBooleanTensor, K.floatx())
A_and_B = A * B #this is also something I use a lot for gathering elements
A_or_B = 1 -((1-A)*(1-B))
But thinking about it now... I never tested python operators... perhaps they work?

Creating new vector in tensorflow from argmax performed on another tensor

I have tensor that has shape (?, 3), looks like this [x, y, z] and I need to create function that take argmax of it, creates new vector and assign values with respect to dimension and argmax.
Example:
f(y):
v = tf.variable(tf.zeros(y.get_shape()))
index = tf.argmax(y)
v[index] = 1.0
return v
Unfortunately this doesn't work and I can't figure out how can one do it.
Are you sure that you want to create and assign to a tf.Variable here? It would probably be simpler to use the tf.one_hot() op (available from version 0.8 onwards) to build the result functionally, as you wouldn't have to worry about initialization, etc. For example, you could do the following:
def f(y):
index = tf.argmax(y, 1)
return tf.one_hot(index, tf.shape(y)[1], 1.0, 0.0)