I am using Keras (tensorflow as backend). What I want to do is to write a lambda layer that gets 2 tensor input and compare every combination of 2 column of them using Indicator function and produce a new tensor with 0-1 value. Here is an example.
Input: x = K.variable(np.array([[1,2,3],[2,3,4]])),
y = K.variable(np.array([[1,2,3],[2,3,4]]))
Output
z=K.variable(np.array[[1,0],[0,1]])
As far as I know, tensorflow provides tf.equal() to compare tensor in a elementwise way. But if I apply it here, I get
>>> z=tf.equal(x,y)
>>> K.eval(z)
array([[True, True, True],
[True, True, True]], dtype=bool)
It only compares tensor in same position.
So my questions are:
1. Is there a tensorflow API to get my desired output or if I need to write my own function to complete it?
2. If it is the latter one, then there is another problem. I noticed that in keras the input is mini-batch, so the input format looks like: (None, m, n). When writing my own method, how can I tackle with the first dimension, which is None?
Any reply would be appreciated!
You could use broadcasting.
import numpy as np
import tensorflow as tf
x = tf.constant(np.array([[1,2,3],[2,3,4]]))
y = tf.constant(np.array([[1,2,3],[2,3,4]]))
x_ = tf.expand_dims(x, 0)
y_ = tf.expand_dims(y, 1)
res = tf.reduce_all(tf.equal(x_, y_), axis=-1)
sess = tf.Session()
sess.run(res)
Related
I got a problem when using tf.gradients to compute gradient.
my x is a tf.constant() of a vector v of shape (4, 1)
and my y is the sigmoid of v, also of shape (4, 1), so the gradient of y with respect to x should be a diagonal matrix of shape (4, 4).
My code:
c = tf.constant(sigmoid(x_0#w_0))
d = tf.constant(x_0#w_0)
Omega = tf.gradients(c, d)
_Omega = sess.run(Omega)
the error is
Fetch argument None has invalid type .
In addition, I think using tf.gradients might be wrong, there may be some other functions that can compute this.
My question:
point out where I am wrong and how to fix it using tf.gradients
or using another function.
Edit:
want to compute the derivative like this: see the vector_by_vector section https://en.wikipedia.org/wiki/Matrix_calculus#Vector-by-vector
and the result Omega would look like the following:
[[s1(1-s1) 0 0 0 ]
[0 s2(1-s2) 0 0 ]
[0 0 s3(1-s3) 0 ]
[0 0 0 s4(1-s4)]]
where si = sigmoid(x_0i#w_0), where x_0i is the ith row of x_0.
Generally, compute a vector over another vector, should be a matrix.
First of all, you can't calculate gradients for constants. You'll get None op for gradients. That's the reason for your error. One way to calculate gradients would be tf graph (see the code below) Or other way could be using tf.GradientTape in Eager execution mode:
import tensorflow as tf
import numpy as np
arr = np.random.rand(4, 1)
ip = tf.Variable(initial_value=arr)
sess = tf.Session()
c_var = tf.math.sigmoid(ip)
Omega = tf.gradients(c_var, ip)
sess.run(tf.global_variables_initializer())
_Omega = sess.run(Omega)
print(_Omega)
Now, you can pass any sized vector. Still, not sure how you will get (4, 4) diagonal matrix for the gradients.
input_mb = tf.placeholder(tf.int32, [None, 166, 1], name="input_minibatch")
Let's say there is the above code. I want to get the rows of the above minibatch tensor such that the first element of each retrieved row == a. How do I do this in Tensorflow? Also, how do you do this in Numpy?
(Given a value a)
To achieve this in numpy you just have to write :
selected_rows = myarray[myarray[:,0]== a]
In tensorflow, use tf.where :
mytensor[tf.squeeze(tf.where(tf.equal(mytensor[:,0],a), None, None))
I would do it like this on tensorflow:
tf.gather(mytensor, tf.squeeze(tf.where(tf.equal(mytensor[:,0],a), None, None)), axis=0)
Hi tensorflow beginner here... I'm trying to get the value of a certain elements in an 2 dim tensor, in my case class scores from a probability matrix.
The probability matrix is (1000,81) with batchsize 1000 and number of classes 81. ClassIDs is (1000,) and contains the index for the highest class score for each sample. How do I get the corresponding class score from the probability matrix using tf.gather?
class_ids = tf.cast(tf.argmax(probs, axis=1), tf.int32)
class_scores = tf.gather_nd(probs,class_ids)
class_scores should be a tensor of shape (1000,) containing the highest class_score for each sample.
Right now I'm using a workaround that looks like this:
class_score_count = []
for i in range(probs.shape[0]):
prob = probs[i,:]
class_score = prob[class_ids[i]]
class_score_count.append(class_score)
class_scores = tf.stack(class_score_count, axis=0)
Thanks for the help!
You can do it with tf.gather_nd like this:
class_ids = tf.cast(tf.argmax(probs, axis=1), tf.int32)
# If shape is not dynamic you can use probs.shape[0].value instead of tf.shape(probs)[0]
row_ids = tf.range(tf.shape(probs)[0], dtype=tf.int32)
idx = tf.stack([row_ids, class_ids], axis=1)
class_scores = tf.gather_nd(probs, idx)
You could also just use tf.reduce_max, even though it would actually compute the maximum again it may not be much slower if your data is not too big:
class_scores = tf.reduce_max(probs, axis=1)
you need to run the tensor class_ids to get the values
the values will be a bumpy array
you can access numpy array normally by a loop
you have to do something like this :
predictions = sess.run(tf.argmax(probs, 1), feed_dict={x: X_data})
predictions variable has all the information you need
tensorflow only returns those tensor values which you run explicitly
I think this is what the batch_dims argument for tf.gather is for.
I have two sparse matrices declared using the tf.sparse_placeholder. I need to perform the element-wise multiplication between the two matrices. But I cannot find such an implementation in tensorflow. The most related function is tf.sparse_tensor_dense_matmul, but this is a function performing matrix multiplication between one sparse matrix and one dense matrix.
What I hope to find is to performing element-wise multiplication between two sparse matrices. Is there any implementation of this in tensorflow?
I show the following example of performing multiplication between dense matrices. I'm looking forward to seeing a solution.
import tensorflow as tf
import numpy as np
# Element-wise multiplication, two dense matrices
A = tf.placeholder(tf.float32, shape=(100, 100))
B = tf.placeholder(tf.float32, shape=(100, 100))
C = tf.multiply(A, B)
sess = tf.InteractiveSession()
RandA = np.random.rand(100, 100)
RandB = np.random.rand(100, 100)
print sess.run(C, feed_dict={A: RandA, B: RandB})
# matrix multiplication, A is sparse and B is dense
A = tf.sparse_placeholder(tf.float32)
B = tf.placeholder(tf.float32, shape=(5,5))
C = tf.sparse_tensor_dense_matmul(A, B)
sess = tf.InteractiveSession()
indices = np.array([[3, 2], [1, 2]], dtype=np.int64)
values = np.array([1.0, 2.0], dtype=np.float32)
shape = np.array([5,5], dtype=np.int64)
Sparse_A = tf.SparseTensorValue(indices, values, shape)
RandB = np.ones((5, 5))
print sess.run(C, feed_dict={A: Sparse_A, B: RandB})
Thank you very much!!!
TensorFlow currently has no sparse-sparse element-wise multiplication operation.
We don't plan to add support for this currently, but contributions are definitely welcome! Feel free to create a github issue here: https://github.com/tensorflow/tensorflow/issues/new and perhaps you or someone in the community can pick it up :)
Thanks!
you can use tf.matmul or tf.sparse_matmulfor sparse matrices also; setting a_is_sparse and b_is_sparse as True.
matmul(
a,
b,
transpose_a=False,
transpose_b=False,
adjoint_a=False,
adjoint_b=False,
a_is_sparse=False,
b_is_sparse=False,
name=None
)
For element-wise multiplication, one workaround is to use tf.sparse_to_dense for converting sparse tensor to dense representation and using tf.multiply for element-wise multiplication
Solution from another post works.
https://stackoverflow.com/a/45103767/2415428
Use the __mul__ to perform the element-wise multiplication.
TF2.1 ref: https://www.tensorflow.org/api_docs/python/tf/sparse/SparseTensor#mul
I'm using Tensorflow 2.4.1.
Here's my workaround to multiply two sparse tensor element-wise:
def sparse_element_wise_mul(a: tf.SparseTensor, b: tf.SparseTensor):
a_plus_b = tf.sparse.add(a, b)
a_plus_b_square = tf.square(a_plus_b)
minus_a_square = tf.negative(tf.square(a))
minus_b_square = tf.negative(tf.square(b))
_2ab = tf.sparse.add(
tf.sparse.add(
a_plus_b_square,
minus_a_square
),
minus_b_square
)
ab = tf.sparse.map_values(tf.multiply, _2ab, 0.5)
return ab
Here's some simple explanation:
Given that
(a+b)^2 = a^2 + 2a*b + b^2
we can calculate a*b by
a*b = ((a+b)^2 - a^2 - b^2) / 2
It seems the gradient can be calculated correctly with such a workaround.
I'm trying to writting a layer to merge 2 tensors with such a formula
The shapes of x[0] and x[1] are both (?, 1, 500).
M is a 500*500 Matrix.
I want the output to be (?, 500, 500) which is theoretically feasible in my opinion. The layer will output (1,500,500) for every pair of inputs, as (1, 1, 500) and (1, 1, 500). As the batch_size is variable, or dynamic, the output must be (?, 500, 500).
However, I know little about axes and I have tried all the combinations of axes but it doesn't make sense.
I try with numpy.tensordot and keras.backend.batch_dot(TensorFlow). If the batch_size is fixed, taking a =
(100,1,500) for example, batch_dot(a,M,(2,0)), the output can be (100,1,500).
Newbie for Keras, sorry for such a stupid question but I have spent 2 days to figure out and it drove me crazy :(
def call(self,x):
input1 = x[0]
input2 = x[1]
#self.M is defined in build function
output = K.batch_dot(...)
return output
Update:
Sorry for being late. I try Daniel's answer with TensorFlow as Keras's backend and it still raises a ValueError for unequal dimensions.
I try the same code with Theano as backend and now it works.
>>> import numpy as np
>>> import keras.backend as K
Using Theano backend.
>>> from keras.layers import Input
>>> x1 = Input(shape=[1,500,])
>>> M = K.variable(np.ones([1,500,500]))
>>> firstMul = K.batch_dot(x1, M, axes=[1,2])
I don't know how to print tensors' shape in theano. It's definitely harder than tensorflow for me... However it works.
For that I scan 2 versions of codes for Tensorflow and Theano. Following are differences.
In this case, x = (?, 1, 500), y = (1, 500, 500), axes = [1, 2]
In tensorflow_backend:
return tf.matmul(x, y, adjoint_a=True, adjoint_b=True)
In theano_backend:
return T.batched_tensordot(x, y, axes=axes)
(If following changes of out._keras_shape don't make influence on out's value.)
Your multiplications should select which axes it uses in the batch dot function.
Axis 0 - the batch dimension, it's your ?
Axis 1 - the dimension you say has length 1
Axis 2 - the last dimension, of size 500
You won't change the batch dimension, so you will use batch_dot always with axes=[1,2]
But for that to work, you must ajust M to be (?, 500, 500).
For that define M not as (500,500), but as (1,500,500) instead, and repeat it in the first axis for the batch size:
import keras.backend as K
#Being M with shape (1,500,500), we repeat it.
BatchM = K.repeat_elements(x=M,rep=batch_size,axis=0)
#Not sure if repeating is really necessary, leaving M as (1,500,500) gives the same output shape at the end, but I haven't checked actual numbers for correctness, I believe it's totally ok.
#Now we can use batch dot properly:
firstMul = K.batch_dot(x[0], BatchM, axes=[1,2]) #will result in (?,500,500)
#we also need to transpose x[1]:
x1T = K.permute_dimensions(x[1],(0,2,1))
#and the second multiplication:
result = K.batch_dot(firstMul, x1T, axes=[1,2])
I prefer using TensorFlow so I tried to figure it out with TensorFlow in past few days.
The first one is much similar to Daniel's solution.
x = tf.placeholder('float32',shape=(None,1,3))
M = tf.placeholder('float32',shape=(None,3,3))
tf.matmul(x, M)
# return: <tf.Tensor 'MatMul_22:0' shape=(?, 1, 3) dtype=float32>
It needs to feed values to M with fit shapes.
sess = tf.Session()
sess.run(tf.matmul(x,M), feed_dict = {x: [[[1,2,3]]], M: [[[1,2,3],[0,1,0],[0,0,1]]]})
# return : array([[[ 1., 4., 6.]]], dtype=float32)
Another way is simple with tf.einsum.
x = tf.placeholder('float32',shape=(None,1,3))
M = tf.placeholder('float32',shape=(3,3))
tf.einsum('ijk,lm->ikl', x, M)
# return: <tf.Tensor 'MatMul_22:0' shape=(?, 1, 3) dtype=float32>
Let's feed some values.
sess.run(tf.einsum('ijk,kl->ijl', x, M), feed_dict = {x: [[[1,2,3]]], M: [[1,2,3],[0,1,0],[0,0,1]]})
# return: array([[[ 1., 4., 6.]]], dtype=float32)
Now M is a 2D tensor and no need to feed batch_size to M.
What's more, now it seems such a question can be solved in TensorFlow with tf.einsum. Does it mean it's a duty for Keras to invoke tf.einsum in some situations? At least I find no where Keras calls tf.einsum. And in my opinion, when batch_dot 3D tensor and 2D tensor Keras behaves weirdly. In Daniel's answer, he pads M to (1,500,500) but in K.batch_dot() M will be adjusted to (500,500,1) automatically. I find tf will adjust it with Broadcasting rules and I'm not sure Keras does the same.