tensorflow: how to perform element-wise multiplication between two sparse matrix - tensorflow

I have two sparse matrices declared using the tf.sparse_placeholder. I need to perform the element-wise multiplication between the two matrices. But I cannot find such an implementation in tensorflow. The most related function is tf.sparse_tensor_dense_matmul, but this is a function performing matrix multiplication between one sparse matrix and one dense matrix.
What I hope to find is to performing element-wise multiplication between two sparse matrices. Is there any implementation of this in tensorflow?
I show the following example of performing multiplication between dense matrices. I'm looking forward to seeing a solution.
import tensorflow as tf
import numpy as np
# Element-wise multiplication, two dense matrices
A = tf.placeholder(tf.float32, shape=(100, 100))
B = tf.placeholder(tf.float32, shape=(100, 100))
C = tf.multiply(A, B)
sess = tf.InteractiveSession()
RandA = np.random.rand(100, 100)
RandB = np.random.rand(100, 100)
print sess.run(C, feed_dict={A: RandA, B: RandB})
# matrix multiplication, A is sparse and B is dense
A = tf.sparse_placeholder(tf.float32)
B = tf.placeholder(tf.float32, shape=(5,5))
C = tf.sparse_tensor_dense_matmul(A, B)
sess = tf.InteractiveSession()
indices = np.array([[3, 2], [1, 2]], dtype=np.int64)
values = np.array([1.0, 2.0], dtype=np.float32)
shape = np.array([5,5], dtype=np.int64)
Sparse_A = tf.SparseTensorValue(indices, values, shape)
RandB = np.ones((5, 5))
print sess.run(C, feed_dict={A: Sparse_A, B: RandB})
Thank you very much!!!

TensorFlow currently has no sparse-sparse element-wise multiplication operation.
We don't plan to add support for this currently, but contributions are definitely welcome! Feel free to create a github issue here: https://github.com/tensorflow/tensorflow/issues/new and perhaps you or someone in the community can pick it up :)
Thanks!

you can use tf.matmul or tf.sparse_matmulfor sparse matrices also; setting a_is_sparse and b_is_sparse as True.
matmul(
a,
b,
transpose_a=False,
transpose_b=False,
adjoint_a=False,
adjoint_b=False,
a_is_sparse=False,
b_is_sparse=False,
name=None
)
For element-wise multiplication, one workaround is to use tf.sparse_to_dense for converting sparse tensor to dense representation and using tf.multiply for element-wise multiplication

Solution from another post works.
https://stackoverflow.com/a/45103767/2415428
Use the __mul__ to perform the element-wise multiplication.
TF2.1 ref: https://www.tensorflow.org/api_docs/python/tf/sparse/SparseTensor#mul

I'm using Tensorflow 2.4.1.
Here's my workaround to multiply two sparse tensor element-wise:
def sparse_element_wise_mul(a: tf.SparseTensor, b: tf.SparseTensor):
a_plus_b = tf.sparse.add(a, b)
a_plus_b_square = tf.square(a_plus_b)
minus_a_square = tf.negative(tf.square(a))
minus_b_square = tf.negative(tf.square(b))
_2ab = tf.sparse.add(
tf.sparse.add(
a_plus_b_square,
minus_a_square
),
minus_b_square
)
ab = tf.sparse.map_values(tf.multiply, _2ab, 0.5)
return ab
Here's some simple explanation:
Given that
(a+b)^2 = a^2 + 2a*b + b^2
we can calculate a*b by
a*b = ((a+b)^2 - a^2 - b^2) / 2
It seems the gradient can be calculated correctly with such a workaround.

Related

Tabular data: Implementing a custom tensor layer without resorting to iteration

I have an idea for a tensor operation that would not be difficult to implement via iteration, with batch size one. However I would like to parallelize it as much as possible.
I have two tensors with shape (n, 5) called X and Y. X is actually supposed to represent 5 one-dimensional tensors with shape (n, 1): (x_1, ..., x_n). Ditto for Y.
I would like to compute a tensor with shape (n, 25) where each column represents the output of the tensor operation f(x_i, y_j), where f is fixed for all 1 <= i, j <= 5. The operation f has output shape (n, 1), just like x_i and y_i.
I feel it is important to clarify that f is essentially a fully-connected layer from the concatenated [...x_i, ...y_i] tensor with shape (1, 10), to an output layer with shape (1,5).
Again, it is easy to see how to do this manually with iteration and slicing. However this is probably very slow. Performing this operation in batches, where the tensors X, Y now have shape (n, 5, batch_size) is also desirable, particularly for mini-batch gradient descent.
It is difficult to really articulate here why I desire to create this network; I feel it is suited for my domain of 'itemized tabular data' and cuts down significantly on the number of weights per operation, compared to a fully connected network.
Is this possible using tensorflow? Certainly not using just keras.
Below is an example in numpy per AloneTogether's request
import numpy as np
features = 16
batch_size = 256
X_batch = np.random.random((features, 5, batch_size))
Y_batch = np.random.random((features, 5, batch_size))
# one tensor operation to reduce weights in this custom 'layer'
f = np.random.random((features, 2 * features))
for b in range(batch_size):
X = X_batch[:, :, b]
Y = Y_batch[:, :, b]
for i in range(5):
x_i = X[:, i:i+1]
for j in range(5):
y_j = Y[:, j:j+1]
x_i_y_j = np.concatenate([x_i, y_j], axis=0)
# f(x_i, y_j)
# implemented by a fully-connected layer
f_i_j = np.matmul(f, x_i_y_j)
All operations you need (concatenation and matrix multiplication) can be batched.
Difficult part here is, that you want to concatenate features of all items in X with features of all items in Y (all combinations).
My recommended solution is to expand the dimensions of X to [batch, features, 5, 1], expand dimensions of Y to [batch, features, 1, 5]
Than tf.repeat() both tensors so their shapes become [batch, features, 5, 5].
Now you can concatenate X and Y. You will have a tensor of shape [batch, 2*features, 5, 5]. Observe that this way all combinations are built.
Next step is matrix multiplication. tf.matmul() can also do batch matrix multiplication, but I use here tf.einsum() because I want more control over which dimensions are considered as batch.
Full code:
import tensorflow as tf
import numpy as np
batch_size=3
features=6
items=5
x = np.random.uniform(size=[batch_size,features,items])
y = np.random.uniform(size=[batch_size,features,items])
f = np.random.uniform(size=[2*features,features])
x_reps= tf.repeat(x[:,:,:,tf.newaxis], items, axis=3)
y_reps= tf.repeat(y[:,:,tf.newaxis,:], items, axis=2)
xy_conc = tf.concat([x_reps,y_reps], axis=1)
f_i_j = tf.einsum("bfij, fg->bgij", xy_conc,f)
f_i_j = tf.reshape(f_i_j , [batch_size,features,items*items])

Keras Custom Merge Two Tensors

I have two tensors of shape [1,4] say,
[1,2,3,4]
[0.2,0.3,0.4,0.5]
Now I want to merge them in merge layer (perhaps using some custom function using Tensorflow backend) so that they become
[1,0.2,2,0.3,3,0.4,4,0.5]
How can I achieve this? The shape of the tensor is fixed. Thank you for your time.
A possible solution is to concatenate the tensors along the axis 0 and then gather the values according to the indices, like that
import tensorflow as tf
from itertools import chain
A = tf.constant([1, 2, 3, 4])
B = tf.constant([0.2, 0.3, 0.4, 0.5])
# Cast A to be compatible with B
A = tf.cast(A, tf.float32)
# Concat AB one next to the other
AB = tf.concat([A, B], axis=0)
# Generate a list of values in this sequence
# 0, 4, 1, 5, ... in other to indicize the tensors
# use gather to collect values in the specified positions
NEW = tf.gather(AB,
list(
chain.from_iterable((i, i + A.shape[0].value)
for i in range(A.shape[0].value))))
with tf.Session() as sess:
print(sess.run([NEW]))
Using Tensorflow, you can use reshape and concat. These operations are also available in the keras backend.
a = tf.constant([1,2,3,4])
b = tf.constant([10,20,30,40])
c = tf.reshape(tf.concat([tf.reshape(a,(-1,1)), tf.reshape(b, (-1,1))], 1), (-1,))
I don't know if there exists a more straightforward way to accomplish this.
Edit: There exists a simpler solution using tf.stack instead of tf.concat.
c = tf.reshape(tf.stack([a, b], 1),(-1,))

batch_dot with variable batch size in Keras

I'm trying to writting a layer to merge 2 tensors with such a formula
The shapes of x[0] and x[1] are both (?, 1, 500).
M is a 500*500 Matrix.
I want the output to be (?, 500, 500) which is theoretically feasible in my opinion. The layer will output (1,500,500) for every pair of inputs, as (1, 1, 500) and (1, 1, 500). As the batch_size is variable, or dynamic, the output must be (?, 500, 500).
However, I know little about axes and I have tried all the combinations of axes but it doesn't make sense.
I try with numpy.tensordot and keras.backend.batch_dot(TensorFlow). If the batch_size is fixed, taking a =
(100,1,500) for example, batch_dot(a,M,(2,0)), the output can be (100,1,500).
Newbie for Keras, sorry for such a stupid question but I have spent 2 days to figure out and it drove me crazy :(
def call(self,x):
input1 = x[0]
input2 = x[1]
#self.M is defined in build function
output = K.batch_dot(...)
return output
Update:
Sorry for being late. I try Daniel's answer with TensorFlow as Keras's backend and it still raises a ValueError for unequal dimensions.
I try the same code with Theano as backend and now it works.
>>> import numpy as np
>>> import keras.backend as K
Using Theano backend.
>>> from keras.layers import Input
>>> x1 = Input(shape=[1,500,])
>>> M = K.variable(np.ones([1,500,500]))
>>> firstMul = K.batch_dot(x1, M, axes=[1,2])
I don't know how to print tensors' shape in theano. It's definitely harder than tensorflow for me... However it works.
For that I scan 2 versions of codes for Tensorflow and Theano. Following are differences.
In this case, x = (?, 1, 500), y = (1, 500, 500), axes = [1, 2]
In tensorflow_backend:
return tf.matmul(x, y, adjoint_a=True, adjoint_b=True)
In theano_backend:
return T.batched_tensordot(x, y, axes=axes)
(If following changes of out._keras_shape don't make influence on out's value.)
Your multiplications should select which axes it uses in the batch dot function.
Axis 0 - the batch dimension, it's your ?
Axis 1 - the dimension you say has length 1
Axis 2 - the last dimension, of size 500
You won't change the batch dimension, so you will use batch_dot always with axes=[1,2]
But for that to work, you must ajust M to be (?, 500, 500).
For that define M not as (500,500), but as (1,500,500) instead, and repeat it in the first axis for the batch size:
import keras.backend as K
#Being M with shape (1,500,500), we repeat it.
BatchM = K.repeat_elements(x=M,rep=batch_size,axis=0)
#Not sure if repeating is really necessary, leaving M as (1,500,500) gives the same output shape at the end, but I haven't checked actual numbers for correctness, I believe it's totally ok.
#Now we can use batch dot properly:
firstMul = K.batch_dot(x[0], BatchM, axes=[1,2]) #will result in (?,500,500)
#we also need to transpose x[1]:
x1T = K.permute_dimensions(x[1],(0,2,1))
#and the second multiplication:
result = K.batch_dot(firstMul, x1T, axes=[1,2])
I prefer using TensorFlow so I tried to figure it out with TensorFlow in past few days.
The first one is much similar to Daniel's solution.
x = tf.placeholder('float32',shape=(None,1,3))
M = tf.placeholder('float32',shape=(None,3,3))
tf.matmul(x, M)
# return: <tf.Tensor 'MatMul_22:0' shape=(?, 1, 3) dtype=float32>
It needs to feed values to M with fit shapes.
sess = tf.Session()
sess.run(tf.matmul(x,M), feed_dict = {x: [[[1,2,3]]], M: [[[1,2,3],[0,1,0],[0,0,1]]]})
# return : array([[[ 1., 4., 6.]]], dtype=float32)
Another way is simple with tf.einsum.
x = tf.placeholder('float32',shape=(None,1,3))
M = tf.placeholder('float32',shape=(3,3))
tf.einsum('ijk,lm->ikl', x, M)
# return: <tf.Tensor 'MatMul_22:0' shape=(?, 1, 3) dtype=float32>
Let's feed some values.
sess.run(tf.einsum('ijk,kl->ijl', x, M), feed_dict = {x: [[[1,2,3]]], M: [[1,2,3],[0,1,0],[0,0,1]]})
# return: array([[[ 1., 4., 6.]]], dtype=float32)
Now M is a 2D tensor and no need to feed batch_size to M.
What's more, now it seems such a question can be solved in TensorFlow with tf.einsum. Does it mean it's a duty for Keras to invoke tf.einsum in some situations? At least I find no where Keras calls tf.einsum. And in my opinion, when batch_dot 3D tensor and 2D tensor Keras behaves weirdly. In Daniel's answer, he pads M to (1,500,500) but in K.batch_dot() M will be adjusted to (500,500,1) automatically. I find tf will adjust it with Broadcasting rules and I'm not sure Keras does the same.

How can I compare if column equals in a matrix multiplication mannar?

I am using Keras (tensorflow as backend). What I want to do is to write a lambda layer that gets 2 tensor input and compare every combination of 2 column of them using Indicator function and produce a new tensor with 0-1 value. Here is an example.
Input: x = K.variable(np.array([[1,2,3],[2,3,4]])),
y = K.variable(np.array([[1,2,3],[2,3,4]]))
Output
z=K.variable(np.array[[1,0],[0,1]])
As far as I know, tensorflow provides tf.equal() to compare tensor in a elementwise way. But if I apply it here, I get
>>> z=tf.equal(x,y)
>>> K.eval(z)
array([[True, True, True],
[True, True, True]], dtype=bool)
It only compares tensor in same position.
So my questions are:
1. Is there a tensorflow API to get my desired output or if I need to write my own function to complete it?
2. If it is the latter one, then there is another problem. I noticed that in keras the input is mini-batch, so the input format looks like: (None, m, n). When writing my own method, how can I tackle with the first dimension, which is None?
Any reply would be appreciated!
You could use broadcasting.
import numpy as np
import tensorflow as tf
x = tf.constant(np.array([[1,2,3],[2,3,4]]))
y = tf.constant(np.array([[1,2,3],[2,3,4]]))
x_ = tf.expand_dims(x, 0)
y_ = tf.expand_dims(y, 1)
res = tf.reduce_all(tf.equal(x_, y_), axis=-1)
sess = tf.Session()
sess.run(res)

masked softmax in theano

I am wondering if it possible to apply a mask before performing theano.tensor.nnet.softmax?
This is the behavior I am looking for:
>>>a = np.array([[1,2,3,4]])
>>>m = np.array([[1,0,1,0]]) # ignore index 1 and 3
>>>theano.tensor.nnet.softmax(a,m)
array([[ 0.11920292, 0. , 0.88079708, 0. ]])
Note that a and m are matrices, so I would like the softmax with work on an entire matrix and perform row-wise masked softmax.
Also the output should be the same shape as a, so the solution can not do advanced indexing e.g. theano.tensor.softmax(a[0,[0,2]])
def masked_softmax(a, m, axis):
e_a = T.exp(a)
masked_e = e_a * m
sum_masked_e = T.sum(masked_e, axis, keepdims=True)
return masked_e / sum_masked_e
theano.tensor.switch is one way to do this.
In the computational graph you can do the following:
a_mask = theano.tensor.switch(m, a, np.NINF)
sm = theano.tensor.softmax(a_mask)
hope it helps others.