Keras Custom Merge Two Tensors - tensorflow

I have two tensors of shape [1,4] say,
[1,2,3,4]
[0.2,0.3,0.4,0.5]
Now I want to merge them in merge layer (perhaps using some custom function using Tensorflow backend) so that they become
[1,0.2,2,0.3,3,0.4,4,0.5]
How can I achieve this? The shape of the tensor is fixed. Thank you for your time.

A possible solution is to concatenate the tensors along the axis 0 and then gather the values according to the indices, like that
import tensorflow as tf
from itertools import chain
A = tf.constant([1, 2, 3, 4])
B = tf.constant([0.2, 0.3, 0.4, 0.5])
# Cast A to be compatible with B
A = tf.cast(A, tf.float32)
# Concat AB one next to the other
AB = tf.concat([A, B], axis=0)
# Generate a list of values in this sequence
# 0, 4, 1, 5, ... in other to indicize the tensors
# use gather to collect values in the specified positions
NEW = tf.gather(AB,
list(
chain.from_iterable((i, i + A.shape[0].value)
for i in range(A.shape[0].value))))
with tf.Session() as sess:
print(sess.run([NEW]))

Using Tensorflow, you can use reshape and concat. These operations are also available in the keras backend.
a = tf.constant([1,2,3,4])
b = tf.constant([10,20,30,40])
c = tf.reshape(tf.concat([tf.reshape(a,(-1,1)), tf.reshape(b, (-1,1))], 1), (-1,))
I don't know if there exists a more straightforward way to accomplish this.
Edit: There exists a simpler solution using tf.stack instead of tf.concat.
c = tf.reshape(tf.stack([a, b], 1),(-1,))

Related

pytorch equivalent tf.gather

I'm having some trouble porting some code over from tensorflow to pytorch.
So I have a matrix with dimensions 10x30 representing 10 examples each with 30 features. Then I have another matrix with dimensions 10x5 containing indices of the the 5 closest examples for each examples in the first matrix. I want to 'gather' using the indices contained in the second matrix the 5 closet examples for each example in the first matrix leaving me with a 3d tensor of shape 10x5x30.
In tensorflow this is done with tf.gather(matrix1, matrix2). Does anyone know how i could do this in pytorch?
How about this?
matrix1 = torch.randn(10, 30)
matrix2 = torch.randint(high=10, size=(10, 5))
gathered = matrix1[matrix2]
It uses the trick of indexing with an array of integers.
I had a scenario where I had to apply gather() on an array of integers.
Exam-01
torch.Tensor().gather(dim, input_tensor)
# here,
# input_tensor -> tensor(1)
my_list = [0, 1, 2, 3, 4]
my_tensor = torch.IntTensor(my_list)
output = my_tensor.gather(0, input_tensor) # 0 -> is the dimension
Exam-02
torch.gather(param_tensor, dim, input_tensor)
# here,
# input_tensor -> tensor(1)
my_list = [0, 1, 2, 3, 4]
my_tensor = torch.IntTensor(my_list)
output = torch.gather(my_tensor, 0, input_tensor) # 0 -> is the dimension

tensorflow: how to perform element-wise multiplication between two sparse matrix

I have two sparse matrices declared using the tf.sparse_placeholder. I need to perform the element-wise multiplication between the two matrices. But I cannot find such an implementation in tensorflow. The most related function is tf.sparse_tensor_dense_matmul, but this is a function performing matrix multiplication between one sparse matrix and one dense matrix.
What I hope to find is to performing element-wise multiplication between two sparse matrices. Is there any implementation of this in tensorflow?
I show the following example of performing multiplication between dense matrices. I'm looking forward to seeing a solution.
import tensorflow as tf
import numpy as np
# Element-wise multiplication, two dense matrices
A = tf.placeholder(tf.float32, shape=(100, 100))
B = tf.placeholder(tf.float32, shape=(100, 100))
C = tf.multiply(A, B)
sess = tf.InteractiveSession()
RandA = np.random.rand(100, 100)
RandB = np.random.rand(100, 100)
print sess.run(C, feed_dict={A: RandA, B: RandB})
# matrix multiplication, A is sparse and B is dense
A = tf.sparse_placeholder(tf.float32)
B = tf.placeholder(tf.float32, shape=(5,5))
C = tf.sparse_tensor_dense_matmul(A, B)
sess = tf.InteractiveSession()
indices = np.array([[3, 2], [1, 2]], dtype=np.int64)
values = np.array([1.0, 2.0], dtype=np.float32)
shape = np.array([5,5], dtype=np.int64)
Sparse_A = tf.SparseTensorValue(indices, values, shape)
RandB = np.ones((5, 5))
print sess.run(C, feed_dict={A: Sparse_A, B: RandB})
Thank you very much!!!
TensorFlow currently has no sparse-sparse element-wise multiplication operation.
We don't plan to add support for this currently, but contributions are definitely welcome! Feel free to create a github issue here: https://github.com/tensorflow/tensorflow/issues/new and perhaps you or someone in the community can pick it up :)
Thanks!
you can use tf.matmul or tf.sparse_matmulfor sparse matrices also; setting a_is_sparse and b_is_sparse as True.
matmul(
a,
b,
transpose_a=False,
transpose_b=False,
adjoint_a=False,
adjoint_b=False,
a_is_sparse=False,
b_is_sparse=False,
name=None
)
For element-wise multiplication, one workaround is to use tf.sparse_to_dense for converting sparse tensor to dense representation and using tf.multiply for element-wise multiplication
Solution from another post works.
https://stackoverflow.com/a/45103767/2415428
Use the __mul__ to perform the element-wise multiplication.
TF2.1 ref: https://www.tensorflow.org/api_docs/python/tf/sparse/SparseTensor#mul
I'm using Tensorflow 2.4.1.
Here's my workaround to multiply two sparse tensor element-wise:
def sparse_element_wise_mul(a: tf.SparseTensor, b: tf.SparseTensor):
a_plus_b = tf.sparse.add(a, b)
a_plus_b_square = tf.square(a_plus_b)
minus_a_square = tf.negative(tf.square(a))
minus_b_square = tf.negative(tf.square(b))
_2ab = tf.sparse.add(
tf.sparse.add(
a_plus_b_square,
minus_a_square
),
minus_b_square
)
ab = tf.sparse.map_values(tf.multiply, _2ab, 0.5)
return ab
Here's some simple explanation:
Given that
(a+b)^2 = a^2 + 2a*b + b^2
we can calculate a*b by
a*b = ((a+b)^2 - a^2 - b^2) / 2
It seems the gradient can be calculated correctly with such a workaround.

batch_dot with variable batch size in Keras

I'm trying to writting a layer to merge 2 tensors with such a formula
The shapes of x[0] and x[1] are both (?, 1, 500).
M is a 500*500 Matrix.
I want the output to be (?, 500, 500) which is theoretically feasible in my opinion. The layer will output (1,500,500) for every pair of inputs, as (1, 1, 500) and (1, 1, 500). As the batch_size is variable, or dynamic, the output must be (?, 500, 500).
However, I know little about axes and I have tried all the combinations of axes but it doesn't make sense.
I try with numpy.tensordot and keras.backend.batch_dot(TensorFlow). If the batch_size is fixed, taking a =
(100,1,500) for example, batch_dot(a,M,(2,0)), the output can be (100,1,500).
Newbie for Keras, sorry for such a stupid question but I have spent 2 days to figure out and it drove me crazy :(
def call(self,x):
input1 = x[0]
input2 = x[1]
#self.M is defined in build function
output = K.batch_dot(...)
return output
Update:
Sorry for being late. I try Daniel's answer with TensorFlow as Keras's backend and it still raises a ValueError for unequal dimensions.
I try the same code with Theano as backend and now it works.
>>> import numpy as np
>>> import keras.backend as K
Using Theano backend.
>>> from keras.layers import Input
>>> x1 = Input(shape=[1,500,])
>>> M = K.variable(np.ones([1,500,500]))
>>> firstMul = K.batch_dot(x1, M, axes=[1,2])
I don't know how to print tensors' shape in theano. It's definitely harder than tensorflow for me... However it works.
For that I scan 2 versions of codes for Tensorflow and Theano. Following are differences.
In this case, x = (?, 1, 500), y = (1, 500, 500), axes = [1, 2]
In tensorflow_backend:
return tf.matmul(x, y, adjoint_a=True, adjoint_b=True)
In theano_backend:
return T.batched_tensordot(x, y, axes=axes)
(If following changes of out._keras_shape don't make influence on out's value.)
Your multiplications should select which axes it uses in the batch dot function.
Axis 0 - the batch dimension, it's your ?
Axis 1 - the dimension you say has length 1
Axis 2 - the last dimension, of size 500
You won't change the batch dimension, so you will use batch_dot always with axes=[1,2]
But for that to work, you must ajust M to be (?, 500, 500).
For that define M not as (500,500), but as (1,500,500) instead, and repeat it in the first axis for the batch size:
import keras.backend as K
#Being M with shape (1,500,500), we repeat it.
BatchM = K.repeat_elements(x=M,rep=batch_size,axis=0)
#Not sure if repeating is really necessary, leaving M as (1,500,500) gives the same output shape at the end, but I haven't checked actual numbers for correctness, I believe it's totally ok.
#Now we can use batch dot properly:
firstMul = K.batch_dot(x[0], BatchM, axes=[1,2]) #will result in (?,500,500)
#we also need to transpose x[1]:
x1T = K.permute_dimensions(x[1],(0,2,1))
#and the second multiplication:
result = K.batch_dot(firstMul, x1T, axes=[1,2])
I prefer using TensorFlow so I tried to figure it out with TensorFlow in past few days.
The first one is much similar to Daniel's solution.
x = tf.placeholder('float32',shape=(None,1,3))
M = tf.placeholder('float32',shape=(None,3,3))
tf.matmul(x, M)
# return: <tf.Tensor 'MatMul_22:0' shape=(?, 1, 3) dtype=float32>
It needs to feed values to M with fit shapes.
sess = tf.Session()
sess.run(tf.matmul(x,M), feed_dict = {x: [[[1,2,3]]], M: [[[1,2,3],[0,1,0],[0,0,1]]]})
# return : array([[[ 1., 4., 6.]]], dtype=float32)
Another way is simple with tf.einsum.
x = tf.placeholder('float32',shape=(None,1,3))
M = tf.placeholder('float32',shape=(3,3))
tf.einsum('ijk,lm->ikl', x, M)
# return: <tf.Tensor 'MatMul_22:0' shape=(?, 1, 3) dtype=float32>
Let's feed some values.
sess.run(tf.einsum('ijk,kl->ijl', x, M), feed_dict = {x: [[[1,2,3]]], M: [[1,2,3],[0,1,0],[0,0,1]]})
# return: array([[[ 1., 4., 6.]]], dtype=float32)
Now M is a 2D tensor and no need to feed batch_size to M.
What's more, now it seems such a question can be solved in TensorFlow with tf.einsum. Does it mean it's a duty for Keras to invoke tf.einsum in some situations? At least I find no where Keras calls tf.einsum. And in my opinion, when batch_dot 3D tensor and 2D tensor Keras behaves weirdly. In Daniel's answer, he pads M to (1,500,500) but in K.batch_dot() M will be adjusted to (500,500,1) automatically. I find tf will adjust it with Broadcasting rules and I'm not sure Keras does the same.

how tf.space_to_depth() works in tensorflow?

I am a pytorch user. I have got a pretrained model in tensorflow and I would like to transfer it into pytorch. In one part of model architecture, I mean in tensorflow-defined model, there is a function tf.space_to_depth which transfers an input size of (None, 38,38,64) to (None, 19,19, 256). (https://www.tensorflow.org/api_docs/python/tf/space_to_depth) is the doc of this function. But I could not understand what this function actually do. Could you please provide some numpy codes to illustrate it for me?
Actually I would like to make an exact similar layer in pytorch.
Some codes in tensorflow reveals another secret:
Here is some codes:
import numpy as np
import tensorflow as tf
norm = tf.random_normal([1, 2, 2, 1], mean=0, stddev=1)
trans = tf.space_to_depth(norm,2)
with tf.Session() as s:
norm = s.run(norm)
trans = s.run(trans)
print("Norm")
print(norm.shape)
for index,value in np.ndenumerate(norm):
print(value)
print("Trans")
print(trans.shape)
for index,value in np.ndenumerate(trans):
print(value)
And here is the output:
Norm
(1, 2, 2, 1)
0.695261
0.455764
1.04699
-0.237587
Trans
(1, 1, 1, 4)
1.01139
0.898777
0.210135
2.36742
As you can see above, In Addition to data reshaping, the tensor values has changed!
This tf.space_to_depth divides your input into blocs and concatenates them.
In your example the input is 38x38x64 (and I guess the block_size is 2). So the function divides your input into 4 (block_size x block_size) and concatenates them which gives your 19x19x256 output.
You just need to divide each of your channel (input) into block_size*block_size patches (each patch has a size of width/block_size x height/block_size) and concatenate all of these patches. Should be pretty straightforward with numpy.
Hope it helps.
Conclusion: tf.space_to_depth() only outputs a copy of the input tensor where values from the height and width dimensions are moved to the depth dimension.
If you modify your code a little bit, like this
norm = tf.random_normal([1, 2, 2, 1], mean=0, stddev=1)
with tf.Session() as s:
norm = s.run(norm)
trans = tf.space_to_depth(norm,2)
with tf.Session() as s:
trans = s.run(trans)
Then you will have the following results:
Norm
(1, 2, 2, 1)
-0.130227
2.04587
-0.077691
-0.112031
Trans
(1, 1, 1, 4)
-0.130227
2.04587
-0.077691
-0.112031
Hope this can help you.
A good reference for PyTorch is the implementation of the PixelShuffle module here. This shows the implementation of something equivalent to Tensorflow's depth_to_space. Based on that we can implement pixel_shuffle with a scaling factor less than 1 which would be like space_to_depth. E.g., downscale_factor=0.5 is like space_to_depth with block_size=2.
def pixel_shuffle_down(input, downscale_factor):
batch_size, channels, in_height, in_width = input.size()
out_channels = channels / (downscale_factor ** 2)
block_size = 1 / downscale_factor
out_height = in_height * downscale_factor
out_width = in_width * downscale_factor
input_view = input.contiguous().view(
batch_size, channels, out_height, block_size, out_width, block_size)
shuffle_out = input_view.permute(0, 1, 3, 5, 2, 4).contiguous()
return shuffle_out.view(batch_size, out_channels, out_height, out_width)
Note: I haven't verified this implementation yet and I'm not sure if it's exactly the inverse of pixel_shuffle but this is the basic idea. I've also opened an issue on the PyTorch Github about this here. In NumPy the equivalent code would use reshapeand transpose instead of view and permute respectively.
Using split and stack functions along with permute in Pytorch gives us the same result as space_to_depth in tensorflow does. Here is the code in Pytorch.
Assume that input is in BHWC format.
Based on block_size and input shape, we can caculate the output shape.
First, it splits the input on the "width" dimension or dimension #2 by block_size. The result of this operation is an array of length d_width. It's just like you cut a cake (by block_size) into d_width pieces.
Then for each piece, you reshape it so it has correct output height and output depth (channel). Finally, we stack those pieces together and perform a permutation.
Hope it helps.
def space_to_depth(input, block_size)
block_size_sq = block_size*block_size
(batch_size, s_height, s_width, s_depth) = input.size()
d_depth = s_depth * self.block_size_sq
d_width = int(s_width / self.block_size)
d_height = int(s_height / self.block_size)
t_1 = input.split(self.block_size, 2)
stack = [t_t.contiguous().view(batch_size, d_height, d_depth) for t_t in t_1]
output = torch.stack(stack, 1)
output = output.permute(0, 2, 1, 3)
return output
maybe this one works:
sudo apt install nvidia-cuda-toolkit
it worked for me.

How can I compare if column equals in a matrix multiplication mannar?

I am using Keras (tensorflow as backend). What I want to do is to write a lambda layer that gets 2 tensor input and compare every combination of 2 column of them using Indicator function and produce a new tensor with 0-1 value. Here is an example.
Input: x = K.variable(np.array([[1,2,3],[2,3,4]])),
y = K.variable(np.array([[1,2,3],[2,3,4]]))
Output
z=K.variable(np.array[[1,0],[0,1]])
As far as I know, tensorflow provides tf.equal() to compare tensor in a elementwise way. But if I apply it here, I get
>>> z=tf.equal(x,y)
>>> K.eval(z)
array([[True, True, True],
[True, True, True]], dtype=bool)
It only compares tensor in same position.
So my questions are:
1. Is there a tensorflow API to get my desired output or if I need to write my own function to complete it?
2. If it is the latter one, then there is another problem. I noticed that in keras the input is mini-batch, so the input format looks like: (None, m, n). When writing my own method, how can I tackle with the first dimension, which is None?
Any reply would be appreciated!
You could use broadcasting.
import numpy as np
import tensorflow as tf
x = tf.constant(np.array([[1,2,3],[2,3,4]]))
y = tf.constant(np.array([[1,2,3],[2,3,4]]))
x_ = tf.expand_dims(x, 0)
y_ = tf.expand_dims(y, 1)
res = tf.reduce_all(tf.equal(x_, y_), axis=-1)
sess = tf.Session()
sess.run(res)