I've got a numpy tensor X of shape (1075, 252, 487).
For each of the 1075, I want to calculate the correlation matrix X.T#X, so I'm left with a tensor of shape (1075, 487, 487).
How do I write this as an Einstein summation?
import numpy as np
(N, R, C) = (20, 15, 30)
X = np.random.rand(N, R, C)
cors = np.einsum("ijk,ikl->ijl", X.transpose((0, 2, 1)), X)
Related
I am trying to do element-wise multiplication of two tensors of dimensions (1,5,64) and (1,5). As far as I know, in spite of their dimension mismatch, broadcasting should allow this to work. So, I use this code:
x = tf.range(0,64*5)
x = tf.reshape(x, [1,5, 64])
y = tf.range(0,5)
y = tf.reshape(y, [1, 5])
prodct = x*y
This causes this error:
InvalidArgumentError: Incompatible shapes: [1,5,64] vs. [1,5] [Op:Mul]
However If i reshape first tensor to dimension (1,64,5), then it works. Code:
x = tf.range(0,64*5)
x = tf.reshape(x, [1,64, 5])
y = tf.range(0,5)
y = tf.reshape(y, [1, 5])
prodct = x*y
I do not understand why the first code does not work.
The General Broadcasting Rules, when operating on two arrays, numpy compares their shapes element-wise. It starts with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are compatible when
they are equal, or
one of them is 1
If these conditions are not met, a ValueError: operands could not be broadcast together exception is thrown, indicating that the arrays have incompatible shapes. The size of the resulting array is the size that is not 1 along each axis of the inputs.
tensorflow also follows the same spirit. Check the documentation for more examples and details. For your case, the rightmost dimension doesn't follow the rules and throws an error.
1, 5, 64
1, 5
But this would work as it obeys the rules.
1, 64, 5
1, 5
Code
In numpy, and in tensorflow for reference.
import numpy as np
a = np.arange(64*5).reshape(1, 64, 5)
b = np.arange(5).reshape(1,5)
(a*b).shape
(1, 64, 5)
import tensorflow as tf
x = tf.reshape(tf.range(0,64*5), [1, 64, 5])
y = tf.reshape(tf.range(0,5), [1, 5])
(x*y).shape
TensorShape([1, 64, 5])
I have numpy tuple (with len 4, 5, 6, or more). How can I convert a numpy tuple to a Tensorflow tuple with input like this:
import tensorflow as tf
import numpy as np
a = np.array([[20, 20], [40, 40]], dtype=np.int32)
b = np.array([[20, 20, 20], [40, 40, 40], [60, 60, 60]], dtype=np.int32)
c = np.array([[20, 20], [40, 40]], dtype=np.int32)
d = np.array([[20, 20, 20], [40, 40, 40], [60, 60, 60]], dtype=np.int32)
e = (a, b, c, d) # e is numpy tensor i want convert to tensor
tf_shapes = ((None, 2), (None, 3), (2, 2), (3, 3))
tf_types = (tf.int64, tf.float32, tf.int64, tf.float32)
I must write a generator to convert this to a Tensorflow tuple.
def data_generator():
for i in range(16):
yield a, b, c, d
dataset=tf.data.Dataset.from_generator(data_generator, tf_types, tf_shapes).batch(batch_size=4, drop_remainder=True)
for sample in dataset:
res = model(sample, training=False)
How can I get a sample directly without not using tf.data.Dataset.from_generator?
I'm not sure if I understood your question correctly, but it appears that you just want to have a, b, c, and d converted to tensorflow tensors without having to use the tf.data.Dataset.from_generator function.
In that case, you can simply use tf.convert_to_tensor:
import tensorflow as tf
import numpy as np
a_tensor = tf.convert_to_tensor(a, np.int32)
b_tensor = tf.convert_to_tensor(b, np.int32)
c_tensor = tf.convert_to_tensor(c, np.int32)
d_tensor = tf.convert_to_tensor(d, np.int32)
# use the tensors however you want
Additionaly, if you want to have a tensor that is similar to e in your code, then do:
e_tensor = tf.stack(e, axis=0)
# e_tensor[0] == a_tensor, e_tensor[1] == b_tensor, ...
I have a 3D tensor called X, of shape say [2,20,300] and I would like to apply dropout to only the third dimension. However, I want the dropped elements to be the same for the 20 instances (second dimension) but not necessarily for first dimension.
What is the behaviour of the following:
tf.nn.dropout(X[0], keep_prob=p)
Would it only act on the dimension that I want? If so, then for multiple first dimensions, I could loop over them and apply the above line.
See the documentation of tf.nn.dropout:
By default, each element is kept or dropped independently. If
noise_shape is specified, it must be broadcastable to the shape of x,
and only dimensions with noise_shape[i] == shape(x)[i] will make
independent decisions
So it is as simple as:
import tensorflow as tf
import numpy as np
data = np.arange(300).reshape((1, 1, 300))
data = np.tile(data, (2, 20, 1))
data_op = tf.convert_to_tensor(data.astype(np.float32))
data_op = tf.nn.dropout(data_op, 0.5, noise_shape=[2, 1, 300])
with tf.Session() as sess:
data = sess.run(data_op)
for b in range(2):
for c in range(20):
assert np.allclose(data[0, 0, :], data[0, c, :])
assert np.allclose(data[1, 0, :], data[1, c, :])
print((data[0, 0, :] - data[1, 0, :]).sum())
# output something != 0 with high probability#
There is a function in numpy that inserts given values to the array:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.insert.html
Is there something similar in tensorflow?
Alternatively, is there a function in tensorflow that can do tensor upsampling using zeros in between values of a tensor?
tf.nn.conv2d_transpose can do this upsampling (with careful design of output_shape and strides). A sample code:
import tensorflow as tf
import numpy as np
input = tf.convert_to_tensor(np.ones((1, 20, 20, 1)))
input = tf.cast(input, tf.float32)
b = np.zeros((3, 3, 1, 1))
b[1, 1, 0, 0] = 1
weight = tf.convert_to_tensor(b)
weight = tf.cast(weight, tf.float32)
output = tf.nn.conv2d_transpose(input, weight, output_shape=(1, 40, 40, 1), strides=[1, 2, 2, 1])
sess = tf.Session()
print sess.run(output[0, :, :, 0])
I believe checking its api will help you more.
I'm porting a numpy expression to theano. The expression finds the number of true positive predictions for each class, given a one-hot matrix Y of ground truth classes and a one-hot matrix Y_hat of predicted classes. The numpy code is:
import numpy as np
y = np.array([1, 0, 1, 2, 2])
y_hat = np.array([2, 0, 1, 1, 0])
Y = np.zeros(shape=(len(y), len(np.unique(y))))
Y_hat = np.zeros_like(Y)
rows = np.arange(len(y))
Y[rows, y] = 1
Y_hat[rows, y_hat] = 1
((Y_hat == Y) & (Y == 1)).sum(axis=0)
The last expression yields array([1, 1, 0]). I've tried using theano's nonzero:
from theano import shared
Yt = shared(Y)
Yt_hat = shared(Y_hat)
Yt_hat[Yt.nonzero()].eval()
The eval results in array([ 0., 1., 1., 0., 0.]), which is a 0-1 mask of the rows of Yt_hat where the prediction is correct. Any suggestions for how to make this work? For different ways of doing it? Thanks.
Here are three variants demonstrating how to re-implement parts of your numpy code in Theano.
Note that Theano's Unique operation does not support running on the GPU and does not appear to support gradients either. As a result version 3 many not be of much use. Version 2 provides a workaround: compute the unique values outside Theano and pass them in. Version 1 is a Theano implementation of the final line of your numpy code only.
To address your specific issue: there is no need to use nonzero; in this case the indexing works in Theano just like it works in numpy. Maybe you were getting confused between y and Y? (common Python style is to stick with lower case for all variable and parameter names).
import numpy as np
import theano
import theano.tensor as tt
import theano.tensor.extra_ops
def numpy_ver(y, y_hat):
Y = np.zeros(shape=(len(y), len(np.unique(y))), dtype=np.int64)
Y_hat = np.zeros_like(Y, dtype=np.int64)
rows = np.arange(len(y), dtype=np.int64)
Y[rows, y] = 1
Y_hat[rows, y_hat] = 1
return ((Y_hat == Y) & (Y == 1)).sum(axis=0), Y, Y_hat
def compile_theano_ver1():
Y = tt.matrix(dtype='int64')
Y_hat = tt.matrix(dtype='int64')
z = (tt.eq(Y_hat, Y) & tt.eq(Y, 1)).sum(axis=0)
return theano.function([Y, Y_hat], outputs=z)
def compile_theano_ver2():
y = tt.vector(dtype='int64')
y_hat = tt.vector(dtype='int64')
y_uniq = tt.vector(dtype='int64')
Y = tt.zeros(shape=(y.shape[0], y_uniq.shape[0]), dtype='int64')
Y_hat = tt.zeros_like(Y, dtype='int64')
rows = tt.arange(y.shape[0], dtype='int64')
Y = tt.set_subtensor(Y[rows, y], 1)
Y_hat = tt.set_subtensor(Y_hat[rows, y_hat], 1)
z = (tt.eq(Y_hat, Y) & tt.eq(Y, 1)).sum(axis=0)
return theano.function([y, y_hat, y_uniq], outputs=z)
def compile_theano_ver3():
y = tt.vector(dtype='int64')
y_hat = tt.vector(dtype='int64')
y_uniq = tt.extra_ops.Unique()(y)
Y = tt.zeros(shape=(y.shape[0], y_uniq.shape[0]), dtype='int64')
Y_hat = tt.zeros_like(Y, dtype='int64')
rows = tt.arange(y.shape[0], dtype='int64')
Y = tt.set_subtensor(Y[rows, y], 1)
Y_hat = tt.set_subtensor(Y_hat[rows, y_hat], 1)
z = (tt.eq(Y_hat, Y) & tt.eq(Y, 1)).sum(axis=0)
return theano.function([y, y_hat], outputs=z)
def main():
y = np.array([1, 0, 1, 2, 2], dtype=np.int64)
y_hat = np.array([2, 0, 1, 1, 0], dtype=np.int64)
y_uniq = np.unique(y)
result, Y, Y_hat = numpy_ver(y, y_hat)
print result
theano_ver1 = compile_theano_ver1()
print theano_ver1(Y, Y_hat)
theano_ver2 = compile_theano_ver2()
print theano_ver2(y, y_hat, y_uniq)
theano_ver3 = compile_theano_ver3()
print theano_ver3(y, y_hat)
main()