I need to convert this keras operation to pytorch:
user_vec = keras.layers.Dot((1,1))([user_vecs,user_att])
Suppose user_vecs has shape (2,5,10) and user_att has shape (2,5). The output has shape (2,10).
pytorch inner only works on the last dimension of the two inputs- I'm wondering if I should permute my axes and call inner then permute back, or if there's a better way.
user_vecs = user_vecs.permute(0,2,1)
torch.inner(user_vecs, user_att)
However, this is returning a tensor of shape (2,10,2).
It looks like the easiest way to accomplish this is using einsum:
user_att = user_att.unsqueeze(2)
torch.einsum('ijk,ijk->ik', user_vecs, user_att)
Related
I have a class 'tensorflow.python.framework.ops.Tensor as output and need to convert this to a numpy array.
.numpy() doesn't work because it isn't a eagerTensor.
.eval doesn't work as well, because i'm using tensorflow >2.0
Is there any other way to fix this?
img_height=330
img_width=600
img_depth=23
save_model="saved_Models/wheatModel"
prediction_data_path=["data/stacked/MOD13Q1.A2017.2738.tif","data/stacked/MOD13Q1.A2017.889.tif","data/stacked/MOD13Q1.A2017.923.tif"]
prediction_data=dataConv.preparePredictionData(prediction_data_path)
prediction_reshaped=dataConv.reshapeFiles(prediction_data,img_width,img_height,img_depth)
x_ds =tf.stack(prediction_reshaped)
model = tf.keras.models.load_model(save_model)
model.predict(x_ds)
image=model.get_layer(name='prediction_image').output
n,output_width,output_height,output_depth,output_channels=image.shape
print(type(image))
image=tf.reshape(image,(output_width,output_height,output_depth))
print(type(image))
image.numpy()
So in the code above.
I load my trained model
predict the given images
get the output from the next to last layer
reshape this data
Now i want to convert this tensor to an numpyarray
I want to compute the softmax_cross_entropy_with_logits of a batch tensor.
I have a batch of logits tensor as input, however I want to mask this tensor before with a boolean mask. The boolean mask is also a batch of masks, in every mask there can be a different amount of True.
Thus applying this mask to the whole tensor will not be dense afterwards.
Trying this either flattens the tensor (tf.boolean_mask) or creates a ragged one (tf.ragged.boolean_mask), which both produce wrong results or don't work with the softmax function.
So basically I want to make the following code work:
# logits.shape = (batch, outputs), e.g. (512,8)
# mask.shape = (batch, valid), e.g. (512,8)
# expected result shape (512,)
one_hot_actions = tf.one_hot(x, logits.get_shape().as_list()[-1])
stopgradient = tf.stop_gradient(one_hot_actions)
return tf.nn.softmax_cross_entropy_with_logits_v2(
logits=tf.boolean_mask(logits, mask),
labels=tf.boolean_mask(stopgradient, mask))
But with tf.boolean_mask this produces just one value, not four and with tf.ragged.boolean_mask this function does not work.
I tried combing the two ragged tensors row wise (first masked logits row with first masked labels row) and compute the softmax rowwise. This did not work since the tf.map_fn that I used does not accept ragged tensors as inputs. I tried this same idea also with SparseTensors and list of Tensors (tf.split) but never got any working code out of it.
The only idea I had to solve this issue is very ugly.
Replace all masked values with tf.where to NaN and then use map_fn on these now dense tensors. Every row I can then mask again to exclude NaN and now can call the softmax function row-wise again.
EDIT This is what the code currently looks like:
stopgradient = tf.stop_gradient(one_hot_actions)
nan_logits = tf.where(mask, logits, float('NaN') + tf.zeros_like(self.logits))
nan_labels = tf.where(mask, stopgradient, float('NaN') + tf.zeros_like(stopgradient))
nan_lola = tf.stack([nan_logits, nan_labels], axis=1)
def fn(x):
nan_lo = x[0]
nan_la = x[1]
masked_lo = tf.boolean_mask(nan_lo, tf.logical_not(tf.math.is_nan(nan_lo)))
masked_la = tf.boolean_mask(nan_la, tf.logical_not(tf.math.is_nan(nan_la)))
return tf.nn.softmax_cross_entropy_with_logits_v2(
logits=masked_lo,
labels=masked_la
)
result = tf.map_fn(fn, nan_lola)
return result
This works but is very slow (my training time almost doubles).
To those interested: I stumbled upon this problem trying to mask valid actions in reinforcement learning.
Do you know of any way to do this (faster)?
Can you replace the masked values with a value that does not affect the softmax?
Thank you!
I'm doing a Matrix Factorization in TensorFlow, I want to use coo_matrix from Spicy.sparse cause it uses less memory and it makes it easy to put all my data into my matrix for training data.
Is it possible to use coo_matrix to initialize a variable in tensorflow?
Or do I have to create a session and feed the data I got into tensorflow using sess.run() with feed_dict.
I hope that you understand my question and my problem otherwise comment and i will try to fix it.
The closest thing TensorFlow has to scipy.sparse.coo_matrix is tf.SparseTensor, which is the sparse equivalent of tf.Tensor. It will probably be easiest to feed a coo_matrix into your program.
A tf.SparseTensor is a slight generalization of COO matrices, where the tensor is represented as three dense tf.Tensor objects:
indices: An N x D matrix of tf.int64 values in which each row represents the coordinates of a non-zero value. N is the number of non-zeroes, and D is the rank of the equivalent dense tensor (2 in the case of a matrix).
values: A length-N vector of values, where element i is the value of the element whose coordinates are given on row i of indices.
dense_shape: A length-D vector of tf.int64, representing the shape of the equivalent dense tensor.
For example, you could use the following code, which uses tf.sparse_placeholder() to define a tf.SparseTensor that you can feed, and a tf.SparseTensorValue that represents the actual value being fed :
sparse_input = tf.sparse_placeholder(dtype=tf.float32, shape=[100, 100])
# ...
train_op = ...
coo_matrix = scipy.sparse.coo_matrix(...)
# Wrap `coo_matrix` in the `tf.SparseTensorValue` form that TensorFlow expects.
# SciPy stores the row and column coordinates as separate vectors, so we must
# stack and transpose them to make an indices matrix of the appropriate shape.
tf_coo_matrix = tf.SparseTensorValue(
indices=np.array([coo_matrix.rows, coo_matrix.cols]).T,
values=coo_matrix.data,
dense_shape=coo_matrix.shape)
Once you have converted your coo_matrix to a tf.SparseTensorValue, you can feed sparse_input with the tf.SparseTensorValue directly:
sess.run(train_op, feed_dict={sparse_input: tf_coo_matrix})
In keras, we can use merge to concatenate two layers. There is a parameter concat_axis. Looks like the default value for this parameter is -1, and quite some code setup it as 1. What do they mean, concat_axis=1 and concat_axis=-1, respectively. I could not find the explanation in Keras document. Thanks.
concat_axis means the axis/dimension to concatenate.
if your input tensor has shape (samples, channels, rows, cols),
set concat_axis to 1 to concatenate per feature map (channels axis).
weights = tf.placeholder("float",[5,5,1,1])
imagein = tf.placeholder("float",[1,32,32,1])
conv = tf.nn.conv2d(imagein,weights,strides=[1,1,1,1],padding="SAME")
deconv = tf.nn.conv2d_transpose(conv, weights, [1,32,32,1], [1,1,1,1],padding="SAME")
dw = np.random.rand(5,5,1,1)
noise = np.random.rand(1,32,32,1)
sess = tf.InteractiveSession()
convolved = conv.eval(feed_dict={imagein: noise, weights: dw})
deconvolved = deconv.eval(feed_dict={imagein: noise, weights: dw})
I've been trying to figure out conv2d_transpose in order to reverse a convolution in Tensorflow. My understanding is that "deconvolved" should contain the same data as "noise" after applying a normal convolution and then its transpose, but "deconvolved" just contains some completely different image. Is there something wrong with my code, or is the theory incorrect?
There's a reason it's called conv2d_transpose rather than deconv2d: it isn't deconvolution. Convolution isn't an orthogonal transformation, so it's inverse (deconvolution) isn't the same as its transpose (conv2d_transpose).
Your confusion is understandable: calling the transpose of convolution "deconvolution" has been standard neural network practice for years. I am happy than we were able to fix the name to be mathematically correct in TensorFlow; more details here:
https://github.com/tensorflow/tensorflow/issues/256