How to check two tensors have incompatible shapes? - numpy

Is there a simple way to check if two tensors have incompatible shapes (i.e., one can not be broadcast to the shape of the other) before multiplying them? Errors such as
RuntimeError: The size of tensor a (?) must match the size of tensor b (?) at non-singleton dimension ?
I could use t1.size()[2] != t2.size()[2] in what I do (i.e., I know when and how that happens, but there could be more complicated scenarios), but how do I say it idiomatically?

Related

How can I tell tensorflow about the shape of a parse_tensor operation?

When I decode a tensor using tf.io.parse_tensor, the shape comes out as shape=<unknown>, which makse sense because tensorflow has no way to know the shape of the data that I will pass into this operation. However, if I know that the data I will provide has a certain shape, such as having exactly 2 dimensions, or having 3 rows and an unknown number of columns, how can I "tell" tensorflow this?
I need to do this because I am using functions like padded_batch later on that behave differently for different shapes (producing a ragged tensor versus a dense tensor).

Shape of tensor for 2D image in Keras

I am a newbie to Keras (and somehow to TF) but I have found shape definition for the input layer very confusing.
So in the examples, when we have a 1D vector of length 20 for input, shape gets defined as
...Input(shape=(20,)...)
And when a 2D tensor for greyscale images needs to be defined for MNIST, it is defined as:
...Input(shape=(28, 28, 1)...)
So my question is why the tensor is not defined as (20) and (28, 28)? Why in the first case a second dimension is added and left empty? Also in second, number of channels have to be defined?
I understand that it depends on the layer so Conv1D, Dense or Conv2D take different shapes but it seems the first parameter is implicit?
According to docs, Dense needs be (batch_size, ..., input_dim) but how is this related the example:
Dense(32, input_shape=(784,))
Thanks
Tuples vs numbers
input_shape must be a tuple, so only (20,) can satisfy it. The number 20 is not a tuple. -- There is the parameter input_dim, to make your life easier if you have only one dimension. This parameter can take 20. (But really, I find it just confusing, I always work with input_shape and use tuples, to keep a consistent understanding).
Dense(32, input_shape=(784,)) is the same as Dense(32, input_dim=784).
Images
Images don't have only pixels, they also have channels (red, green, blue).
A black/white image has only one channel.
So, (28pixels, 28pixels, 1channel)
But notice that there isn't any obligation to follow this shape for images everywhere. You can shape them the way you like. But some kinds of layers do demand a certain shape, otherwise they couldn't work.
Some layers demand specific shapes
It's the case of the 2D convolutional layers, which need (size1,size2,channels). They need this shape because they must apply the convolutional filters accordingly.
It's also the case of recurrent layers, which need (timeSteps,featuresPerStep) to perform their recurrent calculations.
MNIST models
Again, there isn't any obligation to shape your image in a specific way. You must do it according to which first layer you choose and what you intend to achieve. It's a free thing.
Many examples simply don't care about an image being a 2d structured thing, and they just use models that take 784 pixels. That's enough. They probably start with Dense layers, which demand shapes like (size,)
Other examples may care, and use a shape (28,28), but then these models will have to reshape the input to fit the needs of the next layer.
Convolutional layers 2D will demand (28,28,1).
The main idea is: input arrays must match input_shape or input_dim.
Tensor shapes
Be careful, though, when reading Keras error messages or working with custom / lambda layers.
All these shapes we defined before omit an important dimension: the batch size or the number of samples.
Internally all tensors will have this additional dimension as the first dimension. Keras will report it as None (a dimension that will adapt to any batch size you have).
So, input_shape=(784,) will be reported as (None,784).
And input_shape=(28,28,1) will be reported as (None,28,28,1)
And your actual input data must have a shape that matches that reported shape.

How to pad a 4-D tensor to the same shapes of others in tensorflow?

builtins.ValueError: Dimension 0 in both shapes must be equal,
but are 13 and 14 for 'concat' (op: 'ConcatV2') with input shapes:
[4,13,17,512], [4,14,18,512], [] and with computed
input tensors: input[2] = <0>.
as you see,concat2 = tf.concat([conv5_1, deconv5], axis = 0)leads to above error, I have no idea about how to solve it, anyone help?Thanks a lot!
Your tensors must be the same size in order to concatenate them, which is why you are getting this are.
There are a few options to make the tensors the same size, but make sure they make sense with the data you are using and aren't causing you loss of informations:
Reshape one tensor to the size of the other using tf.image.resize_images
Zero pad both tensors with tf.pad so that they're the same size. This is risky since when you concatenate them certain stacked values may not represent the same information.
Finally, you can crop the larger tensor to the dimensions of the smaller one using tf.image.crop_to_bounding_box. This is again maybe not a good idea with your case because your tensor dimensions are off by a value of 1 so you wouldn't be centrally cropping.

Tensorflow: Copy variable-size matrix from one GPU to another and pretend copy has zero derivative

I have computed matrices of size [None, 1024] on each of two GPUs (call them "left GPU" and "right GPU"). The None represents the batch size. I want to copy the matrix from the right GPU to the left GPU (where it is treated as constant for differentiation purposes) and then multiply them:
result = tf.matmul(left_matrix, right_matrix_copied, transpose_b=True)
to obtain a square matrix of shape [None, None]. It's important that the matrix be square since I proceed to apply tf.diag_part to the matrix. (And in case you're wondering, I also use all the off-diagonal entries.)
I tried doing this by assigning the right matrix to a tf.Variable with trainable=False and then using assign with validate_shape=False, but I am still forced to specify the variable's initial shape statically (with no dimensions allowed to be None). And when I change the shape dynamically, the tf.diag_part op complains.
How can I do this?

How to process gradients with a Dimension size of None

Using AdamOptimizer, when I get the gradients of a 2d variable, the second dimension's size ends up being None, while the first dimension is the same size as the variable's first dimension. This makes it difficult to process the gradients, since a size of None isn't compatible with other sizes for most functions. When I get the gradients of a 1d variable, the gradient's dimension size is the same as the variable's. I haven't tried variables with more than 2 dimensions.
Is this a bug? Is there a way to specify what the size of the gradient should be through the compute_gradients function? Is there a way to process the gradient that gets around the size None issue?
TL;DR: It shouldn't matter, and you can process the gradients using the tf.train.AdamOptimizer as normal. If you are seeing shape-related errors, this most likely arises from one of the known dimensions not matching.
The presence of None in a gradient tensor's shape simply means that the size in that dimension could not be statically inferred. This is not necessarily a bug: the shapes of many operators depend on their data inputs, and the TensorFlow Python front-end uses a simple heuristic (i.e., only compute a limited set of ops with constant inputs) to decide what data inputs to evaluate. Almost all of the TensorFlow ops—excluding some image processing ops—will work on inputs whose shape is unknown (or only partially known), and perform checks at runtime instead.
The main way to process gradients is using Optimizer.apply_gradients(), which defers shape checking to the shape function for the ApplyAdam operator. This shape function asserts that the variable and gradient have the same shape, but the TensorShape.merge_with() method allows false positives in the presence of None in either of the shapes.
Finally, if you need to process the gradients at graph construction time, and your processing somehow depends on the gradients having known shapes, you can always use the Tensor.set_shape() method to copy the shape of the variable to the shape of the gradient, as these must be equivalent:
var = tf.Variable(...)
loss = ...
grad = tf.gradients(loss, [var])[0]
# `grad` and `var` must have the same shape.
grad.set_shape(var.get_shape())