Shape assertions and declarations in tensroflow - tensorflow

I use tf.strided_slice to get one value out of the 1d tensor. Unfortunately, inferred shape is ?. How can I assert/declare that it has shape [1]?
P.S. I used reshape, but it might have performance implications in some cases

Use x.set_shape() to provide additional information about the shape of this tensor that cannot be inferred from the graph alone.
You can get more information from the FAQ:
The tf.Tensor.set_shape method updates the static shape of a Tensor
object, and it is typically used to provide additional shape
information when this cannot be inferred directly. It does not change
the dynamic shape of the tensor.

Related

What is tensorflow.matmul?

From the output of print, it is function. But according to the official document:
An Operation is a node in a TensorFlow Graph that takes zero or more
Tensor objects as input, and produces zero or more Tensor objects as
output. Objects of type Operation are created by calling a Python op
constructor (such as tf.matmul) or tf.Graph.create_op.
it is a constructor. So I think it is a class name. But, printing the return value of tf.matmul shows it is a tensor, not an "Object of type Operation". Is the class Tensor inherited from the class Operation? I tried to find the definition of tf.matmul in tensorflow source code but could not get it.
tf.matmul (or tf.linalg.matmul) is a function. You can find its definition in the math_ops module. The behavior of these functions do depend on whether you are using eager execution (default in 2.x) or graph mode (default in 1.x).
With eager execution, the function receives a couple of eager tensors (tensors with their actual value, as opposed to "symbolic") and runs the computation of their matrix product. What you get is another eager tensor containing the result.
In graph mode, the function does not run any computation. It just receives two symbolic tensors (for which the value will not be determined until later), adds a matrix production operation to the current graph and gives you the symbolic tensor of its result. Tensors do not inherit from operations in any case. The graph contains nodes which are operations, which generally have inputs and/or outputs that are tensors. In graph mode, functions like tf.linalg.matmul usually give you the resulting tensor, not the operation, because it is more convenient (you rarely need to access the operation itself). When you give a name to these functions (e.g. name='MyMatMul'), it will be the name of the operation, and the output tensors of the operation (which in most cases is only one) will have that name plus : and its output index (e.g. MyMatMul:0). When you have a tensor t, you can access the operation that produced it with t.op. When you have an operation op, you can access the input and output tensors of the operation with op.inputs and op.outputs, and its type (the kind of operation it is representing, like MatMul) with op.type. These properties cannot be accessed with eager execution, as they only make sense when you have a graph.

Tflite: Resize output tensor based on input tensor contents

I am writing a custom op that outputs a tensor whose shape depends on the values of the input tensor. The problem is that we don't have access to the tensor values in the Prepare method. We can get the tensor shapes but the values are not available. How do I implement this?
On a related note, how do I support outputting a tensor with partially specified shape? The tensor would need to be allocated during the eval function, but I don't see an API to allocate tensors at run time.

Tensorflow: difference get_tensor_by_name vs get_operation_by_name?

The answer here says that one returns an operation while the other returns a tensor. That is pretty obvious from the name and from the documentation. However, suppose I do the following:
logits = tf.add(tf.matmul(inputs, weights), biases, name='logits')
I am following the pattern described in Tensorflow Mechanics 101. Should I restore it as an operation or as a tensor? I am afraid that if I restore it as a tensor I will only get the last computed values for the logits; nonetheless, the post here, seems to suggest that there is no difference or that I should just use get_tensor_by_name. The idea is to compute the logits for a new set of inputs and then make predictions accordingly.
Short answer: you can use both, get_operation_by_name() and get_tensor_by_name(). Long answer:
tf.Operation
When you call
op = graph.get_operation_by_name('logits')
... it returns an instance of type tf.Operation, which is a node in the computational graph, which performs some op on its inputs and produces one or more outputs. In this case, it's a plus op.
One can always evaluate an op in a session, and if this op needs some placehoder values to be fed in, the engine will force you to provide them. Some ops, e.g. reading a variable, don't have any dependencies and can be executed without placeholders.
In your case, (I assume) logits are computed from the input placeholder x, so logits doesn't have any value without a particular x.
tf.Tensor
On the other hand, calling
tensor = graph.get_tensor_by_name('logits:0')
... returns an object tensor, which has the type tf.Tensor:
Represents one of the outputs of an Operation.
A Tensor is a symbolic handle to one of the outputs of an Operation.
It does not hold the values of that operation's output, but instead
provides a means of computing those values in a TensorFlow tf.Session.
So, in other words, tensor evaluation is the same as operation execution, and all the restrictions described above apply as well.
Why is Tensor useful? A Tensor can be passed as an input to another Operation, thus forming the graph. But in your case, you can assume that both entities mean the same.

In Tensorflow, what is the difference between a tensor that has a type ending in _ref and a tensor that does not?

The docs say:
In addition, variants of these types with the _ref suffix are defined
for reference-typed tensors.
What exactly does this mean? What are reference-typed tensors and how do they differ from standard ones?
A reference-typed tensor is mutable. The most common way to create a reference-typed tensor is to define a tf.Variable: defining a tf.Variable whose initial value has dtype tf.float32 will create a reference-typed tensor with dtype tf.float32_ref. You can mutate a reference-typed tensor by passing it as the first argument to tf.assign().
(Note that reference-typed tensors are something of an implementation detail in the present version of TensorFlow. We'd encourage you to use higher-level wrappers like tf.Variable, which may migrate to alternative representations for mutable state in the future.)

How to process gradients with a Dimension size of None

Using AdamOptimizer, when I get the gradients of a 2d variable, the second dimension's size ends up being None, while the first dimension is the same size as the variable's first dimension. This makes it difficult to process the gradients, since a size of None isn't compatible with other sizes for most functions. When I get the gradients of a 1d variable, the gradient's dimension size is the same as the variable's. I haven't tried variables with more than 2 dimensions.
Is this a bug? Is there a way to specify what the size of the gradient should be through the compute_gradients function? Is there a way to process the gradient that gets around the size None issue?
TL;DR: It shouldn't matter, and you can process the gradients using the tf.train.AdamOptimizer as normal. If you are seeing shape-related errors, this most likely arises from one of the known dimensions not matching.
The presence of None in a gradient tensor's shape simply means that the size in that dimension could not be statically inferred. This is not necessarily a bug: the shapes of many operators depend on their data inputs, and the TensorFlow Python front-end uses a simple heuristic (i.e., only compute a limited set of ops with constant inputs) to decide what data inputs to evaluate. Almost all of the TensorFlow ops—excluding some image processing ops—will work on inputs whose shape is unknown (or only partially known), and perform checks at runtime instead.
The main way to process gradients is using Optimizer.apply_gradients(), which defers shape checking to the shape function for the ApplyAdam operator. This shape function asserts that the variable and gradient have the same shape, but the TensorShape.merge_with() method allows false positives in the presence of None in either of the shapes.
Finally, if you need to process the gradients at graph construction time, and your processing somehow depends on the gradients having known shapes, you can always use the Tensor.set_shape() method to copy the shape of the variable to the shape of the gradient, as these must be equivalent:
var = tf.Variable(...)
loss = ...
grad = tf.gradients(loss, [var])[0]
# `grad` and `var` must have the same shape.
grad.set_shape(var.get_shape())