Tensorflow: difference get_tensor_by_name vs get_operation_by_name? - tensorflow

The answer here says that one returns an operation while the other returns a tensor. That is pretty obvious from the name and from the documentation. However, suppose I do the following:
logits = tf.add(tf.matmul(inputs, weights), biases, name='logits')
I am following the pattern described in Tensorflow Mechanics 101. Should I restore it as an operation or as a tensor? I am afraid that if I restore it as a tensor I will only get the last computed values for the logits; nonetheless, the post here, seems to suggest that there is no difference or that I should just use get_tensor_by_name. The idea is to compute the logits for a new set of inputs and then make predictions accordingly.

Short answer: you can use both, get_operation_by_name() and get_tensor_by_name(). Long answer:
tf.Operation
When you call
op = graph.get_operation_by_name('logits')
... it returns an instance of type tf.Operation, which is a node in the computational graph, which performs some op on its inputs and produces one or more outputs. In this case, it's a plus op.
One can always evaluate an op in a session, and if this op needs some placehoder values to be fed in, the engine will force you to provide them. Some ops, e.g. reading a variable, don't have any dependencies and can be executed without placeholders.
In your case, (I assume) logits are computed from the input placeholder x, so logits doesn't have any value without a particular x.
tf.Tensor
On the other hand, calling
tensor = graph.get_tensor_by_name('logits:0')
... returns an object tensor, which has the type tf.Tensor:
Represents one of the outputs of an Operation.
A Tensor is a symbolic handle to one of the outputs of an Operation.
It does not hold the values of that operation's output, but instead
provides a means of computing those values in a TensorFlow tf.Session.
So, in other words, tensor evaluation is the same as operation execution, and all the restrictions described above apply as well.
Why is Tensor useful? A Tensor can be passed as an input to another Operation, thus forming the graph. But in your case, you can assume that both entities mean the same.

Related

What is tensorflow.matmul?

From the output of print, it is function. But according to the official document:
An Operation is a node in a TensorFlow Graph that takes zero or more
Tensor objects as input, and produces zero or more Tensor objects as
output. Objects of type Operation are created by calling a Python op
constructor (such as tf.matmul) or tf.Graph.create_op.
it is a constructor. So I think it is a class name. But, printing the return value of tf.matmul shows it is a tensor, not an "Object of type Operation". Is the class Tensor inherited from the class Operation? I tried to find the definition of tf.matmul in tensorflow source code but could not get it.
tf.matmul (or tf.linalg.matmul) is a function. You can find its definition in the math_ops module. The behavior of these functions do depend on whether you are using eager execution (default in 2.x) or graph mode (default in 1.x).
With eager execution, the function receives a couple of eager tensors (tensors with their actual value, as opposed to "symbolic") and runs the computation of their matrix product. What you get is another eager tensor containing the result.
In graph mode, the function does not run any computation. It just receives two symbolic tensors (for which the value will not be determined until later), adds a matrix production operation to the current graph and gives you the symbolic tensor of its result. Tensors do not inherit from operations in any case. The graph contains nodes which are operations, which generally have inputs and/or outputs that are tensors. In graph mode, functions like tf.linalg.matmul usually give you the resulting tensor, not the operation, because it is more convenient (you rarely need to access the operation itself). When you give a name to these functions (e.g. name='MyMatMul'), it will be the name of the operation, and the output tensors of the operation (which in most cases is only one) will have that name plus : and its output index (e.g. MyMatMul:0). When you have a tensor t, you can access the operation that produced it with t.op. When you have an operation op, you can access the input and output tensors of the operation with op.inputs and op.outputs, and its type (the kind of operation it is representing, like MatMul) with op.type. These properties cannot be accessed with eager execution, as they only make sense when you have a graph.

Adapting an existing keras model with multiple inputs to tensorflow federated

I'm trying to apply federated learning to an existing keras model that takes two inputs. When I call tff.learning.from_compiled_keras_model and include a dummy batch, I get this error: ValueError: Layer model_1 expects 2 inputs, but it received 1 input tensors. Inputs received: [<tf.Tensor 'packed:0' shape=(2, 20) dtype=int64>].
The model accepts two numpy arrays as inputs, so I defined my dummy_batch as:
x = tf.constant(np.random.randint(1,100, size=[20]))
collections.OrderedDict([('x', [x, x]), ('y', x)])
I dug around a little bit and saw that eventually, tf.convert_to_tensor_or_sparse_tensor gets called on the input list (in the __init__ for _KerasModel), and that returns a single tensor of shape (2,20), instead of two separate arrays or tensors. Is there some other way I can represent the list of inputs to avoid this issue?
The TFF team just pushed a commit that should contain this bugfix; this commit should be what you want. See in particular the change in tensorflow_federated/python/learning/model_utils_test.py--the added test case should have been a repro of your issue, and it now passes.
You were right to call out our call to tf.convert_to_tensor_or_sparse_tensor; now we use tf.nest.map_structure to map this function call to the leaves of the passed-in data structure. Note that Keras does some extra input normalization as well; we decided not to duplicate that logic here.
This change won't be in the pip package until the next release, but if you build from source, it will be available now.
Thanks for this catch, and pointing to the right place!

Why is "step" argument necessary when predicting using data tensors? what does this error mean?

I am trying to predict() the output for a single data point d, using my trained Keras model loaded from a file. But I get a ValueError If predicting from data tensors, you should specify the 'step' argument. What does that mean?
I tried setting step=1, but then I get a different error ValueError: Cannot feed value of shape () for Tensor u'input_1:0', which has shape '(?, 600)'.
Here is my code:
d = np.concatenate((hidden[p[i]], hidden[x[i]])).resize((1,600))
hidden[p[i]] = autoencoder.predict(d,steps=)
The model is expecting (?,600) as input. I have concatenated two numpy arrays of shape (300,) each to get (600,), which is resized to (1,600). This (1,600) is my input to predict().
In my case, the input to predict was None (because I had a bug in another part of the code).
In official doc, steps refer to the total number of steps before stopping. So steps=1 means make predictions on one batch instead of making prediction on one record (single data point).
https://keras.io/models/sequential/
-> Define value of steps argument,
d = np.concatenate((hidden[p[i]],
hidden[x[i]])).resize((1,600))
hidden[p[i]] = autoencoder.predict(d,steps=1)
If you are using a test data generator, it is good practice to define the steps, as mentioned in the documentation.
If you are predicting a single instance, no need to define the steps. Just make sure the argument (i.e. instance 'd') is not None, otherwise that error will show up. Some reshaping may also be necessary.
in my case i got the same error, i just reshaped the data to predict with numpy function reshape() to the shape of the data originally used to train the model.

Why variables and constants are operations in TensorFlow?

Intuitively, I expected an operation to be something that takes an input and modifies it (add, substract, divide, square root...). In fact, that's the definition of operation I found on the Internet. Then, why variables and constants are also operations in TensorFlow?
TensorFlow generalizes your definition of operation as something that takes zero or more inputs and produces zero or more outputs. Concretely, a TensorFlow Operation is defined as:
An Operation is a node in a TensorFlow Graph that takes zero or more Tensor objects as input, and produces zero or more Tensor objects as output.
Therefore:
A constant is an operation without inputs that produces a single Tensor as output.
A variable is a special (stateful) operation that takes one Tensor (initial value) as input and produces another Tensor as output.

Caching Computations in TensorFlow

Is there a canonical way to reuse computations from a previously-supplied placeholder in TensorFlow? My specific use case:
supply many inputs (using one placeholder) simultaneously, all of which are fed through a network to obtain smaller representations
define a loss based on various combinations of these smaller representations
train on one batch at a time, where each batch uses some subset of the inputs, without recomputing the smaller representations
Here is the goal in code, but which is defective because the same computations are carried out again and again:
X_in = some_fixed_data
combinations_in = large_set_of_combination_indices
for combination_batch_in in batches(combinations_in, batch_size=128):
session.run(train_op, feed_dict={X: X_in, combinations: combination_batch_in})
Thanks.
The canonical way to share computed values across sess.Run() calls is to use a Variable. In this case, you could set up your graph so that when the Placeholders are fed, they compute a new value of the representation that is saved into a Variable. A separate portion of the graph reads those Variables to compute the loss. This will not work if you need to compute gradients through the part of the graph that computes the representation. Computing those gradients will require recomputing every Op in the encoder.
This is the kind of thing that should be solved automatically with CSE (common subexpression elimination). Not sure what the support in TensorFlow right now, might be kind of spotty, but there's optimizer_do_cse flag for Graph options which is defaulting to false, and you can set it to true using GraphConstructorOptions. Here's a C++ example of using GraphConstructorOptions (sorry, couldn't find a Python one)
If that doesn't work, you could do "manual CSE", ie, figure out which part is being needlessly recomputed, factor it out into separate Tensor, and reference that tensor in all the calculations.