I've trained a network in TensorFlow using a session
sess.run([train_op, loss], feed_dict=feed_dict)
What's the simplest way to give the session an input of a single data row and print the output?
I've tried (many variations of)
sess.run(print_function, data_row)
But I get the result
You must feed a value for placeholder tensor 'Placeholder_1'
with dtype int32 and shape [<batch size>]
Assumption - Your first dimension represents the number of inputs (as is the case in most offical examples)
I'd suggest making the first dimension of the placeholder as None to have the option of passing any number of batches. Have a look at this tutorial for an example - http://learningtensorflow.com/lesson4/
Quoting from this tutorial,
The first dimension of the placeholder is None, meaning we can have any number of rows. The second dimension is fixed at 3, meaning each row needs to have three columns of data.
It's also well documented in the official documentation in https://www.tensorflow.org/resources/faq#how_do_i_build_a_graph_that_works_with_variable_batch_sizes.
Related
I'm trying to apply federated learning to an existing keras model that takes two inputs. When I call tff.learning.from_compiled_keras_model and include a dummy batch, I get this error: ValueError: Layer model_1 expects 2 inputs, but it received 1 input tensors. Inputs received: [<tf.Tensor 'packed:0' shape=(2, 20) dtype=int64>].
The model accepts two numpy arrays as inputs, so I defined my dummy_batch as:
x = tf.constant(np.random.randint(1,100, size=[20]))
collections.OrderedDict([('x', [x, x]), ('y', x)])
I dug around a little bit and saw that eventually, tf.convert_to_tensor_or_sparse_tensor gets called on the input list (in the __init__ for _KerasModel), and that returns a single tensor of shape (2,20), instead of two separate arrays or tensors. Is there some other way I can represent the list of inputs to avoid this issue?
The TFF team just pushed a commit that should contain this bugfix; this commit should be what you want. See in particular the change in tensorflow_federated/python/learning/model_utils_test.py--the added test case should have been a repro of your issue, and it now passes.
You were right to call out our call to tf.convert_to_tensor_or_sparse_tensor; now we use tf.nest.map_structure to map this function call to the leaves of the passed-in data structure. Note that Keras does some extra input normalization as well; we decided not to duplicate that logic here.
This change won't be in the pip package until the next release, but if you build from source, it will be available now.
Thanks for this catch, and pointing to the right place!
I am trying to predict() the output for a single data point d, using my trained Keras model loaded from a file. But I get a ValueError If predicting from data tensors, you should specify the 'step' argument. What does that mean?
I tried setting step=1, but then I get a different error ValueError: Cannot feed value of shape () for Tensor u'input_1:0', which has shape '(?, 600)'.
Here is my code:
d = np.concatenate((hidden[p[i]], hidden[x[i]])).resize((1,600))
hidden[p[i]] = autoencoder.predict(d,steps=)
The model is expecting (?,600) as input. I have concatenated two numpy arrays of shape (300,) each to get (600,), which is resized to (1,600). This (1,600) is my input to predict().
In my case, the input to predict was None (because I had a bug in another part of the code).
In official doc, steps refer to the total number of steps before stopping. So steps=1 means make predictions on one batch instead of making prediction on one record (single data point).
https://keras.io/models/sequential/
-> Define value of steps argument,
d = np.concatenate((hidden[p[i]],
hidden[x[i]])).resize((1,600))
hidden[p[i]] = autoencoder.predict(d,steps=1)
If you are using a test data generator, it is good practice to define the steps, as mentioned in the documentation.
If you are predicting a single instance, no need to define the steps. Just make sure the argument (i.e. instance 'd') is not None, otherwise that error will show up. Some reshaping may also be necessary.
in my case i got the same error, i just reshaped the data to predict with numpy function reshape() to the shape of the data originally used to train the model.
The answer here says that one returns an operation while the other returns a tensor. That is pretty obvious from the name and from the documentation. However, suppose I do the following:
logits = tf.add(tf.matmul(inputs, weights), biases, name='logits')
I am following the pattern described in Tensorflow Mechanics 101. Should I restore it as an operation or as a tensor? I am afraid that if I restore it as a tensor I will only get the last computed values for the logits; nonetheless, the post here, seems to suggest that there is no difference or that I should just use get_tensor_by_name. The idea is to compute the logits for a new set of inputs and then make predictions accordingly.
Short answer: you can use both, get_operation_by_name() and get_tensor_by_name(). Long answer:
tf.Operation
When you call
op = graph.get_operation_by_name('logits')
... it returns an instance of type tf.Operation, which is a node in the computational graph, which performs some op on its inputs and produces one or more outputs. In this case, it's a plus op.
One can always evaluate an op in a session, and if this op needs some placehoder values to be fed in, the engine will force you to provide them. Some ops, e.g. reading a variable, don't have any dependencies and can be executed without placeholders.
In your case, (I assume) logits are computed from the input placeholder x, so logits doesn't have any value without a particular x.
tf.Tensor
On the other hand, calling
tensor = graph.get_tensor_by_name('logits:0')
... returns an object tensor, which has the type tf.Tensor:
Represents one of the outputs of an Operation.
A Tensor is a symbolic handle to one of the outputs of an Operation.
It does not hold the values of that operation's output, but instead
provides a means of computing those values in a TensorFlow tf.Session.
So, in other words, tensor evaluation is the same as operation execution, and all the restrictions described above apply as well.
Why is Tensor useful? A Tensor can be passed as an input to another Operation, thus forming the graph. But in your case, you can assume that both entities mean the same.
I am not able to understand the output from tf.nn.dynamic_rnn tensorflow function. The document just tells about the size of the output, but it doesn't tell what does each row/column means. From the documentation:
outputs: The RNN output Tensor.
If time_major == False (default), this will be a Tensor shaped:
[batch_size, max_time, cell.output_size].
If time_major == True, this will be a Tensor shaped:
[max_time, batch_size, cell.output_size].
Note, if cell.output_size is a (possibly nested) tuple of integers
or TensorShape objects, then outputs will be a tuple having the
same structure as cell.output_size, containing Tensors having shapes
corresponding to the shape data in cell.output_size.
state: The final state. If cell.state_size is an int, this will
be shaped [batch_size, cell.state_size]. If it is a
TensorShape, this will be shaped [batch_size] + cell.state_size.
If it is a (possibly nested) tuple of ints or TensorShape, this will
be a tuple having the corresponding shapes.
The outputs tensor is a 3-D matrix but what does each row/column represent?
tf.dynamic_rnn provides two outputs, outputs and state.
outputs contains the output of the RNN cell at every time instant. Assuming the default time_major == False, let's say you have an input composed of 10 examples with 7 time steps each and a feature vector of size 5 for every time step. Then your input would be 10x7x5 (batch_sizexmax_timexfeatures). Now you give this as an input to a RNN cell with output size 15. Conceptually, each time step of each example is input to the RNN, and you would get a 15-long vector for each of those. So that is what outputs contains, a tensor in this case of size 10x7x15 (batch_sizexmax_timexcell.output_size) with the output of the RNN cell at each time step. If you are only interested in the last output of the cell, you can just slice the time dimension to pick just the last element (e.g. outputs[:, -1, :]).
state contains the state of the RNN after processing all the inputs. Note that, unlike outputs, this doesn't contain information about every time step, but only about the last one (that is, the state after the last one). Depending on your case, the state may or may not be useful. For example, if you have very long sequences, you may not want/be able to processes them in a single batch, and you may need to split them into several subsequences. If you ignore the state, then whenever you give a new subsequence it will be as if you are beginning a new one; if you remember the state, however (e.g. outputting it or storing it in a variable), you can feed it back later (through the initial_state parameter of tf.nn.dynamic_rnn) in order to correctly keep track of the state of the RNN, and only reset it to the initial state (generally all zeros) after you have completed the whole sequences. The shape of state can vary depending on the RNN cell that you are using, but, in general, you have some state for each of the examples (one or more tensors with size batch_sizexstate_size, where state_size depends on the cell type and size).
I want to create an LSTM in tensorflow to predict time-series data. My training data is a set of input/output sequences of different lengths. Can I include multiple sequences of different lengths in the same training batch? Or do I need to pad them to equal lengths? If so, how?
Also: What will tensorflow do if the unrolled RNN is longer than the input sequence? The rnn() method contains an optional sequence_length argument which appears designed to handle this eventuality, but I'm not clear what it does.
Do you want to build the model from scratch? Otherwise you might want to look into the translate.py-model. Here your issue is taken care of by:
- padding the input (and output) sequences with a PAD-symbol (basically a neutral "no info"-symbol)
- buckets: For different groups of lengths you can create different buckets (makes sense only if your sequence-lengths are very different shortest to longest
You DONT have to batch inputs/output sequence of same length into a batch. TF has a way to specify the input size. The parameter "sequence_length", controls the number of time steps a cell is unrolled. So the TF will unroll your cell only up to sequence_length but not to the step size.
So while feeding the inputs and outputs also feed a sequence_length array which contain the length of each input
tf.nn.bidirectional_rnn(fwd_stacked_lstm_cells, bwd_stacked_lstm_cells,
reshaped_inputs,
sequence_length=sequence_length)
.....
feed_dict={
model.inputs: x,
model.targets: y,
model.sequence_length: lengths})
where
len(lengths) == batch_size and
for all i, lengths[i] == length of input x[i] (same as length of outpu y[i])