How to add selective nodes in the log file - tensorflow

I am trying to perform a testing on a trained model. I have restored the checkpoint in the session. The model has a lot of operations but I am only testing part of it.
I would like to see the op network that I am testing in the tensorboard, but currently, it's all entangled with all other operations that I do not want.
Is there any way to store the tf.summary for the selective operations that you are performing in the session?

Instead of merging all summaries by merged_summary = tf.summary.merge_all(), you can merge the ops that you wanted like merged_summary_group1 = tf.summary.merge([op1, op2, ...]). After that, replacing all merged_summary in sess.run with merged_summary_group1.

Related

Loading a model from tensorflow SavedModel onto mutliple GPUs

Let's say someone hands me a TF SavedModel and I would like to replicate this model on the 4 GPUs I have on my machine so I can run inference in parallel on batches of data. Are there any good examples of how to do this?
I can load a saved model in this way:
def load_model(self, saved_model_dirpath):
'''Loads a model from a saved model directory - this should
contain a .pb file and a variables directory'''
signature_key = tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
input_key = 'input'
output_key = 'output'
meta_graph_def = tf.saved_model.loader.load(self.sess, [tf.saved_model.tag_constants.SERVING],
saved_model_dirpath)
signature = meta_graph_def.signature_def
input_tensor_name = signature[signature_key].inputs[input_key].name
output_tensor_name = signature[signature_key].outputs[output_key].name
self.input_tensor = self.sess.graph.get_tensor_by_name(input_tensor_name)
self.output_tensor = self.sess.graph.get_tensor_by_name(output_tensor_name)
..but this would require that I have a handle to the session. For models that I have written myself, I would have access to the inference function and I could just call it and wrap it using with tf.device(), but in this case, I'm not sure how to extract the inference function out of a Saved Model. Should I load 4 separate sessions or is there a better way? Couldn't find much documentation on this, but apologies in advance if I missed something. Thanks!
There is no support for this use case in TensorFlow at the moment. Unfortunately, "replicating the inference function" based only on the SavedModel (which is basically the computation graph with some metadata), is a fairly complex (and brittle, if implemented) graph transformation problem.
If you don't have access to the source code that produced this model, your best bet is to load the SavedModel 4 times into 4 separate graphs, rewriting the target device to the corresponding GPU each time. Then, run each graph/session separately.
Note that you can invoke sess.run() multiple times concurrently since sess.run() releases the GIL for the time of actual computation. All you need is several Python threads.

How to smoothly produce Tensorflow auc summaries for training and test sets?

Tensorflow describes writing file summaries to visualize graph execution.
I envision three stages:
training the data (with optimization)
measuring accuracy on the training set (no optimization)
measuring accuracy on the test set (no optimization!)
I'd like all stages in the same script, as in the evaluate function of the wide_and_deep tutorial, but with the low-level API. I'd like three different graphs for stats like loss or AUC, one for each stage.
Suppose I use one session, and in each stage I define an AUC summary op:
# define auc
auc, auc_op = tf.metrics.auc(labels, predictions)
# summary scalar to track it
tf.summary.scalar("auc", auc_op, family=family_name)
# merge all summaries for evaluation and later writing
summary_op = tf.summary.merge_all()
...
summary_writer.add_summary(summary, step_num)
There are three graphs, but the first graph has all three runs on it, and the second graph has the last two runs (see below). What's worse, each stage starts from the previous state. This makes sense, because all the variables from the previous stages are still around.
I could use a different session for each stage, but that would throw away the model as well.
What is the smooth way to handle this?
I'd like to just clear some of the summary variables. I've tried re-initializing some variables, looked at related questions, read about name scope and variable scope and tried not to re-use variables for AUC, read about variables and sharing, looked into pruning nodes (though I don't understand it), etc. I have not made it work yet.
I am using the low-level API. I saw something like this in the high-level API in _eval_metric_ops, but I don't understand how they 'clear' the different stages. With name_scope?
Do I have to save and load the model into a new session just for this, or is there some clean way to graph each summary separately?
The metric ops will be local variables, so you could run tf.local_variables_initializer() in your Session, which will reset all of your metrics. You could also look through the local variables collection for those with "auc" in the name if you wanted to be a bit more discerning. The high-level way to do this would be to use an Estimator, which will manage metrics for you.

How to structure the model for training and evaluation on the test set

I want to train a model. Every 1000 steps, I want to evaluate it on the test set and write it to the tensorboard log. However, there's a problem. I have a code like this:
image_b_train, label_b_train = tf.train.shuffle_batch(...)
out_train = model.inference(image_b_train)
accuracy_train = tf.reduce_mean(...)
image_b_test, label_b_test = tf.train.shuffle_batch(...)
out_test = model.inference(image_b_test)
accuracy_test = tf.reduce_mean(...)
where model inference declares the variables in the model. However, there's a problem. For the test set I have a separate queue, and I can't swap one queue for another with tensorflow.
Currently I solved the problem by creating 2 graphs, one for training and the other for testing. I copy from one graph to the other with tf.train.Saver. Another solution might be to use tf.get_variable, but this is a global variable, and I don't like it because my code becomes less reusable.
Yes, you need two graphs. These graphs can share variables. This can be done by:
Using Keras layers (from tf.contrib.keras) which let you define the model once and use it to compute two inference graphs
Using slim-style layers (from tf.layers) with tf.get_variable and reuse
Using tf.make_template to make your own model-like object which can be called once to build the training graph and once to build the inference graph
Using tf.estimator.Estimator which lets you define a model function once and runs it automatically for training and evaluation for you
There are other options, but any of these is well-supported and should unblock you.

How to get both loss and model output at once, on a batch of data in Keras?

I'm using Keras w/ Tensorflow backend to train a NN.
I'm using train_on_batch for training, which returns the loss on the given batch. How do I also get the output classification on that batch ? (I'd like to do some visualisations of the output)
To do that I currently do another call to predict to get the model output, but that's redundant since train_on_batch have already passed the input batch "forward".
In Caffe, when an image is fed forward, the intermediate layer outputs stay stored in net.blobs, but in Keras/Tensorflow it seems that if we want to get an intermediate output we have to rerun the computational graph for each intermediate output we want to access on CPU, as described here. Is there a way to access many/all intermediate layers' outputs without rerunning the graph for each ?
I don't mind having a tensorflow-specific workaround.
If you use the function API, this is pretty straight forward.
In addition to #MohamedEzz's answer, you can create a custom callback which can perform the operations you require during the training process. They have methods which will run your code onEpochEnd, onEpochStart, onTrainingEnd and so on...
This way you can preserve the batch.

At what stage is a tensorflow graph set up?

An optimizer typically run the same computation graph for many steps until convergence. Does tensorflow setup the graph at the beginning and reuse it for every step? What if I change the batch size during training? What if I make some minus change to the graph like changing the loss function? What if I made some major change to the graph? Does tensorflow pre-generate all possible graphs? Does tensorflow know how to optimize the entire computation when the graph changes?
As keveman says, from the client's perspective there is a single TensorFlow graph. In the runtime, there can be multiple pruned subgraphs that contain just the nodes that are necessary to compute the values t1, t2 etc. that you fetch when calling sess.run([t1, t2, ...]).
If you call sess.run([t1, t2]) will prune the overall graph (sess.graph) down to the subgraph required to compute those values: i.e. the operations that produce t1 and t2 and all of their antecedents. If you subsequently call sess.run([t3, t4]), the runtime will prune the graph down to the subgraph required to compute t3 and t4. Each time you pass a new combination of values to fetch, TensorFlow will compute a new pruned graph and cache it—this is why the first sess.run() can be somewhat slower than subsequent ones.
If the pruned graphs overlap, TensorFlow will reuse the "kernel" for the ops that are shared. This is relevant because some ops (e.g. tf.Variable and tf.FIFOQueue) are stateful, and their contents can be used in both pruned graphs. This allows you, for example, to initialize your variables with one subgraph (e.g. sess.run(tf.initialize_all_variables())), train them with another (e.g. sess.run(train_op)), and evaluate your model with a third (e.g. sess.run(loss, feed_dict={x: ...})). It also lets you enqueue elements to a queue with one subgraph, and dequeue them with another, which is the foundation of the input pipelines.
TensorFlow exposes only one graph that is visible to the user, namely the one specified by the user. The user can run the graph with Session.run() or by calling Tensor.eval() on some tensor. A Session.run() call can specify some tensors to be fed and others to be fetched. Depending on what needs to be fetched, the TensorFlow runtime could be internally constructing and optimizing various data structures, including a pruned version of the user visible graph. However, this internal graph is not visible to the user in anyway. No, TensorFlow doesn't 'pre-generate' all possible graphs. Yes, TensorFlow does perform extensive optimizations on the computation graph. And finally, changing the batch size of a tensor that is fed doesn't change the structure of the graph.