Let's say someone hands me a TF SavedModel and I would like to replicate this model on the 4 GPUs I have on my machine so I can run inference in parallel on batches of data. Are there any good examples of how to do this?
I can load a saved model in this way:
def load_model(self, saved_model_dirpath):
'''Loads a model from a saved model directory - this should
contain a .pb file and a variables directory'''
signature_key = tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
input_key = 'input'
output_key = 'output'
meta_graph_def = tf.saved_model.loader.load(self.sess, [tf.saved_model.tag_constants.SERVING],
saved_model_dirpath)
signature = meta_graph_def.signature_def
input_tensor_name = signature[signature_key].inputs[input_key].name
output_tensor_name = signature[signature_key].outputs[output_key].name
self.input_tensor = self.sess.graph.get_tensor_by_name(input_tensor_name)
self.output_tensor = self.sess.graph.get_tensor_by_name(output_tensor_name)
..but this would require that I have a handle to the session. For models that I have written myself, I would have access to the inference function and I could just call it and wrap it using with tf.device(), but in this case, I'm not sure how to extract the inference function out of a Saved Model. Should I load 4 separate sessions or is there a better way? Couldn't find much documentation on this, but apologies in advance if I missed something. Thanks!
There is no support for this use case in TensorFlow at the moment. Unfortunately, "replicating the inference function" based only on the SavedModel (which is basically the computation graph with some metadata), is a fairly complex (and brittle, if implemented) graph transformation problem.
If you don't have access to the source code that produced this model, your best bet is to load the SavedModel 4 times into 4 separate graphs, rewriting the target device to the corresponding GPU each time. Then, run each graph/session separately.
Note that you can invoke sess.run() multiple times concurrently since sess.run() releases the GIL for the time of actual computation. All you need is several Python threads.
Related
I am training a model using tf.keras and I have many small .npy files with single observations in a folder on local disk. I have build a DataGeneretor(keras.utils.Sequence) class and it works correctly, although I have a warning:
'tensorflow:multiprocessing can interact badly with TensorFlow, causing nondeterministic deadlocks. For high performance data pipelines tf.data is recommended.'
I have found out that I can simply create something like this:
ds = tf.data.Dataset.from_generator(
DataGenerator, args=[...],
output_types=(tf.float16, tf.uint8),
output_shapes=([None,256,256,3], [None,256,256,1]),
)
and then my Keras DataGenerator would work as a single file reader and a TF Dataset as interface to create batches. My question is: does it make any sense? Would it be safer? Would it read next batch during the training of previous batch, when using simple model.fit?
I want to train different Keras models (or in some cases just multiple runs of the same model to compare the results) in a queue (using TensorFlow as the backend if that matters). In my current setup I create and fit all of these models in one big python script, e.g. (in a simplified way):
for i in range(10):
model = create_model(i)
model.compile(...)
model.fit(...)
some_function_to_save_model(model)
The create_model(i) function creates the specific model for the i'th run. This includes changing the number of inputs / labels for example. The compile function can be different (e.g. different optimizer) for each run as well.
While this code works for me and I have not found any problems, I am unclear if this is the correct way to do it because all of the models reside in the same TensorFlow Graph (if I understand the way Keras / TensorFlow work together correctly). My questions are:
is this the correct way to run multiple independent models. (I do not want any influence of the i'th run on the i+1'th run)
is running the models from different python scripts (in this example model1.py, model2.py, ... model9.py) in any way better technically speaking (I am not referring to readability / reproducibility here) because each model would then have its own separate TensorFlow Graph / Session?
Does clearing the Session / deleting the Graph via keras.backend.clear_session() have any influence in this case if it is run after the save function (some_function_to_save_model() inside the for loop)? Is this in some way beneficial compared to the current setup?
Once again: I am not concerned with the problems that might arise due to creating messy code if all models are cramped together in one script instead of a single script per model only with creating & training models independently.
Unfortunately I did not find a concise answer to this (only suggestions using both methods). Maybe someone here can enlighten me?
Edit: Maybe I should be more precise. Basically I would like to have a technical explanation regarding the differences (advantages & disadvantages) of the following three cases:
create_and_train.py:
for i in range(10):
model = create_model(i)
model.compile(...)
model.fit(...)
some_function_to_save_model(model)
create_and_train.py:
for i in range(10):
model = create_model(i)
model.compile(...)
model.fit(...)
some_function_to_save_model(model)
# clear session:
keras.backend.clear_session()
create_and_train_i.py with i in [0, 1, ..., 9]:
i = 5 # (e.g.)
model = create_model(i)
model.compile(...)
model.fit(...)
some_function_to_save_model(model)
and e.g. a bash script that loops through these
I was going through the tensorflow-model-analysis documentation evaluating TensorFlow models. The getting started guide talks about a special SavedModel called the EvalSavedModel.
Quoting the getting started guide:
This EvalSavedModel contains additional information which allows TFMA
to compute the same evaluation metrics defined in your model in a
distributed manner over a large amount of data, and user-defined
slices.
My question is how can I convert an already existing saved_model.pb to an EvalSavedModel?
EvalSavedModel is exported as SavedModel message, thus there is no need in such conversion.
EvalSavedModel uses SavedModelBuilder under the hood. It populates the estimator graph with several placeholders, creates some additional metric collections. Later on, it performs simple SavedModelBuilder procedure.
Source - https://github.com/tensorflow/model-analysis/blob/master/tensorflow_model_analysis/eval_saved_model/export.py#L228
P.S. I suppose you want to run model-analysis on your model, exported by SavedModelBuilder. Since SavedModel doesn't have neither metric nodes nor related collections, which are created in EvalSavedModel, it's useless to do so - model-analysis just simply couldn't find any metric related to your estimator.
If I understand your question correctly, you have saved_model.pb generated, either by using tf.saved_model.simple_save or tf.saved_model.builder.SavedModelBuilderor by estimator.export_savedmodel.
If my understanding is correct, then, you are exporting Training and Inference Graphs to saved_model.pb.
The Point you mentioned from the Guide of TF Org Website states that, in addition to Exporting Training Graph, we need to Export Evaluation Graph as well. That is called EvalSavedModel.
The Evaluation Graph comprises the Metrics for that Model, so that you can Evaluate the Model's performance using Visualizations.
Before we Export EvalSaved Model, we should prepare eval_input_receiver_fn, similar to serving_input_receiver_fn.
We can mention other functionalities as well, like, if you want the Metrics to be defined in a Distributed Manner or if we want to Evaluate our Model using Slices of Data, rather than the Entire Dataset. Such Options can be mentioned in eval_input_receiver_fn.
Then we can Export the EvalSavedModel using the Code below:
tfma.export.export_eval_savedmodel(estimator=estimator,export_dir_base=export_dir,
eval_input_receiver_fn=eval_input_receiver_fn)
I am trying to perform a testing on a trained model. I have restored the checkpoint in the session. The model has a lot of operations but I am only testing part of it.
I would like to see the op network that I am testing in the tensorboard, but currently, it's all entangled with all other operations that I do not want.
Is there any way to store the tf.summary for the selective operations that you are performing in the session?
Instead of merging all summaries by merged_summary = tf.summary.merge_all(), you can merge the ops that you wanted like merged_summary_group1 = tf.summary.merge([op1, op2, ...]). After that, replacing all merged_summary in sess.run with merged_summary_group1.
I want to train a model. Every 1000 steps, I want to evaluate it on the test set and write it to the tensorboard log. However, there's a problem. I have a code like this:
image_b_train, label_b_train = tf.train.shuffle_batch(...)
out_train = model.inference(image_b_train)
accuracy_train = tf.reduce_mean(...)
image_b_test, label_b_test = tf.train.shuffle_batch(...)
out_test = model.inference(image_b_test)
accuracy_test = tf.reduce_mean(...)
where model inference declares the variables in the model. However, there's a problem. For the test set I have a separate queue, and I can't swap one queue for another with tensorflow.
Currently I solved the problem by creating 2 graphs, one for training and the other for testing. I copy from one graph to the other with tf.train.Saver. Another solution might be to use tf.get_variable, but this is a global variable, and I don't like it because my code becomes less reusable.
Yes, you need two graphs. These graphs can share variables. This can be done by:
Using Keras layers (from tf.contrib.keras) which let you define the model once and use it to compute two inference graphs
Using slim-style layers (from tf.layers) with tf.get_variable and reuse
Using tf.make_template to make your own model-like object which can be called once to build the training graph and once to build the inference graph
Using tf.estimator.Estimator which lets you define a model function once and runs it automatically for training and evaluation for you
There are other options, but any of these is well-supported and should unblock you.