Graph dependencies in tensorflow: how to validate that dependencies exist or not? - tensorflow

op1=tf.image.random_brightness(placeholder_img3d_float32, max_delta=...)
op2=tf.image.random_contrast(placeholder_img3d_float32, lower=..., upper=...)
op3=tf.image.per_image_standardization(placeholder_img3d_float32)
If I defined these 3 ops, and then I run:
sess.run(op1, ...)
sess.run(op2, ...)
sess.run(op3, ...)
vs. running: sess.run([op1, op2, op3], ...)
Would I have executed all 3 ops 3 times? Or are they all independent, thus the 3 runs each ran just the op I requested?
How should I validate graph dependency questions like this?
Update:
The tensorboard graph of those 3 ops looks like there are no dependencies between them, but the local_placeholder shown in the top right has 5 outputs, at least one that feeds each of the 3 ops here. Does that mean that when I feed the placeholder it will run the 3 ops, or are the lack of dependencies shown in the graph telling me that although the placeholder is common, there are no dependencies and only the op call with be processed?

In a session you can give the command to run all 3 operations same time. But inside of the tensorflow will automatically looks for dependencies.
Let's say your 3rd operation depends on 2nd operation and 2nd operations depends on 1st operation and you need to run 3rd operation first, then session object will try to run the first operation first and try to fill dependencies and then come to other steps.
In the tensorflow graph you can observe the dependencies nicely. Each gray line will show you the data flow between two operations. And dotted line will show the dependencies for each variables.

Related

Problem when predicting via multiprocess with Tensorflow

I have 4 (or more) models (same structure but different training data). Now I want to ensemble them to make a prediction. I want to pre-load the models and then predict one input message (one message at a time) in parallel via multiprocess. However, the program always stops at "session.run" step. I could not figure it out why.
I tried passing all arguments to the function in each process, as shown in the code below. I also tried using a Queue object and put all the data (except the model object) in the queue. I also tried to set the number of process to 1. It made no difference.
with Manager() as manager:
first_level_test_features=manager.list()
procs =[]
for id in range(4):
p = Process(target=predict, args=(id, (message, models, configs, vocabs, emoji_dict,first_level_test_features)))
procs.append(p)
p.start()
for p in procs:
p.join()
I did not get any error message since it is just stuck there. I would expect the program can start multiple processes and each process uses the model pass to it to make the prediction.
I am unsure how session sharing along different Processes would work, and this is probably where your issue comes from. Given the way TensorFlow works, I would advise implementing the ensemble call as a graph operation, so that it can be run through a single session.run call, with TF handling the parallelization of computations wherever possible.
In practice, if you have symbolic tensors representing the models' predictions, you could use a TF operation to aggregate them (tf.concat, tf.reduce_mean, tf.add_n... whichever suits your design) and end up with a single symbolic tensor representing the ensemble prediction.
I hope this helps; if not, please provide some more details as to what your setting is, notably which form your models have.

keras with tf backend: how to identify variables (tensors) which are in a graph

I've built (in jupyter notebook with Python 3.6) a long ML proof of concept, which, in essence, has 3 parts: load & prepare data; train network; use network.
I would like to be able to re-run it from "train network" without the "cost" of preparing the data again & again (even loading the prepared data from a save file takes a noticeable amount of time).
When I run all cells from the start of the network training (the first cell of which includes a K.clear_session to wipe out any previous network - needed if the architecture changes) it fails as, part way through, there are still variables stored (with the same names) which are part of the old graph.
I can see two simple solutions (but you may be able to advise a better method to tidy up):
loop through all the defined variables (Tensors) in global() and del any which are Tensors (implicitly all part of the old session and graph),
or (better)
loop through all the tensors defined in the (old) graph del'ing them before del'ing the (old) graph.
I can see K.get_uid but can't see how I can use this info to accomplish what I need.
In the meantime I have to reset and rerun the whole workbook everytime I make adjustments to the network.
Is there a better way?

TensorBoard doesn't show all data points

I was running a very long training (reinforcement learning with 20M steps) and writing summary every 10k steps. In between step 4M and 6M, I saw 2 peaks in my TensorBoard scalar chart for game score, then I let it run and went to sleep. In the morning, it was running at about step 12M, but the peaks between step 4M and 6M that I saw earlier disappeared from the chart. I tried to zoom in and found out that TensorBoard (randomly?) skipped some of the data points. I also tried to export the data but some data point including the peaks are also missing in the exported .csv.
I looked for answers and found this in TensorFlow github page:
TensorBoard uses reservoir sampling to downsample your data so that it can be loaded into RAM. You can modify the number of elements it will keep per tag in tensorboard/backend/server.py.
Has anyone ever modified this server.py file? Where can I find the file and if I installed TensorFlow from source, do I have to recompile it after I modified the file?
You don't have to change the source code for this, there is a flag called --samples_per_plugin.
Quoting from the help command
--samples_per_plugin: An optional comma separated list of plugin_name=num_samples pairs to explicitly
specify how many samples to keep per tag for that plugin. For unspecified plugins, TensorBoard
randomly downsamples logged summaries to reasonable values to prevent out-of-memory errors for long
running jobs. This flag allows fine control over that downsampling. Note that 0 means keep all
samples of that type. For instance, "scalars=500,images=0" keeps 500 scalars and all images. Most
users should not need to set this flag.
(default: '')
So if you want to have a slider of 100 images, use:
tensorboard --samples_per_plugin images=100
The comment is out of date - it can actually be modified in tensorboard/backend/application.py, in the "Default Size Guidance". By default, it stores 1000 scalars. You can increase that limit arbitrarily, or set it to 0 to store every scalar.
You don't need to recompile TensorBoard, or even download it from source. You could just modify this file in your TensorBoard yourself.
If you install TensorFlow using pip in virtualenv (ubuntu, mac), then within your virtualenv directory the path to application.py should be something like lib/python2.7/site-packages/tensorflow/tensorboard/backend. If you modify that file, you should get the new setting in your tensorboard (when you run tensorboard in that virtualenv). If you're like me, you'll put a print statement too so you can be sure that you're running modified code :)

Can I change Inv operation into Reciprocal in an existing graph in Tensorflow?

I am working on an image classification problem with tensorflow. I have 2 different CNNs trained separately (in fact 3 in total but I will deal with the third later), for different tasks and on a AWS (Amazon) machine. One tells if there is text in the image and the other one tells if the image is safe for work or not. Now I want to use them in a single script on my computer, so that I can put an image as input and get the results of both networks as output.
I load the two graphs in a single tensorflow Session, using the import_meta_graph API and the import_scope argument and putting each subgraph in a separate scope. Then I just use the restore method of the created saver, giving it the common Session as argument.
Then, in order to run inference, I retrieve the placeholders and final output with graph=tf.get_default_graph() and my_var=graph.get_operation_by_name('name').outputs[0] before using it in sess.run (I think I could just have put 'name' in sess.run instead of fetching the output tensor and putting it in a variable, but this is not my problem).
My problem is the text CNN works perfectly fine, but the nsfw detector always gives me the same output, no matter the input (even with np.zeros()). I have tried both separately and same story: text works but not nsfw. So I don't think the problem comes from using two networks simultaneaously.
I also tried on the original AWS machine I trained it on, and this time the nsfw CNN worked perfectly.
Both networks are very similar. I checked on Tensorboard if everything was fine and I think it is ok. The differences are in the number of hidden units and the fact that I use batch normalization in the nsfw model and not in the text one. Now why this title ? I observed that I had a warning when running the nsfw model that I didn't have when using only the text model:
W tensorflow/core/framework/op_def_util.cc:332] Op Inv is deprecated. It will cease to work in GraphDef version 17. Use Reciprocal.
So I thougt maybe this was the reason, everything else being equal. I checked my GraphDef version, which seems to be 11, so Inv should still work in theory. By the way the AWS machine use tensroflow version 0.10 and I use version 0.12.
I noticed that the text network only had one Inv operation (via a filtering on the names of the operations given by graph.get_operations()), and that the nsfw model had the same operation plus multiple Inv operations due to the batch normalization layers. As precised in the release notes, tf.inv has simply been renamed to tf.reciprocal, so I tried to change the names of the operations to Reciprocal with tf.group(), as proposed here, but it didn't work. I have seen that using tf.identity() and changing the name could also work, but from what I understand, tensorflow graphs are an append-only structure, so we can't really modify its operations (which seems to be immutable anyway).
The thing is:
as I said, the Inv operation should still work in my GraphDef version;
this is only a warning;
the Inv operations only appear under name scopes that begin with 'gradients' so, from my understanding, this shouldn't be used for inference;
the text model also have an Inv operation.
For these reasons, I have a big doubt on my diagnosis. So my final questions are:
do you have another diagnosis?
if mine is correct, is it possible to replace Inv operations with Reciprocal operations, or do you have any other solution?
After a thorough examination of the output of relevant nodes, with the help of Tensorboard, I am now pretty certain that the renaming of Inv to Reciprocal has nothing to do with my problem.
It appears that the last batch normalization layer eliminates almost any variance of its output when the inputs varies. I will ask why elsewhere.

What is the difference between tf.group and tf.control_dependencies?

Aside from tf.control_dependencies being a context manager (i.e. used with Python with), what's the difference between tf.group and tf.control_dependencies?
When should which be used?
Is it that tf.group doesn't have any particular order of operations? I'd assume tf.group([op_1, op_2, op_3]) executes ops in the list's order, but maybe that's not the case? The docstring doesn't specify a behaviour.
If you look at the graphdef, the c=tf.group(a, b) produces the same graph as
with tf.control_dependencies([a, b]):
c = tf.no_op()
There's no specific order in which ops will run, TensorFlow tries to execute operations as soon as it can (i.e. in parallel).
Just adding a few minor points to #Yaroslav-Bulatov answer.
As you can see from Yaroslav's answer:
tf.control_depenencies creates no ops by itself, and adds dependencies to whatever ops your create inside its scope
tf.group creates a single op (of type NoOp), adds dependencies to that op.
More importantly, if tf.group arguments belong to multiple devices, tf.group will insert an intermediate layer between its inputs and the node it returns. That layer will contain one node per device, so that the dependencies are organized by device. This could reduce the cross-device data flow.
So if your dependencies are on multiple devices, tf.group adds a (possibly critical) optimization.
On the other hand, tf.control_dependencies supports nesting: the inner context will add dependencies to the union of all the ops in the outer contexts.