a tricky graph solve in tensorflow - tensorflow

As the following, I built a graph with two big variables and two input placeholder.
Every time, I want to use the current value of variables (partial values) and input placeholders to calculate delta values. Then the delta values are update to the variables using scatter_add.
problem: the two computing paths are not the same, one needs more computing. the tensorflow solving engine seems to prefer one of the path randomly-it solves one of path, then the other. For example, tf may update variable 0 first, then use this new variable 0 to solve another path (update variable 1). This is not my need.
so, any idea?
tensorflow graph:

I find the solution. Using the tf.control_dependencies() could solve this problem.
https://www.tensorflow.org/api_docs/python/tf/control_dependencies

Related

Re-cycling the graph architecture

I'm new to tensorflow so apologies of my question is not relevant. I would like to re-cycle the graph implementation of tensorflow for a different purpose than deep-learning. The idea is to use each node to perform some calculations and output either a number or a dictionary to the dependent nodes that will perform some more calculations and so on. At the end, a summary of all the intermediate results is returned. Does a similar use case exist?

Pytorch register_hook to Keras implementation

Im trying to implement the following project into Tensorflow/Keras.
https://github.com/jacobgil/pytorch-pruning
Im having a hard time understanding what register_hook does? It can be found in finetune.py, row 66.
x.register_hook(self.compute_rank)
I've searched for clear explanations regarding this function and tried to find Keras-equivalents, without any luck. Do you have answers to these questions?
First things first, here's the documentation:
http://pytorch.org/docs/master/autograd.html#torch.autograd.Variable.register_hook
This allows you to register a method to a Variable that is called whenever the Variable's .grad is updated, i.e. in a backward pass, and takes the grad as input. The method can return a Variable that would replace the original .grad or None if you just want to read the gradients to do something else.
If you update the gradients this way, the nodes further down in the compute graph see the new updated gradient in the backward pass and will have their respective gradients calculated with the updated value.
I'm not a Tensorflow expert, but the RegisterGradient decorators (documentation) seem to be able to do the same, for an example see this answer.

Tensorflow - optimizing part of a variable

let's say i'm optimizing Ax = b where A is a matrix and x,b are vectors.
my question - is it possible to optimize it only on subset of A? specifically, a patch of A.
in other words, i would like to keep as constant a subset of parameters in A.
is it possible in TensorFlow?
I thought about using tf.silce(), but it creates a new reference of the variable
Thanks!
Unless I've misunderstood your question (or there's missing context), just define the parts of A you want to optimise over using tf.Variable(), and define the parts you don't using tf.Constant().
You can either use tf.stop_gradient or the var_list parameter of your optimizer.
See this answer for more details: https://stackoverflow.com/a/34478044/4554460

How to delete existed tensorflow variable?

I am using windows GPU tensorflow 1.0. I want to delete useless or wrong tf variable.
For example I run code and generate a bad model, then I copy code and just modify existed variable W1(shape=[3,3],name="W1") as W1(shape=[5,5],name="W1") and run.
However tensorflow generate W1(shape=[5,5],name="W1_1") rather than replace old W1. So in the end Tensorflow will save both wrong trained W1(name='w1') and trained w1(name='w1_1'). When I restore W1, tensorflow give me wrong trained W1(name='w1').
Could you tell me how to delete old variable W1 and add new W1?
(the newline edit function is useless)
First of all, I don't know exactly what you are trying to achieve by changing the shape of your variable. If you would have asked the question in such a way we knew your intent, maybe we could help you better.
Now there are multiple options to solve your problem:
- ignore the name of your variable (what bugs does it cause??)
- remove all variables: https://www.tensorflow.org/versions/master/api_docs/python/framework/utility_functions#reset_default_graph
- Change the shape of the variable instead of defining a new one. Check the answer of mrry here: How can I change the shape of a variable in TensorFlow?

Can I change Inv operation into Reciprocal in an existing graph in Tensorflow?

I am working on an image classification problem with tensorflow. I have 2 different CNNs trained separately (in fact 3 in total but I will deal with the third later), for different tasks and on a AWS (Amazon) machine. One tells if there is text in the image and the other one tells if the image is safe for work or not. Now I want to use them in a single script on my computer, so that I can put an image as input and get the results of both networks as output.
I load the two graphs in a single tensorflow Session, using the import_meta_graph API and the import_scope argument and putting each subgraph in a separate scope. Then I just use the restore method of the created saver, giving it the common Session as argument.
Then, in order to run inference, I retrieve the placeholders and final output with graph=tf.get_default_graph() and my_var=graph.get_operation_by_name('name').outputs[0] before using it in sess.run (I think I could just have put 'name' in sess.run instead of fetching the output tensor and putting it in a variable, but this is not my problem).
My problem is the text CNN works perfectly fine, but the nsfw detector always gives me the same output, no matter the input (even with np.zeros()). I have tried both separately and same story: text works but not nsfw. So I don't think the problem comes from using two networks simultaneaously.
I also tried on the original AWS machine I trained it on, and this time the nsfw CNN worked perfectly.
Both networks are very similar. I checked on Tensorboard if everything was fine and I think it is ok. The differences are in the number of hidden units and the fact that I use batch normalization in the nsfw model and not in the text one. Now why this title ? I observed that I had a warning when running the nsfw model that I didn't have when using only the text model:
W tensorflow/core/framework/op_def_util.cc:332] Op Inv is deprecated. It will cease to work in GraphDef version 17. Use Reciprocal.
So I thougt maybe this was the reason, everything else being equal. I checked my GraphDef version, which seems to be 11, so Inv should still work in theory. By the way the AWS machine use tensroflow version 0.10 and I use version 0.12.
I noticed that the text network only had one Inv operation (via a filtering on the names of the operations given by graph.get_operations()), and that the nsfw model had the same operation plus multiple Inv operations due to the batch normalization layers. As precised in the release notes, tf.inv has simply been renamed to tf.reciprocal, so I tried to change the names of the operations to Reciprocal with tf.group(), as proposed here, but it didn't work. I have seen that using tf.identity() and changing the name could also work, but from what I understand, tensorflow graphs are an append-only structure, so we can't really modify its operations (which seems to be immutable anyway).
The thing is:
as I said, the Inv operation should still work in my GraphDef version;
this is only a warning;
the Inv operations only appear under name scopes that begin with 'gradients' so, from my understanding, this shouldn't be used for inference;
the text model also have an Inv operation.
For these reasons, I have a big doubt on my diagnosis. So my final questions are:
do you have another diagnosis?
if mine is correct, is it possible to replace Inv operations with Reciprocal operations, or do you have any other solution?
After a thorough examination of the output of relevant nodes, with the help of Tensorboard, I am now pretty certain that the renaming of Inv to Reciprocal has nothing to do with my problem.
It appears that the last batch normalization layer eliminates almost any variance of its output when the inputs varies. I will ask why elsewhere.