I am using windows GPU tensorflow 1.0. I want to delete useless or wrong tf variable.
For example I run code and generate a bad model, then I copy code and just modify existed variable W1(shape=[3,3],name="W1") as W1(shape=[5,5],name="W1") and run.
However tensorflow generate W1(shape=[5,5],name="W1_1") rather than replace old W1. So in the end Tensorflow will save both wrong trained W1(name='w1') and trained w1(name='w1_1'). When I restore W1, tensorflow give me wrong trained W1(name='w1').
Could you tell me how to delete old variable W1 and add new W1?
(the newline edit function is useless)
First of all, I don't know exactly what you are trying to achieve by changing the shape of your variable. If you would have asked the question in such a way we knew your intent, maybe we could help you better.
Now there are multiple options to solve your problem:
- ignore the name of your variable (what bugs does it cause??)
- remove all variables: https://www.tensorflow.org/versions/master/api_docs/python/framework/utility_functions#reset_default_graph
- Change the shape of the variable instead of defining a new one. Check the answer of mrry here: How can I change the shape of a variable in TensorFlow?
Related
I want to reload some of my model variables with the saved weight in the chheckpoint and then export it to the tflite file.
The question is a bit tricky without see code, so I made this Colab jupyter notebook with the complete code to explain it better (All code is working, you can actually copy in a new collab and change if you want):
https://colab.research.google.com/drive/1wSor4CxEz36LgElVi4y_N8uiSt4-j9b2#scrollTo=XKBQzoW_wd4A
I got it working but with two issues:
The exported .tflite file is like 3Ks, so I do not believe it is the entire model with the weights in it. Only the input is an image of 128x128x3, one weight for each is more than 3K.
When I finally import the model in Android, I have this error: "Didn't find custom op for name 'VariableV2' /n Didn't find custom op for name 'ReorderAxes' /n Registration failed."
Maybe the last error is cause the save/restore operations? They look like are there when I save the graph definition.
Thanks in advance.
I realize my problem.. I'm trying to convert to TFLITE a model without previously freezing it, TFLITE do not allow "VariableV2" nodes cause they should not be there..
All the problem is corrected freezing the model like this:
output_graph_def = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), ["output"])
I lost some time looking for that, hope it helps.
Im trying to implement the following project into Tensorflow/Keras.
https://github.com/jacobgil/pytorch-pruning
Im having a hard time understanding what register_hook does? It can be found in finetune.py, row 66.
x.register_hook(self.compute_rank)
I've searched for clear explanations regarding this function and tried to find Keras-equivalents, without any luck. Do you have answers to these questions?
First things first, here's the documentation:
http://pytorch.org/docs/master/autograd.html#torch.autograd.Variable.register_hook
This allows you to register a method to a Variable that is called whenever the Variable's .grad is updated, i.e. in a backward pass, and takes the grad as input. The method can return a Variable that would replace the original .grad or None if you just want to read the gradients to do something else.
If you update the gradients this way, the nodes further down in the compute graph see the new updated gradient in the backward pass and will have their respective gradients calculated with the updated value.
I'm not a Tensorflow expert, but the RegisterGradient decorators (documentation) seem to be able to do the same, for an example see this answer.
im getting started with tensorflow und using retrain.py to teach it some new categories - this works well - however i have some questions:
In the comments of retrain.py it says:
"This produces a new model file that can be loaded and run by any TensorFlow
program, for example the label_image sample code"
however I havent found where this new model file is saved to ?
also: it does contain the whole model, right ? not just the retrained part ?
Thanks for clearing this up
1)I think you may want to save the new model.
When you want to save a model after some process, you can use
saver.save(sess, 'directory/model-name', *optional-arg).
Check out https://www.tensorflow.org/api_docs/python/tf/train/Saver
If you change model-name by epoch or any measure you would like to use, you can save the new model(otherwise, it may overlap with previous models saved).
You can find the model saved by searching 'checkpoint', '.index', '.meta'.
2)Saving the whole model or just part of it?
It's the part you need to learn bunch of ideas on tf.session and savers. You can save either the whole or just part, it's up to you. Again, start from the above link. The moral is that you put the variables you would like to save in a list quoted as 'var_list' in the link, and you can save only for them. When you call them back, you now also need to specify which variables in your model correspond to the variables in the loaded variables.
While running retrain.py you can give --output_graph and --output_labels parameters which specify the location to save graph (default is /tmp/output_graph.pb) and the labels as well. You can change those as per your requirements.
As the following, I built a graph with two big variables and two input placeholder.
Every time, I want to use the current value of variables (partial values) and input placeholders to calculate delta values. Then the delta values are update to the variables using scatter_add.
problem: the two computing paths are not the same, one needs more computing. the tensorflow solving engine seems to prefer one of the path randomly-it solves one of path, then the other. For example, tf may update variable 0 first, then use this new variable 0 to solve another path (update variable 1). This is not my need.
so, any idea?
tensorflow graph:
I find the solution. Using the tf.control_dependencies() could solve this problem.
https://www.tensorflow.org/api_docs/python/tf/control_dependencies
I am working on an image classification problem with tensorflow. I have 2 different CNNs trained separately (in fact 3 in total but I will deal with the third later), for different tasks and on a AWS (Amazon) machine. One tells if there is text in the image and the other one tells if the image is safe for work or not. Now I want to use them in a single script on my computer, so that I can put an image as input and get the results of both networks as output.
I load the two graphs in a single tensorflow Session, using the import_meta_graph API and the import_scope argument and putting each subgraph in a separate scope. Then I just use the restore method of the created saver, giving it the common Session as argument.
Then, in order to run inference, I retrieve the placeholders and final output with graph=tf.get_default_graph() and my_var=graph.get_operation_by_name('name').outputs[0] before using it in sess.run (I think I could just have put 'name' in sess.run instead of fetching the output tensor and putting it in a variable, but this is not my problem).
My problem is the text CNN works perfectly fine, but the nsfw detector always gives me the same output, no matter the input (even with np.zeros()). I have tried both separately and same story: text works but not nsfw. So I don't think the problem comes from using two networks simultaneaously.
I also tried on the original AWS machine I trained it on, and this time the nsfw CNN worked perfectly.
Both networks are very similar. I checked on Tensorboard if everything was fine and I think it is ok. The differences are in the number of hidden units and the fact that I use batch normalization in the nsfw model and not in the text one. Now why this title ? I observed that I had a warning when running the nsfw model that I didn't have when using only the text model:
W tensorflow/core/framework/op_def_util.cc:332] Op Inv is deprecated. It will cease to work in GraphDef version 17. Use Reciprocal.
So I thougt maybe this was the reason, everything else being equal. I checked my GraphDef version, which seems to be 11, so Inv should still work in theory. By the way the AWS machine use tensroflow version 0.10 and I use version 0.12.
I noticed that the text network only had one Inv operation (via a filtering on the names of the operations given by graph.get_operations()), and that the nsfw model had the same operation plus multiple Inv operations due to the batch normalization layers. As precised in the release notes, tf.inv has simply been renamed to tf.reciprocal, so I tried to change the names of the operations to Reciprocal with tf.group(), as proposed here, but it didn't work. I have seen that using tf.identity() and changing the name could also work, but from what I understand, tensorflow graphs are an append-only structure, so we can't really modify its operations (which seems to be immutable anyway).
The thing is:
as I said, the Inv operation should still work in my GraphDef version;
this is only a warning;
the Inv operations only appear under name scopes that begin with 'gradients' so, from my understanding, this shouldn't be used for inference;
the text model also have an Inv operation.
For these reasons, I have a big doubt on my diagnosis. So my final questions are:
do you have another diagnosis?
if mine is correct, is it possible to replace Inv operations with Reciprocal operations, or do you have any other solution?
After a thorough examination of the output of relevant nodes, with the help of Tensorboard, I am now pretty certain that the renaming of Inv to Reciprocal has nothing to do with my problem.
It appears that the last batch normalization layer eliminates almost any variance of its output when the inputs varies. I will ask why elsewhere.