Is it possible to load part of a trained model into part of a newly built model in TensorFlow? - tensorflow

Is it possible to load part of a trained model into part of a newly built model in TensorFlow?
I mean, for instance, some formerly trained model of no use. But part of it is still useful. And that part could be used in a newly built model. Except this part, others of the newly built model should be trained, but this part need not be retrained again. And the newly built model is quite different from the old model except that part is the same.
If this could be done, how to write such code?

One of the possibilities:
Get the code for the model you are importing
Find the last operation of the part of the graph that you want to keep
Split the code into two parts (classes or functions) at this op
Build the shared part of the graph and load the weights
Proceed adding the rest of the new graph
Run the terminal op in the newly added part
Alternatively, you may build the entire old model, have a "fork" in the middle and, when running, ignore the terminal op in the old, unused part, and use only the terminal op in the newly added part. This way, only the new branch will be trained.
If you don't have the code but only have the saved graph definition, it is a bit more complicated: you will need to find the op you want to fork in the loaded graph by name.

Related

Continue training CoreML Model

I'm trying to get a better understanding on how to create object detection models in Turi Create (for usage in CoreML). I'm trying to create a model that detects custom images I designed and printed myself. To avoid having to take a huge amount of photo's, I'm figured I'd use the one-shot-object-detection feature provided by Turi Create. So far so good. I feed the algorithm two starter images and it successfully generates the synthetic data set and creates a somewhat reliable model.
Now I'm wondering what happens when I want to add a third category. I could of course add a third starter image and run the code again, but this feels like 2/3th of the work is redundant...
Is there a way to continue training a previously trained model, or combine multiple models so I don't have to retrain my models from scratch every time I add a category? If not, any other ways to get this done (e.g. TensorFlow)?
Turi Create is rather limited in the options it offers for retraining (none, basically). If you want more control over the process, using a tool such as TensorFlow is the better choice.

What is difference between a regular model checkpoint and a saved model in tensorflow?

There's a fairly clear difference between a model and a frozen model. As described in model_files, relevant part: Freezing
...so there's the freeze_graph.py script that takes a graph definition and a set of checkpoints and freezes them together into a single file.
Is a "saved_model" most similar to a "frozen_model" (and not a saved
"model_checkpoint")?
Is this defined somewhere in docs I'm missing?
In prior versions of tensorflow we would save and restore model
weights, but this seems to be in context of a "model_checkpoint" not
a "saved_model", is that still correct?
I'm asking more for the design overview here, not implementation specifics.
Checkpoint file only contains variables for specific model and should be loaded with either exactly same, predefined graph or with specific assignment_map to load only chosen variables. See https://www.tensorflow.org/api_docs/python/tf/train/init_from_checkpoint
Saved model is more broad cause it contains graph that can be loaded within a session and training could be continued. Frozen graph, however, is serialized and could not be used to continue training.
You can find all the info here https://www.tensorflow.org/guide/saved_model

Updating Tensorflow Object detection model with new images

I have trained a faster rcnn model with a custom dataset using Tensorflow's Object Detection Api. Over time I would like to continue to update the model with additional images (collected weekly). The goal is to optimize for accuracy and to weight newer images over time.
Here are a few alternatives:
Add images to previous dataset and train a completely new model
Add images to previous dataset and continue training previous model
New dataset with just new images and continue training previous model
Here are my thoughts:
option 1: would be more time consuming, but all images would be treated "equally".
Option 2: would like take less additional training time, but one concern is that the algorithm might be weighting the earlier images more.
Option 3: This seems like the best option. Take original model and simply focus on training the new stuff.
Is one of these clearly better? What would be the pros/cons of each?
In addition, I'd like to know if it's better to keep one test set as a control for accuracy or to create a new one each time that includes newer images. Perhaps adding some portion of new images to model and another to the test set, and then feeding older test set images back into model (or throwing them out)?
Consider the case where your dataset is nearly perfect. If you ran the model on new images (collected weekly), then the results (i.e. boxes with scores) would be exactly what you want from the model and it would be pointless adding these to the dataset because the model would not be learning anything new.
For the imperfect dataset, results from new images will show (some) errors and these are appropriate for further training. But there may be "bad" images already in the dataset and it is desirable to remove these. This indicates that Option 1 must occur, on some schedule, to remove entirely the effect of "bad" images.
On a shorter schedule, Option 3 is appropriate if the new images are reasonably balanced across the domain categories (in some sense a representative subset of the previous dataset).
Option 2 seems pretty safe and is easier to understand. When you say "the algorithm might be weighting the earlier images more", I don't see why this is a problem if the earlier images are "good". However, I can see that the domain may change over time (evolution) in which case you may well wish to counter-weight older images. I understand that you can modify the training data to do just that as discussed in this question:
Class weights for balancing data in TensorFlow Object Detection API

Tensorflow restored model gives different results

I have recently been working on word2vec within tensor flow and it works well so I decided to try to save and load it but when I restore the model it gives different results. It does this every time I restore it. Here is my code
https://github.com/drok0920/cobalt/tree/master
I am sorry if i am improperly using terms as I am relatively new to this topic.

Adjust existing Tensorflow Graphs (VGG)

I want to use the VGG converted tensorflow model from Ryan.
https://github.com/ry/tensorflow-vgg16
Now I want to adjust the layers and add another layer or change the fully connected layers. But I don't know how to get the single layers/weights out of the graphDef or how to adjust the graph.
Short answer: you can't adjust a graph, but there are probably ways to get what you want accomplished.
Long answer: TensorFlow Graph objects are structurally immutable. You can modify some aspects of them (e.g., the shape of a tensor flowing into a node), but you can't remove a node or add a node between two existing nodes. However, there are a couple ways to get the same effect:
If your changes are limited to additions only, then there's no problem with doing this. For instance, if you wanted to add a layer on the end of a network, go for it. Likewise, you can "replace" the last layer by simply adding a new layer which takes the second-to-last layer as input and just ignoring the existing last layer. When you run the graph, if you never ask for the output of the original last layer, TensorFlow will never compute it.
If you need to do modifications, one way is to slowly build up a copy of the graph node by node. So read in the original graph definition, then build your own new graph by iterating over the original and adding similar nodes to your new copy. This is somewhat tedious and can be error-prone. Moreover...
...You might not need to "adjust" the graph at all. If you want something similar to that VGG-16 implementation, you can just work off the python code directly. Don't like the width of fc6? Just edit the code that generates it.
This brings us to the real issue, though. If your goal is to modify the network and be able to re-use the weights, then 2. and 3. aren't going to work. Realistically, this isn't possible in a lot of cases. For instance, if I wanted to add or remove a layer in the middle of VGG-16 (say, adding another convolutional layer), the pre-trained weights are no longer valid. You might be able to salvage any pre-trained weights which are upstream of your changes, but everything downstream will basically be wrong. You'll need to retrain the network anyways. (Maybe you can use the pre-trained networks as initialization, but you'll still need to retrain.) Even if you're just adding to the network (as in 1.), you'll still need to train the network.
Thanks! I have recreated the graph and then loaded every single weight by getting the value of the graph definition.
This was done by graph.get_tensor_by_name('import/...') where ... is the name of the weight
https://www.tensorflow.org/versions/r0.9/how_tos/tool_developers/index.html