I have 2 Tensorflow Lite models (they are Yolo V2 Tiny models):
Model A) Downloaded from the internet, detects and classifies objects
with 80 classes. The .tflite files weights 44,9mb.
Model B) Trained by myself using Darknet, detects and classifies objects with 52
classes. The .tflite files weights 20,8mb. The model is converted
to TFLite using Darkflow.
However both on a mobile phone and on a computer model B takes 10x more time to predict than model A (even if model B detects within less classes and its file is lighter). Also, models seem to work with input images of size 416x416 and use float numbers.
What could be the reason for model A being faster than model B?
How can I find out why model A is faster?
One of the problems I have is that for model A, since I have not trained it myself, I don't have its .cfg file with the whole setup...
You should try the following two approaches to gain more insight, as the reasons to why a model happens to be slower than expected could be several.
Inspect both networks with a tool like Netron. You can upload your flatbuffer (TF Lite) model file and visualize the network architecture after TF Lite conversion.
There you can see where the difference between the two models lies. If e.g. there happen to be additional Reshape operations or alike in Model B compared to A, that could likely be a reason. To download Netron follow https://github.com/lutzroeder/netron.
Measure the time spent by the model on each of its layers. For this you can use the TF Lite benchmark tool provided directly in the Tensorflow repository.
Check it out here https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/benchmark/README.md.
Related
Does a tflite file contain data about the model architecture? A graph that shows what operations there where between the weights and features and biases, what kind of layers (linear or convolutional etc), size of layers, and what activation functions are there in-between the layers?
For example a graph you get with graphviz, that contains all the information, or does a tflite file only contain the final weights of the model after training?
I am working on a project with image style transfer. I wanted to do some research on an existing project, and see what parameters work better. The project I am looking at is here:
https://tfhub.dev/sayakpaul/lite-model/arbitrary-image-stylization-inceptionv3-dynamic-shapes/int8/transfer/1
I can download a tflite file, but I don't know much about these files. If they have the architecture I need, how do I read it?
TFLite flatbuffer files contain the model structure as well. For example, there are a subgraph concept in TFLite, which corresponds to the function concept in the programming language and the operator nodes also represent a graph node, which takes inputs and generates outputs. By using the Netron application, the model architecture can be visualized.
I have implemented a form of the LeNet model via tensorflow and python for a Car number plate recognition system. My model was trained solely on my train data and tested on the test data. My dataset contains segmented images wherein every image has only one character in them. This is what my data looks like. My created model does not perform very well, so I'm now looking for models which I can use via Transfer Learning. Since most models, are already trained on a humongous dataset, I looked over a few like AlexNet, ResNet, GoogLeNet and Inception v2. Most of these models have not been trained on the type of data that I want which would be, Letters and digits.
Question: Should I still go forward with one of these models and train them on my dataset or are there any better models which would help ? For such models would keras be a better option since it is more high level than Tensorflow?
Question: I'd prefer to work with the LeNet model itself since training the other models would definitely take a long time due to the insufficient specs of my laptop. So is there any implementation of the model which uses machine printed character images to train the model which I could use to then train the final layers of the model on my data?
to get good results you should use a model explicitly designed for text recognition.
First, (roughly) crop the input image to the region around the text.
Then, feed the image of the text into a neural network (NN) to detect the text.
A typical NN for text recognition extracts relevant features (with convolutional NN), propagates those features through the image (with recurrent NN) and finally predicts a character score for each position in the image.
Usually, those networks are trained with the CTC loss.
As a starting point I would suggest looking at the CRNN implementation (they also provide a pre-trained model) [1] and the corresponding paper [2]. There is, as far as I remember, also a TensorFlow implementation on github.
You can use any framework (e.g TensorFlow or CNTK or ...) you like as long as it features convolutional and recurrent NN and the CTC loss.
I once attended a presentation about CNTK where they claimed that they have a very fast implementation of recurrent NN - so maybe CNTK would be a good choice for your slow computer?
[1] CRNN implementation: https://github.com/bgshih/crnn
[2] Shi - An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
I am building a new tensorflow model based off of SSD V1 coco model in order to perform real time object detection in a video but i m trying to find if there is a way to build a model where I can add a new class to the existing model so that my model has all those 90 classes available in SSD MOBILENET COCO v1 model and also contains the new classes that i want to classify.
For example, I have created training data for two classes: man, woman
Now, I built a new tensorflow model that identifies a man and/or woman in a video. However, my model does not have the other 90 classes present in original SSD Mobilenet model. I am looking for a way to concatenate both models or pass more than one model to my code to detect the objects.
If you have any questions or if I am not clear, please feel free to probe me further.
The only way i find is you need to get dataset of SSD Mobilenet model on which it was trained.
Make sure all the images are present in one directory and annotations in another directory.
We should have a corresponding annotation file for each image file
ex: myimage.jpg and myimage.xml
If all the images of your customed dataset are of same formate with SSD Mobilenet model then annotate it with a tool called LabelImg.
Add that images and annotated files to respective images and annotations directory where we have already saved SSD Mobilenet.
Try regenerate new TFrecord and continue with remaining procedure on it.
You can use transfer learning with Tensorflow API.
Transfer learning allows you to load re-trained network and modify the fully connected layer by introducing your classes.
There is full description for this in the following references:
Codelab
A good explanation here
Tensorflow API here for more details
Also you can use google cloud platform for better and faster results:
I wish this helps you.
I don't think there is a way you can add your classes to the existing 90 classes without using the dataset it is previously trained with. Your only way is to use that dataset plus your own and retrain the model.
I am using a tensorflow framework and I have noticed that there are major variances in the size of the tensorflow model files.
For example the framework provides 2 models:
one of pretrained model to be used with fine tuning for example
and one which contains an untrained version.
They both have a size of 172.539 kb
When I apply fine tuning in my model with some minor changes in the graph (there is a module in framework for that) and save my model the size remains essentially the same: 178.525 kb.
First, I am bit surprised that my fine-tuned model is somewhat bigger since I change just the last layer from 21 to 14 classes so I would expect a somewhat smaller model file size but since the difference is so little I didn't pay attention.
But when I trained the same model using the same model file (the pretrained one I mean) and saved the model in disk the file size is quite different: 340.097 kb. By the term train I mean I allow the network to modify all parameter not just the parameters of the last layer.
The model that is being implemented is a variation of resnet for semantic image segmentation (if can someone deduct the expected model file size from the model itself).
So, my questions are why I have such a variance in the model file sizes and how come my saved fine-tuned model is larger than the original model? Is there a way to include/exclude parameters in the model to be saved?
P.S.1 Some information that might be handy:
I am using tensorflow v2 model saving while I think the framework files use v1. I am not sure how to identify this besides the fact that the former produces 3 files.
The framework is called tensorflow-deeplab-resnet and can be found here and the models are here.
P.S.2
I am not sure stack overflow it 's the right place for this question either.
That is because, when training models and saving them, Tensorflow will also save the gradients of your ops.
So allowing training on the last layer will increase the size of your saved model a little. And allowing training on the whole model will essentially double the size of the save file because each op will have its gradients saved.
I could download and successfully test brain parcellation demo of NiftyNet package. However, this only gives me the ultimate parcellation result of a pre-trained network, whereas I need to get access to the output of the intermediate layers too.
According to this demo, the following line downloads a pre-trained model and a test MR volume:
wget -c https://www.dropbox.com/s/rxhluo9sub7ewlp/parcellation_demo.tar.gz -P ${demopath}
where ${demopath} is the path to the demo folder. Extracting the downloaded file will create a .ckpt file which seems to contain a pre-trained tensorflow model, however I could not manage to load it into a tensorflow session.
Is there a way that I can load the pre-trained model and have access to the all its intermediate activation maps? In other words, how can I load the pre-trained models from NiftyNet library into a tensorflow session such that I can explore through the model or probe certain intermediate layer for a any given input image?
Finally, in NiftyNet's website it is mentioned that "a number of models from the literature have been (re)implemented in the NiftyNet framework". Are pre-trained weights of these models also available? The demo is using a pre-trained model called HighRes3DNet. If the pre-trained weights of other models are also available, what is the link to download those weights or saved tensorflow models?
To answer your 'Finally' question first, NiftyNet has some network architectures implemented (e.g., VNet, UNet, DeepMedic, HighRes3DNet) that you can train on your own data. For a few of these, there are pre-trained weights for certain applications (e.g. brain parcellation with HighRes3DNet and abdominal CT segmentation with DenseVNet).
Some of these pre-trained weights are linked from the demos, like the parcellation one you linked to. We are starting to collect the pre-trained models into a model zoo, but this is still a work in progress.
Eli Gibson [NiftyNet developer]