SnapML model issue on Lens Studio - tensorflow

I built a custom model for classifying images of cars using Tensorflow and Keras, to use it for building a Snap lens powered by machine learning. Lens Studio only accepts quantized models; the model had to go through the quantization process using the TFLite module.
However, the problem is that the model passed into Lens Studio is unable to function properly. It only displays classification results for the first time after its initiation; then the results (and even the probability numbers behind classification) remain static despite image/video changes.
Any tips on how to solve this issue would be appreciated. The configurations for input image setups remain identical as the provided template by Snap.

Related

How to get preprocess/postprocess steps from model created using Google Vertex AI?

A client of mine wants to run their Google Vertex AI model on NVIDIA Jetson boards using TensorRT as accelerator. The problem with this is that their model uses certain operators (DecodeJpeg) that are not supported by ONNX. I've been able to isolate the feature extrator subgraph from the model, so everything supported by ONNX is being used, while the preprocess and postprocess will be written separate from the model.
I'm asking because I need to be provided the pre/postprocess of the model so I could implement them separately, so is there a way to get pre/postprocess from Google Vertex AI console?
I've tried running a loop that rescales the image to a squared tile from 0 to 512, but none of those gave the adequate result.

Convert PoseNet TensorFlow.js params to TensorFlow Lite

I'm fairly new to TensorFlow so I apologize if I'm saying something absurd.
I've been playing with the PoseNet model in the browser using TensorFlow.js. In this project, I can change the algorithm and parameters so I can get better results on the detection of certain poses. The most important params in my use case are the Multiplier, Quant Bytes and Output Stride.
So far so good, I have the results I want. However, I want to convert these results to TensorFlow Lite so I can use it in an iOS application. I managed to find the PoseNet model in a TensorFlow Lite file (tflite) and I even found an iOS app example provided by TensorFlow to I'm able to load up the model file and have it working on iOS.
The problem is...I'm unable to change the params (Multiplier, Quant Bytes and Output Stride) on the iOS app. I can't find it anywhere how I can do this. I've tried searching for these params in the iOS app source code, I've tried to find ways to convert a TensorFlow.js model to TensorFlow Lite so I can load the model with the params I want in the app but no luck.
I'm writing this post so maybe you guys can point me in the right direction so I'm able to "translate" what I have on TensorFlow.js to TensorFlow Lite.
EDIT:
This is what I've learned in the last couple of days:
TFLite is designed for serving fixed model with lightweight runtime. Thus, modifying model parameters on demand is not a design goal for it.
I looked at the TF.js code for PoseNet, and found similar design. It seems you can modify parameters, because they actually have different models for each params. https://github.com/tensorflow/tfjs-models/blob/b72c10bdbdec6b04a13f780180ed904736fa52a5/posenet/src/checkpoints.ts#L37
TFLite models generally don't support dynamic parameters. Output stride Multiplier and Quant Bytes are fixed params when the neural network is created.
So what I want to do is to extract weights from TF.js model, and put then into existing MobileNet code.
And that's where I need help now. Could anyone point me in the direction to load and change the model so I can then convert it to tflite with my own params?
EDIT2:
I found a repo that is helping me convert TF.js models to TF Lite Griffin98/posenet_tfjs2tflite. I still can't define the Quant Bytes tho.

how to use tensorflow object detection API for face detection

Open CV provides a simple API to detect and extract faces from given images. ( I do not think it works perfectly fine though because I experienced that it cuts frames from the input pictures that have nothing to do with face images. )
I wonder if tensorflow API can be used for face detection. I failed finding relevant information but hoping that maybe an experienced person in the field can guide me on this subject. Can tensorflow's object detection API be used for face detection as well in the same way as Open CV does? (I mean, you just call the API function and it gives you the face image from the given input image.)
You can, but some work is needed.
First, take a look at the object detection README. There are some useful articles you should follow. Specifically: (1) Configuring an object detection pipeline, (3) Preparing inputs and (3) Running locally. You should start with an existing architecture with a pre-trained model. Pretrained models can be found in Model Zoo, and their corresponding configuration files can be found here.
The most common pre-trained models in Model Zoo are on COCO dataset. Unfortunately this dataset doesn't contain face as a class (but does contain person).
Instead, you can start with a pre-trained model on Open Images, such as faster_rcnn_inception_resnet_v2_atrous_oid, which does contain face as a class.
Note that this model is larger and slower than common architectures used on COCO dataset, such as SSDLite over MobileNetV1/V2. This is because Open Images has a lot more classes than COCO, and therefore a well working model need to be much more expressive in order to be able to distinguish between the large amount of classes and localizing them correctly.
Since you only want face detection, you can try the following two options:
If you're okay with a slower model which will probably result in better performance, start with faster_rcnn_inception_resnet_v2_atrous_oid, and you can only slightly fine-tune the model on the single class of face.
If you want a faster model, you should probably start with something like SSDLite-MobileNetV2 pre-trained on COCO, but then fine-tune it on the class of face from a different dataset, such as your own or the face subset of Open Images.
Note that the fact that the pre-trained model isn't trained on faces doesn't mean you can't fine-tune it to be, but rather that it might take more fine-tuning than a pre-trained model which was pre-trained on faces as well.
just increase the shape of the input, I tried and it's work much better

Object detection project (root architecture) using Tensorflow + Keras. Image sample size for accurate training of model?

Im currenty working on a project at University, where we are using python + tensorflow and keras to train an image object detector, to detect different parts of the root system of Arabidopsis.
Our current ressults are pretty bad, as we do only have about 100 images to train the model with at this moment, but we are currently working on cultuvating more plants in order to get more images(more data) to train the tensorflow model.
We have implemented the following Mask_RCNN model:Github- Mask_RCNN tensorflow
We are looking to detect three object clases: stem, main root and secondary root.
But the model detects main roots incorrectly where the secondary roots are located.
It should be able to detect something like this:Root detection example
Training root data set that we are using right now:training images
What is the usual sample size that is used to train a neural network accurate results?
First off: I think there is no simple rule to estimate the sample size but at least it depends on:
1. Quality of your images
I downloaded the images and I think you need to preprocess them before you can use it to reduce the "problem complexity". In some projects, in which I worked with biological data, a background removal (image - low pass filter) was the key to get better results. But you should definitely remove/crop the area outside the region of your interest (like the tape and the ruler). I would try to get the cleanest data set as possible (including manually adjustments cv2/ gimp/ etc.) to focus the network to solve "the right problem".. After that you could apply some random distortion to make it also work on fuzzy/bad/realistic images as well.
2. The way you work with your data
There are a few tricks that enables you to "expand" your dataset.
Sometimes it's very helpful to let a generator method crop random small patches from your input data. This allows you to work with more batches (on small gpus) and gives your network more "variety", (just think about the conv2d task: if you don't use random cropping your filters will slide over the same areas over and over again (at the same image)). Because of the same reason: apply random distortion, flip and rotate your images.
3. Network architecture
In your case I would prefer a U-Net architecture with a last conv2d output of 3 (your classes) feature maps, a final softmax activation and an categorical_crossentropy, this enables you to play with the depth, because sometimes you need sophisticated architectures to solve a problem (close to 100%) but in your case you just want to see a first working result. So fewer layers and a simple architecture could also help you to get things work. Maybe there are some trained network weights for a U-Net which meets your requirements (search on kaggle for example). Because it is also helpful (to reduce the data you need) to use "transfer learning" -> use the first layers of an network (weights) which is already trained. Using a semantic segmentation the first filters will become something like an edge detection for the most given problems/images.
4. Your mental model of "accurate results"
This is the hardest part.. because it evolves during your project. Eg. in the same moment your networks starts to perform well on preprocessed input images you will start to think about architecture/data changes to make it work on fuzzy images as well. This is why you should start with a feasible problem but always improve your dataset (including rare kinds of roots) and tune your network architecture step by step.

Issue with Custom object detection using tensorflow when Training on a single type of object

I am training a pre built tensorflow based model for custom object detection.
I want to detect only 1 type of object. I have taken lot of images from different angles and in different light conditions. I am training on K80 Nvidia GPU. Everything is working and when I train I can see the loss function falling to 0.3. But the loss values drops very quickly to under 1 when I start training. I am using SSD mobile Net as the base configuration for the model. When I try to test the model, it just draws a big square on the input image, rather than detecting the desired object in the image. Basically, it fails to detect the object.
I tried to train the model with a different set of images of mac n chesse which had lot of variations. Then the model worked fine and detected images of mac n chesse in the input image. But when I have pictures of single object then the model fails to detect. Please help me understand what I am doing wrong here
The issue was with my training dataset. I was not properly cropping the object from the original image. Also I needed around 300 images to properly train the model. SSD worked well after giving a well cropped images.