Tensorflow Object Detection API - showing loss for training and validation on one graph - tensorflow

I am playing with Tensorflow Object Detection API and training the Faster R-CNN network on my own dataset. I am checking the progress of learning at Tensorbord. All metrics are there, but is there a way to have both loss plots, for training and validation data, on one graph? Or do I have to dive into TOD Api code and modify it? I would like to avoid the second because during every update of the API I will have to keep in mind that some of the code is changed locally.

The underlying data for the plots is saved under different tag names (loss vs loss_1). I believe TensorBoard does not natively support displaying different tags in one plot. There might be third-party extensions to do this.
If different models used the same tag, the graphs would be combined by default (see: Plot multiple graphs in one plot using Tensorboard).

Related

Using dynamically generated data with keras

I'm training a neural network using keras but I'm not sure how to feed the training data into the model in the way that I want.
My training data set is effectively infinite, I have some code to generate training examples as needed, so I just want to pipe a continuous stream of novel data into the network. keras seems to want me to specify my entire dataset in advance by creating a numpy array with everything in it, but this obviously wont work with my approach.
I've experimented with creating a generator class based on keras.utils.Sequence which seems like a better fit, but it still requires me to specify a length via the __len__ method which makes me think it will only create that many examples before recycling them. Can someone suggest a better approach?

In GluonTS, how to get the feature importance of every timestep, when using the TemporalFusionTransformer model?

Im using the MXNet implementation of the TFT model, and I want to get the feature importance for every timestep from the trained model. Unfortunately, there is no such implemented functionwhich would satisfy my demand. According to the original article for TFT, there is a way to get the feature importance by getting the weigths off of the variable selection network. Howewer, it's softmax function gives back an embedded, 3 dimensional matrix. Im stuck with this problem, due to the lack of documentation about TFT/MXNet.
Any help is highly appricated.

Missing tool in Deep Learning: data augmentation with geometrical landmarks

Data augmentation can easily be achieved using ad hoc modules in e.g. TensorFlow. This works perfectly for classification problems, however when the objective of the network is the prediction of a geometrical feature, e.g. a landmark, a problem arises. As the image is modified, e.g. flipped, or distorted, the corresponding labels also need to be adapted.
1 - Is there any tool to do this? I am sure that this is a common problem.
2 - Would it be useful to create a data augmentation script for neural networks that predict geometrical features?
I want to understand if I need to code all of this by myself or if I am missing something that already exists. If I need to do it and it could be useful I would just create an open source thing.
You can use imgaug library https://github.com/aleju/imgaug
An example of augmentation for key points using imgaug you can find here https://github.com/aleju/imgaug#example-augment-images-and-keypoints

Image Detection & Classification - general approach?

I'm trying to build a detection + classification model that will recognize an object in an image and classify it. Every image will contain at most 1 object among my 10 classes (i.e. same image cannot contains 2 classes). An image can, however, contain none of my classes/objects. I'm struggling with the general approach to this problem, especially due to the nature of my problem; my objects have different sizes. This is what I have tried:
Trained a classifier with images that only contains my objects/classes, i.e. every image is the object itself with background pre-removed. Now, since the objects/images have different shapes (aspect ratios) I had to reshape the images to the same size (destroying the aspect ratios). This would work just fine if my purpose was to only build a classifier, but since I also need to detect the objects, this didn't work so good.
The second approach was similar to (1), except that I didn't reshape the objects naively, but kept the aspect ratios by padding the image with 0 (black). This completely destroyed my classifiers ability to perform well (accuracy < 5%).
Mask RCNN - I followed this blogpost to try build a detector + classifier in the same model. The approach took forever and I wasn't sure it was the right approach. I even used external tools (RectLabel) to generate annotated image files containing information about the bounding boxes.
Question:
How should I approach this problem, on a general level:
Should I build 2 separate models? (One for detection/localization and one for classification?)
Should I be annotating my images using annotations file as in approach (3)?
Do I have to reshape my images at any stage?
Thanks,
PS. In all of my approaches, I augmented the images to generate ~500-1000 images per class.
To answer your questions:
No, you don't have to build two separate models. What you are describing is called Object detection, which is classification along with localization. There are many models which do this: Mask_RCNN, Yolo, Detectron, SSD, etc..
Yes, you do need to annotate your images for training a model for your custom classes. Each of the models mentioned above has needs a different way of annotation.
No, you don't need to do any image resizing. Most of the time it is done when the model loads the data for training or inference.
You are on the right track with trying MaskRCNN.
Other than MaskRCNN, you could also try Yolo. There is also an accompanying easy-to-use annotating tool Yolo-Mark.
If you go through this tutorial, you would understand what you care about.
How to train your own Object Detector with TensorFlow’s Object Detector API
The SSD model is small so that it would not take so much time for training.
There are some object detection models.
On RectLabel, you can save bounding boxes in the PASCAL VOC format.
You can export TFRecord for Tensorflow.
https://rectlabel.com/help#tf_record

Object detection using CNTK

I am very new to CNTK.
I wanted to train a set of images (to detect objects like alcohol glasses/bottles) using CNTK - ResNet/Fast-R CNN.
I am trying to follow below documentation from GitHub; However, it does not appear to be a straight forward procedure. https://github.com/Microsoft/CNTK/wiki/Object-Detection-using-Fast-R-CNN
I cannot find proper documentation to generate ROI's for the images with different sizes and shapes. And how to create object labels based on the trained models? Can someone point out to a proper documentation or training link using which I can work on the cntk model? Please see the attached image in which I was able to load a sample image with default ROI's in the script. How do I properly set the size and label the object in the image ? Thanks in advance!
sample image loaded for training
Not sure what you mean by proper documentation. This is an implementation of the paper (https://arxiv.org/pdf/1504.08083.pdf). Looks like you are trying to generate ROI's. Can you look through the helper functions as documented at the site to parse what you might need:
To run the toy example, make sure that in PARAMETERS.py the datasetName is set to "grocery".
Run A1_GenerateInputROIs.py to generate the input ROIs for training and testing.
Run A2_RunCntk_py3.py to train a Fast R-CNN model using the CNTK Python API and compute test results.
The algo will work on several candidate regions and then generate outputs: one for the classes of objects and another one that generates the bounding boxes for the objects belonging to those classes. Please refer to the code for getting the details of the implementation.
Can someone point out to a proper documentation or training link using which I can work on the cntk model?
You can take a look at my repository on GitHub.
It will guide you through all the steps required to train your own model for object detection and classification with CNTK.
But in short the proper steps should look something like this:
Setup environment
Prepare data
Tag images (ground truth)
Download pretrained model and create mappings for your custom dataset
Run training
Evaluate the model on test set