Tensorflow Object Detection API - Detecting the humans not wearing Helmet - tensorflow

*** Please note, my previous problem of detecting withouthelmet as NA is resolved.
Now I have a new issue, I used 1000 images for detecting humans not wearing helmets and also 1000 images for humans wearing helmet and 1000 images for only humans. I used SSD_mobilenet_v1_pets.config file for training.
Here is my pbtxt file
item {
id: 1
name: 'withouthelmet'
}
item {
id: 2
name: 'withhelmet'
}
item {
id: 3
name: 'person'
}
sample training Image
After the training my model detect every car as person..
Is that because of using ssd_mobilenet model(id: 1 for person but I used id: 1 as withouthelmet and id:3 for car but I used id:3 for person)
Pls help me to solve this problem

Have you set num_classes to 1 in your config?
Please note that min_negatives_per_image means min # of negative anchors (instead of images) so you data mix has nothing to do with this parameter.
I had to modify earlier answer - if you add a background image(image with no gt boxes) to the dataset, it should help reduce false positives. Sorry I got confused with some other stuff.

Have you used the pre-trained SSD-MobileNetV1 model trained on the pets dataset?
I think you better use a model trained on COCO dataset since it has persons, in contrast to pets.
Of course that if you train your model it will learn to detect the person as well, but since you don't have a lot of examples of persons without a helmet, it would probably be better to start with a model which knows what a person is.
Regarding your questions, if you only want to detect people without helmet, you can simply drop everything else in the pbtxt file, only put
item {
id: 1
name: 'withouthelmet'
display_name: 'withouthelmet'
}
change the number of categories in the config file to 1, and fine-tune the model.

Related

Person recolonization using ML.NET/TensorFlow

I am noob in ML. I have a Person table that have,
-----------------------------------
User
-----------------------------------
UserId | UserName | UserPicturePath
1 | MyName | MyName.jpeg
Now I have tens of millions of persons in my database. I wanna train my model to predict the UserId by giving images(png/jpeg/tiff) in bytes. So, input will be images and the output I am looking is UserId. Right now I am looking for a solution in ML.NET but I am open to switch to TensorFlow.
Well, this is nothing but a mapping problem, particularly an id-to-face mapping problem, and neural nets excell at this more than on anything else.
As you have understood by now, you can do this using tensorflow, pytorch or any library of the same purpose.
But if you want to use tensorflow, read on for a ready code at the end. It is easiest to achieve your task by transfer learning, i.e. by loading some pretrained model, freezing all but last layer and then training the network to produce a latent one-dimensional vector for a given face image. Then you can save this vector into a database and map it into an id.
Then, whenever there is a new image and you want to predict an id for the image, you run your image through the network, get your vector and compute cosine similarity with vectors in your database. If the similarity is above some threshold and it is the highest among other similarities, you have found your id.
There are many ways to go about this. Sure you have to preprocess your data, and augment it at the same time, but if you want some ready code to play with then have a look at this famous happy house tutorial from Andrew NG and his team:
https://github.com/gemaatienza/Deep-Learning-Coursera/blob/master/4.%20Convolutional%20Neural%20Networks/Keras%20-%20Tutorial%20-%20Happy%20House%20v2.ipynb
This should suffice your needs.
Hope it helps!

Label file in tensorflow object detection training

I want to create my own .tfrecord files using tensorflow object detection API and use them for training. The record will be a subset of original dataset so the model will detect only specific categories.
The thing I dont understand and cant find any information about is, how are id`s assigned to labels in label_map.pbtxt during training.
What I do...
Step 1:
assign label_id during creation of the tfrecord file, where I put my own ids:
'image/object/class/label': dataset_util.int64_list_feature(category_ids)
'image/object/class/text': dataset_util.bytes_list_feature(category_names)
Step 2:
create labels file with e.g. two categories:
item { name: "apple" id: 53 display_name: "apple" }
item { name: "broccoli" id: 56 display_name: "broccoli" }
Step 3:
Train the model
After training, there are some objects detected, but with N/A label. When I set the id`s starting from 1 then it shows correct labels.
My questions are:
Why it did not map correctly to label with custom id?
Can the second id have other value than 2? I'm sure I saw skipped ids in labels file for coco dataset.
How to set the id to have custom value, if possible?
Thanks
I had the same problem with my label map. After Googling a bit, I found your question here and also this excerpt from the TensorFlow Object Detection repository:
Each dataset is required to have a label map associated with it. This label map defines a mapping from string class names to integer class Ids. The label map should be a StringIntLabelMap text protobuf. Sample label maps can be found in object_detection/data. Label maps should always start from id 1.
I also checked the source code for label_map_util.py and found this comment:
We only allow class into the list if its id-label_id_offset is
between 0 (inclusive) and max_num_classes (exclusive).
If there are several items mapping to the same id in the label map,
we will only keep the first one in the categories list
So in your example, which only has two classes, valid ID's are 1 and 2. Any higher value will be ignored.

Tensorflow Object Detection Unusually large bounding boxes and wrong results

I am building an object detector in TensorFlow to detect, motorbike riders with and without helmet, I have 1000 Images each for riders with helmet, withouthelmet and pedestrians(pu together -- 3000 IMAGES), My last checkpoint was 35267 steps, I have tested using a traffic video, but I see unusally large bounding boxes with wrong results. Can someone please explain the reason for such detections? Do I need to wait for atleast 50000 steps?? or Do I need to add datasets(Images in the angle to Traffic Cameras)?
Model - SSD Mobilenet COCO - Custom Object Detection,
Training Platform - Google Colab
Please find the Images attachedVideo Snapshot 1
Video Snapshot 2
Day 2 - 10/30/2018
I have tested with Images today, I have got different results, seems to be correct,2nd Day if I test with single object in a Image. Please find the results
Single Object IMage Test 1
Single Object Image Test 2
Tested CHeckpoint - 52,000 Steps
But, If I test with the Images with multiple objects in a road, the detection is wrong and bounding boxes are weirdly bigger, Is it because of the dataset, as I am training with One Motorbike rider(with or with out helmet) per image.
Please find the wrong results
Multi Object Image Test
Multi Object Image Test
I had also tested with images like all Motorbikes in the scene, In this case, I did not get any results, Please find the Images
No Result Image
No Result Image
The results are very confusing, Is there anything I am missing?,
There is no need to wait till 50000 epocs you should get decent result in 35k or even in 10k. I would suggest
go through you data-set again and check all the bounding boxes (data cleaning)
Check your model with inference code for changes like batch normalization etc
Add some more data with different features, angles and color complexities
I would check these points before going further.

use tensorflow object detection API for gender recognition

can I use tensorflow object detection API for gender recognition?
I want to train SSD_mobile net for gender recognition and detection. I changed labelmap to:
item {
id: 1
name: 'man'
}
item {
id: 2
name: 'woman'
}
and num_classes=2
I attach to training_loss=8 but when I feed an image to the network to test, the result is awful.
what should I do? can somebody help me?
For this kind of task you will need a huge dataset and a very long time for training if you don't have a super computer haha jokes apart but this is pretty difficult we need very keen analysis because men or women has almost same kind of features for computer not for us but for computer just like it can not make a difference between a bitch and a dog but we human can by just one watch so I hope you will understand what I am trying to say but you should definitely try it its a very nice idea and there is a lot of applications for this if you can do some thing better with this. Good luck let me know if you can do some thing better.
You can. The method that you need to follow is as following:
Use SSD to extract the location of the object to be found (face in here).
Get relevant feature map of the location at conv5 (assume that you use VGG). For example, if you find your object at location (100, 100, 100, 100 - XYWH) within input image with size (300, 300), cut conv5 features at (12, 12, 12, 12 - XYWH). Math is (100 / 300) * 38.
Now you shall have the activation features cut from conv5 (12 x 12 x 512) and which is only relevant with the face that you want to predict the gender.
Flatten this feature activation and apply DNN Classifier for it (i.e. Classifier used for VGG).
Get binary output stating either male or female.
Train your network by adding gender loss to global loss function.
Voila. You have the gender estimation network.

Reduction axis 1 is empty in shape [x,0]

I'm trying to train the SVHN (street view house numbers) dataset for object-detection in tensorflow (to do some basic OCR on numbers).
So far I have successfully followed the pet-training example from the object-detection-tensorflow guides.
When I train the network based on the sample faster_rcnn_resnet101.config, after a few dozen steps I get:
INFO:tensorflow:Error reported to Coordinator:
<class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>,
Reduction axis 1 is empty in shape [3,0]
[[Node: Loss/RPNLoss/Match/cond/ArgMax_1 = ArgMax[T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"]
(Loss/RPNLoss/Match/cond/ArgMax/Switch:1,
Loss/RPNLoss/Match/cond/ArgMax_1/dimension)]]
I have no clue what to change or improve.
Has someone seen this before?
What is going wrong here?
Is it simply a wrong config-setting?
The only parameter I changed (besides path-info) is num_classes: 10 (10 digits)
Thanks for any hints.
My label-map looks like this:
item {
id: 0
name: 'none_of_the_above'
}
item {
id: 1
name: '1'
}
item {
id: 2
name: '2'
}
... with id: 10 being '0'
As supposed here https://github.com/tensorflow/models/blob/master/object_detection/g3doc/running_pets.md
I used the pretrained COCO-model faster_rcnn_resnet101 and also the config-file from that:
https://github.com/tensorflow/models/blob/master/object_detection/samples/configs/faster_rcnn_resnet101_pets.config
The only things I adapted are the paths and:
faster_rcnn {
num_classes: 11
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 64
max_dimension: 900
}
}
Due to the fact that the images from SVHN are rather small, I adapted the dimensions here and removed all images that are smaller than 64px in height or width.
I didn't pay attention to clear the training-directory until now. But I tried now and the same error occurs.
I'm currently trying with the pretrained inception-model, maybe this works out.
I'm not sure what caused your error but you shouldn't use the index 0 in your label map as it's the placeholder index. All indices should start at 1.
See: https://github.com/tensorflow/models/issues/1696
This happens when using the ArgMaxMatcher class when the number of proposal is 0.
If made a PR that fixes the issue here: https://github.com/tensorflow/models/pull/1986
As the problem vanished, when I'm doing the training correctly, I will post this solution ... might be, that others are as "dumb" as me.
As proposed in the pets-tutorial, I downloaded a pretrained model.
But I set the path to the training-directory to the same directory where the downloaded pretrained model was.
I think this caused the error.
You need to set param "pad_to_max_dimension" = true in section "keep_aspect_ratio_resizer". Its worked for me
Your image size from dataset is not match with size of model that you used for training.
Ex: If you used mode mask_rcnn_inception_v2_coco_2018_01_28 then your image size of dataset must be in range[800: 1365]