I'm trying to train the SVHN (street view house numbers) dataset for object-detection in tensorflow (to do some basic OCR on numbers).
So far I have successfully followed the pet-training example from the object-detection-tensorflow guides.
When I train the network based on the sample faster_rcnn_resnet101.config, after a few dozen steps I get:
INFO:tensorflow:Error reported to Coordinator:
<class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>,
Reduction axis 1 is empty in shape [3,0]
[[Node: Loss/RPNLoss/Match/cond/ArgMax_1 = ArgMax[T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"]
(Loss/RPNLoss/Match/cond/ArgMax/Switch:1,
Loss/RPNLoss/Match/cond/ArgMax_1/dimension)]]
I have no clue what to change or improve.
Has someone seen this before?
What is going wrong here?
Is it simply a wrong config-setting?
The only parameter I changed (besides path-info) is num_classes: 10 (10 digits)
Thanks for any hints.
My label-map looks like this:
item {
id: 0
name: 'none_of_the_above'
}
item {
id: 1
name: '1'
}
item {
id: 2
name: '2'
}
... with id: 10 being '0'
As supposed here https://github.com/tensorflow/models/blob/master/object_detection/g3doc/running_pets.md
I used the pretrained COCO-model faster_rcnn_resnet101 and also the config-file from that:
https://github.com/tensorflow/models/blob/master/object_detection/samples/configs/faster_rcnn_resnet101_pets.config
The only things I adapted are the paths and:
faster_rcnn {
num_classes: 11
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 64
max_dimension: 900
}
}
Due to the fact that the images from SVHN are rather small, I adapted the dimensions here and removed all images that are smaller than 64px in height or width.
I didn't pay attention to clear the training-directory until now. But I tried now and the same error occurs.
I'm currently trying with the pretrained inception-model, maybe this works out.
I'm not sure what caused your error but you shouldn't use the index 0 in your label map as it's the placeholder index. All indices should start at 1.
See: https://github.com/tensorflow/models/issues/1696
This happens when using the ArgMaxMatcher class when the number of proposal is 0.
If made a PR that fixes the issue here: https://github.com/tensorflow/models/pull/1986
As the problem vanished, when I'm doing the training correctly, I will post this solution ... might be, that others are as "dumb" as me.
As proposed in the pets-tutorial, I downloaded a pretrained model.
But I set the path to the training-directory to the same directory where the downloaded pretrained model was.
I think this caused the error.
You need to set param "pad_to_max_dimension" = true in section "keep_aspect_ratio_resizer". Its worked for me
Your image size from dataset is not match with size of model that you used for training.
Ex: If you used mode mask_rcnn_inception_v2_coco_2018_01_28 then your image size of dataset must be in range[800: 1365]
Related
I am a beginner in Convolutional DL. I saw the following architecture in paper Simultaneous Feature Learning and Hash Coding with Deep Neural Networks: For images of size 256*256,
I do not understand the output size of the first 2D convolution: 96*54*54. 96 seems fine as the number of filters is 96. But, if we apply the following formula for the output size: size = [(W−K+2P)/S]+1 = [(256 - 11 + 2*0)/4] + 1 = 62.25 ~ 62. I have assumed the padding, P to be 0 as it is not mentioned in the paper anywhere. Keras Conv2D API produces the same 96*62*62 size output. Then, why paper points to 96*54*54? What am I missing?
Well, it reminded me AlexNet paper where there was a similar mistake. Your calculation is correct. I think they mistakenly write 256x256 instead of 224x224, in which case the calculation for the input layer is,
(224-11+2*0)/4 + 1 = 54.25 ~ 54
It's highly possible that authors mistakenly wrote 256x256 instead of the real architecture input size being 224x224 (that was the case in AlexNet also), or the other less possible option is they wrote 256x256 which was the real architecture input size, but do the calculations for 224x224. The latter is ignorable as I think it is a very silly mistake and I don't think that's even an option.
Thus, I believe the true input size was 224x224 instead of 256x256.
I trained my own model with darkflow yolov2 for just one class, and the results are pretty good when running this on the terminal with a threshold configuration of 0.55
python3 flow --model cfg/yolov2-tiny-voc-1c.cfg --load 5250 --demo BARCELONA_WALK.mp4
but then I convert the checkpoint on pb and meta files to use on code
and when I specify the threshold on the code like this
options = {"model": "cfg/yolov2-tiny-voc-1c.cfg",
"pbload": "built_graph/yolov2-tiny-voc-1c.pb",
"metaload": "built_graph/yolov2-tiny-voc-1c.meta",
"threshold": 0.55,
"gpu": 0.9}
it detects nothing from my image samples, but when the threshold is 0.5 or lower it detects like 280 objects and the ones with confidence greater than 0.5 are like 190, so, why is the neural network not working the same way when using the code and when running demo from terminal if I'm using the same weights and the same threshold?
SOLVED!!! On my options I had to put "pbLoad" and "metaLoad" instead of "pbload" and "metaload" too bad that it didn't throw any errors but anyways, I realized it may be the Uppercases when reading this post. I hope it helps someone in the future!!
*** Please note, my previous problem of detecting withouthelmet as NA is resolved.
Now I have a new issue, I used 1000 images for detecting humans not wearing helmets and also 1000 images for humans wearing helmet and 1000 images for only humans. I used SSD_mobilenet_v1_pets.config file for training.
Here is my pbtxt file
item {
id: 1
name: 'withouthelmet'
}
item {
id: 2
name: 'withhelmet'
}
item {
id: 3
name: 'person'
}
sample training Image
After the training my model detect every car as person..
Is that because of using ssd_mobilenet model(id: 1 for person but I used id: 1 as withouthelmet and id:3 for car but I used id:3 for person)
Pls help me to solve this problem
Have you set num_classes to 1 in your config?
Please note that min_negatives_per_image means min # of negative anchors (instead of images) so you data mix has nothing to do with this parameter.
I had to modify earlier answer - if you add a background image(image with no gt boxes) to the dataset, it should help reduce false positives. Sorry I got confused with some other stuff.
Have you used the pre-trained SSD-MobileNetV1 model trained on the pets dataset?
I think you better use a model trained on COCO dataset since it has persons, in contrast to pets.
Of course that if you train your model it will learn to detect the person as well, but since you don't have a lot of examples of persons without a helmet, it would probably be better to start with a model which knows what a person is.
Regarding your questions, if you only want to detect people without helmet, you can simply drop everything else in the pbtxt file, only put
item {
id: 1
name: 'withouthelmet'
display_name: 'withouthelmet'
}
change the number of categories in the config file to 1, and fine-tune the model.
I want to see more evaluation steps in Tensorboard, while I'm training and evaluating my object detection (standard code in tensorflow object detection).
Here you can see what I mean for number of evaluation steps. As you can see, it's fixed to 10 visualization.
I can't find where to change and increase this parameter. Moreover, these visualizations are random and not the last 10.
Is it possible to set a different number of visualization?
And what can I do for see the last N evaluations instead of random N evaluations?
Thank you in advance.
Added: Image from link:
I assume you're using this code:
https://github.com/tensorflow/models/tree/master/research/object_detection
(you should include that link to clarify in future questions, and if that assumption is wrong you should edit your question to specify what code you're using)
If you look at the trainer.py code at the bottom they have:
slim.learning.train(
train_tensor,
logdir=train_dir,
master=master,
is_chief=is_chief,
session_config=session_config,
startup_delay_steps=train_config.startup_delay_steps,
init_fn=init_fn,
summary_op=summary_op,
number_of_steps=(
train_config.num_steps if train_config.num_steps else None),
save_summaries_secs=120,
sync_optimizer=sync_optimizer,
saver=saver)
It looks like they've hard coded save_summaries_sec=120 to save a summary every 120 seconds. That's what you want to edit to change the tensorboard summary update period.
Edit: I've added the image to the question to help clarify. I believe the answer is in tf.summary.image you have a property max_outputs which controls the number of values from the block of images. To choose a subset of images specifically you should simply write your own code to select them in whatever way you see fit, randomly, or in some order, then pass that new set of images to tf.summary.image.
You may want to consider looking at the eval_config section of the model config file.
eval_config: {
num_examples: 100
num_visualizations: 50
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
#max_evals: 10
}
I'm guessing that max_evals is what you're looking for.
I am trying to run the entire CIFAR10 as is, with data from SVHN.
http://ufldl.stanford.edu/housenumbers/
I formatted the data in the exact format as the bin file from Alex Krizhevsky's website.
http://www.cs.toronto.edu/~kriz/cifar.html
I did not edit the code, other than changing a few variable names to make it work in another directory. It gives me an error now.
W tensorflow/core/common_runtime/executor.cc:1076] 0x218fec0 Compute status: Invalid argument: Indices are not valid (out of bounds). Shape: dim { size: 128 } dim { size: 10 }
[[Node: SparseToDense = SparseToDense[T=DT_FLOAT, Tindices=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](concat, SparseToDense/output_shape, SparseToDense/sparse_values, SparseToDense/default_value)]]
Specifically, the line that fails in cifar.py is:
dense_labels = tf.sparse_to_dense(concated,[FLAGS.batch_size, NUM_CLASSES],1.0, 0.0)
I have checked this solution too, it does not work.
TensorFlow Indices are not valid (out of bounds)
Anyone has any idea on how to make it work?
I realized the mistake. The SVHN dataset gave the number 0 a value of 10, instead of 0. I made this fatal assumption from the start and it wasted a lot of my time.
Given 10 classes, the labels should range from 0-9, inclusive. The error happened because the labels ranged from 1-10.
http://ufldl.stanford.edu/housenumbers/
Do remember to read overviews in future!