I trained my own model with darkflow yolov2 for just one class, and the results are pretty good when running this on the terminal with a threshold configuration of 0.55
python3 flow --model cfg/yolov2-tiny-voc-1c.cfg --load 5250 --demo BARCELONA_WALK.mp4
but then I convert the checkpoint on pb and meta files to use on code
and when I specify the threshold on the code like this
options = {"model": "cfg/yolov2-tiny-voc-1c.cfg",
"pbload": "built_graph/yolov2-tiny-voc-1c.pb",
"metaload": "built_graph/yolov2-tiny-voc-1c.meta",
"threshold": 0.55,
"gpu": 0.9}
it detects nothing from my image samples, but when the threshold is 0.5 or lower it detects like 280 objects and the ones with confidence greater than 0.5 are like 190, so, why is the neural network not working the same way when using the code and when running demo from terminal if I'm using the same weights and the same threshold?
SOLVED!!! On my options I had to put "pbLoad" and "metaLoad" instead of "pbload" and "metaload" too bad that it didn't throw any errors but anyways, I realized it may be the Uppercases when reading this post. I hope it helps someone in the future!!
Related
I built a digital scale reader using Darknet's YOLOv4Tiny. It is having trouble confusing 2's and 5's which leads me to believe that I am doing some unwanted data augmentation during training. (The results are mostly correct, and glare could be a factor, but I am expecting better results).
I have referenced this post:
Understanding darknet's yolo.cfg config files
and the darknet github:
https://github.com/AlexeyAB/darknet/wiki/CFG-Parameters-in-the-%5Bnet%5D-section
Below is a link to the yolov4-tiny.cfg that I modified for my model:
https://github.com/AlexeyAB/darknet/blob/master/cfg/yolov4-tiny.cfg
And a snippet from the link above:
[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=64
subdivisions=1
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
Am I correct that angle=0 means that there is no rotation?
Are there any other possible ways I might be augmenting my data that could cause an issue?
Edit: If I wanted, how could I eliminate all data augmentation?
Or do I just need more data (currently 2484 images for 10 digit classes)?
horizontal flip is applied by default, add "flip=0" to disable.
https://github.com/AlexeyAB/darknet
Here, the angle, saturation, exposure, and hue all are part of data augmentation. You can eliminate all data augmentation by setting the value to 0. As data argumentation's values are hyperparameters, modifying the value of these could lead the accuracy to better or worse both. Here I suggest you keep the value given by darknet the same as they found these values are good to get good accuracy. And by all these 4 types of data augmentation darknet initially generated all images. If you add more images without replication it is always good to add more images for the deep learning model to learn the necessary complexity during training.
I need to classify small images in 4 different categories, +1 "background" for false detection.
While training the loss quickly drop to 0.7, but stay there even after 800k steps. In the end, the frozen graph seems to classify most images with the background label.
I'm probably missing something, I'll detail the steps I used below, and any feedback is welcomed.
I'm new to tf-slim, so it can be an obvious mistake, maybe too little samples ? I'm not looking for top accuracy, just something working for prototyping.
Source materials can be found there : https://www.dropbox.com/s/k55xoygdzb2efag/TilesDataset.zip?dl=0
I used tensorflow-gpu 1.15.3 on windows 10.
I created the dataset using :
python ./createTfRecords.py --tfrecord_filename=tilesV2_40 --dataset_dir=.\tilesV2\Tiles_40
I added a dataset provider in models-master\research\slim\datasets based on the flowers provider.
I modified the mobilnet_v2.py in models-master\research\slim\nets\mobilenet, changed num_classes=5 and mobilenet.default_image_size = 40
I trained the net with : python ./models-master/research/slim/train_image_classifier.py --model_name "mobilenet_v2" --learning_rate 0.045 --preprocessing_name "inception_v2" --label_smoothing 0.1 --moving_average_decay 0.9999 --batch_size 96 --learning_rate_decay_factor 0.98 --num_epochs_per_decay 2.5 --train_dir ./weight --dataset_name Tiles_40 --dataset_dir .\tilesV2\Tiles_40
When I try this python .\models-master\research\slim\eval_image_classifier.py --alsologtostderr --checkpoint_path ./weight/model.ckpt-XXX --dataset_dir ./tilesV2/Tiles_40 --dataset_name Tiles_40 --dataset_split_name validation --model_name mobilenet_v2 I get eval/Recall_5[1]eval/Accuracy[1]
I then export the graph with python .\models-master\research\slim\export_inference_graph.py --alsologtostderr --model_name mobilenet_v2 --image_size 40 --output_file .\export\output.pb --dataset_name Tiles_40
And freeze it with freeze_graph --input_graph .\export\output.pb --input_checkpoint .\weight\model.ckpt-XXX --input_binary true --output_graph .\export\frozen.pb --output_node_names MobilenetV2/Predictions/Reshape_1
I then try the net with images from the dataset with python .\label_image.py --graph .\export\frozen.pb --labels .\tilesV2\Tiles_40\labels.txt --image .\tilesV2\Tiles_40\photos\lac\1_1.png --input_layer input --output_layer MobilenetV2/Predictions/Reshape_1. This is where I get wrong classifications.,
like 0:background 0.92839915 2:lac 0.020171663 1:house 0.019106707 3:road 0.01677236 4:start 0.0155500565 for a "lac" image of the dataset
I tried changing the depth_multiplier, the learning rate, learning on a cpu, removing --preprocessing_name "inception_v2" from the learning command. I don't have any idea left...
Change your learning rate, maybe start from the usual choice of 3e-5.
I am using DeepLabv3+ and I am running some tests. For my first run I used an output_stride=16 and atrous_rates=[6, 12, 18] and in the 2nd run I used output_stride=8 and atrous_rates=[12,24, 36]. Then I used tensorboard to see the results and I could notice that the heatmaps look larger and one "unit" is 4x bigger than the run with output_stride=16.
output_stride=16
output_stride=8
I would like to know what is the reason behing this behaviour and the consequences on my mIOU metric.
regards
According to the paper Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation (3.1 DeeplabV3+ as an encoder), output_stride simply means the ratio between image input size and feature map output size (before global pooling). So change output_stride will change the output result.
just copy form link.
I have tried to get performance comparison result between source built and google provided .whl files for tensorflow-gpu runs. I have tried more than tens of bench mark tests, and I always get slow performance on every n x 100 step like 0, 100, 200, .... I cannot figure out the reason. Who, one of you, expert of tensorflow, can answer for me?
I am running ubuntu(18.04). fedora(27, 28), Windows, and CUDA 9.0/9.1/9.2
I've tested with tf1.6, 1.7, 1.8, 1.9.
My GPU is 1080ti/11GB.
My cpu is intel 4690k with 32G dram.
attached one sample
.
Tnank you very much in advance.
Dae-Chul Jo
dcjo00#gmail.com
It could be for some different reasons:
Every 100 steps you are saving the model
Every 100 steps you are testing validation data
Every 100 steps you are saving logs to tensorboard
These are my first guesses in order of probability, if you provide code I could study it more deeply.
Hope it helps! :)
EDIT: it ended up being:
tf.train.MonitoredTrainingSession has a default of saving summaries every 100 steps. Which was proposal 3.
I want to see more evaluation steps in Tensorboard, while I'm training and evaluating my object detection (standard code in tensorflow object detection).
Here you can see what I mean for number of evaluation steps. As you can see, it's fixed to 10 visualization.
I can't find where to change and increase this parameter. Moreover, these visualizations are random and not the last 10.
Is it possible to set a different number of visualization?
And what can I do for see the last N evaluations instead of random N evaluations?
Thank you in advance.
Added: Image from link:
I assume you're using this code:
https://github.com/tensorflow/models/tree/master/research/object_detection
(you should include that link to clarify in future questions, and if that assumption is wrong you should edit your question to specify what code you're using)
If you look at the trainer.py code at the bottom they have:
slim.learning.train(
train_tensor,
logdir=train_dir,
master=master,
is_chief=is_chief,
session_config=session_config,
startup_delay_steps=train_config.startup_delay_steps,
init_fn=init_fn,
summary_op=summary_op,
number_of_steps=(
train_config.num_steps if train_config.num_steps else None),
save_summaries_secs=120,
sync_optimizer=sync_optimizer,
saver=saver)
It looks like they've hard coded save_summaries_sec=120 to save a summary every 120 seconds. That's what you want to edit to change the tensorboard summary update period.
Edit: I've added the image to the question to help clarify. I believe the answer is in tf.summary.image you have a property max_outputs which controls the number of values from the block of images. To choose a subset of images specifically you should simply write your own code to select them in whatever way you see fit, randomly, or in some order, then pass that new set of images to tf.summary.image.
You may want to consider looking at the eval_config section of the model config file.
eval_config: {
num_examples: 100
num_visualizations: 50
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
#max_evals: 10
}
I'm guessing that max_evals is what you're looking for.