How to train tensorflow object detection image segmentation mask_rcnn_inception_resnet_v2_atrous_coco instance segmentation on my own dataset - tensorflow

please help me with training my own dataset on mask_rcnn_inception_resnet_v2_atrous_coco model.
https://github.com/tensorflow/models/tree/master/research/object_detection
model:https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
I have refered to https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/instance_segmentation.md ; but I can't clearly understand the steps.
Do we have to give the Bounding box coordinates of the object along with the mask.png file?
How to convert the mask data to tfRecord files (for instance segmentation).?
Can anyone suggest the labelling tool used for bounding box as well as mask.png file!!
tools like LabelBox, labelme, labelimg gives either bounding box coordinated or mask.png file or the polygon coordinates for the object.
please help

The best you give png mask and xml labelization it should be working with create_pet_tf_record.py, set faces_only=false in this file... You can see into the code what is expected in this file..
change path into to point your directories in pipeline configuration

Do we have to give the Bounding box coordinates of the object along with the mask.png file?
Answer: Yes, you need the original images, bounding box files, and mask images.
Use the following tool to annotate each object in your original images Label image
Once you're done with this, you need to annotate each pixel inside each bounding box. There are several tools you can use, for example you can use these tool VGG annotator

Related

What format does YOLOv2 txt file data use?

My YOLOv2Tiny model is drawing bounding boxes in the correct place, however consistently smaller than the object. This got me wondering if I correctly annotated the images.
Does YOLOv2 use the Pascal VOC .xml format or the same YOLO .txt format that YOLOv4 uses?

Change image labels .xml after resizing actual images sizes

what am I doing: I've collected images for tensorflow object api retraining job, label them using labelImg application, further i've resize collected images to reduce training job time.
I guess labels generated for primary collected images are not corresponds to newly resized images, so is it any scripts how can I change previously generated images according to newly resized images. Thank you!
Usually one convert the XML generated by labelImg into a single csv, then this csv is converted into a tfrecord file which contain both the images and the annotations. During this convertion coordinates are stored as relative (percentage on image width/height), thus you don't need to recalcute them. I gues this is your case too.

Should I include negative examples for Tensorflow object detection API?

I am building a RCNN detection network using Tensorflow's object detection API.
My goal is to detect bounding boxes for animals in outdoor videos. Most frames do not have animals and are just of dynamic backgrounds.
Most tutorials focus on training custom labels, but make no mention of negative training samples. How do these class of detectors deal with images which do not contain objects of interest? Does it just output a low probability, or will it force to try to draw a bounding box within an image?
My current plan is to use traditional background subtraction in opencv to generate potential frames and pass them to a trained network. Should I also include a class of 'background' bounding boxes as 'negative data'?
The final option would be to use opencv for background subtraction, RCNN to generate bounding boxes, then a classification model of crops to identify animals versus background.
In general it's not necessary to explicitly include "negative images". What happens in these detection models is that they use the parts of the image that don't belong to the annotated objects as negatives.
If you expect your model to differentiate between "found a figure" and "no figure", then you will almost certainly need to train it on negative examples. Label these as "no image". In the "no image" case, yes, use the entire image as the bounding box; don't suggest that the model recognize anything smaller.
In "no image" cases, you may get a smaller bounding box, but that doesn't matter: in inference, you'll simply ignore whatever box is returned for "no image".
Of course, the critical issue here is to try it out, and see how well it works for you.
I have found success by scanning my ground truth, copying the box areas plus a margin, then pasting tilings of those box areas onto new background images (guaranteed to have no objects), and creating corresponding XML files with the box category assertions.
I collect non-objects as "uncategorised" boxes - usually from glitches in the output from my latest model. These are tiled (just like the "is-objects") but are not updated in the XML files.
I produce tilings at various scales to build each new training set.
A further explanation and sample python code is here:
https://github.com/brentcroft/ground-truth-productions

Nvidia digits object detection own dataset

According to information on the Nvidia website Digits uses datatasets in Kitti format. Is there possibilty in Digits or in external application to prepare such dataset or I will have to write it on my own?
I would like to simply draw bounding boxes on the displayed image and then have it converted to txt appropiate txt file.
Thanks in advance!
Yep, you can use one of the available solutions for bounding box annotations eg. RectLabel, save the annotations in Pascal VOC format and then transform it to Kitti using one of the freely available converters, eg: VOD Converter

Saving "heavy" figure to PDF in MATLAB - rendering problem

I generate a figure in MATLAB with large amount of elements (100000+) and want to save it into a PDF file. With zbuffer or painters renderer I've got very large and slowly opened file (over 4 Mb) - all points are in vector format. Using OpenGL renderer rasterize the figure in PDF, ok for the plot, but not good for text labels. The file size is about 150 Kb.
Try this simplified code, for example:
x=linspace(1,10,100000);
y=sin(x)+randn(size(x));
plot(x,y,'.')
set(gcf,'Renderer','zbuffer')
print -dpdf -r300 testpdf_zb
set(gcf,'Renderer','painters')
print -dpdf -r300 testpdf_pa
set(gcf,'Renderer','opengl')
print -dpdf -r300 testpdf_op
The actual figure is much more complex with several axes and different types of plots.
Is there a way to rasterize the figure, but keep text labels as vectors?
Another problem with OpenGL is that is does not work in terminal mode (-nosplash -nodesktop -nodisplay) under Mac OSX. Looks like OpenGL is not supported. I have to use terminal mode for automation. The MATLAB version I run is 2007b. Mac OSX server 10.4.
This is a funny one. Your problem is not Matlab, it's Ghostscript (Matlab creates PDFs by calling Ghostscript, at least on Windows). When I run
x=linspace(1,10,100000);
y=sin(x)+randn(size(x));
plot(x,y,'.')
print -dpsc2 test.ps
I've got a 2Mb PS file (all vector, of course), which when compressed became a 164Kb ZIP. One would expect to get more-or-less the same result when converting PS to PDF, but ps2pdf test.ps produced your 4Mb file!
Since you are on a Mac, you probably have Distiller. I'd give it a try — generate PS files as above, and then run them through Distiller; you should get a 150K vector PDF.
If you insist on rasterizing, I can suggest printing the figure without any axes or labels to a tiff, opening the tiff, and recreating axes and labels on top of it.
If you don't want to go with a 2D histogram (i.e. an image where pixel brightness corresponds to density of points) as BlessedKey suggests, it looks like the only good way is to do the rasterizing yourself, as mentioned by AB.
getframe followed by frame2im seems to be the way to go for that. Unfortunately, getframe returns empty if you run with -nodisplay. Therefore, you'd have to save the figure as .fig, and on another computer run a script that
opens the figure, gets the content of the axes with getframe, displays the image from getframe and then saves to pdf.
As an alternative to simple plotting or a 2D histogram, you may want to look into scattercloud, which combines plotting the points with density information, by the way.
If at all possible you should try to subsample your problem before building the illustration. If you are plotting points on a curve then 10,000 is probably more than you need. A modern printer is only about 600 DPI afterall.
If the points are illustrating a cloud with some density properties, a better solution may be to build a two dimensional histogram first, and illustrate that with imshow or imagesc.
If multiple clouds are being illustrated with different colors you may be interested in building one such image for each cloud and the combining them with transparency.