Calculating IOU for bounding box predictions - object-detection

I have these two bounding boxes as given in the image. the box cordinates are given as below :
box 1 = [0.23072851 0.44545859 0.56389928 0.67707491]
box 2 = [0.22677664 0.38237819 0.85152483 0.75449795]
The coordinate are like this : ymin, xmin, ymax, xmax
I am calculating IOU as follows :
def get_iou(box1, box2):
"""
Implement the intersection over union (IoU) between box1 and box2
    
Arguments:
box1 -- first box, numpy array with coordinates (ymin, xmin, ymax, xmax)
    box2 -- second box, numpy array with coordinates (ymin, xmin, ymax, xmax)
"""
# ymin, xmin, ymax, xmax = box
y11, x11, y21, x21 = box1
y12, x12, y22, x22 = box2
yi1 = max(y11, y12)
xi1 = max(x11, x12)
yi2 = min(y21, y22)
xi2 = min(x21, x22)
inter_area = max(((xi2 - xi1) * (yi2 - yi1)), 0)
# Calculate the Union area by using Formula: Union(A,B) = A + B - Inter(A,B)
box1_area = (x21 - x11) * (y21 - y11)
box2_area = (x22 - x12) * (y22 - y12)
union_area = box1_area + box2_area - inter_area
# compute the IoU
iou = inter_area / union_area
return iou
Based on my understanding these 2 boxes completely overlap each other so IOU should be 1. However I get an IOU of 0.33193138665968164
. Is there something which I am doing wrong or I am interpreting it in an incorrect way. Any suggestions in this regard would be helpful.

You are interpreting the IoU in an incorrect way.
If pay attention to your example, you notice that the union of the areas of the two bounding boxes is much bigger than the intersection of the areas. So it makes sense that the IoU - which is indeed intersection / union - is much smaller than one.
When you say
Based on my understanding these 2 boxes completely overlap each other so IOU should be 1.
that is not true. In your situation the two bounding boxes overlap only in the sense that one is completely contained in the other. But if this situation weren't penalized, IoU could be always maximized predicting a bounding box as big as the image - which clearly doesn't make sense.

Related

Final Editing of a grid.arranged ggplot

I try to explain my problems but perhaps there are to many so I dont know where to start. And I am running out of time :(
I have tested the ability of fungi to alter plastic surfaces after 2 diff timepoints and in two batches. Method of surface investigation was ATR-FT-IR. I now have spectral IR data from 4 different substrates, each exposed to 5 diff fungi for two diff times. Every sample was measured 10 times (very mostly but sadly not always). Logically, I was running control samples (no fungi and no treatment, sample treated but without fungi), also for the two diff batches. SO- for each Substrate, I come up with around 140 columns and 1820 rows. I shrunk the data to respective means and standard deviations with excel and imported it as .xlsx- because .csv absolutely failed and i could figure out why ?! Catastrophe.
> head(pet)
Wavenumbers MEAN_PET_untreated SD_PET_untreated MEAN_c_PET_B1_AL1 SD_PET_B1_AL1 MEAN_c_PET_B1_AL2 SD_c_PET_B1_AL2
1 3997.805 8.021747e-05 0.0003198024 -5.862401e-05 0.0002445300 0.0001309613 0.0004636534
2 3995.877 7.575977e-05 0.0003168603 -4.503153e-05 0.0002384142 0.0001185064 0.0004360579
3 3993.948 7.713719e-05 0.0003169468 -3.218230e-05 0.0002414230 0.0001145128 0.0004352532
4 3992.020 7.847460e-05 0.0003191443 -3.255098e-05 0.0002519945 0.0001258732 0.0004388980
5 3990.091 7.835603e-05 0.0003159916 -4.792059e-05 0.0002617358 0.0001325122 0.0004465352
6 3988.163 7.727790e-05 0.0003063113 -6.286794e-05 0.0002593732 0.0001297744 0.0004532126
My goal was a multiplot, showing averaged spectral data with geom_path and geom_ribbons per fungus, yielding 5 elements per plot (substrate pur, controle t1, controle t2, fungi treat 1, fungi treat 2). The dataset is really large so I had problems to handle it and created these plots manually, so NOT by faceting.
F4<-ggplot(pet)+
geom_errorbar(aes(x = Wavenumbers, y = MEAN_c_PET_B2_AL2, ymin = MEAN_c_PET_B2_AL2 - SD_c_PET_B2_AL2, ymax = MEAN_c_PET_B2_AL2 + SD_c_PET_B2_AL2, group=1), alpha= .1, stat="identity", position = "identity", colour="red")+
geom_path(aes(x = Wavenumbers, y = MEAN_c_PET_B2_AL2), stat="identity", group= 1, colour= "red")+
geom_errorbar(aes(x = Wavenumbers, y = MEAN_c_PET_B2_AL1 ,ymax = MEAN_c_PET_B2_AL1 + SD_c_PET_B2_AL1, ymin = MEAN_c_PET_B2_AL1 - SD_c_PET_B2_AL1, group=1), alpha= .1, stat="identity", position = "identity", colour="purple")+
geom_path(aes(x = Wavenumbers, y = MEAN_c_PET_B2_AL1), stat="identity", group= 1, colour= "purple")+
geom_errorbar(aes(x = Wavenumbers, y = MEAN_PET_untreated, ymax = MEAN_PET_untreated + SD_PET_untreated, ymin = MEAN_PET_untreated - SD_PET_untreated, group=1), alpha= .1, stat="identity", position = "identity", colour="yellow")+
geom_path(aes(x = Wavenumbers, y = MEAN_PET_untreated), stat="identity", group= 1, colour= "yellow")+
geom_errorbar(aes(x = Wavenumbers, y = MEAN_F4_PET_B2_AL1, ymax = MEAN_F4_PET_B2_AL1 + SD_F4_PET_B2_AL1, ymin = MEAN_F4_PET_B2_AL1 - SD_F4_PET_B2_AL1, group=1), alpha= .1, stat="identity", position = "identity", colour="orange")+
geom_path(aes(x = Wavenumbers, y = MEAN_F4_PET_B2_AL1), stat="identity", group= 1, colour= "orange")+
geom_errorbar(aes(x = Wavenumbers, y = MEAN_F4_PET_B2_AL2, ymax = MEAN_F4_PET_B2_AL2 + SD_F4_PET_B2_AL2, ymin = MEAN_F4_PET_B2_AL2 - SD_F4_PET_B2_AL2, group=1), alpha= .1, stat="identity", position = "identity", colour="darkgreen")+
geom_path(aes(x = Wavenumbers, y = MEAN_F4_PET_B2_AL2), stat="identity", group= 1, colour= "darkgreen")+xlab(NULL)+ylab(NULL)+
scale_x_reverse(limits=c(4000 , 500))
So far I summarized the diff ggplots with:
pets<-grid.arrange(F1, F2, F7,F4, F19, ncol = 1, nrow = 5)
ggsave("Multi.pdf", width = 210, height = 297, units = "mm", pets)
This is nearly fine, not elegant and very complicated, but I wont give up at this stage of work as it costed me a whole week. Sadly, I am not really happy with the design, not even to say, I can not use this like it is. Currently, I try to find solutions regarding:
a) Getting rid of empty grid areas left and right to the plotted values. I use scale_x_reverse(limits=c(4000 , 500)), but the range is extended to both sides on the x axis.
b) Creating manually a legend, because even if it would be possible to do this via shared.legend or whatever, it would always yield to many elements. I only want 5 elements with the always repeating, same colors (red=substrate pure, orange= cT_t1, yellow= cT_t2, green= f_t1, purple = f_t2)
c) creating manually a y-labeling (Absorbance), spanning invisible over all plots (vertically)- I tried to label only the 3. plot in the middle, but this leads to a indentation of this plot and the ones above and below appear more left-ragged. If this would be possible, I could use the direct labeling for indicating the respective fungus (e.g. F4).
d) creating a global x labeling- because if I label only the last element, the height of the last plot is reduced by the height of the label.
e) Give it an overall name.
What makes me nervous, too, is that I get an error only for geom_path, telling me that 1 row was removed. But shouldnt this affect also the geom_ribbon? Has it something to do with the fact that I have to call ribbon BEFOR I call geom_path? Otherwise, the lines would have been hidden by the ribbon.
Removed 1 row(s) containing missing values (geom_path).
Also, I am a wondering about the long duration of code execution. 1 element needs 20 seconds, the whole plot 2 minutes to compute. But at least, it is not collapsing like Excel did before- inclusively data loss. Is it normal for such huge datasets? Or could it indicate a very problematic problem?
Ok, finally I hope someone is out there, having had similar work-around-solutions. Because, like I said, I am not willing to spend another week to tidyr or reshape or mutate or whatever.
Thanx in advance! :)

convert a .csv file to yolo darknet format

I have a few annotations that is originally in .csv format. I would need to convert it to yolo darknet format inorder to train my model with yolov4.
my .csv file :
YOLO format is : object-class x y width height
where, object_class, widht, height is know from my .csv format. But finding x,y is confusing .Note that x and y are center of rectangle (are not top-left corner).
Any help would be appreciated :)
You can use this function to convert bounding boxes to the yolo format. Of course you will need to write some code to read the csv. Just use this function as a template for your needs.
This function was extracted from the labelimg app:
https://github.com/tzutalin/labelImg/blob/master/libs/yolo_io.py
def BndBox2YoloLine(self, box, classList=[]):
xmin = box['xmin']
xmax = box['xmax']
ymin = box['ymin']
ymax = box['ymax']
xcen = float((xmin + xmax)) / 2 / self.imgSize[1]
ycen = float((ymin + ymax)) / 2 / self.imgSize[0]
w = float((xmax - xmin)) / self.imgSize[1]
h = float((ymax - ymin)) / self.imgSize[0]
# PR387
boxName = box['name']
if boxName not in classList:
classList.append(boxName)
classIndex = classList.index(boxName)
return classIndex, xcen, ycen, w, h

Return coordinates that passes threshold value for bounding boxes Google's Object Detection API

Does anyone know how to get bounding box coordinates which only passes threshold value?
I found this answer (here's a link), so I tried using it and done the following:
vis_util.visualize_boxes_and_labels_on_image_array(
image,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=1,
min_score_thresh=0.80)
for i,b in enumerate(boxes[0]):
ymin = boxes[0][i][0]*height
xmin = boxes[0][i][1]*width
ymax = boxes[0][i][2]*height
xmax = boxes[0][i][3]*width
print ("Top left")
print (xmin,ymin,)
print ("Bottom right")
print (xmax,ymax)
But I noticed that by using answer provided in link - returns all the values. From all the bounding boxes detected by the classifier (which I do not want). What I want is only values from bounding boxes that passes "min_score_thresh".
I feel like this should be very simple, but I do lack knowledge in this area.
If I'll find the answer, I'll be sure to post it right here, but if anyone else knows the answer and could save me some time - I would be grateful.
Update:
The boxes and scores returned by previous functions are both numpy array objects, therefore you can use boolean indexing to filter out boxes below the threshold.
This should give you the box that passes the threshold.
true_boxes = boxes[0][scores[0] > min_score_thresh]
And then you can do
for i in range(true_boxes.shape[0]):
ymin = true_boxes[i,0]*height
xmin = true_boxes[i,1]*width
ymax = true_boxes[i,2]*height
xmax = true_boxes[i,3]*width
print ("Top left")
print (xmin,ymin,)
print ("Bottom right")
print (xmax,ymax)

CGAL : Can I use alpha shape to simplify a path?

I would like to get the outter part of a robot trajectory (the boundaries of this trajectory).
I read in several posts that the best way to retrieve the boundary of a point cloud is to use alpha shapes.
So I use the alpha shape implementation of CGAL.
Above picture repressent :
Blue dot : The robot trajectory
Red cross : Vertexes of the optimal alpha shape
Cyan edges : Edges of the optimal alphashape.
Optimal alpha is according to the CGAL documentation the alpha for which :
All data points are either on the boundary or in the interior of the regularized version of the alpha shape.
The number of solid component of the alpha shape is equal to or smaller than 1.
If I increase alpha, I got the convex hull (as expected).
But I can't find an alpha that will give me the following boundary (the black one in the figure bellow) :
So my question is :
Does the black shape in figure above can be found thanks to alpha shapes with the blue point as input ?
For those who wants to see how to use the CGAL python binding to generate alpha shapes, here is my code :
def computeAlphaShape(val):
alpha_shape = Alpha_shape_2(points, 10000.0)
it = alpha_shape.find_optimal_alpha(1)
optimal_alpha = it.next()
alpha_shape.set_alpha(val)
print("Optimal alpha : " + str(optimal_alpha) + " current alpha : " + str(val))
if val == 0:
salpha.set_val(optimal_alpha)
return
print("Solid components : " + str(alpha_shape.number_of_solid_components()))
drawResult(alpha_shape)
salpha.on_changed(computeAlphaShape)
def drawResult(alpha_shape):
ax.clear()
ax.plot(X, Y, 'ob')
edges = alpha_shape.alpha_shape_edges()
while edges.hasNext():
eresX = []
eresY = []
edge = edges.next()
segment = alpha_shape.segment(edge)
eresX.append(segment.source().x())
eresY.append(segment.source().y())
eresX.append(segment.target().x())
eresY.append(segment.target().y())
classe = alpha_shape.classify(edge)
color = 'g-'
if classe == EXTERIOR:
color = 'b-'
elif classe == INTERIOR:
color = 'r-'
elif classe == SINGULAR:
color = 'y-'
elif classe == REGULAR:
color = 'c-'
ax.plot(eresX, eresY, color)
vertices = alpha_shape.alpha_shape_vertices()
v_res_x =[]
v_res_y = []
while vertices.hasNext():
vertex = vertices.next()
v_res_x.append(vertex.point().x())
v_res_y.append(vertex.point().y())
ax.plot(v_res_x, v_res_y, '+r')
For such a task I would use the simplification package if the already have the segments and the 2D reconstruction package is you only have points.
Alpha-shape will work well only if the density of the points is uniform, by picking all edges that are not EXTERIOR. Alpha should be the squared distance between 2 points on the trajectory (just a bit more to be sure the edge is picked). I'm not even sure about what will be the outcome if you have some parts with a small local feature size. In such a case, only SINGULAR and REGULAR edges should be picked.

After detecting objects in a video stream, I want to crop and save these objects

I had detected objects in a video and i want to crop these objects. I tried tensorflow APIs but none of them worked with me. When trying tf.image.crop_to_bounding_box(
image,
offset_height,
offset_width,
target_height,
target_width
)
it tells me that offset_height didn't defined.
So, i need a guide into how to crop object from an image using tensorflow.
Try to use :
tf.image.crop_and_resize(image,boxes,box_ind,crop_size,method='bilinear',extrapolation_value=0,name=None)
For example:
#bounding box coordinates
ymin =boxes[0][0][0]
xmin =boxes[0][0][1]
ymax =boxes[0][0][2]
xmax =boxes[0][0][3]
test = tf.image.crop_and_resize(image=frame_expanded/255,
boxes=[[ymin,xmin,ymax,xmax]],
box_ind=[0],
crop_size=[100,100])
Worked for me!
or if you want to use tf.image.crop_to_bounding_box(...) try this:
#bounding box coordinates
ymin =boxes[0][0][0]
xmin =boxes[0][0][1]
ymax =boxes[0][0][2]
xmax =boxes[0][0][3]
#image_size
(im_width,im_height) = image.size
(xminn, xmaxx, yminn, ymaxx) = (xmin * im_width, xmax * im_width, ymin * im_height, ymax * im_height)
test=tf.image.crop_to_bounding_box(frame,int(yminn), int(xminn), int(ymaxx-yminn), int(xmaxx-xminn))