Convert raw YOLO output to bounding boxes - yolo

I'm trying to convert the raw output of my tiny yoloV3-model to bounding box coordinates. My input is a 416x416-image and the raw output has shape [2535, 6], corresponding to [center_x, center_y, width, height, obj score, class prob.] for each box. I want to convert the first four elements of this array into actual pixel coordinates, but I'm not sure how to interpret the values. As an example, most arrays look something like this:
[-4.3124937e+03 -4.5493687e+03 4.7279790e+03 5.0129067e+03 9.9985057e-01 9.9992472e-01]
Why are the values for center_x and center_y negative? And why is width and height so large?

Related

Interpolate 3D surface input into 4th dimension

My source data consists of a set of (x,y,z,e) samples. It can be visualized like this, where the dots are the (x,y,z) samples in 3D space and the color reflects the e value. The (x,y,z) samples compose a surface.
I want a kind of interpolation method, that I can feed with random (x,y,z) coordinates that are close to the surface and that can output an interpolated e value.
I tried the scipy LinearNDInterpolator. It works fine, but only for input (x,y,z) points that lie inside the convex hull of the surface. When the input is only slightly outside, the interpolator returns 'nan'.
I'm a bit out of ideas how to solve this.
I can only think of iterating the each line in the grid to find the points closest to the random (x,y,z) input and do linear interpolations from these points. But if I could somehow reconstruct the surface, that would be more accurate.

How do I see the actual color of a single RGB value in Google Colab?

Very basic question. I have a single vector (e.g., [53, 21, 110]) and I want to print the RGB color it represents in a colab notebook. Like a color swatch. What's the simplest way to do this?
The simplest way would be using the Image module from PIL. According to the documentation, you can construct an image with:
PIL.Image.new(mode, size, color=0)
mode [required]: determines the mode used for the image, it can be RGB, RGBA, HSV, etc. You can find more modes in the docs
size [required]: this is a tuple (weight, height) that represents the dimensions of your image in pixels.
color [optional]: this is the color of the image, it can receive a tuple to represent the RGB color in your case. The default color is black.
Then, to show the image within colab, you would use
display(img)
Given your question, the mode would need to be 'RGB' and if your vector is a list, you need to convert into a tuple to use it.
To show an 300px by 300px image, the code would look like.
from PIL import Image
img = Image.new('RGB', (300,300), color = (53, 21, 110))
display(img)

How to fill a line in 2D image along a given radius with the data in a given line image?

I want to fill a 2D image along its polar radius, the data are stored in a image where each row or column corresponds to the radius in target image. How can I fill the target image efficiently? Such as with iradius or some functions? I do not prefer a pix-pix operation.
Are you looking for something like this?
number maxR = 100
image rValues := realimage("I(r)",4,maxR)
rValues = 10 + trunc(100*random())
image plot :=realimage("Ring",4,2*maxR,2*maxR)
rValues.ShowImage()
plot.ShowImage()
plot = rValues.warp(iradius,0)
You might also want to check out the relevant example code from the F1 help documentation of GMS itself:
Explaining warp a bit:
plot = rValues.warp(iradius,0)
Assigns values to plot based on a value-lookup in rValues.
For each pixel in plot a coordinate position in rValues is computed, and the value is simply looked up. If the computed coordinate is non-integer, bilinear interpolation between the 4 closest points is used.
In the example, the two 'formulas' for the coordinate calculation are simple x' = iradius and y' = 0 where iradius is an expression computed from the coordinate in plot, for convenience.
You can feed any expression into the parameters for warp( ) and the command is closely related to just using the square bracket notation of addressing values. In fact, the only difference is that warp performs the bilinear interpolation of values instead of truncating the coordinates to integer values.

Tensorflow output labels is a value in the 2D gird or locating it in the grid

My final output should be a 2D grid that contains values for each grid point. Is there a way to implement in TensorFlow, where I can input a number of images and each image correspond to a specific point in a 2D grid? I want my model such that when I input a similar image it should result in detecting that specific grid in a 2D image. I mean that each image input image belongs to a specific area in the output image (which I divided into a grid for simplicity to make it a finite number of locations).

How can I get text's heigth from pdf's transformation matrix?

I am making a pdf parser and I have a problem when I am trying to read the transformation matrix (Tm) of a text.
For example, when I have a horizontal text, the transformation matrix looks like this:
"71.9871 0 0 73.5 178.668 522.2227 Tm"
which means that the text's height is the d parameter (73.5), the ratio of each character is a/d (71.9871/73.5) and it has to be translated to the point (178.668 522.2227).
If I rotate this text, then the transformation matrix looks like this:
"63.1614 -34.5367 35.2625 64.4888 181.8616 575.8494 Tm"
How can I get the height of the text, which is 73.5?
If I export the same file as an svg file I get this matrix:
"0.8593 0.4699 -0.4798 0.8774 181.8616 266.0405"
and that the height of the text is 73.5. (I have noticed that if i divide the d parameter of my rotated text with the text's height (73.5) I get the d parameter of the svg matrix (0.8774), but agian, how can I know the text's height?).
Thank you.
As already mentioned in a comment, you actually have a multitude of matrices and scalars to deal with, at least the current transformation matrix, the text matrix, the font size, the horizontal scaling, and the page user unit setting. Of course, though, you can combine all these into one matrix.
Thus, let's assume the matrix you have is this combined one.
To determine the factors by which the font is stretched from its size 1 default state, you could simply apply that matrix to a vertical and a horizontal line segment of length 1, e.g. [0, 0, 1] to [1, 0, 1] and [0, 0, 1] to [0, !, 1], and then calculate the lengths of the resulting line segments.
PS Doing some minor linear algebra, you will see that for a matrix
a b 0
c d 0
e f 1
this amounts to a horizontal font extent of sqrt(a² + b²) and a vertical font extent (the height) of sqrt(c² + d²).