finding bounding box of centroid with limited information - object-detection

I have detected blob keypoints in opencv c++. The centroid displays fine. How do I then draw a bounding box around the detected blob if I only have the blob center coordinates? I can't work backwards from center because of too many unknowns(or so I believe).
threshold(imageUndistorted, binary_image, 30, 255, THRESH_BINARY);
Ptr<SimpleBlobDetector> detector = SimpleBlobDetector::create(params);
// Detect blob
detector->detect(binary_image, binary_keypoints);
drawKeypoints(binary_image, binary_keypoints, bin_image_keypoints, Scalar(0, 0, 255), DrawMatchesFlags::DRAW_RICH_KEYPOINTS);
//draw BBox ?
What am I overlooking to draw the bounding box around the single blob?

I said:
I can't work backwards from center because of too many unknowns(or so I believe).
There is not limited information if blob size is used: keypoints.size which returns the diameter of the blob in question. Though there might be some inaccurate results with highly asymmetric or lopsided targets, this worked well for me b/c I used spheroid objects. Moments/ is probably the better approached for the asymmetrical targets.
keypoints.size should not be confused with keypoints.size(). The latter does a count in the vector of objects in my case the former is the diameter. Using both.
Using the diameter I can then calculate the rest with no problem:
float TLx = (ctr_x - r);
float TLy = (ctr_y - r);
float BRx = (ctr_x + r);
float Bry = (ctr_y + r);
Point TLp(TLx-10, TLy-10); //works fine without but more visible with enhancement
Point BRp(BRx+10, Bry+10); //same here
std::cout << "Top Left: " << TLp << std::endl << "Right Lower:" << BRp << std::endl;
cv::rectangle(bin_with_keypoints, TLp, BRp, cv::Scalar(0, 255, 0));
imshow("With Green Bounding Box:", bin_with_keypoints);
TLp = top left point with 10px adjustments to make box bigger.
BRp = bottom right point
TLx, TLy are calculated from blob center coordinates as well as BRps. If you are going to use multiple targets would suggest contours approach (with the moments). I have 1 - 2 blobs to keep track of which is a lot easier but keeps resource usage down.
Rectangle drawing function can also work with Rect (diameter = keypoint.size)
Rect r(TLp, BRp, center_x + diameter/2, center_y+diamter/2) // r(TLc, BRc, width, heigth)
cv::rectangle(bin_with_keypoints, rect, cv::Scalar(0, 255, 0));

Related

Can you change the bounds of a Sampler in a Metal Shader?

In the fragment function of a Metal Shader file, is there a way to redefine the "bounds" of the texture with respect to what the sample will consider it's normalized coordinates to be?
By default, a value of 0,0 for the sample is the top-left "pixel" and 1,1 is the bottom right "pixel" of the texture. However, I'm re-using textures for drawing and at any given render pass there's only a portion of the texture that contains the relevant data.
For example, in a texture of width: 500 and height: 500, I might have only copied data into the region of 0,0,250,250. In my fragment function, I'd like the sampler to interpret a normalized coordinate of 1.0 to be 250 and not 500. Is that possible?
I realize I can just change the sampler to use pixel addressing, but that comes with a few restrictions as noted in the Metal Shader Specification.
No, but if you know the region you want to sample from, it's quite easy to do a little math in the shader to fix up your sampling coordinates. This is used often with texture atlases.
Suppose you have an image that's 500x500 and you want to sample the bottom-right 125x125 region (just to make things more interesting). You could pass this sampling region in as a float4, storing the bounds as (left, top, width, height) in the xyzw components. In this case, the bounds would be (375, 375, 125, 125). Your incoming texture coordinates are "normalized" with respect to this square. The shader simply scales and biases these coordinates into texel coordinates, then normalizes them to the dimensions of the whole texture:
fragment float4 fragment_main(FragmentParams in [[stage_in]],
texture2d<float, access::sample> tex2d [[texture(0)]],
sampler sampler2d [[sampler(0)]],
// ...
constant float4 &spriteBounds [[buffer(0)]])
{
// original coordinates, normalized with respect to subimage
float2 texCoords = in.texCoords;
// texture dimensions
float2 texSize = float2(tex2d.get_width(), tex2d.get_height());
// adjusted texture coordinates, normalized with respect to full texture
texCoords = (texCoords * spriteBounds.zw + spriteBounds.xy) / texSize;
// sample color at modified coordinates
float4 color = tex2d.sample(sampler2d, texCoords);
// ...
}

NSBitmapImageRep: Inconsistent set of values

I am trying to read a 12-bit grayscale (DICOM:MONOCHROME2) image. I can read DICOM RGB files fine. When I attempt to load a grayscale image into NSBitmapImageRep, I get the following error message:
Inconsistent set of values to create NSBitmapImageRep
I have the following code fragment:
NSBitmapImageRep *rep = [[NSBitmapImageRep alloc]
initWithBitmapDataPlanes : nil
pixelsWide : width
pixelsHigh : height
bitsPerSample : bitsStored
samplesPerPixel : 1
hasAlpha : NO
isPlanar : NO
colorSpaceName : NSCalibratedWhiteColorSpace
bytesPerRow : width * bitsAllocated / 8
bitsPerPixel : bitsAllocated];
With these values:
width = 256
height = 256
bitsStored = 12
bitsAllocated = 16
Nothing seems inconsistent to me. I have verified that the image is: width*height*2 in length. So I am pretty sure that it is in a 2-byte grayscale format. I have tried many variations of the parameters, but nothing works. If I change "bitsPerSample" to 16, the error message goes away, but I get a solid black image. The closest success that I have been able to achieve, is to set "bitsPerPixel" to zero. When I do this, I successfully produce an image but it is clearly incorrectly rendered (you can barely make out the original image). Please some suggestions!! I have tried a long time to get this to work and have checked the Stack overflow and the web (many times). Thanks very much for any help!
SOLUTION:
After the very helpful suggestions from LEADTOOLS Support, I was able to solve my problem. Here is the code fragment that works (assuming a MONOCHROME2 DICOM image):
// If, and only if, MONOCHROME2:
NSBitmapImageRep *imageRep = [[NSBitmapImageRep alloc]
initWithBitmapDataPlanes : &pixelData
pixelsWide : width
pixelsHigh : height
bitsPerSample : bitsAllocated /*bitsStored-this will not work*/
samplesPerPixel : samplesPerPixel
hasAlpha : NO
isPlanar : NO
colorSpaceName : NSCalibratedWhiteColorSpace
bytesPerRow : width * bitsAllocated / 8
bitsPerPixel : bitsAllocated];
int scale = USHRT_MAX / largestImagePixelValue;
uint16_t *ptr = (uint16_t *)imageRep.bitmapData;
for (int i = 0; i < width * height; i++) *ptr++ *= scale;
It is important to know about the Transfer Syntax (0002:0010) and Number of frames in the dataset. Also, try to get the value length and VR for Pixel Data (7FE0:0010) element. Using value length of the pixel data element you will be able to validate your calculation for uncompressed image.
As for displaying the image, you will also need the value for High Bit (0028:0102) and Pixel Representation (0028:0103). An image could be 16-bit allocated, 12-bit stored, high bit set to 15 and have one sample per pixel. That means 4 lest significant bits of each word do not contain pixel data. Pixel Representation when set to 1 means sign bit is the high bit in pixel sample.
In addition, you many need to apply modality LUT transformation (rescale slope and rescale intercept for linear transformation) when present in the dataset to prepare the data for display. At the end, you apply the VOI LUT transformation (Window center and Window Width) to display the image.

OpenTK OpenGL Drawing text

I am trying to learn how to do OpenGL using OpenTK and I can successfully draw polygons, circles, and triangles so far but my next question is how to draw text? I have looked at the example on their homepage which was in C# and I translated it to VB .NET.
It currently just draws a white rectangle so I was hoping that someone could spot an error in my code or suggest another way to draw text. I will just list my paint event.
Paint event:
GL.Clear(ClearBufferMask.ColorBufferBit)
GL.Clear(ClearBufferMask.DepthBufferBit)
Dim text_bmp As Bitmap
Dim text_texture As Integer
text_bmp = New Bitmap(ClientSize.Width, ClientSize.Height)
text_texture = GL.GenTexture()
GL.BindTexture(TextureTarget.Texture2D, text_texture)
GL.TexParameter(TextureTarget.Texture2D, TextureParameterName.TextureMagFilter, All.Linear)
GL.TexParameter(TextureTarget.Texture2D, TextureParameterName.TextureMinFilter, All.Linear)
GL.TexImage2D(TextureTarget.Texture2D, 0, PixelInternalFormat.Rgba, text_bmp.Width, text_bmp.Height, 0 _
, PixelFormat.Bgra, PixelType.UnsignedByte, IntPtr.Zero)
Dim gfx As Graphics
gfx = Graphics.FromImage(text_bmp)
gfx.DrawString("TEST", Me.Font, Brushes.Red, 0, 0)
Dim data As Imaging.BitmapData
data = text_bmp.LockBits(New Rectangle(0, 0, text_bmp.Width, text_bmp.Height), Imaging.ImageLockMode.ReadOnly, System.Drawing.Imaging.PixelFormat.Format32bppArgb)
GL.TexImage2D(TextureTarget.Texture2D, 0, PixelInternalFormat.Rgba, Width, Height, 0, PixelFormat.Bgra, PixelType.UnsignedByte, data.Scan0)
text_bmp.UnlockBits(data)
GL.MatrixMode(MatrixMode.Projection)
GL.LoadIdentity()
GL.Ortho(0, width, Height, 0, -1, 1)
GL.Enable(EnableCap.Texture2D)
GL.Enable(EnableCap.Blend)
GL.BlendFunc(BlendingFactorSrc.One, BlendingFactorDest.OneMinusSrcAlpha)
GL.Begin(BeginMode.Quads)
GL.TexCoord2(0.0F, 1.0F)
GL.Vertex2(0.0F, 0.0F)
GL.TexCoord2(1.0F, 1.0F)
GL.Vertex2(1.0F, 0.0F)
GL.TexCoord2(1.0F, 0.0F)
GL.Vertex2(1.0F, 1.0F)
GL.TexCoord2(0.0F, 0.0F)
GL.Vertex2(0.0F, 1.0F)
GL.End()
GlControl1.SwapBuffers()
You'll get a white rectangle if your card doesn't support NPOT (non-power-of-two) texture sizes. Try testing by setting the bitmap size to e.g. 256x256.
That is an ok method. If you plan to draw lots of text or even a medium amount, that will absolutely destroy performance. What you want to do is look into a program called BMFont:
www.angelcode.com/products/bmfont/‎
What this does is create a texture atlas of text, along with an xml file with the positions, width and height and offsets of every letter. You start off by reading that xml file, and loading each character into a class, with the various values. Then you simply make a function that you pass a string which binds the atlas, than depending on the letters in the string, draws a quad with texture coordinates that vary on the xml data. So you might make a:
for each _char in string
create quad according to xml size
assign texture coordinates relative to xml position
increase position so letters don't draw on top of each other
There are tutorials in other languages on the BMFont website which can be helpful.

Can't correctly rotate cylinder in openGL to desired position

I've got a little objective-c utility program that renders a convex hull. (This is to troubleshoot a bug in another program that calculates the convex hull in preparation for spatial statistical analysis). I'm trying to render a set of triangles, each with an outward-pointing vector. I can get the triangles without problems, but the vectors are driving me crazy.
I'd like the vectors to be simple cylinders. The problem is that I can't just declare coordinates for where the top and bottom of the cylinders belong in 3D (e.g., like I can for the triangles). I have to make them and then rotate and translate them from their default position along the z-axis. I've read a ton about Euler angles, and angle-axis rotations, and quaternions, most of which is relevant, but not directed at what I need: most people have a set of objects and then need to rotate the object in response to some input. I need to place the object correctly in the 3D "scene".
I'm using the Cocoa3DTutorial classes to help me out, and they work great as far as I can tell, but the rotation bit is killing me.
Here is my current effort. It gives me cylinders that are located correctly, but all point along the z-axis (as in this image:. We are looking in the -z direction. The triangle poking out behind is not part of the hull; for testing/debugging. The orthogonal cylinders are coordinate axes, more or less, and the spheres are to make sure the axes are located correctly, since I have to use rotation to place those cylinders correctly. And BTW, when I use that algorithm, the out-vectors fail as well, although in a different way, coming out normal to the planes, but all pointing in +z instead of some in -z)
from Render3DDocument.m:
// Make the out-pointing vector
C3DTCylinder *outVectTube;
C3DTEntity *outVectEntity;
Point3DFloat *sideCtr = [thisSide centerOfMass];
outVectTube = [C3DTCylinder cylinderWithBase: tubeRadius top: tubeRadius height: tubeRadius*10 slices: 16 stacks: 16];
outVectEntity = [C3DTEntity entityWithStyle:triColor
geometry:outVectTube];
Point3DFloat *outVect = [[thisSide inVect] opposite];
Point3DFloat *unitZ = [Point3DFloat pointWithX:0 Y:0 Z:1.0f];
Point3DFloat *rotAxis = [outVect crossWith:unitZ];
double rotAngle = [outVect angleWith:unitZ];
[outVectEntity setRotationX: rotAxis.x
Y: rotAxis.y
Z: rotAxis.z
W: rotAngle];
[outVectEntity setTranslationX:sideCtr.x - ctrX
Y:sideCtr.y - ctrY
Z:sideCtr.z - ctrZ];
[aScene addChild:outVectEntity];
(Note that Point3DFloat is basically a vector class, and that a Side (like thisSide) is a set of four Point3DFloats, one for each vertex, and one for a vector that points towards the center of the hull).
from C3DTEntity.m:
if (_hasTransform) {
glPushMatrix();
// Translation
if ((_translation.x != 0.0) || (_translation.y != 0.0) || (_translation.z != 0.0)) {
glTranslatef(_translation.x, _translation.y, _translation.z);
}
// Scaling
if ((_scaling.x != 1.0) || (_scaling.y != 1.0) || (_scaling.z != 1.0)) {
glScalef(_scaling.x, _scaling.y, _scaling.z);
}
// Rotation
glTranslatef(-_rotationCenter.x, -_rotationCenter.y, -_rotationCenter.z);
if (_rotation.w != 0.0) {
glRotatef(_rotation.w, _rotation.x, _rotation.y, _rotation.z);
} else {
if (_rotation.x != 0.0)
glRotatef(_rotation.x, 1.0f, 0.0f, 0.0f);
if (_rotation.y != 0.0)
glRotatef(_rotation.y, 0.0f, 1.0f, 0.0f);
if (_rotation.z != 0.0)
glRotatef(_rotation.z, 0.0f, 0.0f, 1.0f);
}
glTranslatef(_rotationCenter.x, _rotationCenter.y, _rotationCenter.z);
}
I added the bit in the above code that uses a single rotation around an axis (the "if (_rotation.w != 0.0)" bit), rather than a set of three rotations. My code is likely the problem, but I can't see how.
If your outvects don't all point in the correct directino, you might have to check your triangles' winding - are they all oriented the same way?
Additionally, it might be helpful to draw a line for each outvec (Use the average of the three vertices of your triangle as origin, and draw a line of a few units' length (depending on your scene's scale) into the direction of the outvect. This way, you can be sure that all your vectors are oriented correctly.
How do you calculate your outvects?
The problem appears to be in that glrotatef() expects degrees and I was giving it radians. In addition, clockwise rotation is taken to be positive, and so the sign of the rotation was wrong. This is the corrected code:
double rotAngle = -[outVect angleWith:unitZ]; // radians
[outVectEntity setRotationX: rotAxis.x
Y: rotAxis.y
Z: rotAxis.z
W: rotAngle * 180.0 / M_PI ];
I can now see that my other program has the inVects wrong (the outVects below are poking through the hull instead of pointing out from each face), and I can now track down that bug in the other program...tomorrow:

Fragment-shader blur ... how does this work?

uniform sampler2D sampler0;
uniform vec2 tc_offset[9];
void blur()
{
vec4 sample[9];
for(int i = 0; i < 9; ++i)
sample[i] = texture2D(sampler0, gl_TexCoord[0].st + tc_offset[i]);
gl_FragColor = (sample[0] + (2.0 * sample[1]) + sample[2] +
(2.0 * sample[3]) + sample[4] + 2.0 * sample[5] +
sample[6] + 2.0 * sample[7] + sample[8] ) / 13.0;
}
How does the sample[i] = texture2D(sample0, ...) line work?
It seems like to blur an image, I have to first generate the image, yet here, I'm somehow trying to query the very iamge I'm generating. How does this work?
It applies a blur kernel to the image. tc_offset needs to be properly initialized by the application to form a 3x3 area of sampling points around the actual texture coordinate:
0 0 0
0 x 0
0 0 0
(assuming x is the original coordinate). The offset for the upper-left sampling point would be -1/width,-1/height. The offset for the center point needs to be carefully aligned to texel center (the off-by-0.5 problem). Also, the hardware bilinear filter can be used to cheaply increase the amount of blur (by sampling between texels).
The rest of the shader scales the samples by their distance. Usually, this is precomputed as well:
for(int i = 0; i < NUM_SAMPLES; ++i) {
result += texture2D(sampler,texcoord+offsetscaling[i].xy)*offsetscaling[i].z;
}
One way is to generate your original image to render to a texture, not to the screen.
And then you draw a full screen quad using this shader and the texture as it's input to post-process the image.
As you note, in order to make a blurred image, you first need to make an image, and then blur it. This shader does (just) the second step, taking an image that was generated previously and blurring it. There needs to be additional code elsewhere to generate the original non-blurred image.