How to detect an image between shapes from camera - objective-c

I've been searching around the web about how to do this and I know that it needs to be done with OpenCV. The problem is that all the tutorials and examples that I find are for separated shapes detection or template matching.
What I need is a way to detect the contents between 3 circles (which can be a photo or something else). From what I searched, its not to difficult to find the circles with the camera using contours but, how do I extract what is between them? The circles work like a pattern on the image to grab what is "inside the pattern".
Do I need to use the contours of each circle and measure the distance between them to grab my contents? If so, what if the image is a bit rotated/distorted on the camera?
I'm using Xamarin.iOS for this but from what I already saw, I believe I need to go native for this and any Objective C example is welcome too.
EDIT
Imagining that the image captured by the camera is this:
What I want is to match the 3 circles and get the following part of the image as result:
Since the images come from the camera, they can be rotated or scaled up/down.

The warpAffine function will let you map the desired area of the source image to a destination image, performing cropping, rotation and scaling in a single go.
Talking about rotation and scaling seem to indicate that you want to extract a rectangle of a given aspect ratio, hence perform a similarity transform. To define such a transform, three points are too much, two suffice. The construction of the affine matrix is a little tricky.

Related

GODOT: What is an efficient calculation for the AABB of a simple 3D model from a camera's view

I am attempting to come up with a quick and efficient means of translating a 3d mesh into a projected AABB. In the end, I would like to accomplish something similar to figure 1 wherein only the area of the screen covered by the cube is located inside the bounding box highlighted in red. ((if it is at all possible, getting the area as small as possible, highlighted in blue, would increase efficiency down the road.))
Figure 1. https://i.imgur.com/pd0E20C.png
Currently, I have tried:
Calculating the point position on the screen using camera.unproject_position(). this failed largely due to my inability to wrap my head around the pixel positions trending towards infinity. I understand it has something to do with Tan, but frankly, it is too late for my brain to function anymore.
Getting the area of collision between the view frustum and the AABB of the mesh instance. This method seems convoluted, and to get it in a usable format I would need to project the result into 2d coordinates again.
Using the MeshInstance VisualInstance to create a texture wherein a pixel is white if it contains the mesh instance, and black otherwise. Visual instances in general just baffle me, and I did not think it would be efficient to have another viewport just to output this texture.
What I am looking for:
An output that can be passed to a shader informing where to complete certain calculations. Right now this is set up to use a bounding box, but it could easily be rewritten to also use a texture. It also could be rewritten to use polygons, but I am trying to keep calculations to a minimum in the shader.
Certain solutions I have tried before have worked, slightly, but this must be robust. The camera interfacing with the 3d object will be able to move completely around and through it, meaning at times the view will be completely surrounded by the 3d model with points both in front, and behind.
Thank you for any help you can provide.
I will try my best to update this post with information if needed.

Perspective distortions correction

I'm searching for a methods of text recognition based on document borders.
Or the methods that can solve the problem of finding new viewpoint.
For exmp. the camera is in point (x1,y1,z1) and the result picture with perspective distortions, but we can find (x2,y2,z2) for camera to correct picture.
Thanks.
The usual approach, which assumes that the document's page is approximately flat in 3D space, is to warp the quadrangle encompassing the page into a rectangle. To do so you must estimate a homography, i.e. a (linear) projective transformation between the original image and its warped counterpart.
The estimation requires matching points (or lines) between the two images, and a common choice for documents is to map the page corners in the original images to the image corners of the warped image. This will in general produce a rectangle with an incorrect aspect ratio (i.e. the warped page will look "wider" or "taller" than the real one), but this can be easily corrected if you happen to know in advance what the real aspect ratio is (for example, because you know the type of paper used, whether letter, A4, etc.).
A simple algorithm to perform the estimation is the so-called Direct Linear Transformation.
The OpenCV library contains routines to help accomplishing all these tasks, look into it.

Which pixels did that drawmesh operation just draw to?

Ok, it's a relatively simple problem, I want to know where, in screen space, a particular mesh was just drawn. I plan on then storing that information in a data store of some kind so that when I interact with something in screen space, I can lookup in the register and find the object, i.e, click on the spaceship drawn on the screen and then select target etc.
I can't find any way of finding out which pixels the mesh was drawn to though...
Alternatively, if I'm missing something obvious regarding what it is that I Want to do, please let me know!
There is no easy way to do that. But you can use another texture as render target and render those meshes with unique colors.
So for example you give #FF0000 to your mesh A and draw it also to your second render target with that color. Now when you select a pixel from 2nd render target and look at that color, if it is #FF0000 you can understand that, the pixel is a part of mesh A. Thus you can easily pick the mesh drawn on a certain pixel when you click one of those pixels.
Why dont you Unproject your screen space coords into 3D space? The only complication I had was the fact that I'd be left with a plane, I could check if a Mesh intersected with that plane but I often had multiple candidates for 'picking'.
Check out Google for DirectX Unproject and there are various articles discussing it. It's sometimes complicated for some to implement but done well it's actually pretty nifty; don't get put off by the people online who say it doesn't work, it does work!

Simple algorithm for tracking a rectangular blob

I have created an experimental fast rectangular object tracking system; it will be used for headtracking and controllling objects in 3D engine (Ogre3D).
For now I am able to show to the webcam any kind of bright colored rectangle (text markers are good objects) and system registers basic properties of this object (hue/value/lightness and initial width and height in 0 degrees rotation).
After I have registered the trackable object, I do some simple frame processing to create grayscale probabilty map.
So now I have 2 known things:
1) 4 corners for the last object position (it's always a rectangle but it may be rotated)
2) a pretty rectangular (but still far from perfect) blob which is the brightest in the frame. I can get coordinates of any point of the blob without problems, point detection is stable enough.
I can find a bounding rectangle of the object without problems, but I have a problem with detecting the object corners themselves.
I need the simplest possible (quick&dirty would be great) algorithm to scan the image starting with some known coordinates (a point inside the blob) and detect new 4 x,y coordinates of a "blobish" rectangle corners (not corners of a bounding box but corners of the rectangular blob itself).
Ready-to-use C++ function would be awesome, but somehow google doesn't like me today :(
I think that it would be overkill to use some complicated function form OpenCV library just to extract 4 points of a single rectanglular blob. But if you know a quick and efficient way how to do it using OpenCV (it must be real-time and light on CPU because I'll run the 3D engine at the same time) then I would be really grateful.
You can apply Hough transform on segmented image to detect lines. Using detected lines you can calculate their intersection to find the corner coordinates of the blob.

Is it possible to animate markers in ArcMap?

I'm completely new to ArcGIS and ArcMap, but someone suggested this program to me for a project I'm working on.
I would like to animate individual entities on a map, and was wondering if it is possible to do so in ArcMap. I asked this earlier here and a member directed me to a tutorial on animating in ArcGIS. The animation in the guide was over a map spread (ie. each pixel on the map displays, say, a different color to indicate population data in the area). However I realized that if I zoom in a lot, eventually the image will degenerate into pixels, which is why I need an actual object to mark a certain point. I checked some online tutorials and it seems like we can place markers on the map. Can someone tell me if it is possible to animate these markers (for example via a for-loop)? And if so, could you point me in a direction where to start?
Thanks in advance!
You can animate layers in ArcMap is the short answer. Its not as simple as using the timeline feature in Google Earth for example though. But then ArcMap is much more than just a visualization tool.
This help page on the ESRI web help looks like a good place to start.
I'm not 100% sure what you mean by the image degenerates into pixels. Are you saying that the markers were single points in the layer. Unlike Google Earth you are not confined to simply plotting points on the map. You can draw completely arbitrary shapes in ArcMap, which can be defined to cover actual areas of the map, so when you zoom-in the shape gets larger.
The way you need to load data into ArcMap to produce an animation isn't too simple. There might be other ways to do this, but the way I know of is to generate a NetCDF file. This file contains a 3D matrix of layer data, where each layer is separated through time. Because you generate a matrix, you are effectively placing a raster image over the map. Thus if you want to cover a large area, each matrix becomes large, and you multiply that by the number of time slices you wish to animate over.
Once you have a NetCDF file with your data in however, getting ArcMap to animate it and produce say a .avi file is pretty simple.
You could try just loading some of the example NetCDF datasets into ArcMap to see how/if they will work to get you started.
Hope that helps.
The upcoming v10 will have better time-aware capabilities, which will allow for animation.