With imagesegmentation I segmented objects from a webcam. But because of the bad light I get a lot of noise in the picture. I want now to improve the shape of the found objects. The only method i found is image opening and closing but the result is no as good as I wished. Does anyone know some other methods?
In this folder I have the orginal image after the segmentation, the image after closing, and a picture of what kind of picture I'm looing for.
Thanks in advance
You could try the opening-closing combination operator. This provides a better smoothing of the shapes. Further more varying the radii of opening and closing (called the ASF - alternating sequential filtering) produces smoother results.
There are a few demos online that use fourier desciptors. Its also important to choose a good shape of structuring elements, since your closing seems to close convex holes in the image.
If it is salt/pepper noise, a simple median filter may be better.
Related
I am attempting to come up with a quick and efficient means of translating a 3d mesh into a projected AABB. In the end, I would like to accomplish something similar to figure 1 wherein only the area of the screen covered by the cube is located inside the bounding box highlighted in red. ((if it is at all possible, getting the area as small as possible, highlighted in blue, would increase efficiency down the road.))
Figure 1. https://i.imgur.com/pd0E20C.png
Currently, I have tried:
Calculating the point position on the screen using camera.unproject_position(). this failed largely due to my inability to wrap my head around the pixel positions trending towards infinity. I understand it has something to do with Tan, but frankly, it is too late for my brain to function anymore.
Getting the area of collision between the view frustum and the AABB of the mesh instance. This method seems convoluted, and to get it in a usable format I would need to project the result into 2d coordinates again.
Using the MeshInstance VisualInstance to create a texture wherein a pixel is white if it contains the mesh instance, and black otherwise. Visual instances in general just baffle me, and I did not think it would be efficient to have another viewport just to output this texture.
What I am looking for:
An output that can be passed to a shader informing where to complete certain calculations. Right now this is set up to use a bounding box, but it could easily be rewritten to also use a texture. It also could be rewritten to use polygons, but I am trying to keep calculations to a minimum in the shader.
Certain solutions I have tried before have worked, slightly, but this must be robust. The camera interfacing with the 3d object will be able to move completely around and through it, meaning at times the view will be completely surrounded by the 3d model with points both in front, and behind.
Thank you for any help you can provide.
I will try my best to update this post with information if needed.
I'm searching for a methods of text recognition based on document borders.
Or the methods that can solve the problem of finding new viewpoint.
For exmp. the camera is in point (x1,y1,z1) and the result picture with perspective distortions, but we can find (x2,y2,z2) for camera to correct picture.
Thanks.
The usual approach, which assumes that the document's page is approximately flat in 3D space, is to warp the quadrangle encompassing the page into a rectangle. To do so you must estimate a homography, i.e. a (linear) projective transformation between the original image and its warped counterpart.
The estimation requires matching points (or lines) between the two images, and a common choice for documents is to map the page corners in the original images to the image corners of the warped image. This will in general produce a rectangle with an incorrect aspect ratio (i.e. the warped page will look "wider" or "taller" than the real one), but this can be easily corrected if you happen to know in advance what the real aspect ratio is (for example, because you know the type of paper used, whether letter, A4, etc.).
A simple algorithm to perform the estimation is the so-called Direct Linear Transformation.
The OpenCV library contains routines to help accomplishing all these tasks, look into it.
I am working with kinect in openframework using the ofxKinect addon, which is great and plenty fun!
Anyway I am looking for some pointers or a direction when dealing with multiple bodies on the screen. I was thinking of making a rect around each detected body and when the rects intersect something could happen, an effect or anything.
So what I am looking for are ideas or something that could point me to the right direction of detecting multiple bodies when using a kinect.
Right now based on the depth image I get from the kinect I go through each pixel and create a bunch of smaller rectangles with a padding and group them in a larger rectangle bound if they are separate from another rectangle group. This is not ideal as it only deals with the pixel values and is not really seperating bodies from eachother and is not giving me the results I am looking for.
So any ideas would be greatly appreciated!
If you want to use ofxKinect a quick solution would be to threshold on depth and assume bodies and no other objects will be within a depth range. This should make it easy to use the OpenCV's contour finder to isolate the outlines of the bodies and get the bounding rectangles. If the rectangles intersect(and ofRectangle already does the math you), trigger the reaction you need. Also don't forget to do that once if the effect isn't showing already, otherwise you will trigger the effect multiple times per second while the two bodies' bounding rectangles intersect.
You could try something a bit more hardcore and using ofxCv(not just ofxOpenCV) to tap into the HoG functionality. This is slow in itself and not ideal with the depth map, but hopefully you can run in every few seconds just to detect a person and the depth, then keep tracking that movement.
Personally, if you want to track people with the Kinect I recommend using ofxOpenNI as if already provides the scene segmentation feature and even if you don't track the skeletons you can still get useful information like the pixels pertaining to each body and they're centre of mass. I'm guessing Microsoft KinectSDK has a similar feature and there should be an oF addon, but it's windows only.
ofxKinect/libfreenect does not offer any people detection features, so you will need to roll your own.
Imagine I have a rectangle say 400px x 300px. Then let’s say I want to load an image in that. All of this is very easy using Sytem.Drawing.DrawImage.
Rectangle http://img576.imageshack.us/img576/2363/rectangle.gif
But then I want to leave the left hand side as 300px but change the right hand side to 250 px. I can draw the box using 4 DrawLines but I don’t know how to squash the image into the new shape. I want the right hand side of the shape to be 250, the left size 300 and the top and bottom 400px.
Resized http://img85.imageshack.us/img85/3479/rectangle2.gif
I can’t use DrawImage as it expects the left and right sizes to be the same. Is there a way to manipulate the image into the new shape?
I've looked at other questions, but they only apply where the left and right hand side is equal.
Any thoughts on how to squash an image into a shape which did not have parallel sides?
(If it helps, I'm happy to sacrifice image quality to fit the right shape.)
Disclaimer: I work for Atalasoft.
Our DotImage product has a command called QuadrilateralWarpCommand that can do this. It's in DotImage Photo.
What you want to do is non-trivial (but also very powerful).
#Heinzi is correct, the general class is called warp transformations. What you're trying to do is specifically a perspective transformation. At a high level, it involves running the individual coordinates through a transformation matrix to get their new positions, and then doing interpolation between pixel values based on the old and new locations.
This article talks about some transformations, one of them being a sheer, so it might be helpful overall. I'm not sure, I haven't read it closely. In general, you want to google for something approximately like "c# image transformation" or "c# image perspective transform".
Depending on what you're planning on using it for, buying a library might be the best way to go about it, although there is a lot to learn about image manipulation by doing it yourself.
I did not find a solution to your problem, but I have some information which might help you along:
What you want is called a warp transformation.
As far as I know, the .net framework natively supports this kind of transformation only for a GraphicsPath, namely, the GraphicsPath.Warp method. Unfortunately, I don't think that this will help you, unless you are willing to redraw your image using a .net GraphicsPath object.
If you need the transformation directly in the UI layer, your UI library might help: Silverlight, for example, includes the PlaneProjection class, which can be used for such effects; in WPF, the 3D engine might be useful for this (requiring more programming effort, through).
I'm completely new to ArcGIS and ArcMap, but someone suggested this program to me for a project I'm working on.
I would like to animate individual entities on a map, and was wondering if it is possible to do so in ArcMap. I asked this earlier here and a member directed me to a tutorial on animating in ArcGIS. The animation in the guide was over a map spread (ie. each pixel on the map displays, say, a different color to indicate population data in the area). However I realized that if I zoom in a lot, eventually the image will degenerate into pixels, which is why I need an actual object to mark a certain point. I checked some online tutorials and it seems like we can place markers on the map. Can someone tell me if it is possible to animate these markers (for example via a for-loop)? And if so, could you point me in a direction where to start?
Thanks in advance!
You can animate layers in ArcMap is the short answer. Its not as simple as using the timeline feature in Google Earth for example though. But then ArcMap is much more than just a visualization tool.
This help page on the ESRI web help looks like a good place to start.
I'm not 100% sure what you mean by the image degenerates into pixels. Are you saying that the markers were single points in the layer. Unlike Google Earth you are not confined to simply plotting points on the map. You can draw completely arbitrary shapes in ArcMap, which can be defined to cover actual areas of the map, so when you zoom-in the shape gets larger.
The way you need to load data into ArcMap to produce an animation isn't too simple. There might be other ways to do this, but the way I know of is to generate a NetCDF file. This file contains a 3D matrix of layer data, where each layer is separated through time. Because you generate a matrix, you are effectively placing a raster image over the map. Thus if you want to cover a large area, each matrix becomes large, and you multiply that by the number of time slices you wish to animate over.
Once you have a NetCDF file with your data in however, getting ArcMap to animate it and produce say a .avi file is pretty simple.
You could try just loading some of the example NetCDF datasets into ArcMap to see how/if they will work to get you started.
Hope that helps.
The upcoming v10 will have better time-aware capabilities, which will allow for animation.