Detect multiple bodies in Kinect?

Detect multiple bodies in Kinect? - kinect

I am working with kinect in openframework using the ofxKinect addon, which is great and plenty fun!
Anyway I am looking for some pointers or a direction when dealing with multiple bodies on the screen. I was thinking of making a rect around each detected body and when the rects intersect something could happen, an effect or anything.
So what I am looking for are ideas or something that could point me to the right direction of detecting multiple bodies when using a kinect.
Right now based on the depth image I get from the kinect I go through each pixel and create a bunch of smaller rectangles with a padding and group them in a larger rectangle bound if they are separate from another rectangle group. This is not ideal as it only deals with the pixel values and is not really seperating bodies from eachother and is not giving me the results I am looking for.
So any ideas would be greatly appreciated!

If you want to use ofxKinect a quick solution would be to threshold on depth and assume bodies and no other objects will be within a depth range. This should make it easy to use the OpenCV's contour finder to isolate the outlines of the bodies and get the bounding rectangles. If the rectangles intersect(and ofRectangle already does the math you), trigger the reaction you need. Also don't forget to do that once if the effect isn't showing already, otherwise you will trigger the effect multiple times per second while the two bodies' bounding rectangles intersect.
You could try something a bit more hardcore and using ofxCv(not just ofxOpenCV) to tap into the HoG functionality. This is slow in itself and not ideal with the depth map, but hopefully you can run in every few seconds just to detect a person and the depth, then keep tracking that movement.
Personally, if you want to track people with the Kinect I recommend using ofxOpenNI as if already provides the scene segmentation feature and even if you don't track the skeletons you can still get useful information like the pixels pertaining to each body and they're centre of mass. I'm guessing Microsoft KinectSDK has a similar feature and there should be an oF addon, but it's windows only.
ofxKinect/libfreenect does not offer any people detection features, so you will need to roll your own.

Related

GODOT: What is an efficient calculation for the AABB of a simple 3D model from a camera's view

I am attempting to come up with a quick and efficient means of translating a 3d mesh into a projected AABB. In the end, I would like to accomplish something similar to figure 1 wherein only the area of the screen covered by the cube is located inside the bounding box highlighted in red. ((if it is at all possible, getting the area as small as possible, highlighted in blue, would increase efficiency down the road.))
Figure 1. https://i.imgur.com/pd0E20C.png
Currently, I have tried:
Calculating the point position on the screen using camera.unproject_position(). this failed largely due to my inability to wrap my head around the pixel positions trending towards infinity. I understand it has something to do with Tan, but frankly, it is too late for my brain to function anymore.
Getting the area of collision between the view frustum and the AABB of the mesh instance. This method seems convoluted, and to get it in a usable format I would need to project the result into 2d coordinates again.
Using the MeshInstance VisualInstance to create a texture wherein a pixel is white if it contains the mesh instance, and black otherwise. Visual instances in general just baffle me, and I did not think it would be efficient to have another viewport just to output this texture.
What I am looking for:
An output that can be passed to a shader informing where to complete certain calculations. Right now this is set up to use a bounding box, but it could easily be rewritten to also use a texture. It also could be rewritten to use polygons, but I am trying to keep calculations to a minimum in the shader.
Certain solutions I have tried before have worked, slightly, but this must be robust. The camera interfacing with the 3d object will be able to move completely around and through it, meaning at times the view will be completely surrounded by the 3d model with points both in front, and behind.
Thank you for any help you can provide.
I will try my best to update this post with information if needed.

GameMaker Studio Physics Object Precise Collisions

Is it possible to have a physics object in GameMaker Studio use precise collisions?
Here's some context for my question. I'm making a pirate game where the player sails around a large ocean with a number of islands. I've been using the physics engine to control the movement of the ship, and that is working well. However, the problem arises when trying to introduce collisions between the ship and the various islands. As far as I can tell, the underlying physics fixtures can only be formed into fairly simple shapes. Specifically, the collision shape editor is limited to 12 points, and only convex shapes. This is a problem, because many of my islands are relatively complicated non-convex shapes, and aren't necessarily a single piece. It would be nice to be able to use the island sprite as a precise collision mask, as would be possible for non physics based objects.
Is there a way to do this, or a possible work-around that I'm missing? Here's an example of one of my islands:

I can see two solutions to your problem.
1 - The easiest, but performance-unfriendly.
In the sprite editor, click "Modify mask". There should be a "precise collision checking" box you can tick. This means that your sprite will be checked pixel by pixel for collisions. As you can guess, this is not performance friendly, but will do exactly what you want.
2 - The one I would recommend.
What you could do is just draw the island sprites, either through the background or via a dedicated object, and then create some simple shape objects (rectangle, circle and diamond), that would be invisible, and place them over your islands in the room editor. (Don't forget that you can stretch them).
These simple shaped objects would be the ones to check for collisions.
I used this technique make a hitbox for complex-shaped clouds in one of my games, so I know it works.
I believe that the island you show us can be fairly well covered with a few ovals and a long rectangle.
Bonus : after doing that graphically, you can copy the creation code of the shapes from the room create event to the island create event to repeat for multiple identical islands. Just don't forget the position/angle offset !

By using the Shape options when defining the collision shape, you can have any kind of Convex polygon as your collision shape. Example:
The spot where you choose the Shape option is in red.
After you select that option, you can just click & drag to add/edit a vertex to the polygon. Just bear in mind, it has to be a convex polygon, GameMaker is very strict about that. You can also remove vertices by right clicking on them.

Drawing an Isometric image split into layers

I've seen quite a bit of questions regarding how to draw isometric tiles, and most all point at being drawn back to front, top down. However I'm trying to find a way to prevent clipping with a single isometric image.
While normally drawing a sprite ontop of a single image would not prevent overdrawing on walls and such, I split up the image into 3 layers. A floor, lower wall, and top wall. Where the player checks the floor for collision, is drawn in front of the lower wall always, and drawn behind the top wall always. The result looks like the following
While this seems to work decently well, I'd like to know what the most efficient way to draw these sort of isometric scenes are. I've considered tiles, however that raises the question of how to draw multi-tiled buildings and such. If tiling becomes a better option I will create a new question regarding those questions. For now lets assume I'm using a single image broken into layers.
This is somewhat easier, however, for my artist. To be able to draw a single scene in isometric, and split it up into layers, eliminating the need for a map creator. And then using pixel collision to get precise collision with the enviroment.
Is using a multi-layered scene even a good approach for this? My biggest concern is preventing overdrawing and breaking perspective. I've also seen many good examples of drawing everything using tiles, however then I'm limited to a certain scale, and that arises even more questions. Do you know of the best way to approach this? Should I use tiles instead of a single image split into layers?
I plan to code this in either MonoGame or Processing.
(I would have posted this on gamedev but I can not post images there)

Which pixels did that drawmesh operation just draw to?

Ok, it's a relatively simple problem, I want to know where, in screen space, a particular mesh was just drawn. I plan on then storing that information in a data store of some kind so that when I interact with something in screen space, I can lookup in the register and find the object, i.e, click on the spaceship drawn on the screen and then select target etc.
I can't find any way of finding out which pixels the mesh was drawn to though...
Alternatively, if I'm missing something obvious regarding what it is that I Want to do, please let me know!

There is no easy way to do that. But you can use another texture as render target and render those meshes with unique colors.
So for example you give #FF0000 to your mesh A and draw it also to your second render target with that color. Now when you select a pixel from 2nd render target and look at that color, if it is #FF0000 you can understand that, the pixel is a part of mesh A. Thus you can easily pick the mesh drawn on a certain pixel when you click one of those pixels.

Why dont you Unproject your screen space coords into 3D space? The only complication I had was the fact that I'd be left with a plane, I could check if a Mesh intersected with that plane but I often had multiple candidates for 'picking'.
Check out Google for DirectX Unproject and there are various articles discussing it. It's sometimes complicated for some to implement but done well it's actually pretty nifty; don't get put off by the people online who say it doesn't work, it does work!

Finding significant images in a set of surveillance camera images

I've had theft problems outside my house so I setup a simple webcam to capture every second with Dorgem (http://dorgem.sf.net).
Dorgem does offer a feature to use motion detection to only capture frames where something is moving on the screen. The problem is that the motion detection algorithm it uses is extremely sensitive. It goes off because of variations in color between successive shots on my cheap webcam, and it also goes off because the trees in front of the house are blowing in the wind. Additionally, the front of my house is a high traffic area so there is also a large number of legitimately captured frames.
I average capturing 2800/3600 frames every second using Dorgem's motion detection. This is too much for me to search through to find out where the interesting activity is.
I wish I could re-position the camera to a more optimal position where it would only capture the areas I'm interested in, so that motion detection would be simpler, however this is not an option for me.
I think that because my camera has a fixed position and each picture frames the same area in front of my house, then I should be able to scan the images and figure out which ones have motion in some interesting region of that image, throwing out all other frames.
For example: if there's a change in pixel 320,240 then someone has stepped in front of my house and I want to see that frame, but if there's a change in pixel 1,1 then its just the trees blowing in the wind and the frame can be discarded.
I've looked at pdiff, a tool for finding diffs in sets of pictures, but it seems to be also focused on diffing the entire picture, rather than a specific region of it:
http://pdiff.sourceforge.net/
I've also looked at phash, a tool for calculating a hash based on human perception of an image, but it seems too complex:
http://www.phash.org/
I suppose I could implement it in a shell script using imagemagick's mogrify -crop to cherry pick the regions of the image I'm interested in, then running pdiff to find the interesting ones, and using that to pick out the interesting frames.
Any thoughts? ideas? existing tools?

cropping and then using pdiff seems like the best choice to me.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas