General considerations for NUI/touch interface - kinect

For the past few months I've been looking into developing a Kinect based multitouch interface for a variety of software music synthesizers.
The overall strategy I've come up with is to create objects, either programatically or (if possible) algorithmically to represent various controls of the soft synth. These should have;
X position
Y position
Height
Width
MIDI output channel
MIDI data scaler (convert x-y coords to midi values)
2 strategies I've considered for agorithmic creation are XML description and somehow pulling stuff right off the screen (ie given a running program, find xycoords of all controls). I have no idea how to go about that second one, which is why I express it in such specific technical language ;). I could do some intermediate solution, like using mouse clicks on the corners of controls to generate an xml file. Another thing I could do, that I've seen frequently in flash apps, is to put the screen size into a variable and use math to build all interface objects in terms of screen size. Note that it isn't strictly necessary to make the objects the same size as onscreen controls, or to represent all onscreen objects (some are just indicators, not interactive controls)
Other considerations;
Given (for now) two sets of X/Y coords as input (left and right hands), what is my best option for using them? My first instinct is/was to create some kind of focus test, where if the x/y coords fall within the interface object's bounds that object becomes active, and then becomes inactive if they fall outside some other smaller bounds for some period of time. The cheap solution I found was to use the left hand as the pointer/selector and the right as a controller, but it seems like I can do more. I have a few gesture solutions (hidden markov chains) I could screw around with. Not that they'd be easy to get to work, exactly, but it's something I could see myself doing given sufficient incentive.
So, to summarize, the problem is
represent the interface (necessary because the default interface always expects mouse input)
select a control
manipulate it using two sets of x/y coords (rotary/continuous controller) or, in the case of switches, preferrably use a gesture to switch it without giving/taking focus.
Any comments, especially from people who have worked/are working in multitouch io/NUI, are greatly appreciated. Links to existing projects and/or some good reading material (books, sites, etc) would be a big help.

Woah lots of stuff here. I worked on lots of NUI stuff during my at Microsoft so let's see what we can do...
But first, I need to get this pet peeve out of the way: You say "Kinect based multitouch". That's just wrong. Kinect inherently has nothing to do with touch (which is why you have the "select a control" challenge). The types of UI consideration needed for touch, body tracking, and mouse are totally different. For example, in touch UI you have to be very careful about resizing things based on screen size/resolution/DPI... regardless of the screen, fingers are always the same physical size and people have the same degreee of physical accuracy so you want your buttons and similar controls to always be roughly the same physical size. Research has found 3/4 of an inch to be the sweet spot for touchscreen buttons. This isn't so much of a concern with Kinect though since you aren't directly touching anything - accuracy is dictated not by finger size but by sensor accuracy and users ability to precisely control finicky & lagging virtual cursors.
If you spend time playing with Kinect games, it quickly becomes clear that there are 4 interaction paradigms.
1) Pose-based commands. User strikes and holds a pose to invoke some application-wide or command (usually brining up a menu)
2) Hover buttons. User moves a virtual cursor over a button and holds still for a certain period of time to select the button
3) Swipe-based navigation and selection. User waves their hands in one direction to scroll and list and another direction to select from the list
4) Voice commands. User just speaks a command.
There are other mouse-like ideas that have been tried by hobbyists (havent seen these in an actual game) but frankly they suck: 1) using one hand for cursor and another hand to "click" where the cursor is or 2) using z-coordinate of the hand to determine whether to "click"
It's not clear to me whether you are asking about how to make some existing mouse widgets work with Kinect. If so, there are some projects on the web that will show you how to control the mouse with Kinect input but that's lame. It may sound super cool but you're really not at all taking advantage of what the device does best.
If I was building a music synthesizer, I would focus on approach #3 - swiping. Something like Dance Central. On the left side of the screen show a list of your MIDI controllers with some small visual indication of their status. Let the user swipe their left hand to scroll through and select a controller from this list. On the right side of the screen show how you are tracking the users right hand within some plane in front of their body. Now you're letting them use both hands at the same time, giving immediate visual feedback of how each hand is being interpretted, and not requiring them to be super precise.
ps... I'd also like to give a shout out to Josh Blake's upcomming NUI book. It's good stuff. If you really want to master this area, go order a copy :) http://www.manning.com/blake/

Related

How did they make the controls in "Fast Like a Fox"?

I am trying to make a basic rhythm game in Godot, but with unique controls. A few years ago, I played a cool game called Fast Like a Fox. The controls were unique, because you tapped on the back of your device to move your character to move, not on the screen. I thought the controls were cool, and I want to try to replicate them in a simple one-button rhythm game for mobile. Does anyone know if it would be possible for Godot to take that kind of input, either in a built-in function or something else?
They read the accelerometer (and maybe other sensors), which Godot supports through accelerometer, gravity and gyroscope. Accelerometers are accurate enough to read passwords as they're being typed so you can even get a rough estimate on where the user is tapping, which is used in Fast Like a Fox use case where internally they poll the sensor and raise an event when particular changes happen in one or multiple axes. In your case, it might be enough to just treat any sudden changes as an event if you simply care about the user tapping anything.
Try writing an app that will display the delta of each axis measurement then tap your phone around, you'll figure it out. Remember to test on various conditions (device being held upside down while laying on a bed, sitting on a chair, laying on one's side, etc) since different axes will register the changes.

Make your own mouse driver

I have a mouse mouse from speedlink that is able to do a lot of things, like changing the colours of the leds, but I can only do those things with the provided software from speedlink.
Is it possible to code your own software that controls the led lights of the mouse?
Yes, but you would have to have the hardware specifications to know what needs to be sent to the mouse to accept the commands you're looking for. Usually these things are not published or readily accessible.
I bought two Microsoft Basic Optical mice, identical to the one which performed the functions I needed really well. The first one would grab and flip the grid with object in one position within the grid with a right click of the Mouse. The Blender 2.79 3D modelling app is what I am using the mice for, I plugged the two new mice into two computers, to try them out, they would NOT do the grid grabbing and flipping, though they would perform the other functions. the grid grabbing function is important so that You can move the scene and inspect the model, or appear to walk around a solid object in the real world.

Objective-C draw a path and detect when it closes (forms a closed shape)

I'm fairly new to game programming (but not to programming) and I want to create a space ship which leaves a trail on the screen. Now my problem is to come up with a solution how to detect if the trail left from the ship forms a closed shape - eg. if the ship left a trail around an object, the object is caught inside its trail so to speak.
The direction I'm thinking is to draw the path of the trail on an image not visible on the screen and every now and then try to fill it with certain color and then check if fill is caught within the trail path. However it seems like a lot of overhead.
Any ideas how to do that? I'm using cocos2d if that's of any help
In game programming you often need to think more mathematically than visually.
First does your ship continuously leaves a trail on the screen? If yes, then it will be easier to know when the shape closes : you just have to remember the coordinate where your ship started to leave a trail, then wait for the trail to approach this coordinate another time (for example within a radius of 10 pixels, or else the user will need to be really accurate to hit exactly the same pixel to close the shape).
The visual representation of the trail is only here for the user, you'll never use it to compute anything. What you will do is to keep in memory the path followed by the ship's trail : a polygon, which is nothing else than the list of coordinates it followed.
Then after you know that your shape is closed, you have to determine if an object is inside your polygon or not. It's possible that objective-c or cocos2d (I don't know much about it) already contains a built-in function to know if a point is inside a polygon. In java there is the Polygon class which makes this really easy. If you don't find anything you can do it yourself, there are already great answers about this subject on SO, here is a nice one : How can I determine whether a 2D Point is within a Polygon?

Extending Functionality of Magic Mouse: Do I Need a kext?

I recently purchased a Magic Mouse. It is fantastic and full of potential. Unfortunately, it is seriously hindered by the software support. I want to fix that. I have done quite a lot of research and these are my findings regarding the event chain thus far:
The Magic Mouse sends full multitouch events to the system.
Multitouch events are processed in the MultitouchSupport.framework (Carbon)
The events are interpreted in the framework and sent up to the system as normal events
When you scroll with one finger it sends actual scroll wheel events.
When you swipe with two fingers it sends a swipe event.
No NSTouch events are sent up to the system. You cannot use the NSTouch API to interact with the mouse.
After I discovered all of the above, I diassembled the MultitouchSupport.framework file and, with some googling, figured out how to insert a callback of my own into the chain so I would receive the raw touch event data. If you enumerate the list of devices, you can attach for each device (trackpad and mouse). This finding would enable us to create a framework for using multitouch on the mouse, but only in a single application. See my post here: Raw Multitouch Tracking.
I want to add new functionality to the mouse across the entire system, not just a single app.
In an attempt to do so, I figured out how to use Event Taps to see if the lowest level event tap would allow me to get the raw data, interpret it, and send up my own events in its place. Unfortunately, this is not the case. The event tap, even at the HID level, is still a step above where the input is being interpreted in MultitouchSupport.framework.
See my event tap attempt here: Event Tap - Attempt Raw Multitouch.
An interesting side note: when a multitouch event is received, such as a swipe, the default case is hit and prints out an event number of 29. The header shows 28 as being the max.
On to my question, now that you have all the information and have seen what I have tried: what would be the best approach to extending the functionality of the Magic Mouse? I know I need to insert something at a low enough level to get the input before it is processed and predefined events are dispatched. So, to boil it down to single sentence questions:
Is there some way to override the default callbacks used in MultitouchSupport.framework?
Do I need to write a kext and handle all the incoming data myself?
Is it possible to write a kext that sits on top of the kext that is handling the input now, and filters it after that kext has done all the hard work?
My first goal is to be able to dispatch a middle button click event if there are two fingers on the device when you click. Obviously there is far, far more that could be done, but this seems like a good thing to shoot for, for now.
Thanks in advance!
-Sastira
How does what is happening in MultitouchSupport.framework differ between the Magic Mouse and a glass trackpad? If it is based on IOKit device properties, I suspect you will need a KEXT that emulates a trackpad but actually communicates with the mouse. Apple have some documentation on Darwin kernel programming and kernel extensions specifically:
About Kernel Extensions
Introduction to I/O Kit Device Driver Design Guidelines
Kernel Programming Guide
(Personally, I'd love something that enabled pinch magnification and more swipe/button gestures; as it is, the Magic Mouse is a functional downgrade from the Mighty Mouse's four buttons and [albeit ever-clogging] 2D scroll wheel. Update: last year I wrote Sesamouse to do just that, and it does NOT need a kext (just a week or two staring at hex dumps :-) See my other answer for the deets and source code.)
Sorry I forgot to update this answer, but I ended up figuring out how to inject multitouch and gesture events into the system from userland via Quartz Event Services. I'm not sure how well it survived the Lion update, but you can check out the underlying source code at https://github.com/calftrail/Touch
It requires two hacks: using the private Multitouch framework to get the device input, and injecting undocumented CGEvent structures into Quartz Event Services. It was incredibly fun to figure out how to pull it off, but these days I recommend just buying a Magic Trackpad :-P
I've implemented a proof-of-concept of userspace customizable multi-touch events wrapper.
You can read about it here: http://aladino.dmi.unict.it/?a=multitouch (see in WaybackMachine)
--
all the best
If you get to that point, you may want to consider the middle click being three fingers on the mouse instead of two. I've thought about this middle click issue with the magic mouse and I notice that I often leave my 2nd finger on the mouse even though I am only pressing for a left click. So a "2 finger" click might be mistaken for a single left click, and it would also require the user more effort in always having to keep the 2nd finger off the mouse. Therefor if it's possible to detect, three fingers would cause less confusion and headaches. I wonder where the first "middle button click" solution will come from, as I am anxious for my middle click Expose feature to return :) Best of luck.

Dojo dnd: Avatar positioning

Is it possible to change the positioning of the avatar with dojo toolkit's dnd api? At the moment, when dragging, the avatar of the dragged item appears to the right and below the mouse cursor. I want it to be in the same position as the mouse cursor. I ran some usability tests on my application, and most people seem to attempt to try and drag the avatar into the drop area, as opposed to moving the cursor over the drop area. Any input would be nice. Thanks!
Sorry, not possible for technical reasons.
UPDATE: by popular demands these are technical reasons:
When you have a node right under the mouse, the node gets all mouse events.
The mouse events bubble up the parent chain.
Now imagine that you move this node with the mouse — this node would always get all mouse events.
It means that any other node, e.g., a target cannot get mouse events unless it is a parent of the moved node. Typically this is not the case.
But I know that other people can do it! It should be possible! Yes, it is possible … in principle:
Let's register all target nodes.
Let's catch relevant mouse move events directly on the topmost parent (the document).
When we detect a drag operation, let's do the following:
Calculate geometry (bounding boxes) of all targets.
On every mouse move lets check if the current mouse position overlaps with a target. Bonus points for an "A+" student: detect overlaps with other nodes, e.g, when a target is partially obscure for cosmetic reasons, and process this situation correctly.
If the current mouse position overlaps with a target, let's initiate "drop is possible" actions, e.g., show some cues so the end user knows that she can drop now.
Why Dojo doesn't do that? For a number of technical reasons (finally we got there!):
A node's geometry calculations are notoriously buggy in most browsers. As soon as tables are involved, or any other non-trivial means of placement, you cannot be 100% sure that the bounding box is correct.
Geometry calculations is an expensive operation, and we have to do it at least once on every drag operation for all targets assuming that no changes can be made during the drag operation (not always the case). A browser may reflow nodes for many reasons ⇒ it can move/resize existing targets, so we have to be vigilant.
Typically the calculated boxes are kept in a list ⇒ checking the list for intersections is O(n) (linear) ⇒ doesn't scale well as number of targets grow.
All mouse event handlers should be fast, otherwise a browser's mouse event handling facility can be "broken" leading to unpredictable side-effects. See the previous points for reasons why mouse event processing can be slow.
Improving on the linear search is possible, e.g., 2D spatial trees can be used, but it leads to more (much more) JavaScript code ⇒ more stuff to download on the client side ⇒ typically it isn't worth it.
How do I know that? Because Dojo used to have this kind of drag'n'drop in earlier versions, and we got sick and tired fighting problems I described above. Any improvement was an uphill battle, which increased the code size. Finally we decided against reinventing and replicating mechanisms already built in a browser. A browser does virtually the same work: calculates geometry of nodes, finds the underlying node, and dispatches a mouse move event appropriately.
The current implementation doesn't use mouse move events and do not calculate the geometry. Instead it relies on mouse over/out events detected by targets after a drag was started. It works reliably and scales well.
Another wrinkle in this story: Dojo treats targets as containers — a very common use case (shopping carts, rearranging items, editing hierarchies). Linear containers and generic trees are implemented at the moment, custom containers are possible. When dragging and dropping you can see and drop dragged items in a proper position within a target container, e.g., inserting them between existing items. Implementing this feature using geometric calculations and checks would be prohibitively expensive.