How to monitor for swipe gesture globally in OS X - objective-c

I'd like to make an OSX application that runs in the background and performs some function when a swipe down with four fingers is detected on the trackpad.
Seems easy enough. Apple's docs show almost exactly this here. Their example monitors for mouse down events. As a simple test, I put the following in applicationDidFinishLaunching: in my AppDelegate.
void (^handler)(NSEvent *e) = ^(NSEvent *e) {
NSLog(#"Left Mouse Down!");
};
[NSEvent addGlobalMonitorForEventsMatchingMask:NSLeftMouseDownMask handler:handler];
This works as expected. However, changing NSLeftMouseDownMask to NSEventMaskSwipe does not work. What am I missing?

Well, the documentation for NSEvent's +addGlobalMonitorForEventsMatchingMask:handler: gives a list of event it supports and NSEventMaskSwipe is not listed so... it's to be expected that it not work.
While the API obviously supports the tracking of gesture locally within your own application (through NSResponder), I believe gestures can't be track globally by design. Unlike key combinations, there are much lower forms/types of gestures... essentially only:
pinch in/out (NSEventTypeMagnify)
rotations (NSEventTypeRotation)
directional swipes with X amount of fingers (NSEventTypeSwipe)
There's not as much freedom. With keys, you have plenty of modifiers (control, option, command, shift) and the whole alphanumeric keys making plenty of possible combinations so it'd be easier to reduce the amount of conflicts with local-events and global-events. Similarly, mouse events are region-based; clicking in one region can easily be differenciated from clicking in another region (from both the program's and user's point-of-view).
Because of this lower possible combination of touch events, I believe Apple might purposely be restricting global (as in, one app, responding to one or more gestures for the whole system) usage for its own usage (Mission Control, Dashboard, etc.)

Related

How to track the location of a window belonging to another app

When screen sharing a specific window on macOS with Zoom or Skype/Teams, they draw a red or green highlight border around that window (which belongs to a different application) to indicate it is being shared. The border is following the target window in real time, with resizing, z-order changes etc.
See example:
What macOS APIs and techniques might be used to achieve this effect?
You can find the location of windows using CGWindowListCopyWindowInfo and related API, which is available to Sandboxed apps.
This is a very fast and efficient API, fast enough to be polled. The SonOfGrab sample code is great platform to try out this stuff.
You can also install a global event tap using +[NSEvent addGlobalMonitorForEventsMatchingMask:handler:] (available in sandbox) to track mouse down, drag and mouse up events and then you can respond immediately whenever the user starts or releases a drag. This way your response will be snappy.
(Drawing a border would be done by creating your own transparent window, slightly larger than, and at the same window layer as, the window you are tracking. And then simply draw a pretty green box into it. I'm not exactly sure about setting the z-order. The details of this part would be best as a separate question.)

How did they make the controls in "Fast Like a Fox"?

I am trying to make a basic rhythm game in Godot, but with unique controls. A few years ago, I played a cool game called Fast Like a Fox. The controls were unique, because you tapped on the back of your device to move your character to move, not on the screen. I thought the controls were cool, and I want to try to replicate them in a simple one-button rhythm game for mobile. Does anyone know if it would be possible for Godot to take that kind of input, either in a built-in function or something else?
They read the accelerometer (and maybe other sensors), which Godot supports through accelerometer, gravity and gyroscope. Accelerometers are accurate enough to read passwords as they're being typed so you can even get a rough estimate on where the user is tapping, which is used in Fast Like a Fox use case where internally they poll the sensor and raise an event when particular changes happen in one or multiple axes. In your case, it might be enough to just treat any sudden changes as an event if you simply care about the user tapping anything.
Try writing an app that will display the delta of each axis measurement then tap your phone around, you'll figure it out. Remember to test on various conditions (device being held upside down while laying on a bed, sitting on a chair, laying on one's side, etc) since different axes will register the changes.

Is there a way to get push to scroll functionality in Windows 8 metro Apps?

In the Windows 8 Consumer Preview, moving the mouse towards the left or right edge in the start screen causes the content to scroll.
The standard controls (and currently released preview apps) does not seem to support this.
Is there a way to make this work?
I asked this question at TechEd North America this year, after one of the sessions given by Paul Gusmorino, a lead program manager for the UI platform.
His answer was that no, apps can't do push-against-the-edge-to-scroll. WinJS and WinRT/XAML apps don't even get the events you would need to implement it yourself. Apps get events at the level of the mouse pointer, and once the mouse pointer hits the edge of the screen, it can't move any farther and you don't get any more events. (Well, it might wiggle up and down a little bit, but not if it hit a corner. At any rate, it's not good enough to scroll the way the Start screen does.)
He mentioned that, if you were writing a C++/DirectX app, you would be able to get the raw mouse input you needed to do this yourself -- you can get low-level "device moved by DX,DY" rather than the high-level "pointer moved to X,Y". I'm guessing this is how the Start screen does it, though I didn't think to ask.
But no, it's not built-in, it's not something you can implement yourself (unless you write your app in low-level C++/DirectX), and it sounds like they have no plans to add it before Windows 8 ships.
Personally, I think it's pretty short-sighted of them to have apps feel cripped compared to the Start screen, but evidently they're not concerned about little things like usability. </rant>
You can do the following to get information on mouse moving beyond the screen and use the delta information to scroll your content.
using Windows.Devices.Input;
var mouse = MouseDevice.GetForCurrentView();
mouse.MouseMoved += mouse_MouseMoved;
private void mouse_MouseMoved(MouseDevice sender, MouseEventArgs args)
{
tb.Text = args.MouseDelta.X.ToString();
}

General considerations for NUI/touch interface

For the past few months I've been looking into developing a Kinect based multitouch interface for a variety of software music synthesizers.
The overall strategy I've come up with is to create objects, either programatically or (if possible) algorithmically to represent various controls of the soft synth. These should have;
X position
Y position
Height
Width
MIDI output channel
MIDI data scaler (convert x-y coords to midi values)
2 strategies I've considered for agorithmic creation are XML description and somehow pulling stuff right off the screen (ie given a running program, find xycoords of all controls). I have no idea how to go about that second one, which is why I express it in such specific technical language ;). I could do some intermediate solution, like using mouse clicks on the corners of controls to generate an xml file. Another thing I could do, that I've seen frequently in flash apps, is to put the screen size into a variable and use math to build all interface objects in terms of screen size. Note that it isn't strictly necessary to make the objects the same size as onscreen controls, or to represent all onscreen objects (some are just indicators, not interactive controls)
Other considerations;
Given (for now) two sets of X/Y coords as input (left and right hands), what is my best option for using them? My first instinct is/was to create some kind of focus test, where if the x/y coords fall within the interface object's bounds that object becomes active, and then becomes inactive if they fall outside some other smaller bounds for some period of time. The cheap solution I found was to use the left hand as the pointer/selector and the right as a controller, but it seems like I can do more. I have a few gesture solutions (hidden markov chains) I could screw around with. Not that they'd be easy to get to work, exactly, but it's something I could see myself doing given sufficient incentive.
So, to summarize, the problem is
represent the interface (necessary because the default interface always expects mouse input)
select a control
manipulate it using two sets of x/y coords (rotary/continuous controller) or, in the case of switches, preferrably use a gesture to switch it without giving/taking focus.
Any comments, especially from people who have worked/are working in multitouch io/NUI, are greatly appreciated. Links to existing projects and/or some good reading material (books, sites, etc) would be a big help.
Woah lots of stuff here. I worked on lots of NUI stuff during my at Microsoft so let's see what we can do...
But first, I need to get this pet peeve out of the way: You say "Kinect based multitouch". That's just wrong. Kinect inherently has nothing to do with touch (which is why you have the "select a control" challenge). The types of UI consideration needed for touch, body tracking, and mouse are totally different. For example, in touch UI you have to be very careful about resizing things based on screen size/resolution/DPI... regardless of the screen, fingers are always the same physical size and people have the same degreee of physical accuracy so you want your buttons and similar controls to always be roughly the same physical size. Research has found 3/4 of an inch to be the sweet spot for touchscreen buttons. This isn't so much of a concern with Kinect though since you aren't directly touching anything - accuracy is dictated not by finger size but by sensor accuracy and users ability to precisely control finicky & lagging virtual cursors.
If you spend time playing with Kinect games, it quickly becomes clear that there are 4 interaction paradigms.
1) Pose-based commands. User strikes and holds a pose to invoke some application-wide or command (usually brining up a menu)
2) Hover buttons. User moves a virtual cursor over a button and holds still for a certain period of time to select the button
3) Swipe-based navigation and selection. User waves their hands in one direction to scroll and list and another direction to select from the list
4) Voice commands. User just speaks a command.
There are other mouse-like ideas that have been tried by hobbyists (havent seen these in an actual game) but frankly they suck: 1) using one hand for cursor and another hand to "click" where the cursor is or 2) using z-coordinate of the hand to determine whether to "click"
It's not clear to me whether you are asking about how to make some existing mouse widgets work with Kinect. If so, there are some projects on the web that will show you how to control the mouse with Kinect input but that's lame. It may sound super cool but you're really not at all taking advantage of what the device does best.
If I was building a music synthesizer, I would focus on approach #3 - swiping. Something like Dance Central. On the left side of the screen show a list of your MIDI controllers with some small visual indication of their status. Let the user swipe their left hand to scroll through and select a controller from this list. On the right side of the screen show how you are tracking the users right hand within some plane in front of their body. Now you're letting them use both hands at the same time, giving immediate visual feedback of how each hand is being interpretted, and not requiring them to be super precise.
ps... I'd also like to give a shout out to Josh Blake's upcomming NUI book. It's good stuff. If you really want to master this area, go order a copy :) http://www.manning.com/blake/

Extending Functionality of Magic Mouse: Do I Need a kext?

I recently purchased a Magic Mouse. It is fantastic and full of potential. Unfortunately, it is seriously hindered by the software support. I want to fix that. I have done quite a lot of research and these are my findings regarding the event chain thus far:
The Magic Mouse sends full multitouch events to the system.
Multitouch events are processed in the MultitouchSupport.framework (Carbon)
The events are interpreted in the framework and sent up to the system as normal events
When you scroll with one finger it sends actual scroll wheel events.
When you swipe with two fingers it sends a swipe event.
No NSTouch events are sent up to the system. You cannot use the NSTouch API to interact with the mouse.
After I discovered all of the above, I diassembled the MultitouchSupport.framework file and, with some googling, figured out how to insert a callback of my own into the chain so I would receive the raw touch event data. If you enumerate the list of devices, you can attach for each device (trackpad and mouse). This finding would enable us to create a framework for using multitouch on the mouse, but only in a single application. See my post here: Raw Multitouch Tracking.
I want to add new functionality to the mouse across the entire system, not just a single app.
In an attempt to do so, I figured out how to use Event Taps to see if the lowest level event tap would allow me to get the raw data, interpret it, and send up my own events in its place. Unfortunately, this is not the case. The event tap, even at the HID level, is still a step above where the input is being interpreted in MultitouchSupport.framework.
See my event tap attempt here: Event Tap - Attempt Raw Multitouch.
An interesting side note: when a multitouch event is received, such as a swipe, the default case is hit and prints out an event number of 29. The header shows 28 as being the max.
On to my question, now that you have all the information and have seen what I have tried: what would be the best approach to extending the functionality of the Magic Mouse? I know I need to insert something at a low enough level to get the input before it is processed and predefined events are dispatched. So, to boil it down to single sentence questions:
Is there some way to override the default callbacks used in MultitouchSupport.framework?
Do I need to write a kext and handle all the incoming data myself?
Is it possible to write a kext that sits on top of the kext that is handling the input now, and filters it after that kext has done all the hard work?
My first goal is to be able to dispatch a middle button click event if there are two fingers on the device when you click. Obviously there is far, far more that could be done, but this seems like a good thing to shoot for, for now.
Thanks in advance!
-Sastira
How does what is happening in MultitouchSupport.framework differ between the Magic Mouse and a glass trackpad? If it is based on IOKit device properties, I suspect you will need a KEXT that emulates a trackpad but actually communicates with the mouse. Apple have some documentation on Darwin kernel programming and kernel extensions specifically:
About Kernel Extensions
Introduction to I/O Kit Device Driver Design Guidelines
Kernel Programming Guide
(Personally, I'd love something that enabled pinch magnification and more swipe/button gestures; as it is, the Magic Mouse is a functional downgrade from the Mighty Mouse's four buttons and [albeit ever-clogging] 2D scroll wheel. Update: last year I wrote Sesamouse to do just that, and it does NOT need a kext (just a week or two staring at hex dumps :-) See my other answer for the deets and source code.)
Sorry I forgot to update this answer, but I ended up figuring out how to inject multitouch and gesture events into the system from userland via Quartz Event Services. I'm not sure how well it survived the Lion update, but you can check out the underlying source code at https://github.com/calftrail/Touch
It requires two hacks: using the private Multitouch framework to get the device input, and injecting undocumented CGEvent structures into Quartz Event Services. It was incredibly fun to figure out how to pull it off, but these days I recommend just buying a Magic Trackpad :-P
I've implemented a proof-of-concept of userspace customizable multi-touch events wrapper.
You can read about it here: http://aladino.dmi.unict.it/?a=multitouch (see in WaybackMachine)
--
all the best
If you get to that point, you may want to consider the middle click being three fingers on the mouse instead of two. I've thought about this middle click issue with the magic mouse and I notice that I often leave my 2nd finger on the mouse even though I am only pressing for a left click. So a "2 finger" click might be mistaken for a single left click, and it would also require the user more effort in always having to keep the 2nd finger off the mouse. Therefor if it's possible to detect, three fingers would cause less confusion and headaches. I wonder where the first "middle button click" solution will come from, as I am anxious for my middle click Expose feature to return :) Best of luck.