CGWindowListCreate generates a hugely long list of windows - objective-c

When I use CGWindowListCreate from quartz window services, it generates a very long array of window id's. I tried to turn on the option to exclude desktop elements, but I get a list of 30-40 windows even if there are only 3 or 4 of what I would call windows open.
Here is how I am doing it:
CGWindowListOption opt = 1 << 4;
CFArrayRef windowids =CGWindowListCreate(opt,kCGNullWindowID);
I am wondering what I am doing wrong that is causing this problem, and what I can do to fix it. I simply want the program to list windows created by applications, such as finder windows or browser windows, and not whatever else it is including. Thank you in advance for your help.

This will return every window whether it is on screen or off screen, you should combine it with the option kCGWindowListOptionOnScreenOnly (and also don't hardcode the one you are using). It will look like this:
CGWindowListOption opt = kCGWindowListOptionOnScreenOnly|kCGWindowListExcludeDesktopElements;
CFArrayRef windowids =CGWindowListCreate(opt,kCGNullWindowID);
That is what I gathered from the docs anyway.

I discovered a solution is to filter the window list to only those windows "below" the Dock (in terms of window layering).
The code below worked well for me. It fetches all on screen windows (excluding desktop elements). It extracts the window ID for the "Dock" window out of the list. Then fetches on screen windows again, filtering to only those windows "below" the Dock window.
// Fetch all on screen windows
CFArrayRef windowListArray = CGWindowListCreate(kCGWindowListOptionOnScreenOnly|kCGWindowListExcludeDesktopElements, kCGNullWindowID);
NSArray *windows = CFBridgingRelease(CGWindowListCreateDescriptionFromArray(windowListArray));
NSLog(#"All on screen windows: %#", windows);
// Find window ID of "Dock" window
NSNumber *dockWindowNumber = nil;
for (NSDictionary *window in windows) {
if ([(NSString *)window[(__bridge NSString *)kCGWindowName] isEqualToString:#"Dock"]) {
dockWindowNumber = window[(__bridge NSString *)kCGWindowNumber];
break;
}
}
NSLog(#"dockWindowNumber: %#", dockWindowNumber);
CFRelease(windowListArray);
if (dockWindowNumber) {
// Fetch on screen windows again, filtering to those "below" the Dock window
// This filters out all but the "standard" application windows
windowListArray = CGWindowListCreate(kCGWindowListOptionOnScreenBelowWindow|kCGWindowListExcludeDesktopElements, [dockWindowNumber unsignedIntValue]);
NSArray *windows = CFBridgingRelease(CGWindowListCreateDescriptionFromArray(windowListArray));
NSLog(#"On screen application windows: %#", windows);
}
else {
NSLog(#"Could not find Dock window description");
}

Related

Capture screenshot of macOS window

Note: this question is intentionally very general (e.g. both Objective-C and Swift code examples are requested), as it is intended to document how to capture a window screenshot on macOS as accessibly as possible.
I want to capture a screenshot of a macOS window in Objective-C/Swift code. I know this is possible because of the multitude of ways to take a screenshot on macOS (⇧⌘4, the Grab utility, screencapture on the command line, …), but I’m not sure how to do it in my own code. Ideally, I’d be able to specify a window of a particular application, and then capture it in an NSImage or CGImage that I could then process and display to the user or store in a file.
Screen capture on macOS is possible through Quartz Window Services, a facility of the Core Graphics framework. Our key function here is CGWindowListCreateImage, which “returns a composite image based on a dynamically generated list of windows,” or, in other words, finds windows based on specified criteria and creates an image with the contents of each. Perfect! Its declaration is as follows:
CGImageRef CGWindowListCreateImage(CGRect screenBounds,
CGWindowListOption listOption,
CGWindowID windowID,
CGWindowImageOption imageOption);
So, in order to capture one specific window on the screen, we’ll need its window ID (CGWindowID). To go about retrieving that, we’ll first need a list of all of the windows available on the system. We get that through CGWindowListCopyWindowInfo, which takes CGWindowListOptions and a corresponding CGWindowID that, together, select which windows to include in the resulting list. To get ALL the windows, we specify kCGWindowListOptionAll, and kCGNullWindowID, respectively. Also, if you haven’t figured it out already, this is a C API, so we’ll use a bridging cast to work with the friendlier Objective-C containers rather than the Core Foundation ones.
Objective-C:
NSArray<NSDictionary*> *windowInfoList = (__bridge_transfer id)
CGWindowListCopyWindowInfo(kCGWindowListOptionAll, kCGNullWindowID);
Swift:
let windowInfoList = CGWindowListCopyWindowInfo(.optionAll, kCGNullWindowID)!
as NSArray
From here, we need to filter our windowInfoList down to the specific window that we want. Chances are we want to filter first by application. To do that, we’ll need the process ID of our application of choice. We can use NSRunningApplication to accomplish this:
Objective-C:
NSArray<NSRunningApplication*> *apps =
[NSRunningApplication runningApplicationsWithBundleIdentifier:
/* Bundle ID of the application, e.g.: */ #"com.apple.Safari"];
if (apps.count == 0) {
// Application is not currently running
puts("The application is not running");
return; // Or whatever
}
pid_t appPID = apps[0].processIdentifier;
Swift:
let apps = NSRunningApplication.runningApplications(withBundleIdentifier:
/* Bundle ID of the application, e.g.: */ "com.apple.Safari")
if apps.isEmpty {
// Application is not currently running
print("The application is not running")
return // Or whatever
}
let appPID = apps[0].processIdentifier
With appPID in hand, we can now go ahead and filter down our window info list to only windows with a matching owner PID:
Objective-C:
NSMutableArray<NSDictionary*> *appWindowsInfoList = [NSMutableArray new];
for (NSDictionary *info in windowInfoList) {
if ([info[(__bridge NSString *)kCGWindowOwnerPID] integerValue] == appPID) {
[appWindowsInfoList addObject:info];
}
}
Swift:
var appWindowsInfoList = [NSDictionary]()
for info_ in windowInfoList {
let info = info_ as! NSDictionary
if (info[kCGWindowOwnerPID as NSString] as! NSNumber).intValue == appPID {
appWindowsInfoList.append(info)
}
}
We could have done additional filtering above by testing other keys of the info dictionary—for example, by name (kCGWindowName), or by whether the window is on-screen (kCGWindowIsOnscreen)—but for now, we’ll just take the first window in the list:
Objective-C:
NSDictionary *appWindowInfo = appWindowsInfoList[0];
CGWindowID windowID = [appWindowInfo[(__bridge NSString *)kCGWindowNumber] unsignedIntValue];
Swift:
let appWindowInfo: NSDictionary = appWindowsInfoList[0];
let windowID: CGWindowID = (appWindowInfo[kCGWindowNumber as NSString] as! NSNumber).uint32Value
And we have our window ID! Now, what else did we need for that call again?
CGImageRef CGWindowListCreateImage(CGRect screenBounds,
CGWindowListOption listOption,
CGWindowID windowID,
CGWindowImageOption imageOption);
First, we need a screenBounds to capture. According to the documentation, we can specify CGRectNull for this parameter to enclose all specified windows as tightly as possible. Works for me.
Second, we have to specify how we want to select our windows with listOption. We actually used one of these earlier, in our call to CGWindowListCopyWindowInfo, but there we wanted all the windows on the system; here, we only want one, so we’ll specify kCGWindowListOptionIncludingWindow, which, contrary to its documentation page, is meaningful on its own for CGWindowListCreateImage in that it specifies the window we pass, and only the window we pass.
Third, we pass our windowID as the window we want captured.
Fourth and finally, we can specify CGWindowImageOptions with the imageOption parameter. These affect the appearance of the resulting image; you can combine them through bitwise OR. The full list is here, but common ones include either kCGWindowImageDefault, which captures the window's contents along with its frame and shadow, or kCGWindowImageBoundsIgnoreFraming, which captures only the content, and kCGWindowImageBestResolution, which captures the window's content at the best resolution available, regardless of actual size (and, depending on the window, may be considerably large), or kCGWindowImageNominalResolution, which captures the window at its current size on the screen. Here, I’ve gone with kCGWindowImageBoundsIgnoreFraming and kCGWindowImageNominalResolution to capture only the content at the same size as on the screen.
Aaand, drumroll please:
Objective-C:
CGImageRef windowImage =
CGWindowListCreateImage(CGRectNull, kCGWindowListOptionIncludingWindow,
windowID, kCGWindowImageBoundsIgnoreFraming|
kCGWindowImageNominalResolution);
// NOTE: windowImage may be NULL if the capture failed
Swift:
let windowImage: CGImage? =
CGWindowListCreateImage(.null, .optionIncludingWindow, windowID,
[.boundsIgnoreFraming, .nominalResolution])
Here's the Objective C code without all the exposition, and no need to know your bundle ID ahead of time:
int processID = [[NSProcessInfo processInfo] processIdentifier];
NSArray<NSDictionary*>* windowInfoList = (__bridge_transfer id) CGWindowListCopyWindowInfo(kCGWindowListOptionOnScreenOnly, kCGNullWindowID);
int windowID = -1;
for (NSDictionary* info in windowInfoList) {
int thisProcess = [info[(__bridge NSString *)kCGWindowOwnerPID] integerValue];
if (thisProcess == processID) {
windowID = [info[(__bridge NSString *)kCGWindowNumber] integerValue];
break;
}
}
CGImageRef screenCG = nil;
if (windowID != -1)
screenCG = CGWindowListCreateImage(CGRectNull, kCGWindowListOptionIncludingWindow, windowID, kCGWindowImageBoundsIgnoreFraming);

Simulating mouse-down event on another window from CGWindowListCreate

You can list the currently open windows using CGWindowListCreate, e.g:
CFArrayRef windowListArray = CGWindowListCreate(kCGWindowListOptionOnScreenOnly|kCGWindowListExcludeDesktopElements, kCGNullWindowID);
NSArray *windows = CFBridgingRelease(CGWindowListCreateDescriptionFromArray(windowListArray));
// stuff here with windows array
CFRelease(windowListArray);
With this you can get a specific window, e.g. a window from Chrome that's somewhere in the background of the current workspace, but not minimized. I also found that you can simulate mouse-clicks anywhere on-screen using CGEventCreateMouseEvent:
CGEventRef click_down = CGEventCreateMouseEvent(NULL, kCGEventLeftMouseDown, point, kCGMouseButtonLeft);
CGEventPost(kCGHIDEventTap, click_down);
Instead of sending this event to the front-most window under this point on the screen, can I send this to a window in background? Or is the only way to temporarily switch to that window (bring it to front), click, and switch back to the previous frontmost window?
This post suggests the latter is possible:
Cocoa switch focus to application and then switch it back
Although I'm very interested in seeing whether this can be avoided, whether we can send mouse-clicks directly to a specific window in background without bringing it into view. I can consider any Objective-C, C, or C++ options for this.
Finally got this working after finding an alternative to the deprecated GetPSNForPID function:
NSEvent *customEvent = [NSEvent mouseEventWithType: NSEventTypeLeftMouseDown
location: point
modifierFlags: NSEventModifierFlagCommand
timestamp:[NSDate timeIntervalSinceReferenceDate]
windowNumber:[self.windowID intValue]
context: nil
eventNumber: 0
clickCount: 1
pressure: 0];
CGEvent = [customEvent CGEvent];
CGEventPostToPid(PID, CGEvent);
Didn't realise there was a CGEventPostToPid method instead of CGEventPostToPSN. No need for the ProcessSerialNumber struct.

How to check CGWindowID still valid

If I am getting a CGWindowID (_windowID) as follows...
NSArray *windowList = (__bridge NSArray *)CGWindowListCopyWindowInfo(kCGWindowListOptionOnScreenOnly, kCGNullWindowID);
for (NSDictionary *info in windowList) {
if ([[info objectForKey:(NSString *)kCGWindowOwnerName] isEqualToString:#"window name"] && ![[info objectForKey:(NSString *)kCGWindowName] isEqualToString:#""]) {
_windowID = [[info objectForKey:(NSString *)kCGWindowNumber] unsignedIntValue];
}
}
How do I properly test the window id is still valid and the window has not been closed? Do i just run similar code just checking window id exists?
Thanks in advance
The documentation for the kCGWindowListOptionOnScreenOnly constant says:
List all windows that are currently onscreen. Windows are returned in
order from front to back. When retrieving a list with this option, the
relativeToWindow parameter should be set to kCGNullWindowID.
So the windows would certainly be on screen, since nothing appears to be happening between the call to CGWindowListCopyWindowInfo and your action on it.
Maybe you'd want to test to make sure they are not hidden or minimized?

Find the currently active Finder window/folder

Is there any way to determine the currently active window, or a folder, in the Finder? I need this to determine, in some sense, an appropriate "default" location in which to do some particular things in my app.
Actually, does this question even make sense? Does this concept of a "currently active Finder window/folder" even exist in the first place? If it does not, I kindly ask how to get the currently selected Finder item.
Yes, the concept of the currently active Finder window does exist, as well as the currently selected item.
For example, the following AppleScript gets the selection which is the current selection in the frontmost window. Since this returns a list of files or folders even if there is a single item, the next line gets the first item out of that list (after making sure that the count of the list is greater than 0). You can then ask the Finder for the container window of the selected item, which will return a Finder window object.
tell application "Finder"
set selectedItems to selection
if ((count of selectedItems) > 0) then
set selectedItem to (item 1 of selectedItems) as alias
container window of selectedItem
end if
end tell
I'm pretty sure the code sidyll posted will work okay in 10.5 and earlier, but it errors out in 10.6 due to the inevitable changes and quirkiness that AppleScript seems to have from one version of OS X to the next.
[EDIT] Actually, I just figured out what's going on.
I usually have the Finder's Inspector window open all the time (the dynamic Get Info window you get through Command-Option-i), the upper right panel in the image below:
That image shows 3 different classes of windows:
1) The upper left, a Get Info window, is an information window, which inherits from the generic window class.
2) The upper right, an Inspector window, is a plain window.
3) The lower image shows a Finder window, which inherits from the generic window class.
If I run the following script with the setup of windows shown above:
tell app "Finder"
every window
end tell
it returns the following result:
{window "mdouma46 Info" of application
"Finder", information window "mdouma46
Info" of application "Finder", Finder
window id 1141 of application
"Finder"}
So, what was happening in my case is that, since the Inspector window is a floating utility panel, if it's currently being shown, asking the Finder for window 1 will always return the Inspector panel, since it's always floating in front of the other windows.
So the error I was getting when running the code was:
error "Can’t make «class fvtg» of
window 1 of application \"Finder\"
into type alias." number -1700 from
«class fvtg» of window 1 to alias
(In other words, the Inspector window, a plain window, doesn't have the FileViewer target (fvtg) property; only Finder windows do).
So, your code will work fine as long as the user doesn't have the Inspector window, the Preferences window, or a Get Info window that is frontmost. By changing window to Finder window, though, you can make sure that you only look at the file viewer windows that have the target property.
So, like this:
NSDictionary *errorMessage = nil;
NSAppleScript *script = [[[NSAppleScript alloc] initWithSource:
#"tell application \"Finder\"\n"
" if ((count of Finder windows) > 0) then\n"
" return (POSIX path of (target of Finder window 1 as alias))\n"
"end if\n"
"end tell"] autorelease];
if (script == nil) {
NSLog(#"failed to create script!");
return nil;
}
NSAppleEventDescriptor *result = [script executeAndReturnError:&errorMessage];
if (result) {
// POSIX path returns trailing /'s, so standardize the path
NSString *path = [[result stringValue] stringByStandardizingPath];
return path;
}
I needed to do this in a project in the past and recurred to AppleScript:
// Get path
NSAppleScript *script = [[NSAppleScript alloc] initWithSource:
#"tell application \"Finder\"\n"
" return POSIX path of (target of window 1 as alias)\n"
"end tell"];
NSDictionary *errors = nil;
NSAppleEventDescriptor *descriptor = [script executeAndReturnError:&errors];
if ((errors != nil) || (descriptor == nil)) {
// There is no opened window or an error occured
} else {
// what was retrieved by the script
path = [descriptor stringValue];
}
[script release];

addGlobalMonitorForEventsMatchingMask only returning mouse position

I'm trying to learn to code for the Mac. I've been a Java guy for a while, so I hope the problem I'm running into is a simple misunderstanding of Cocoa.
I've got the following code:
-(IBAction)beginEventMonitor:(id)sender {
_eventMonitor = [NSEvent addGlobalMonitorForEventsMatchingMask:(NSLeftMouseUpMask)
handler:^(NSEvent *incomingEvent) {
//NSWindow *targetWindowForEvent = [incomingEvent window];
NSLog(#"Got a mouse click event at %#", NSStringFromPoint([incomingEvent locationInWindow]));
}];
}
-(IBAction)stopEventMonitor:(id)sender {
if (_eventMonitor) {
[NSEvent removeMonitor:_eventMonitor];
_eventMonitor = nil;
}
}
This is a simple hook to tell me when a mouse click happens at a global level. The handler is working, but the contents of the incomingEvent don't seem to be set to anything. The only useful information that I can find is the location of the mouse at the time of the click, and the windowId of the window that was clicked in.
Shouldn't I be able to get more information? Am I not setting up the monitor correctly? I'd really like to be able to know which window was clicked in, but I can't even find a way to turn the mouse location or windowId into something useful.
You can retrieve more information about the window using the CGWindow APIs (new in Leopard), for example:
CGWindowID windowID = (CGWindowID)[incomingEvent windowNumber];
CFArrayRef a = CFArrayCreate(NULL, (void *)&windowID, 1, NULL);
NSArray *windowInfos = (NSArray *)CGWindowListCreateDescriptionFromArray(a);
CFRelease(a);
if ([windowInfos count] > 0) {
NSDictionary *windowInfo = [windowInfos objectAtIndex:0];
NSLog(#"Name: %#", [windowInfo objectForKey:(NSString *)kCGWindowName]);
NSLog(#"Owner: %#", [windowInfo objectForKey:(NSString *)kCGWindowOwnerName]);
//etc.
}
[windowInfos release];
There's lots of information there (look in CGWindow.h or refer to the docs for available keys). There are also functions to create screenshots of just one window (which even works if it's partially covered by another window), cool stuff!