On windows, and with c#, I was able to capture another window's "screen", use that for processing, and then send user input events to that window. (which were generated by the program).
I would like to do the same with objective-c and within Mac OS X. So! Any resources or even a name for what I'm trying to do would be great. It's very frustrating to try and find information on this when the only ways i can think to phrase my searches are too ambiguous.
Thanks!
EDIT: As a specific example, there might be a particular game that I want to make an AI for. In that case I would need to be able to send mouse and keyboard events to the game.
If you're looking to do automated user interface testing, look at Squish, eggPlant, expect and OS X Accessibility. If you're looking to programmatically control another program, use scripting (if you wrote the program to control, add scripting support; otherwise, see what scripting support the program offers), or try Automator.
Related
So i was thinking is there a way i can program an AI that reads something(numbers mostly, in a font and a particular area on the screen that i will specify) and then perform some clicks on the screen according to what it read...the data(numbers) will change constantly and the AI will have to look out for these changes and act accordingly. I am not asking exactly how do i do that. I am asking whether it is possible and if yes then which approach should i take like for example python or something else and where do i start?
You need a OCR library such as opencv to recognize digits. The rest should be regular programming.
It is quite likely that your operating system doesn't allow you access to parts of the screen that are not owned by your application, so you are either blocked at this point, or you are restricted to parts of the screen owned by your application. (If I enter my details on the screen into my banking app, I definitely don't want another app to be able to read it).
Next you'd need to find away to read the pixels on the screen programatically. That will be very different from OS to OS, so very unlikely to be built into your language's library. You might be able to interface with whatever is availabe on your OS, or find a library that does it for you. This will give you an image, made of pixels.
Then you need some OCR software to read the text. AI doesn't seem to be involved in any of this.
I'd like to find both current running programs (or at least program in the foreground) programmatically - and also key events on OS X.
I found inter-application communication guidelines, but they don't seem to say I can find out what applications are running.
I've found key events, but it seems to imply that the current task in the forefront is the one that gets the key events and only if it doesn't handle them does it go up to event chain. I'd like to programmatically intercept them.
Seems dubious, I know. I'm trying to use key events along with screen captures to try to best learn text on the screen - it's for research.
I'm using swift, but I understand that an obj-c example is pretty helpful since they all use the same libraries.
To get a list of running applications use:
NSWorkspace.sharedWorkspace().runningApplications
You can build a key logger (so to speak) by creating an event monitor (same document you linked, just a different section).
I am looking a bit of simple code that will take whatever data comes from the keyboard at a given moment and let me check it for certain button presses. It doesn't matter what language it is in.
I am looking for a console solution, no GUI.
It sounds like you are looking for the curses library (you may also find an implementation as "ncurses", which is probably already installed on your system). This library powers most of the "full-screen" console programs that you might see on Linux.
I'm working on an idea for a type of window manager for OSX, similar to Cinch or SizeUp. In order to do this I need to be able to determine the positions of various windows, and which window is active. Some kind of callback when the active window changes would also be useful, as would how to handle multiple screens and multiple spaces.
I'm resigned to the fact that I'll probably need to learn Objective C for this, but if there is a way to do this type of thing from Java, that would be particularly awesome.
Can anyone point me to the appropriate place in the OSX API for these things?
You'll want to take a look at the Accessibility API, as discussed here.
I've started programming on Mac OS X in Objective-C and decided to create a little card game. At first I create a command line version. If everything works fine I would like to implement a GUI: Nothing big, just a green window with cards which can be dragged and dropped.
Since I don't have any idea about how to do this: What can I use to implement my card game GUI?
Since Objective-C and Cocoa looks like a "bundle" on Mac OS X, is it possible to use Cocoa for this (and how)? If not, what else should I use or is there already sth. like this?
Regards,
inno
Apple has some sample code here that could point you in the right direction.
This is a fine study in MVC.
Your Model (an active game, player and non-player characters, the cards in the game, etc.) will be entirely in Foundation, at least at first. You can add AppKit-dependent properties such as images later, and use the C preprocessor to conditionalize that code if you want to continue maintaining your command-line program.
Your Controllers will also generally be in pure Foundation. I say “Controllers” because you'll have one for both programs (owning the model, responding to user actions, and running the game—that is, dealing cards, enforcing the rules, etc.), one specifically for the command-line program (holding the readline/output loop), and at least one specifically for the GUI program (owning the game window).
In the GUI app, you'll write your Views with AppKit, of course. In the command-line app, you may want to make a View class, separate from the Controllers, to make it easy to radically change the output quickly (even at runtime, if you want to allow that). Of course, that View will not descend from NSView, and will use terminal output instead of graphical drawing.
I recommend keeping the command-line version of the program alive after you make the GUI version, at least for a little while. You'll know you're doing it right when you can maintain both working versions of the program without much fuss, and even find a bug in one version of the program and fix it in both.