I am trying to develop a cocoa application that requires to read highlighted text from any application. But so far I cannot find any decent solution because the accessibility API isn't always work. (e.g. Firefox) Does anyone know how it is implemented in text-to-speech included in Leopard?
I have found this SO question but it isn't solved.
That's a Service, so anywhere that an application identifies its selection as text, the text-to-speech service will work.
Related
I was wanting to make an app similar to something like TextExpander, but I am not sure how you would intercept the text. As far as I can tell, I need to start with NSAccessability. Could anyone share some snippets, or at least point me in the right direction?
First off, you should be aware that, because of the sandboxing requirement, this isn't possible at all if you want to sell your app in the App Store.
If you don't intend to sandbox your app, you can use the NSEvent class method addGlobalMonitorForEventsMatchingMask: to create a global key event handler that gets called when keys are pressed in other apps (but not your own app, use addLocalMonitor... for that).
To actually insert snippets, like TextExpander, there are several ways. You can use the accessibility APIs, but that requires that the app(s) you're targeting support accessibility, which isn't always the case.
Another option is to use the Quartz Event Services (CGEvent) APIs which provide (among other things) a low-level method to simulate key events.
Edit: Nevermind. You're asking about Mac OS. I thought you were asking about iOS.
You should look at how TextExpander is used by other apps. The target app has to build in support for TE by making an object provided by TE a delegate of the text field. You can't run your code in someone else's app. They have to compile your code into their app. That's why there's a TextExpander SDK.
Once the TextExpander code is in the target app, the text field delegate gets the shared snippets by looking for snippets put into a shared pasteboard.
I want to add a similar feature to a tool I'm making. I'm interested in how it works code-wise. I want to be able get an html page and exclude all but the article.
The Readability project does something similar for chrome and iOS. I'm not sure how it detects the content automatically but I know that Readability has an API for people who want to integrate it's features. You might want to check that out.
http://www.readability.com/learn-more
If you're working with Ruby, you could use Pismo. It extracts an article from a given document.
I need to automatically transcribe some short MP3s as part of a proof of concept I am working on. I am currently looking into cloud solutions or web API services to send the MP3 as a simple HTTP request and receive a transcription back.
The only free/open source solution I have found here, but the demos don't seem to work (at least not on the files I need to transcribe). I have found some enterprise solutions for call centers, but so far nothing I can simply integrate into a project.
Are there any web based speech recognition services available? One that is able to filter out small noise would be a plus.
Here is an unofficial method to access Google ASR capability. I just tested on Yesterday and it still works - you can get JSON style ASR output with words and associated confidence score from an FLC audio sampled in 16KHz.
Also you can try speech recognition engine of Windows 7 to produce subtitles. Here is the tool for that.
This may be a good match. Also, their techcrunch profile (See this) lists competitors as: SimulScribe, SpinVox, Vlingo, Nuance, Microsoft, Google
Some of these links may be helpful.
Vlingo, Bing and Google have recognizers in the cloud, but I don't think they make them publicly programmable. I believe they are accessible only from their authorized clients.
For a proof of concept (and low volume), have you considered just using the desktop speech engines that come in Windows 7? What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition? may be helpful. The MS desktop recognizers ship with a dictation grammar and it sounds like that is what you will need.
I'm here to ask if any of you know how to develop an app for Mac OS X that keeps reading everything the user types in. An example of app that implements this behavior is Text Expander.
Text Expander reads everything the user types in, searching for abbreviations previously added on it. When one of this abbreviations is found, Text Expander replace the abbreviation form for the entire content related to that abbreviation.
So, I would like to know what resource of Objective-C or Cocoa let you do this kind of stuff.
P.S.: Just to mention, I'm not thinking about developing something like a key logger. I'm just curious and thinking about at developing a snippet platform.
This can be done with CGEventTap, but it requires that your process is running as root or “access for assistive devices” is enabled.
Check up on Services:
http://homepage.mac.com/simx/technonova/tips/creating_a_service_for_mac_os_x.html
http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/SysServices/introduction.html
That is one way to achieve this.
I'm building a simple Cocoa app and I want to direct the audio output to a specific device, instead of the system selected one. I know some apps, like Skype, let you select where to send the output to. How do they do this?
I tried the MTCoreAudio framework but I can't even compile my app (or their AudioMonitor demo) with it included and the errors aren't helpful (_objc_fatal). Are there any complete examples that I can learn from? So far my searches haven't turned anything up.
Thanks!
The CAPlayThrough example on the Mac Dev Center Sample Code library shows how to list all of the available input and output devices, and select a default device from a menu.
Have you looked through the sample code on http://developer.apple.com ?
Look at these projects http://developer.apple.com/mac/library/navigation/index.html?section=Resource+Types&topic=Sample+Code
Namely the DefaultAudioUnit project.
I should say that working with Core Audio is more challenging than Cocoa. Most of the API's are C-based (I find that harder). You should read the Core Audio programming guide as well to get a sense of how the audio system is put together.