Talking head library for Mac OS X - objective-c

Is there a "talking head" library for Mac OS X / Cocoa / Objective-C? Specifically the ones that simplify translating spoken text into visemes / facial expressions? Microsoft has "Microsoft Agent" as part of their Text to Speech API, does the Mac has a worthy competitor for this feature?

There is nothing to generate the face, but you can use the NSSpeechSynthesizerDelegate protocol to receive -speechSynthesizer:willSpeakPhoneme: messages so you can sync your own artwork with the speech.
I just posted a quick and dirty demo complete with project files and test mouth images. It's intended to show how easy this is to do. The hard part is the art work. :-) The project and blog post should be enough to get anybody started.

Related

Use Webcam to Track User's Finger Location

I have a project just starting up that requires the kind of expertise I have none of (yet!). Basically, I want to be able to use the user's webcam to track the position of their index finger, and make a particular graphic follow their finger around, including scaling and rotating (side to side of course, not up and down).
As I said, this requires the kind of expertise I have very little of - my experience consists mostly of PHP and some Javascript. Obviously I'm not expecting anyone to solve this for me, but if anyone was able to direct me to a library or piece of software that could get me started, I'd appreciate it.
Cross compatibility is of course always preferred but not always possible.
Thanks a lot!
I suggest you to start reading about OpenCV
OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at real-time computer vision, developed by Intel Russia research center in Nizhny Novgorod, and now supported by Willow Garage and Itseez.1 It is free for use under the open-source BSD license. The library is cross-platform. It focuses mainly on real-time image processing.
But first, as PHP and JavaScript are mainly used for web development, you should start reading about a programming language which is supported by OpenCV like C, C++, Java, Python or even C# (using EmguCV) etc.
Also, here are some nice tutorials to get you started with hand gestures recognition using OpenCV. Link
Good luck!

Is there a way to use text-to-speech directly in bb10

I want to generate some text from my bb10 app to give audio feedback to the user.
(But the screenreader like in the accessibility feature is not sufficient)
Has anybody already successfully got text-to-speech implemented?
There are countless open source projects that do this on PC platforms. You may have your best luck in fitting them to your needs. – Josh C
Any library you would recommend? It should have C or C++ interface and must work offline (no server based solution) and it should not occupy too much memory. – thowa
I had to check to make sure it was written in C++ which it is. It is called ESpeak. I heard about it nearly 7 years ago when I was looking for a speech synthesizer that was powerful/robust enough to sound like a human. I believe it was ESpeak, and back then it was a complicated task to get it to spew out realistic sounding speech.
http://sourceforge.net/projects/espeak/files/
This one looks promising as well; however it is written in java.
http://mary.dfki.de/Download/openmary-open-source-emotional-text-to-speech-synthesis-system-released
Found here https://github.com/marytts/marytts

POS-Tagging API for Italian Language on Mac OSX

I need help Looking for POS-Tagging API that works on documents in Italian.
My preference is for open source code (possibly; ruby, jruby, macruby, java, scala).
The program that I write will run on Mac OsX and I have already explored this list but there is no much for "Italian Language"
As of 10.8, Cocoa NSLinguisticTagger provides parts-of-speech tags and lemmas for Spanish and Italian, I could try it, but before upgrading my OSX 10.7 please let me know if you think really worth it or if I have other good options.
Tree Tagger comes with support for Italian and runs on OS X.

General tutorial on OS X audio programming

I would like to do some fairly simple audio programming on OS X (using Lion and Xcode 4.3) -- synthesizing tones with given frequencies, mainly. Trouble is, Apple's documentation on the subject is way too high-level for my current knowledge of the subject. I've searched for weeks now for something that will get me started, to no avail.
Does anyone know of some Core Audio basic tutorial, or even some sample code, that will help me do fairly simple Core Audio tasks so that I can progress to understanding the Apple documentation?
I would suggest the book Learning Core Audio There is also sample code from the book at that site.
If you are looking to synthesize audio fairly easy, you are going to want to use a 3rd party library. Two possible solutions are FMOD and SuperCollider.
The pros and cons between the two are really that supercollider runs as a server that you can connect from app as a client and FMOD is compiled into an app and uses core audio to synthesize the sound. FMOD is clearly the choice if you are planning on distributing this app. SuperCollider also has it's own language that you'd have to learn the basics of to start tailoring your sounds synthesis. Here are some links:
FMOD:
FMOD Downloads (Comes with a bunch of sample code)
Super Collider:
SC Server Download
Sine Wave Generator Sample App
Great source of SC scripts and examples

Mac OS X speech to text API. Howto?

I have a program that receives an audio (mono) stream of bits from TCP/IP. I am wondering whether the speech (speech-recognition) API in Mac OS X would be able to do a speech-to-text transform for me.
(I don't mind saving the audio into .wav first and read it as oppose to do the transform on the fly).
I have read the official docs online, it is a bit confusing. And I couldn't find any good example about this topic.
Also, should I do it in Cocoa/Carbon/Java or Objective-C?
Can someone please shed some light?
Thanks.
There's a number of examples that get copied under /Developer/Examples/Speech/Recognition when you install XCode.
Cocoa class for speech recognition is NSSpeechRecognizer.
I've not used it but as far as I know speech recognition requires you to build a grammar to help the engine choose from a number of choices rather then allowing you to pass free-form input. This is all explained in the examples referred above.
This comes a bit late perhaps, but I'll chime in anyway.
The speech recognition facilities in OS X (on both the Carbon and Cocoa side of things) are for speech command recognition, which means that they will recognize words (or phrases, commands) that have been loaded into the speech system language model. I've done some stuff with small dictionaries and it works pretty well, but if you want to recognize arbitrary speech things may turn hairier.
Something else to keep in mind is that the functionality that the speech APIs in OS X provide is not one to one. The Carbon stuff provides functionality that has not made it to NSSpeechRecognizer (the docs make some mention of this).
I don't know about Cocoa, but the Carbon Speech Recognition Manager does allow you to specify inputs other than a microphone so a sound stream would work just fine.
Here's a good O'Reilly article to get you started.
You can use either ApplicationServices's SpeechSynthesis (10.0+)
CFStringRef cfstr = CFStringCreateWithCString(NULL,"Hello World!", kCFStringEncodingMacRoman);
Str255 pstr;
CFStringGetPascalString(cfstr, pstr, 255, kCFStringEncodingMacRoman);
SpeakString(pstr);
or AppKit's NSSpeechSynthesizer (10.3+)
NSSpeechSynthesizer *synth = [[NSSpeechSynthesizer alloc] initWithVoice:#"com.apple.speech.synthesis.voice.Alex"];
[synth startSpeakingString:#"Hello world!"];