How to identify the voice in command trigger - voice-recognition

I tried the Android demo of CMU Sphinx. It went well, but it also triggers when any other person speaks "Oh might computer".
So, I want to make something like Ok Google which only activates when I speak "Oh mighty Computer" like Ok Google does?
Is there any way to do it on CMU Sphinx?

This feature is not supported. You have to implement voice identification yourself.

Related

Using different intonations with Watson text to speech

I am developing a PoC using Watson text to speech and Watson conversation.
Sometimes, the chatbot needs to ask a question, so I'd like text to speech to synthesize the voice using an interrogation intonation.
Is it possible to be done?
Watson Text to Speech supports SSML, and has expressive SSML tags.
The one you want to use is Uncertainty. As it is defined as "conveys an uncertain, interrogative message".
Example:
<express-as type="Uncertainty">
Could she still be in the office? She told me that she might leave early.
</express-as>
More details on it's usage is here:
https://console.bluemix.net/docs/services/text-to-speech/SSML-expressive.html#the-express-as-element
Yes, you can certainly use text-to-speech (TTS) for output and speech-to-text (STT) for input. You would need to use a middleware or app layer to drive the conversation and route the input/output to the other services (see "how to use" in the docs).
I have used the following TJBot recipe as a simple and good started for some projects: https://github.com/damiancummins/tell_the_time
Unfortunately Concatenative TTS may have problems to create correct intonation in questions. If you think it happens consistently or too often please open a bug.
If you have a specific question which gets incorrect intonation try to rephrase it a little bit if possible. A useful trick for this voice could be to use double question mark '??'

how do I install OS X TTS voices from Objective-C

OS X Lion comes with some fantastic voices from Nuance. I would like to use them from my software, however, currently they require the user to manually go to System Preferences, Dictation and Speech, System Voice -> Customize, then download the voices from Apple. I would like to call something from Objective-C so that that voices that are missing (say Chinese voices) are automatically downloaded in the background. How can this be done?
Incidentally, the available voices on the system can be found using
[NSSpeechSynthesizer availableVoices]
but all the possible voices (so far) can only be found in the Dictation and Speech dialog. Here is how it is done manually:
I highly doubt there is any reliable way to do this.
You should just point the user to a help page explaining the process (preferably a help page on apple's website if possible, since they'll keep it updated if the process changes).
You're going to need root access to do it, which is banned from App Store apps. And it almost certainly needs to tie into the software update process which is extremely complex, completely undocumented, and subject to change at any time.
Alternatively, contact nuance for a license to distribute the voices yourself. They are all for sale. Maybe you can even get a discounted price if you only install if on macs where the user can get them for free.
You can use this AppleScript to programmatically install TTS Voice, but MacOS is coming with all voices!
But Not all are showed in "Dictation & Speech->System Voice".
The reason is that they are very simple(compact) version and MacOS show only the full version of instaled Voice(s) like:"Alex".
Examples:
My Configuration:
You can see that I have only english voices but the next example work perfectly with spanish text.
Swift example:
var voice:NSSpeechSynthesizer = NSSpeechSynthesizer(voice: "com.apple.speech.synthesis.voice.diego")
voice.startSpeakingString("España y los españoles")
Terminal example:
say -v monica "En España a los españoles les encanta paella"
How to get the download list is a mystery... yet...
Using wireshark, it is possible to determine the voice location. Indeed a pkg is downloaded from Apple. Whether credentials, etc. are needed, is a separate issue.
Here are additional sample URLs:
/content/downloads/10/29/041-5265/kKRs5z32DcbrGdV4TZvY9hcqy8tcpkbX43/MLV_en_AU_karen.pkg
After download, could then run installer to install.
Update. I was able to download the pkg from Windows using http. Also, this other link
http://swcdn.apple.com/content/downloads/59/49/041-4469/WNshMBxxbxx5qXv27Mrvxm7yKcSRM75sVw/041-4469.English.dist
gives information about 041-4469 including the package name.

How to make a library for the webcam?

I'm learning Factor and I thought it would be great to have a small program to capture images from the webcam that comes with my mac pro. I know every webcam will be very different but sounds like something I should be able to do. I want to create a library with support for Mac, Linux and Windows. The problem is that I'm not sure where to start.
Factor-based answers are welcome but I'm looking for the language agnostic solutio. When I google for it, all I get is programs that capture images. I want to learn how to interact (in the 3 big operative systems) with the drivers I guess.
I think the only clue I have is the ioctl wiki page. How would you start such a project? What kind of google keywords would you use? Books?
It's not clear if you want to write a driver for your particular webcam or a library that makes talking to the existing driver easier.
If you want to write a driver for your webcam, you probably want to investigate libusb for Mac and Linux and libusb-win32 for Windows. You would need to understand the protocol that your webcam talks, though. You could probably read the source code for the existing Linux driver (assuming there is one, which is pretty likely).
As for Google search terms, you might try "video capture" and maybe looking for Python/Ruby etc. code or Open Source programs will get you code you can look at to see how to do what you want to do.
Perhaps if you describe in a little more details what you're trying to accomplish someone could give you better suggestions.

Google Wave extension for Programmers and their Code

Sorry if this is well known but Googling for my answer only came up with links about making Google Wave gadgets.
My question is, are there any Google Wave gadgets that allow for better collaborative code editing? I mean, I can set the font to fixed width etc., but are their any gadgets designed for it?
Responses shouldn't include anything about git or svn. I use those when I want to use those. This is about Google Wave!
Here is a huge list of robots available for Wave: http://www.chaaps.com/huge-list-of-125-google-wave-robots-add-bots-and-enjoy-wave.html
Maybe there is one in there?
Don't know how well it works but found an extension called CodeBot.
http://aaron.oirt.rutgers.edu/myapp/root/gadget/codeGadget
-- its a work in progress. Let me know if you want the source code.
I will package it and release it or something like it for the next WHIFF
release.

How can you programmatically control a Casio Exilim EX-F1 digital camera?

I would like to be able to programmatically emulate a shutter button press on my Casio Exilim EX-F1 digital camera.
It comes with a USB tethered remote control that can emulate a shutter press, so I would think there is a way to emulate that from a PC.
I've looked and can't find any libraries or anything for controlling this camera.
Anybody have any ideas? How about a way to "sniff" the USB being sent from the remote (I can't imagine that's easy).
Ok although it might be out-of-the-box thinking the easiest solution I can think of, without having the patent documentation + technical specs in-front of me (that is the normal route people use to do this sort of thing) you could always use Lego Mindstorm robotics.
Edit: Anyways besides the Lego, which would be my course of action, I mentioned, the hard-core way is to use the Spec sheets, you can normally get them off the website, but then your basically into driver programming. If you find that prospect attractive in anyway this link will give you some ideas for doing it on Windows.
In case anyone finds this question, Casio finally released an official way to do this with the free EX-F1 Controller software (with special firmware included):
http://support.casio.com/download.php?rgn=5&cid=001&pid=573
It has its limitations but it makes more possible for sure.
There's a non-official API to control the Casio EX-F1. It's a reverse enginyering free (as in freedom) product.
http://code.google.com/p/exf1ctrl/