nuance : Speech Recognition ,algorithm for answering the user's queries - nuance

For voice activated apps(virtual assistants) like (aivc for android ), which uses nuance speech recognition, which API is used to to get the answer to the user's question. FOr example, if the user types "what is your name" the ap gives an answer.
Do we have a standard algorithm which understands the user's query and gets a probable answer.

Nuance has several speech recognition platforms/products, it depends on which one you use. For your situation, probably start out with Nuance Mix, it'a self-service speech recognition platform targeted for mobile developers. There's also Nina too which is a more full service offering from Nuance.

Related

What is the difference between Google Cloud Vision API and Mobile Vision?

I have been playing with cloud vision API. I did some label and facial detection. During this Google I/O, there is a session where they talked about mobile vision. I understand both the APIs are related to machine learning in Google Cloud.
Can anyone explain (use-cases) when to use one over another?
what sort of applications that we can build by using both ?
There can be many various sets of application requirements and use cases befitting one of the APIs or the other, so the best decision is to be taken on an individual basis.
It is worthwhile noting that the Mobile Vision offers facial recognition, whereas Cloud Vision API does not. Mobile Vision is geared towards the likeliest use-cases to be encountered in a mobile device environment, and encompasses the Face, Barcode Scanner and Text Recognition APIs.
A use case for both the Mobile Vision set and the Cloud Vision API would require face recognition as well as one of the features specific to the Cloud Vision API, such as the detection of inappropriate content in an image.
Google is planning to "wind down" Mobile Vision as I understood.
The Mobile Vision API is now a part of ML Kit. We strongly encourage you to try it out, as it comes with new capabilities like on-device image labeling! Also, note that we ultimately plan to wind down the Mobile Vision API, with all new on-device ML capabilities released via ML Kit. Feel free to reach out to Firebase support for help.
https://developers.google.com/vision/introduction?hl=en

Android offline voice recognition

I already use HTK (Hidden Markov Model Tool Kit) for recognizing specific commands
used to control my Android application, but in this case I need to pass some voice data to a server and that may consume more time.
To prevent this latency, I am thinking about using pocketsphinx to recognize the voice data locally with the Android application so that I won't need to pass that audio to the server.
If this is a good idea, is it easy to learn pocketsphinx from scratch? Also, what are advantages and disadvantages of both techniques (server-based and local voice recognition), and which one is better?
CMUSphinx is definitely a great idea, it has a number of advantages over HTK:
Better license
Works offline on Android
Fast
Supports multiple languages out-of-box
Easier to use and learn
You definitely should try Pocketsphinx, for more information see
http://cmusphinx.sourceforge.net/2011/05/building-pocketsphinx-on-android/

Speech Recognition API match grammar when there is no sound (Microsoft)

I have build a Speech Recognition tool using Microsoft SAPI and a Kinect.
Following code sample I load XML grammar and start a SpeechRecognitionEngine.
Sometimes when there is few or no sound the SpeechRecognitionEngine have a match with a very high confidence (0.85) on a simple sentence: "Sarah what time is it"
Why Engine trigger this strong match in silence ?!
Any Workaroud ?
Here is my main class on GitHub
I also write (in french) a blog post with dump (wav + xml)
I am not sure exactly which wave file you are talking about (I haven't spoken French since middle school). But I think this wave from your group qualifies: dump_2012_12.16_12.47.33.wav. It has a high confidence value .857 and does not appear to have any speech in the audio file. Looking at a spectrogram (see below) you can see the audio file does contain energy in the speech range.
Most speech recognition engines these days use a Hidden Markov Model (aka HMM) to match audio vector patterns to speech. The state of the art today is not always accurate at doing this. HMM's tend to be really sensitive to background noise.
This is why most speech type features in production today (like Siri) are push to talk. You need to push a button and you have 5 seconds to speak into the microphone. They do this so they can be sure there is some type of speech signal. For those systems that are open mic (Kinect is the only one I know of) they try and use a form of echo cancellation to suppress background audio. But even with the state of the art there is still bleed through.
The only relatively easy work arounds (again not 100%) that I know of involve editing your grammar to include a garbage rule and shortening the possible phrase list. The garbage rule will give the speech engine a "run home to momma" option when it does not know what to do.
http://www.w3.org/TR/speech-grammar/#S2.2.3
Although I don't think this is recommended usage I have seen some systems behave better when using the garbage rule to help filter out background noise. Of course they then have to ignore the garbage reco events.

Good speech recognition engine for Mac, not iOS?

Sorry if this is a repeat question, but I didn't see it anywhere.
I'm working on a Mac program that will take voice commands, and NSSpeechRecognizer isn't quite doing it for me.
I want something a little more dynamic so I can set alarms, make dates, give more natural commands, etc.
Every open source speech engine I've found is tailored toward iOS. Do openears/vocalkit etc. still work just as fine for Mac programs?
Speech recognition is exceptionally non-trivial. The engines that are free are free for a reason. If you expect dictation in any amount (like an alarm label), you're out of luck. There are reasons Siri requires an entire data center. The open source packages available won't get you much further than simple telephone auto-attendants.
Unless you have an extensive statistics background and free time, I'd recommend that you pursue licensing a commercial library or server implementation.
pocket sphinx from Carnegie Melon is about the only option
http://cmusphinx.sourceforge.net/

Best mobile application development tool/environment? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I would like to develop a mobile application that is able to access all the features of the mobile device it runs on (camera, files, phone and network connectivity). I intend to build a series of applications that each have a specific function to perform, rather than a single application with a large feature set. My programming background is C, C# and web applications.
What would be the best tool set to use to do this? I have looked at using the NetBeansIDE to create a Java ME applications using LWUIT - this looks promising, but what are the caveat's?
I want to target the largest universe of mobile devices possible.
J2ME is the way to go to reach the masses, in the consumer or the business market. From a consumer standpoint, most of the world's mobile phones support J2ME. From a business standpoint, most of the world's smart phones support J2ME.
Nokia owns a 40% share of the smart phone market (and the whole market) worldwide. Next in line is Blackberry with a 13% share. Both have standard implementations of J2ME on their devices (though Blackberry also has a proprietary version of Java as well). On top of this, most devices that run Windows Mobile come with a JVM as well (I developed a game recently that initially targetted the Sony Ericsson W810i and it ran flawlessly on my HTC Tilt's JVM). Add in the fact that Android has a Java SDK and the only segment you are really missing out on are the BREW only phones and the iPhone.
I'm not a huge backer of J2ME. I just know that every other mobile platform has disappeared from my life over the last 3 years as it just makes financial sense for companies to only target the J2ME segment of the industry.
You are facing the usual main issue of mobile development : targetting as many handsets as possible with only one programming language means using J2ME, which doesn't quite give you access to all the features of the handset.
Most open handsets will support J2ME but different phone manufacturers implement it in different ways and fragmentation is enourmous accross the board. Unfortunately, the majority of open handsets (the ones where you can install third-party applications) only allow you to develop in J2ME
The only good news is that your intent to only write small applications will provide large relief from fragmentation issues.
J2ME also has huge limitations in terms of file system access, complete lack of a telephony API, very poor interaction with the system applications management...
In order to get full features, you always need to use the native technology of the open platform you are targetting, be it Android, iPhone, the several variants of Symbian OS, Brew, Windows Mobile or Palm OS handsets. Each of these has its own native technology.
Writing your application many times in many different languages in the costly price of wanting both a large number of targetted handsets and access to the full features of each of them.
I'm a Symbian/J2ME veteran myself and, given your stated background and goals, I suspect that you are trying to learn about mobile technologies. I'll therefore shamelessly plug my book, which is meant as an introduction into the Symbian development ecosystem :
http://www.amazon.com/Quick-Recipes-Symbian-Smartphone-Development/dp/0470997834/
Good luck
If your primary goal is to reach the consumer market, Java 2 Micro Edition (J2ME) is probably your best bet. All popular mobile phones (from Nokia, Sony Ericsson, Samsung) come with a Java Virtual Machine installed. If you look at Google, for example, they are developing all of their mobile applications (like Gmail and Google Maps) in Java.
If you instead are targeting business customers, the Microsoft .NET Compact Framework is the way to go in my opinion. The Windows Mobile operating system holds a strong position in the business market, mainly because of Outlook Mobile and its integration with Exchange.
I respectfully disagree with Fostah.
If you want to reach the masses, the Web is your best bet. It's far easier to write a simple Web application that will work on millions of devices.
And the best bit is that you can easily update your application and improve the user experience everyday with the Web.