I would like to build simple text dependent voicer verification system: simple application that will receive wav/mp3 file with recorded word/sentence and will check if that word/sentence is spoken by given user and if the word is correct. Basically I want to authenticate user with voice and password from previously recorded and processed records. What is the simplest approach to do this?
I have googled this term: 'Text Dependent Voicer Verification', but haven't found any simple solutions or algorithms.
Thank you in advance
It looks like Sensory has it:
http://www.sensory.com/products/technologies/trulyhandsfree/
Not sure if it is very accurate
There are no simple solutions for such a complex problem. You can study algorithms involved with the following textbook http://www.amazon.com/Fundamentals-Speaker-Recognition-Homayoon-Beigi/dp/0387775919
Related
In order to implement a CAPTCHA for my login page, I would like to understand how a translation test can be considered secure compared to popular image recognition patterns.
All customers will be bilingual speakers of an orally learnt and used Polynesian language i.e., no formal spelling conventions (hence the translation to English not the reverse), so instead of asking them to read distorted letters I would like to ask them to translate a simple sentence into English to be validated from the PHP server side.
Is this secure/accurate?
The basic idea to state that this kind of CAPTCHA ("Completely Automated Public Turing test to tell Computers and Humans Apart") is totally insecure is that while the OP states that "currently" Google Translator doesn't offer support for Polynesian language, it cannot be excluded that it will do so in the future.
More generally, translation is not a valid CAPTCHA test because of the following considerations:
Comparing a random sentence VS its automated translation using a public translator (e.g. a future version of Google, Bing) is equal for a hacker submitting the same phrase to the translation engine
Using a whitelist of sentences and their translations will be eventually overwhelmed by the accuracy of the automated public translators
I mean that modern public computer translators are perfecting their accuracy. If you assume that a public translator is unable to perform an accurate job today and challenge the user with a known phrase the translator cannot process, technology will tend to eventually fix that translation and you will get the challenge sentence easily spotted by robots.
That is the main principle of ReCaptcha being used as an OCR, but from the opposite side. I will suggest you to read this paper but briefly the researchers state that ReCaptcha is destined to improve its accuracy far more than automated OCRs because of user input.
Since Google and Bing Translate widely use user-submitted data to improve their translation process, they will be subject to a human-aided machine learning eventually breaking the Turing Test for that kind of challenge (e.g. ReCaptcha will read like a human, Translate will translate like a human)
After reading the comments, it seems the only danger I face is a vague future Google Translate one, which is unlikely to eventuate. So I'm going to stick my head out and say that this is indeed a good security measure which could conceivably be useful to many businesses or organisations that have such a customer base. Thanks for the assist.
Major point in it's favor is ease of use for the customers all of which so far prefer it to trying to read captcha. I put it on a live system so had 80+ people use it today.
I presume they all speak English too then? Unusual to require your users to be bilingual. Even if this is the case today, is it possible that with future growth you might be excluding certain users? What if someone moves into the area who wants to signup but only speaks English?
Language is a funny imprecise thing. You could take a sentence and probably translate it a number of different ways. Computers deal in precision so you need a question where there can only be one answer.
Also, the whole idea of a CAPTCHA is to make sure it's a real person but it may not be too hard to write a program that uses google translate or something similar. It may not always get it right but it'd probably get through some of the time.
How can I slow down a normal content inside the "Say" verb? The spanish accent is VERY fast and most people have trouble following/understanding what is said. Ideally something like the following would be perfect:
<Say voice="woman" language="es" speed="0.5">El siguiente mensaje se repetirá en español</Say>
Note I made up the speed="0.5" param. That is not an option for twilio but slowing the reading of the content of that "Say" verb by half is what I am looking for.
I don't think this is currently supported in any explicit way so ideas on how to accomplish this more hackish are welcome too. Text is dynamic.
Thank you for your insight.
Friendly Neighbourhood Twilio Evangelist here:
Could I suggest you use Twilio's new voice, 'Alice', I think she sounds a lot better, but I'm afraid I don't speak Spanish so it's difficult to test:
<Say voice="alice" language="es-ES">...</Say>
Hope this helps!
You can configure the default voice Twilio uses for each locale in the Twilio Console here.
In particular, you can set it to the “Kimberly” or “Kendra” Amazon Polly voice, both of which are noticeably slower than the default “Salli” voice.
There are also clunkier solutions like adding a , comma, or period between each word, but they break the natural flow of reading the sentence. Those solutions work better for slowing the reading of numbers, like described in this StackOverflow question.
This was asked in an interview.
I think the answer can be done by constructing a trie of all valid words and then suggestions can be made based on a possible valid path which was otherwise given as incorrect.
Say if user types apfle, and system would detect that after ap a possible valid path was app, which would then satisfy apple.
Is there any better solution than this? Perhaps the one implemented by spell checkers.
See:
How does the Google "Did you mean?" Algorithm work?
How do I approximate "Did you mean?" without using Google?
How to write a spelling corrector
Youtube Video: Search 101
Within typical search engines you will find a lot of Analyzer stuff, which directs to the same underlying problem. A very popular Analyzer would be the n-gram Analyzer.
Perhaps this helps.
I am a bit confused about how reCAPTCHA works. I have implemented it
using ROR.
Sometimes even if i specify only one word out of two, it returns true
while sometimes it fails.
I am really confused and not able to understand the behaviour of
reCAPTCHA.
Only one of the recaptcha words is "known" by the system - it is relying on the user performing the captcha to tell the system what the other word is, because it is not machine-readable.
That is the "point" of recaptcha, or the added benefit - it is not only performing a human test, it is also massively group-sourcing translation where automated OCR has failed.
Recaptcha shows two words. One that a computer scanner has scanned and recognized and one that the computer scanner cannot recognize. Recaptcha checks for the word it knows the answer to and saves the response for the unknown word. These responses to the unknown words are compiled and analyzed so that it is essentially "solved" by humans and not by the computer scanner.
Here's more info, in their own words:
"But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct."
source - http://www.google.com/recaptcha/learnmore
Recaptcha uses two words, one of which is known and one which is unknown (the unknown word is the one that the program is trying to help decipher--it's probably scanned out of an old book or something somewhere!). So really, all the service is looking for is the right answer to the KNOWN word. If that's the word you put it, it will succeed even if you don't put in anything for the unknown word. If you put in the other word (the unknown one) it will fail.
I think that's the main point of recaptcha. It helps developers make difference between humans and robots and it also helps digitalize books.
There're always two words. One is easier to read. If you can read this word, it's fine, you're human.
The second word is a scan from a book where automatic OCR (recognition) is not sure about this word. So users are helping read this word so books can be digitalized better.
All,
I am a developer but like to know more about testing process and methods. I believe this helps me write more solid code as it improves the cases I can test using my unit tests before delivering product to the test team. I have recently started looking at Test Driven Development and Exploratory testing approach to software projects.
Now it's easier for me to find test cases for the code that I have written. But I am curios to know how to discover test cases when I am not the developer for the functionality under test.
Say for e.g. let's have a basic user registration form that we see on various websites. Assuming the person testing it is not the developer of the form, how should one go about testing the input fields on the form, what would be your strategy? How would you discover test cases? I believe this kind of testing benefits from exploratory testing approach, i may be wrong here though.
I would appreciate your views on this.
Thanks,
Byte
Bugs! One of my favorite starting places on a project for adding new test cases is to take a look at the bug tracking system. The existing bugs are test cases in their own right, but they also can steer you towards new test cases. If a particular module is buggy, it can lead you to develop more test cases in that area. If a particular developer seems to introduce a certain class of bugs, it can guide testing of future projects by that developer.
Another useful consideration is to look more at testing techniques, than test cases. In your example of a registration form, how would you attack it from a business requirements perspective? Security? Concurrency? Valid/invalid input?
Testing Computer Software is a good book on how to do all kinds of different types of testing; black box, white box, test case design, planning, managing a testing project, and probably a lot more I missed.
For the example you give, I would do something like this:
For each field, I would think about the possible values you can enter, both valid and invalid. I would look for boundary cases; if a field is numeric, what happens if I enter a value one less than the lower bound? What happens if I enter the lower bound as a value? Etc.
I would then use a tool like Microsoft's Pairwise Independent Combinatorial Testing (PICT) Tool to generate as few test scenarios as I could across the cases for all input fields.
I would also write an automated test to pound away on the form using random input, capture the results and see if the responses made sense (virtual monkeys at a keyboard).
Ask questions. Keep a list of question words and force yourself to come up with questions about the product or a feature. Lists like this can help you get out of the proverbial box or rut. Don't spend too much time on a question word if nothing comes to you.
Who
Whose
What
Where
When
Why
How
How much
Then, when you answer them, ask "else" questions. This forces you to distrust, for a moment at least, your initial conclusions.
Who else
Whose else
etc..
Then, ask the "not" questions--negate or refute your assumptions, and challenge them.
Who not (eg, Who might not need access to this secure feature, and why?)
What not (what data will the user not care about? What will the user not put in this text box? Are you sure?)
etc...
Other modifiers to the qustions could be:
W else
W not
W risks
W different
Combine two question words, eg, Who and when.
In the case of the form, I'd look at what I can enter into it and test various boundary conditions there,e.g. what happens if no username is supplied? I'm reminded of there being a few different forms of testing:
Black box testing - This is where you test without looking inside what is being tested. The challenge here is not being able to see inside can cause issues with limiting what are useful tests and how many different tests are worthwhile. This is of course what some default testing can look like though.
White box testing - This is where you can look at the code and have metrics like code coverage to ensure that you are covering a percentage of the code base. This is generally better as in this case you know more about what is being done.
There are also performance tests compared to logic tests that are also worth noting somewhere,e.g. how fast does the form validate me rather than just does the form do this.
Identify your assumptions from different perspectives:
How can users possibly misunderstand this?
Why do I think it acts or should act this way?
What biases might I have about how this software should work?
How do I know the requirements/design/implementation is what's needed?
What other perspectives (users, administrators, managers, developers, legal) might exist on priority, importance, goals, etc, of this software?
Is the right software being built?
Do I really know what a valid name/phone number/ID number/address/etc looks like?
What am I missing?
How might I be mistaken about (insert noun here)?
Also, use any of the mnemonics and testing lists noted here:
http://www.qualityperspectives.ca/resources_mnemonics.html
Discussing test ideas with others. When you explain your ideas to someone else, you tend to see ways to refine or expand on them.
Group brainstorming sessions. (or informally in pairs when necessary)
see these brainstorming techniques
Make data tables with major features listed across the top and side, and consider possible interactions between each pair. Doing this in three dimensions can get unwieldy.
Keep test catalogs with common questions and problem types for different kinds of tasks such as integer validation and workflow steps etc.
Make use of Exploratory Testing Dynamics and Satisfice Heuristic Test Strategy Model by James Bach. Both offer general ways to start thinking more broadly or differently about the product, which can help you switch between boxes and heuristics in testing.