Does TensorFlow Audio/Speech Recognition work with multi-word trigger keywords? - tensorflow

Related link: https://www.tensorflow.org/tutorials/sequences/audio_recognition
How should I modify my TensorFlow "Simple Audio Recognition" training environment (number of input samples, choice of trigger keywords, training parameters, etc.) to get a robust recognition of a unique trigger keyword (multi-words or single-word) in a normal conversation?
The original TensorFlow "Simple Audio Recognition" comes with 10 single trigger keywords, each 1 second in duration. To avoid single trigger keywords to get detected in a normal conversation and cause false positives, I have recorded 400 times (100 times 4 different people) the following two multi-worded trigger keywords, each 1.5 seconds in duration: PLAY MUSIC, STOP MUSIC. After following the exact same training steps and compensating for the new 1.5 seconds duration in the code, I am getting 100 % recognition of these two multi-worded trigger keywords when pronounced correctly; however, further testing also shows that I am getting false positives during normal speech when any work of these trigger keywords is pronounced, e.g. STOP BLA BLA BLA, STOP VIDEO, PLAY BLA BLA BLA, PLAY VIDEO, etc.
Thank you for your kind response,
PM

You should have added garbage speech into training dataset, not sure if you did that.
For very long phrases, it is more reliable to detect smaller chunks and ensure they all are present - i.e. to have a separate detector for "play" and for "music".
For example, Google separately detects "ok" and "google" in their "ok google" as described in SMALL-FOOTPRINT KEYWORD SPOTTING USING DEEP NEURAL NETWORKS
.

Related

Convert all items in a list to string format

I am trying to seperate sentences (with spacy sentencizer) within a larger text format to process them in a transformers pipeline.
Unfortunately, this pipeline is not able to process the sentences correctly, since the sentences are not yet in string format after sentencizing the test. Please see the following information.
string = 'The Chromebook is exactly what it was advertised to be. It is super simple to use. The picture quality is great, stays connected to WIfi with no interruption. Quick, lightweight yet sturdy. I bought the Kindle Fire HD 3G and had so much trouble with battery life, disconnection problems etc. that I hate it and so I bought the Chromebook and absolutely love it. The battery life is good. Finally a product that lives up to its hype!'
#Added the sentencizer model to the classification package, so all the sentences in the summary texts of the reviews are being disconnected from each other
import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp(string)
sentences = list(doc.sents)
sentences
This leads to the following list:
[The Chromebook is exactly what it was advertised to be.,
It is super simple to use.,
The picture quality is great, stays connected to WIfi with no interruption.,
Quick, lightweight yet sturdy.,
I bought the Kindle Fire HD 3G and had so much trouble with battery life, disconnection problems etc.,
that I hate it,
and so I bought the Chromebook and absolutely love it.,
The battery life is good.,
Finally a product that lives up to its hype!]
When I provide this list to the following pipline, I get this error: ValueError: args[0]: The Chromebook is exactly what it was advertised to be. have the wrong format. The should be either of type str or type list
#Now in this line the list of reviews are being processed into triplets
from transformers import pipeline
triplet_extractor = pipeline('text2text-generation', model='Babelscape/rebel-large', tokenizer='Babelscape/rebel-large')
model_output = triplet_extractor(sentences, return_tensors=True, return_text=False)
extracted_text = triplet_extractor.tokenizer.batch_decode([x["generated_token_ids"] for x in model_output])
print("\n".join(extracted_text))
Therefore, can someone please indicate how I can convert all the sentences in the 'sentences' list to string format?
Looking forward for the response. : )
Your sentences are Span objects. You can convert them to strings by using sentence.text, so [ss.text for ss in sentences] for all of them.
What is triplet_extractor? You don't explain it anywhere.

Postgres INSERT returning 'invalid input syntax' for json

Problem: Attempting to insert a JSON string into a Postgres table column of json datatype intermittently returns this error for some record insertion attempts but not others.
I confirmed using multiple third party 'JSON validator' apps that the JSON I am inserting is indeed valid, and I have confirmed that any single ' quote characters have been escaped with the double '' technique, and the issue persists.
What are some additional troubleshooting steps to consider?
Here is a scrubbed sample JSON I have attempted:
{"id": "jf4ba72kFNQ","publishedAt": "2012-09-02T06:07:28Z","channelId": "UCrbUQCaozffv1soNdfDROXQ","title": "Scout vs. Witch: a tale of boy meets ghoul (Official Version)","tags": ["L4D","TF2","SFM","animation","zombies","Valve","video game"],"description": "Howdy folks (he''s alive!). I made a new SFM video (October 2015), called \"Nick in a Hotel Room\". Please check it out: https://www.youtube.com/watch?v=FOCTgwBIun0\n\nAlso check out some early behind the scenes of Scout vs. Witch:\nhttps://www.youtube.com/watch?v=73tQEBgD09I\n\nYou can find links to my stuff on my website: http://nailbiter.net\n\n-----\n\nhey gang,\nI''m the animator who made this cartoon. Hope you like it.\n\nThis is my little mash-up of a bunch of stuff I like. What happens when the Scout from Valve''s Team Fortress 2 video-game walks into the wrong neighborhood (Left 4 Dead). Hilarity (and a bodycount) ensues. It was created using Source Film Maker (for all the dialog stuff and the montage at the beginning), and with TF2/Source SDK for the entire 300 alley-run sequence. I had already completed that part before SFM was released. The big zombie horde scenes and a couple others were shot in Left 4 Dead. I hope you get a kick out of it.\n\nStuff I did:\nI animated all of the characters (using Maya) except for the big crowd scenes and parts of the headcrab zombie (the crawling and the legs). The faces in the dialog scenes were animated in SFM.\n\nAlso did additional mapping, particles, motion graphics, zombie maya rigging, and created blendshapes for the Witch''s face to enable her to talk/emote. I didn''t do a full set, just the phonemes I needed for this performance. Inspiration for her performance was based on Meg Mucklebones (if you''ve ever seen Legend) mixed with the demon ladies in Army of Darkness. I have a feeling Valve had seen those movies too when they designed her..\n\nthanks for watching."}
I am answering this question by enumerating all the other troubleshooting steps I have found so far, either 'working knowledge' that 'field workers' will have, or a little more obscure (or buried in postgres docs which, while thorough, are esoteric) insights I have found thru my own trial & error
Steps
Make sure you have escaped any single quote ' characters by double-escaping with like ''
Make sure your JSON string is actually a single line string - JSON is very easy to copy as a multiline string, and postgres JSON columns will not accept this (easy as hitting backspace on any newline)
Most obscure I've found: even when encapsulated in a JSON string field, the ? question mark weirdly enough breaks the JSON syntax for postgres. Something like {"url": "myurl.com?queryParam=someId"} will return as invalid. Solve this by escaping the question mark like: {"url": "myurl.com\?queryParam=someId"}

Change images slider step in TensorBoard

TensorBoard 1.1.0's images history. I would like to set the slider's position (on top of the black image with 7) more precisely, to be able to select any step. Now I can only select e.g. between steps 2050 or 2810. Is that possible?
Maybe a place in sources where the 10 constant is hardcoded?
I answered this question over there "TensorBoard doesn't show all data points", but this seems to be more popular so I will quote it here.
You don't have to change the source code for this, there is a flag called --samples_per_plugin.
Quoting from the help command
--samples_per_plugin: An optional comma separated list of plugin_name=num_samples pairs to explicitly
specify how many samples to keep per tag for that plugin. For unspecified plugins, TensorBoard
randomly downsamples logged summaries to reasonable values to prevent out-of-memory errors for long
running jobs. This flag allows fine control over that downsampling. Note that 0 means keep all
samples of that type. For instance, "scalars=500,images=0" keeps 500 scalars and all images. Most
users should not need to set this flag.
(default: '')
So if you want to have a slider of 100 images, use:
tensorboard --samples_per_plugin images=100
I managed to do this by changing this line in TensorBoard backend
This question is covered in the FAQ:
Is my data being downsampled? Am I really seeing all the data?
TensorBoard uses reservoir sampling to downsample your data so that it
can be loaded into RAM. You can modify the number of elements it will
keep per tag in tensorboard/backend/application.py. See this
StackOverflow question for some more information.

PsychoPy Builder - How to I take a rest part way through a set of trials?

In PsychoPy builder, I have a lot of trials and I want to let the participant take a rest/break part way through and then press SPACE to continue when they're ready.
Any suggestions about how best to do this?
PsychoPy Builder uses the TrialHandler class and you can make use of its attributes to do control when you want to take a rest.
Assuming you're trial loop is utilising an Excel/csv file to get the trial data then make use of trialHandler's attribute : thisTrialN
e.g.
1/ Add a routine containing a text component into your loop (probably at the beginning) with your 'now take a rest...' message and a keyboard component to take the response when they are ready to continue.
2/ Add a custom code component as well and place something similar to this code into its "Begin Routine" tab:
if trials.thisTrialN not in [ int(trials.nTotal / 2) ]:
continueRoutine=False
where 'trials' is the 'name' of your trial loop.
The above will put a rest in the middle of the current set of trials but you could replace it with something like this
if trials.thisTrialN not in [10,20]:
continueRoutine=False
if you wanted to stop after 10 and again after 20 trials.
Note, if you're NOT using an Excel file but are simply using the 'repeat' feature of a simple trial loop, then you'll need to replace thisTrialN with thisRepN
If you're using an Excel file AND reps you'll need to factor in both when working out when you want to rest.
This works by using one of Builder's own variables - continueRoutine and sets it false for most trials so that most of the time it doesn't display the 'take a rest' message.
If you want to understand more, then use the 'compile script' button (or F5) and take a look at the python code that Builder generates for you.

JAVA/JUNIT : I am trying to simulate a User Choice Input in my unit test; What shall I use without prompting any system out?

I have the a Game interface and Logic works perfectly but the input od the user is causing lot of trouble and i have tried System.in Buffere Reader ..... with no lucl at all
the Game is the dominoes
and this my output
YARD
(5-1)
MY HAND
(6-0)(5-4)(5-2)(6-3)(4-1)(3-1)(5-3)
AVAILABLE
(5-4)(5-2)(4-1)(3-1)(5-3)
Options:
{1} {2} {3} {4} {5}
Please Select one bone from above options : "in here i will input my integer"
as you can see the user is pushed to make mistakes and he/she only need to digit an integer;
how do i bypass this step in JUNIT after i set up my table,hand and available moves?
Move the entering part to a method and stub/mock it in your test cases