Speech training files and registry locations - sapi

I have a speech project that requires acoustic training to be done in code. I a successfully able to create training files with transcripts and their associated registry entries under Windows 7 using SAPI. However, I am unable to determine if the Recognition Engine is successfully using these files and adapting its model. My questions are as follows:
When performing training through the Control Panel training UI, the system stores the training files in "{AppData}\Local\Microsoft\Speech\Files\TrainingAudio". Do the audio training files HAVE to be stored in this location, or can I store them elsewhere as long as the registry entries for the profile reflect the correct path?
The Speech Control Panel creates registry entries for the training audio files in the key "HKCU\Software\Microsoft\Speech\RecoProfiles\Tokens{ProfileGUID}{00000000-0000-0000-0000-0000000000000000}\Files".
a) Do the registry entries created by my training code HAVE to be placed in "{00000000-0000-0000-0000-0000000000000000}\Files" or can I create a new random GUID under {ProfileGUID}?
b) Does the subkey HAVE to be named "Files"?
c) And do the registry values HAVE to follow the form "TrainingAudio-xxxx-xxxxxxxx-xxxxxxxx" or can I use other values?
d) Finally, the Registry Value Data is of the form "%1c%\Microsoft\Speech\Files\TrainingAudio\SP-xxx....xxx". Can I specify an absolute path?
e) Do the file names HAVE to follow the form "SP-xxx....xxx.wav" or can I use any unique file names?
Thanks.
Giri

Related

upload same file with different name each time in Load Runner

I need to upload an excel file in load runner HTTP/HTML script with unique filename each time. The file must present in the directory. Copying files and renaming them will be more manual task. Can anyone tell me is there any automated way to do this? or Load Runnner itself can perform such tasks? Thank you.
On each of your load generators make sure that you have a ram drive for the file i/o for the new files. You are going to have ten, perhaps hundreds or thousands on your load generator. You do not want contention for the read/write heads of a physical hard drive acting as a drag anchor on the performance of your entire load generator. It is for this same reason that logging is minimized during test execution.
Include the base file as part of your virtual user
Use appropriate language API for making a copy of the file from the virtual user directory to the ramdrive on your virtual user generator with a new name. It might recommend a name which includes virtual user number and iteration number at the end to ensure uniqueness across your virtual user population.
Upload your file from the ramdrive as the source
Delete your newly created file to return to the same initial condition as the beginning of the iteration.
As you will be engaging a large amount of file i/o for the virtual users it is highly recommended that you monitor the load generators just as you would monitor your application under test. If you are new to LoadRunner and performance testing then this is an excellent opportunity for your mentor/trainer to guide you on a monitoring strategy.
Assuming the upload is done using a html form..
Use web_submit_data() with the FilePath argument.
but first lets create some parameters to get a real unique filename (very importent)
create a parameter VUSERID which outputs the current vuser id.
get/save the current timestamp
web_save_timestamp_param("TIMESTAMP", LAST);
and here is the request:
web_submit_data("i1",
"Action=https://{LR_SERVER}/{LR_JUNCTION}/upload",
"Method=POST",
"EncType=multipart/form-data",
"Snapshot=t1.inf",
"Mode=HTML",
ITEMDATA,
"Name=FIELDNAME", "Value={VUSERID}{TIMESTAMP}_LOADTEST.xlsx", "File=yes", "FilePath=REALFILEPATH.xlsx", "ContentType=WHATEVERCONTENTTYPE", ENDITEM,
LAST);
The Value={VUSERID}{TIMESTAMP}_LOADTEST.xlsx will be the new (unique) filename. (It is unique for each user and iteration! very importent)
The FilePath points to the real file and its content will be uploaded.

"Empty table from specified data source" error in Create ML

I'm trying to train a new object detection model using the Create ML tool from Apple. I've already used RectLabel to generate annotations for all of the JPEG images in my directory of training images.
However, every time I try loading the directory in Create ML, I receive this error message:
Empty table from specified data source
I already looked on the Apple Developer forums and that thread incorrectly claims the problem was solved in a previous update.
What causes this error? How can I get Create ML to accept my training data?
I'm using Create ML Version 2.0 (53.2.2) and RectLabel Version 3.04.2 (3.04.2) on macOS Big Sur 11.0.1 (20B29).
The “Empty table from specified data source” error occurs if any of the filenames contain spaces.
My solution was to rename all the files so the filenames don't contain spaces.
Make sure that there are only images and annotations.json file in your directory of training images.
If there are any other files including .mlproj file in the folder, Create ML shows the "Empty table from specified data source" error.
When you create a new project on Create ML, specify outside the directory of training images.

Google Vision AutoML > Datasets | Validation data in csv doesn't upload

I am using Google Vision Automl. In order to train a model the data needs to be uploaded. There are following two ways.
Upload directly from your computer
Upload to google bucket and make a csv which contains the paths to the image files.
See the following image
Since, i want to compare my locally pre-trained model with the model i will train on Google Automl, i want to ensure that the same data splits are used (train, test, validation). So #2 way is the best way
Issue:
I have made a the csv in the following format. But when i upload it, only train and test sets are loaded.
I solved it by putting "Validation" instead of "Validate" in the set column.
So the issue was the language used on the upload form, where they have the following.
Optionally, you can specify the TRAIN, VALIDATE, or TEST split.
Which is misleading and they also did not show the sample row for Validation.
For more details:
https://cloud.google.com/vision/automl/docs/prepare#csv

Programmatically set default column values based on folder in SharePoint Online

I'm working on enhancing metadata in our SharePoint online (O365) environment. Since a portion of my user base is used to foldering (explorer style), I've started using default column values to automatically set values on any files added to that specific folder (we have content organized categorically by folder currently). An example is our HR documents library - we have separate folders for recruiting, payroll, personnel files, etc. that automatically categorize files added to that folder with the same categories (recruiting, payroll, personnel, etc.). This supports both "search" and "click" users and makes adoption WAY easier while getting important metadata.
I want to implement this in a larger, more dynamic fashion, so manually setting default column values on each folder is not going to be scalable.
How can I reference the top level folder within the library (or even the current folder) for each newly added file and populate the "category" field for that new file with that folder name? I can do some very basic C# or Java code copy/paste, but bonus points for non-coding solutions =)
This problem can be solved through no coding.You can use the workflow to implement this by SharePoint Designer.
Create different view for different function team, and then use the view filter to show the document.
If you upload a file, use the workflow to set the metadata of the file. There are some known limitations: if you upload multiple files at the same time, the metadata for the file maybe does not work well; or if you upload a folder, the meta will not work for it and the file in the folder may not be set to right metadata.
I was actually able to use MS Flow to accomplish this in a pretty simple and straightforward fashion without managing custom views per team. The concept at a high level was:
(Trigger) When a new document is created in a folder in the library
Get the link of the parent folder of the newly added document
Create a variable (or just code it out in the Flow step) to parse out the name of the parent folder from the parent folder link (should be all text to the right of the last "/")
Set the category field as the variable
I'm sure that you could do the same right in a SharePoint designer workflow, but I prefer flow due to the visual aspect of it and being far easier to troubleshoot.

Executing Abaqus Model in Taverna

I'm pretty new to both Taverna and Abaqus but I am trying to run an Abaqus model using a "Tool" in Taverna remotely on a HPC. This works fine if I already have my model file and inputs on the HPC but I need a way of uploading the files dynamically in Taverna (trying to generically wrap Abaqus models).
I've tried adding a input port that takes a file list but I don't know how I can copy it to the "location" that I've set for the tool. Could a beanshell service be the answer or can I iterate through the file list and copy them up before executing the abaqus model?
Thanks
When you say that you created an input port that takes a file list, I guess you mean an input to the tool service.
Assuming the input port is called my_file_list, when the tool service is run, it will take a list of data values on port my_file_list. As an example, say it has "hello", "hi" and "hola" is the three values in the list.
On the location where the tool service is run, it executes in a temporary directory - a different directory for each execution of the service. It is normally something like /tmp/usecase-2029778474741087696
Three files will be created in the temporary directory; those files contain the (in this example) three values the tool service received on port my_file_list. The files could be called
/tmp/usecase-2029778474741087696/tempfile.0.tmp containing hello
/tmp/usecase-2029778474741087696/tempfile.1.tmp containing hi
/tmp/usecase-2029778474741087696/tempfile.2.tmp containing hola
There will also be a file called my_input_list. That file will contain
/tmp/usecase-2029778474741087696/tempfile.0.tmp
/tmp/usecase-2029778474741087696/tempfile.1.tmp
/tmp/usecase-2029778474741087696/tempfile.2.tmp
The script of your tool service would normally read the contents of my_input_list line by line and do something with the contents of the listed file(s).
I have also seen some scripts that 'cheat' and iterate directly over tempfile*.tmp but that would be "a bad thing". The problem with that trick, is that if you want to add a second list of files to the tool service then the file my_input_list could contain
/tmp/usecase7932018053449784034/tempfile.4.tmp
/tmp/usecase7932018053449784034/tempfile.5.tmp
/tmp/usecase7932018053449784034/tempfile.6.tmp
as other temporary files were used for the other file list port.
I hope that helps
The tool service allows you to upload files - but if you are using the HPC through a job submission node, then you would have to modify your command line tool to then use the job file staging command to further push the files as part of the job. The files would be available in the current (temporary) directory of the specified tool script.
I would try to do it through the Tool service and not involve the beanshell - then you can keep your workflow simpler.
A good thing to remember is that you can write multiple shell commands in the box.
Similarly you would probably want to retrieve back the results so that you can process them further in the workflow (unless they are massive - in which case you should just output their remote filenames and send them in again to the next HPC job)
The exact commands to use for staging files and retrieving them depends on the HPC job submission system. Which one are you using?
Thanks for the input guys.
It was my misunderstanding of how Taverna uses the File list. All the files in the list are copied to the temp "sandbox" and are therefore available for use.
Another nice easy way is to zip the directory and pass the zipped files into an input port for the service. Then just unzip the files inside the command.
Thanks again