Pocketspinx, russian language and keywordspotting - voice-recognition

The goal is:
Control smart house via predefined text commands. Without any activation word like okay google, it always listening.
What's done:
pocketsphinx_continuous -inmic yes -hmm /path/zero_ru.cd_cont_4000 -dict /path/my_dictionary_out -lm /path/lmbase.lm.DMP
Where my_dictionary_out has using words one by one. And it works, but I think KWS mode might be better. So I've written a text file like:
включи свет комнате /1e-50/
выключи свет комнате /1e-50/
and trying to run sphinx without language model (-lm option), but got nothing. It doesn't recognize commands from keywords file.
What's wrong?

Related

Finete State machine visualizer

I need an application that prints/visualizes input/output pairs during the FST runs. I mean, for each state of the fst, it needs to print out a tuple that contains input for that state and output of the state. Right now I can generate fst files that is compatible with foma,hfst and xfst fst tools. So, I guess the visualization tool I need should be enough to compatible with any of them. Is there anyone who knows such a tool ?
foma can produce dot format files that can be visualized by graphviz. On Debian/Ubuntu, install graphviz with
$ sudo apt-get install graphviz
foma can read att format files (produced with hfst-fst2txt for anything HFST can read, or lt-print for anything from lttoolbox); assuming you've got such a file named myfst.att, you can do
$ foma
foma[0]: read att myfst.att
foma[1]: view
to display the full FST. That will show each input/output pair on each edge between states of the FST.
But you say "during runs" – are you talking about also showing the queue of "live states"? If so, I don't know of a tool that does this, that would be nice! One thing you could do is to modify the HFST source to output the list of live states and string vectors as it's processing, and then combine that with the dot file to e.g. colour in the live states. (If so, you may want to take this to the #hfst channel on irc.freenode.net.)
There is also a script att2dot.py on https://ftyers.github.io/2017-%D0%9A%D0%9B_%D0%9C%D0%9A%D0%9B/hfst.html that can be used on the command line like
hfst-fst2txt chv.lexc.hfst | python3 att2dot.py | dot -Tpng -ochv.lexc.png if you prefer something more scriptable. If you use that from the Python library of HFST, you might be able to get the "live states" for every part of an analysis more easily.

Moving the "cursor" back a line for stdout

I have a little command line tool (written in Objective C, runs under MacOS) that tracks changes to folders and applies rules to files. This tool also informs the user about the progress. It says like:
"Found 3 files of type Z and applied rule"
"Found 6 files of typ x and applied rules"
Currently, the tool outputs the feedback as an endless list but this does not look very handy. What I'm after is a solution to only type the line per file type once and then update the number in the terminal if the tool finds another file of that type. Very similar to how "top" under Unix gives the feedback.
However, to do so, I'll need to move the cursor in the terminal backwards to the beginning of the line and also one or multiple lines backwards.
Is this possible and does anybody know, how to do so?
Thanks
Norbert

Lua syntax highlighting latex for arXiv

I have a latex file which needed to include snippets of Lua code (for display, not execution), so I used the minted package. It requires latex to be run with the latex -shell-escape flag.
I am trying to upload a PDF submission to arXiv. The site requires these to be submitted as .tex, .sty and .bbl, which they will automatically compile to PDF from latex. When I tried to submit to arXiv, I learned that there was no way for them to activate the -shell-escape flag.
So I was wondering if any of you knew a way to highlight Lua code in latex without the -shell-escape flag. I tried the listings package, but I can't get it to work for Lua on my Ubuntu computer.
You can set whichever style you want inline using listings. It's predefined Lua language has all the keywords and associated styles identified, so you can just change it to suit your needs:
\documentclass{article}
\usepackage{listings,xcolor}
\lstdefinestyle{lua}{
language=[5.1]Lua,
basicstyle=\ttfamily,
keywordstyle=\color{magenta},
stringstyle=\color{blue},
commentstyle=\color{black!50}
}
\begin{document}
\begin{lstlisting}[style=lua]
-- defines a factorial function
function fact (n)
if n == 0 then
return 1
else
return n * fact(n-1)
end
end
print("enter a number:")
a = io.read("*number") -- read a number
print(fact(a))
\end{lstlisting}
\end{document}
Okay so lhf found a good solution by suggesting the GNU source-hightlight package. I basically took out each snippet of lua code from the latex file, put it into an appropriately named [snippet].lua file and ran the following on it to generate a [snippet]-lua.tex :
source-highlight -s lua -f latex -i [snippet].lua -o [snippet]-lua.tex
And then I included each such file into the main latex file using :
\input{[snippet]-lua}
The result really isn't as nice as that of the minted package, but I am tired of trying to convince the arXiv admin to support minted...

I am trying to implement Pocketsphinx running uClinux, but I keep getting "Phone is missing in acoustic model" errors

I am trying to run Pocketsphinx on a microcontroller running uClinux, I have installed pocketsphinx on the controller, but I keep getting several different errors regarding acoustic models and definitions. The current one I am facing is:
"Phone ... is missing in the acoustic model"
Replace the ... with every possible phonetic combination. It starts
off with A, then AE, then progresses to B etc.
I am trying to take a .wav file as input, and so this is the command I am using to run the software:
pocketsphinx_continuous -hmm /usr/share/pocketsphinx/model/hmm/en/tidigits/ -lm /usr/share/pocketsphinx/model/lm/en/tidigits.DMP -infile 1.wav -samprate 8000 -dict cmu07a.dic
Has anyone encountered this issue? if so, do you know a way to resolve it?
For tidigits model there is a special dictionary tidigits.dic in pocketsphinx/model/lm/en/tidigits.dic, you need to use with -dict option instead of cmu07a.dicin your command line.

Translation from plain English to a set of instruction

I have a set of instructions, for example Linux's "ls", "grep", "cd", etc.
I want the user to be able to execute this commands without knowing the exact names and parameters but rather with something similar to their meaning in plain English, e.g. "show me all folders", "filter all files by name" , "go to directory". Or in other words: the user input is "Show me all files and then show me all that contains 'foo' " to be translated to "ls | grep foo"
I understand that I will need some kind of meta-information about each instruction and do some kind of evaluation how close is the user query term to each instruction. Something like:
<instruction>
<command>ls</command>
<semantic>lists all files</semantic>
<plainEnglish>List all the files in this directory</plainEnglish>
<synonyms>
<synonym>Show all files</synonym>
...etc
</synonyms>
</instruction>
So which is the important information and how to do this evaluation?
Any general guidelines how I can translate the user's input to a specific instruction from my set? (This sounds like quite a challenge to me)