Unwanted pauses when using <prosody> tag in SSML for TTS - text-to-speech

I am writing and marking up spoken utterances for an VUI tool. We are using Google Cloud Wave-net for our TTS service, and I have been trying to use SSML to make the TTS output more natural. When I add the tag "prosody", the TTS output adds a pause before the start of the tag, as in the below:
<speak>
Rebecca is allergic to <prosody rate="slow" range="high">soybean oil.</prosody> Would you like to cancel this order?
</speak>
In this example, the TTS output pauses between "to" and "soybean oil". This is just a silly example sentence, but in our real product we need to use this kind of tag to provide emphasis and differentiation between complex words.
Has anyone else experienced this issue? Any tips?

It looks like range isn't part of the Google Cloud TTS ssml spec. It is part of Microsoft's spec though, so maybe that's what you were thinking of.
If you're still trying to get rid of a gap like that, you could theoretically use the <seq> tag to get the segments to slightly overlap, but that seems like it'd be super difficult.

Related

QnA Markdown formatting gets lost on using Transalation API

I have been using Microsoft Translator API v3.0 which does not seem to work in my case.
The Translation actually appends spaces and the markdown gets scrambled. How to fix this?
Thanks,
deeepss
The ! indicates a sentence end, which is not what you want in this case.
You can escape the exclamation point to a tag like <exclamation>. Then it will be handled as a word in the context of the sentence.

What is the meaning of 'cimode' in react-i18next and why isn't it properly documented?

I started using react-i18next a few days ago and I am very satisfied with it. However, I've been seeing this 'cimode' language here and there, in some posts and while debugging, but have no clue what it means. I've searched all over, I believe, and can't find any documentation on it.
In my particular case, I am generating some boilerplate code in a new website and created a demo page to show how to use localization in the website. I am generating toggle language buttons from the languages I set on the whitelist and, to my surprise, I have a 'cimode' button. I know I can filter it out and I will, but I would like to know what it should be used for and maybe to see better documentation for it in https://react.i18next.com/.
From my understanding, CIMODE is used for testing to consistently return the translation key instead of the variant value.
It seems rather hidden on the FAQ.

Is there a way of using the <prosody> tag in SSML to adjust individual words without a pause (without using a post-processor)

When using the prosody tag in SSML with Google Cloud TTS, I cannot adjust the attributes of individual words without creating an unwanted pause.
The code below creates a lag between 'New' and 'Video'. It has been suggested that a postprocessor can remove these pauses, but I'd like to know if there's a way of doing it directly within the code itself?
<speak>
Hello, and welcome to this<prosody pitch="+3st">New</prosody>Video Tutorial.
</speak>
After testing, it appears there isn't a way of doing this using Google Cloud TTS. You can manually edit the sound file after generating it, but thay defeats the object of the exercise.
I don't have the cleanest answer, as what you are asking is not very supported. Prosody's pitch contour let's you change the tone of voice at different parts of the sentence.
Example of Prosody contour
<speak><prosody contour="(0%, +20Hz) (20%, +30%) (100%, +20%)"> Hello friends! </prosody></speak>
I am still playing around with this, but it seems like a tedious way of getting what you want done.
Using contour
contour takes a string of tuples "(%position in sentence, pitch adjustment) (..., ...)
I hope this helped and best of luck on your work!

Using elm for front end development + serving dynamic elm pages though haskell

I started with elm yesterday and I really enjoy using it. Without any experience in front end development I could build a nice looking webpage in only 30 lines of code, which is amazing.
Now I really want to use it in a real life example, I want to build a small blog.
But I need a way to communicate with elm. For example I need to query my database and I get a list of blog entries [Blog] and now I need to pass them to elm.
I am not sure how I would do it. I was looking though the popular haskell frameworks like yesod snap and happstack and the first thing that I found was http://hackage.haskell.org/package/snap-elm-0.1.1.2/docs/Snap-Elm.html
But it seems it is intended for serving static elm files, but I need to pass arguments to it.
Any framework that you would recommend me that already has elm support for serving dynamic elm pages?
And if not, how would you do it?
My idea was just to use elm as a skeleton and then I generate a normal html file with yesod snap or happstack and integrate this file into elm. Would this be possible?
Something that would look like this
container 1000 1000 middle <| displayHtml "/pages/my_generated_html_page.html"
Edit:
My first hacky solution was this
tPage = plainText "<script src=\"http://code.jquery.com/jquery-1.10.1.min.js\"></script>\n
<script> \n
$(function(){\n
$(\"#includedContent\).load(\"/home/maik/b.html\"); \n
});\n
</script> \n
<div id=\"includedContent\"></div>\n"
Unfortunately I am not allowed to use script tags in elm.
I recommend studying elm-lang.org's source code. The majority of it is pure Elm but there are pages that are generated on the server side with Haskell.

How to handle LaTeX/PDF doc reviews?

I am a Ph.D student, and I usually write articles which are later proof-read by my supervisor. I usually do it in LaTeX and reviews are done to the PDF outputs in Adobe Reader itself. There are mostly grammatical ones and mostly I miss prepositions and conjuctions in fast writing. To re-phrase everything I have to manually enter everything in my LaTeX script again.
This seems to be hell lot of work and this goes on multiple times sometimes. Is there any software in current world that makes the task easier? For example, if a text stuck out for grammar errors and suggested alternatives, can I accept the changes to replace old one with new phrase or sentence and also able to blank out the striked text. Please suggest me a tool which really makes my life easier.
You may want to take a look at the following link. It has some good information about version controlling.
http://en.wikibooks.org/wiki/LaTeX/Collaborative_Writing_of_LaTeX_Documents
You could attach the LaTeX sources to the PDF (with the attachfile2 package), so reviewers can directly edit the source and send that back. Or you try to accept comments to the PDF, but currently only Adobe Reader and Foxit allow that - and not on Linux.