Is there a way to convert speech directly into SSML? - text-to-speech

Just as one is able to use various speech-to-text 'dictation' tools to convert spoken word into its corresponding text, I would like to know if there are similar such tools for converting spoken word into its corresponding SSML. That is, it will provide the text in addition to the relevant SSML tags associated with any intonation, prosody, pauses/breaks, inflection, etc... present in the speaker's voice.

I work on building Voice apps. In a recent project I was working on, we needed the text to sound exactly right, with all the associated intonations, prosody, pauses/breaks, inflection, etc.
On extensive research, we found that the only way to make the text sound like being spoken by a real person is either to use SSML (still not perfect) or a recorded mp3.
If you're trying to get the real person feel for a project, the best way to execute it is to utilize a human. I would suggest you record the mp3 (/get it recorded by a professional) instead of trying to get SSML from voice.
The reason we use SSML is exactly that computers cannot understand the associated intonations, prosody, pauses/breaks, inflection, etc. of human speech.
If your goal is to get SSML, then the best way would be to convert text to SSML. For this, I'd suggest taking a peek here:
W3C SSML
Google SSML
Amazon SSML
This is to the best of our knowledge # mid July 2018.
If anyone has more info please feel to add to this answer.
Hope this helps :3

Related

Is there a way to use YouTube's API to batch search all videos by transcript?

I'm working on a documentary and looking for specific sound bites – wondering if anyone has ever developed a way to search YouTube transcripts en masse.
Like, as an example: if I'm looking for a clip of someone talking about pounds of e-waste, I could search for "million pounds of e-waste" and find any video where that phrase pops up in the transcript.
I'm surprised this doesn't already exist, since it would be so valuable to many different aspects of crediting, sourcing, and media production. So that leads me to think it's not possible or allowed w/the API for some reason.
I doubt there is such a way provided by YouTube to batch search all videos by transcript.
However, if you're really want to get this result, the best way I can think of is as follows:
Make a search of videos about your topic - in this case, using the search term: "million pounds of e-waste"
For each video, get its transcripts/captions
Read the obtained transcripts/captions and mark those videos where the search term is found (either total or parcially).
You can check my answer (https://stackoverflow.com/a/70438847/12511801) for get a initial starting point for read the captions and search the text on its transcripts/captions.

Text-to-speech: prosody transfer

Probably "prosody transfer" isn't the right term, but I don't know what would be.
I'm looking for some solution, most likely a software library, that would allow me to:
Record an utterance.
Also provide a text transcription.
Play it back using a different voice, but with intonation, stress, rhythm, etc. preserved from the original recording.
Alternatively, I'm also interested in ways of annotating the text with prosody information before feeding it into a text-to-speech system.
Could you please suggest me some software or libraries (preferably free and open-source, or at least cheap), or even just pointers about where to read up on such topics? Thanks.
(Note that I'm a software engineer, but new to TTS.)

API for document language translation

I searched for an API which takes a document as an input and translates that document into a specified language but I'm unable able to find such an API.
I know Google provides an API to translate text but here I need to translate a whole document and return it translated. Please let me know if any API like this is available.
You might want to look into the Pairaphrase translation API.
It will translate documents (and other files) from one language into another.
Just note that Pairaphrase is really meant to be used as a translation productivity tool for businesses. It won't give you perfect translation.
Instead, it facilitates secure, high-quality translation by doing the following.
It uses machine translation and translation memory to translate your files quickly (as a 1st draft).
Then, it's up to the user to share (using the "share" feature inside Pairaphrase) the translated files with bilingual users inside and/or outside their organizations to edit the translations within Pairaphrase.
When this is done correctly, Pairaphrase learns human-translated words and phrases so that you never need to translate the same phrases or sentences twice.
All translation data is encrypted and two-factor authentication is available to provide data security.
They have a demo video that you can check out. This will give you a feel for the functionality and if this translation API is right for you.

Correcting misunderstandings from Google's Speech2Text service

I am using Google's Speech2Text API and would like to optimize the results I get, correcting misunderstood words by performing Google searches to find possible phrases within a chosen topic. Is there a service I can use for that?
No, I don't think there is a service like that already in existence. But it wouldn't be too much work to write one.
You're already using Google's API to perform the speech recognition, so presumably you'd be comfortable using it to perform a search for the "chosen topic".
Once you've done that you can take all the answer pages and concatenate them together to make a corpus of phrases that match the chosen topic. From those you can implement an algorithm to find the closest possible substring (try these). You're looking for the substring of that whole corpus that is closest to the speech recognition results you got from Google. That should give you your answer.

Tools to effectively manage the information?

How do you guys manage the information overflow?
What are the tools that you guys use?
One of the usefull tool is RSS feed reader.
Does Any body uses any other tools or any other ways to effectively manage the information?
Be an information snob.
If the blog doesn't absolutely rock your world, don't read it. It's so easy to get bogged down, even obsessed, with too much information. No matter what tools you have, you're still human and can only read so many words per day.
I use Evernote to keep notes and search through them.
I use Google Reader for the feeds. Split it up in multiple categories, 'A' with the more unique stuff, 'B' with the spam (Digg for example, easy to ignore because the important stuff shows up in 'A'), 'C' for my webcomics.
I always read the stuff in 'A', when bored I read 'C' and 'B' when I have spare time. It happens a lot of time that I'll mark 'B' as read just to get rid of it.
For work I'm stuck with Outlook, so I use the 'Tasks' function of Outlook a lot to get things sorted. Also a big believer of 'Inbox Zero' (http://www.43folders.com/izero).
I use a small number of tools and techniques, because it is easy to get distracted managing the information management tools, rather than managing the information.
Google Reader - The key for me was creating #work and #home labels, for the appropriate location.
TiddlyWiki - I keep track of all my notes for work projects in a TiddlyWiki file.
Delicious - I keep my bookmarks here. When I come across a link I want to read later (usually in my RSS Reader), I tag it #readreview. When I read it, I delete it unless it is useful reference, then I retag appropriately.
Local bookmarks - I store bookmarks on the browser toolbar in folders so I can middle-click and open all in tabs. Obviously these would be limited in number :-). I also have a bookmarklets folder.
I don't have a PDA. I have a pad of graph paper on my desk that I use for writing temporary notes and diagrams (permanent notes go into the TiddlyWiki). A lot of "productivity blogs" like to promote various tools, and some of these caught on for people, but I find my system is pretty simple and easy for me to manage. This makes it useful.
Well, this is an obvious one, but iGoogle seems to do a great job for me.
Depends on what information you are looking to manage. Can you be more specific?
I use google reader to handle things i read, RememberTheMilk to remind me of what i have to do, and gmail overall to quickly store and search data/correspondences.
Oh and i use the hipster PDA too!
You should probably check out Lifehacker for more tools and Getting Things Done apps.
Like you say most sites have a RSS feed today. Get a RSS Reader that sync between computers if you use more than one computer, so you don't have to mark alot of post as read. A good program is FeedDeamon, its free and sync between computers, there is even a online version as well, if you are on the road. FeedDeamon also have tools to help you identify the feeds, that you dont really read, and gives you a top 10 of feeds that you look on alot. This can help you delete bad RSS-Feeds, and also help you organize you're feeds.
I also use Delicious, to keep my bookmarks in sync, and is very handy if you bookmark alot.
Other than that, I don't really use any more tools - just the common sence that there is only 24 hours in the day, so dont use it to just read information that you don't need - bookmark interesting blog post from RSS, and read them later when you need to.
I've been using Delicious quite a bit over the past 2 years and it's been a great help.
If you're primarily interested in blogs, what I think we need is a way to prioritize the information that we, personally, are interested in. There used to be an RSS reader called wTicker (now demised) that used Bayesian filtering to rate articles for you. Another product under development, Particls, would similarly watch what you read and highlight similar content.
What about other types of information, though? For example, the tasks that OneNote or EverNote, or more obscure tools like Zoot aim to facilitate?
It depends on the type of information you meant. The answers above contain most of the tools. But if you use ms office you shall explore Office OneNote.
iGoogle: News, RSS, Wether, New Films, E-Mail widgets
ToDoList: every day work aspects
Local MediaWiki, for local company knowleges
Smartphone MS Excel for personal finances.
I still read news and blogs from RSS feeds. Feedly is the best tool for that right now.
When I find something interesting in Feedly, I add it to Pocket and read later. A Premium account allows me to highlight paragraphs I would like to save.
I also set up a receipt on IFTTT that monitors my likes on Twitter and adds links from the liked tweets to Pocket too.
As Substack grows, there is the new email newsletter boom. But my inbox is also a place where I do my work. So, I wrote an apps script file to receive newsletters once or twice a day and prevent them from distracting me from work. And then, I published it as Silent Inbox add-on that plugs into your Gmail.