How to convert text into ipa? - text-to-speech

I want to convert some text into ipa to make sure the pronunciation will always be the same on all tts engines.
I didn't find a good way to do that, especially I need this for german.
I hoped for a service from a big player like Google, because of their advanced tts, but I didn't find one.
Does someone know a good tool / api / plugin for that?

Check out espeak-ng, which does IPA output with the -x flag. There's also a couple of related libraries out there based on espeak/espeak-ng if you don't want to just use the CLI.
Also, Amazon Polly allows output of "visimes" (see Speech Marks, which in theory you could translate back into IPA (they have a lookup table).

Related

Using the SSML phoneme element

I am using Visual Basic.net Ultimate, and am developing a TTS application. May I please have some help with the phoneme element.
Here is the text that I wish to speak:
As you release the tension in your shoulders and neck, take another deep breath in... and out.
Currently, the two words "breath in" seem to be running together and sound like "breath thin"
I would like to (via SSML) modify this statement so that the words sound like "breath in."
What would be the best way to do this via SSML? I am thinking that the phoneme element is the best way to do this.
Here is an example I have found to pronounce the word tomato:
<phoneme alphabet="ipa" ph="təˈmeɪ.ɾoʊ"> tomato </phoneme>
The text between the ph section of the above code seems to be totally in a different language (:)). How do I use this language to spell out a word?
This "totally different language" is the International Phonetic Alphabet. That's why on the same line it says: alphabet="ipa".
Different TTS systems use different phonetic representations of sounds, the IPA being one that some systems support.
I think for what you are doing a break tag would be more appropriate, as you are not looking to change the systems pronunciation of a word, but instead you wish for it to have a short pause between the two words.
If you specify which TTS engine you are using I can try to help more.

How does Safari's reader feature work?

I want to add a similar feature to a tool I'm making. I'm interested in how it works code-wise. I want to be able get an html page and exclude all but the article.
The Readability project does something similar for chrome and iOS. I'm not sure how it detects the content automatically but I know that Readability has an API for people who want to integrate it's features. You might want to check that out.
http://www.readability.com/learn-more
If you're working with Ruby, you could use Pismo. It extracts an article from a given document.

Script or piece of code to get a quick list of links per page in a website

How can I quickly produce a report of a website in the format:
Page Name.
- Links within the page
Page Name.
- Links within the page
Any programming or scripting language will do.
Although I prefer a solution on Windows, we have all of: Windows, Mac and Linux platforms available in the office.
Just looking for a way to do it without much fanfare.
There might be tools able to do this for you, but it isn't all that hard to put together yourself. One possible solution would be to...
Use wget (can be found for Windows) to download all HTML files, and
use some xpath tool or grep with regexps to get the title and the links from the pages.
///Jens
There are loads of link analysers that will do exactly that. Here's the first I found in Google.
For something a little more interesting, Don Syme did a great F# demo in which he wrote a really simple asynch URL processing class. I can't find the exact link, but here's something similar from an F# MVP. You would need to adapt it to pull out links, and recursively follow them if you want nesting.

Create wmv-file from code?

I have a project where the requirements is that a end user will select a template, enter some information and then my program should create a wmv movie file that has the information entered encoded in the movie.
So from my perspective I would like to have a framework that allows me to add graphics and text to a movie. Something like this:
movie.addframes(framecount, templateimage)
movie.frame(x).drawtext(x,y,text,font,size,color)
movie.frame(x).drawRectangle(rect,color,bordersize)
movie.frame(x).drawImage(rect,borderstyle,bordersize, image,sizemode)
movie.save(filename,filetype)
Does this exists?
I have searched and only found information about ffmpeg that doesn't seem to do what I want.
I don't need it to be real-time encoding.
I don't care if the framework/library is expensive.
If there are information of how to do this with for example DirectX or DirectShow and pointing to real working vb.net examples, then ill be happy too. ;) (Believe me, I have tried to search and haven't found anything.)
I have not found any good information about how to use Windows Media Encoder for this, but It seems like Windows Media Encoder is the way to go if doing it myself..
you can use DirectShow, check Samples\Capture\CapWMV here

A technology for reading pdfs online with annotations?

is there an open source solution that displays PDFs for online reading? It has to be searchable much like google books and if possible has the ability to display annotations?
By "online reading" I'll assume you mean without a PDF reader plugin on the client. In that case you'll need to convert to HTML
http://pdftohtml.sourceforge.net/
If you don't mind losing the ability to copy text then converting to PNG may give you a more accurate rendering
http://www.imagemagick.org/
Regardless of the output format you can manage your searching using the original PDF data. One technology for this is mnogosearch
http://www.mnogosearch.org/
Monogosearch uses pdftotext internally, you may find this useful if you want to write your own search routines. pdftotext is part of the Xpdf suite of utilities
http://www.foolabs.com/xpdf/about.html
All of the tools listed above are available on Windows or Linux
You may also be interested in the Vuzit DocuPub Platform: http://vuzit.com/products/docupub_platform
The display technology itself is not open source, but they provide an API to access their service, so perhaps it is worth investigating.
Don't know if you are looking a software to install or some service to pay for...
I've read a lot about www.getbackboard.com (this is not advertising, only reporting something I've read about, that maybe fits your needs.. ;)
Not sure if they do annotations, but both of these will show PDFs quite well:
http://pdfmenot.com
http://docs.google.com
ICEPdf recently released their code as open source. It is Java based.
PyPdf is really nice. It supports reading the text as well as encryption which I know that itextsharp does not.
Of course you'd have to program in python as IronPython's class libraries aren't quite to the point where you can ref them from another language and use them. (But I imagine they will be someday soon)
PyPdf
This is not open source, but check it out anyways. You can download a free trial of their SDK to try it out. Reading PDF's and their annotations is not simple and I wouldn't trust a production app to open source decoders.
Here is an online demo.
http://www.atalasoft.com/ajaxannotations/default.aspx
Another good pdf reader is FoxitReader.