How to use neural voices in Azure Direct Line Speech bot - text-to-speech

I am trying to update the experimental DirectLineSpeech Echo Bot sample's Speak() method to use neural voices, but it doesn't seem to work.
Here's the code I am trying to make it work -
public IActivity Speak(string message)
{
var activity = MessageFactory.Text(message);
string body = #"<speak version='1.0' xmlns='https://www.w3.org/2001/10/synthesis' xml:lang='en-US'>
<voice name='Microsoft Server Speech Text to Speech Voice (en-US, JessaNeural)'>
<mstts:express-as type='chat'>" + $"{message}" + "</mstts:express-as></voice></speak>";
activity.Speak = body;
return activity;
}
This is based on the recommendation provided in the SSML Guide
Here's the standard T2S for reference:
public IActivity Speak(string message)
{
var activity = MessageFactory.Text(message);
string body = #"<speak version='1.0' xmlns='https://www.w3.org/2001/10/synthesis' xml:lang='en-US'>
<voice name='Microsoft Server Speech Text to Speech Voice (en-US, JessaRUS)'>" +
$"{message}" + "</voice></speak>";
activity.Speak = body;
return activity;
}
Can someone help me understand how does it work or what am I doing wrong?
If it helps find any restrictions, I have deployed the bot as app service in F1 free tier in westus2 region.
Edit: Updated the code to use the full name ie. Microsoft Server Speech Text to Speech Voice (en-US, JessaNeural) instead of the short name en-US-JessaNeural as suggested by Nicholas. But this doesn't seem to help either.

The Neural voice exact name is Microsoft Server Speech Text to Speech Voice (en-US, JessaNeural). But the main thing is that you wanted to use a speaking style, using mstts:express-as.
The thing is that you forgot to add the block declaring mstts namespace in the xml (xmlns:mstts='https://www.w3.org/2001/mstts'):
"<speak version='1.0' xmlns='https://www.w3.org/2001/10/synthesis' xml:lang='en-US'>
<voice name='en-US-JessaNeural'>
<mstts:express-as type='chat'>" + $"{message}" + "</mstts:express-as>
</voice>
</speak>";
Should be:
"<speak version='1.0' xmlns='https://www.w3.org/2001/10/synthesis' xmlns:mstts='https://www.w3.org/2001/mstts' xml:lang='en-US'>
<voice name='en-US-JessaNeural'>
<mstts:express-as type='chat'>" + $"{message}" + "</mstts:express-as>
</voice>
</speak>";

Related

How I can take to human accent (Wavenet or Ssml voices)?

I am using this google cloud text to speech like they write in their website. https://codelabs.developers.google.com/codelabs/cloud-text-speech-csharp/#6 )
But there are no details about how to take output Wavenet voices (Ssml) . This coding output is normal voices.
My question is, with this code, how I can take to human accent (Wavenet or Ssml voieces)?
using Google.Cloud.TextToSpeech.V1;
using System;
using System.IO;
namespace TextToSpeechApiDemo
{
class Program
{
static void Main(string[] args)
{
var client = TextToSpeechClient.Create();
// The input to be synthesized, can be provided as text or SSML.
var input = new SynthesisInput
{
**Text = "This is a demonstration of the Google Cloud Text-to-Speech API"
};
// Build the voice request.
var voiceSelection = new VoiceSelectionParams
{
LanguageCode = "en-US",
SsmlGender = SsmlVoiceGender.Female**
};
// Specify the type of audio file.
var audioConfig = new AudioConfig
{
AudioEncoding = AudioEncoding.Mp3
};
// Perform the text-to-speech request.
var response = client.SynthesizeSpeech(input, voiceSelection, audioConfig);
// Write the response to the output file.
using (var output = File.Create("output.mp3"))
{
response.AudioContent.WriteTo(output);
}
Console.WriteLine("Audio content written to file \"output.mp3\"");
}
}
}
Here you can check the languages and voices supported in text-to-speech API. As described in this tutorial the speech is characterized by three parameters: the language_code, the name and the ssml_gender.
You can employ the following Python code to translate the text "Hello my name is John. How are you?" into English with the accent en-GB-Standard-A
def synthesize_text(text):
"""Synthesizes speech from the input string of text."""
from google.cloud import texttospeech
client = texttospeech.TextToSpeechClient()
input_text = texttospeech.types.SynthesisInput(text=text)
# Note: the voice can also be specified by name.
# Names of voices can be retrieved with client.list_voices().
voice = texttospeech.types.VoiceSelectionParams(
language_code='en-GB',
name='en-GB-Standard-A',
ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)
audio_config = texttospeech.types.AudioConfig(
audio_encoding=texttospeech.enums.AudioEncoding.MP3)
response = client.synthesize_speech(input_text, voice, audio_config)
# The response's audio_content is binary.
with open('output.mp3', 'wb') as out:
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
text="Hello my name is John. How are you?"
synthesize_text(text)
I am not familiar with C# language but judging by the C# and java documentations you should be able to define the name parameter as well to tune the speech.

BizTalk 2010 WCF Remove processing instruction

I need to do download an XML file from a public website (http://www.tcmb.gov.tr/kurlar/201707/10072017.xml) to get exchange rates.
But I have a problem since the XML contains an xml-stylesheet processing instruction.
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="isokur.xsl"?>
<Tarih_Date Tarih="07.07.2017" Date="07/07/2017" Bulten_No="2017/131" >
I use a WCF-Custom port with webHttpBindng and BizTalk REST starter kit from bLogical. Everything works fine, but when I try to parse the incoming xml, I get an error on that processing instruction.
System.Xml.XmlException: Processing instructions (other than the XML declaration) and DTDs are not supported. Line 2, position 2.
I'm not sure what the best way would be to fix this. I tried to follow this guide WCF Errors on XML Deserialization but it still fails when I try to access the message content using the CreateBufferedCopy method.
using (var readStream = new System.IO.MemoryStream())
{
using (var buffer = reply.CreateBufferedCopy(int.MaxValue))
{
buffer.WriteMessage(readStream);
}
readStream.Position = 0;
xdoc.Load(readStream);
}
Does anybody know how I can access the content of my message without actually parsing the XML? I'm just trying to find a way to either remove that line or make the parser ignore it.
I found the solution myself in the end. Instead of a message inspector, I created a Message Encoder based on the CustomTextMessageEncoder that you can find online.
In the ReadMessage method I just added a little bit of code
public override Message ReadMessage(System.IO.Stream stream, int maxSizeOfHeaders, string contentType)
{
XmlReaderSettings xsettings = new XmlReaderSettings();
xsettings.IgnoreProcessingInstructions = true;
XmlReader reader = XmlReader.Create(stream,xsettings);
return Message.CreateMessage(reader, maxSizeOfHeaders, this.MessageVersion);
}

Walmart Marketplace API Integration and Authentication

I am working on integrating my application Walmart Marketplace API using Ruby on Rails.
  1. if i try to generate Auth signature for multiple parameters, it does not generate it and returns exceptions. I am using a Jar file to generate Auth signature
    For e.g. -: https://marketplace.walmartapis.com/v3/orders?createdStartDate=2016-09-13&createdEndDate=2016-09-23 
Does anyone generate Auth Signature & timestamp for multiple parameter for Walmart Marketplace API
  2. Does Auth Signature & timestamp need to be generated for each API call for e.g . Pagination call Also?
Does Authentication need to do for each call?
Additional Comments
I know it is a month later and you already have your program figured out but in case you need some help with these parts or anyone else does, I thought I would include the following information I have on the Walmart API.
1.You might want to consider building a method in ruby since it'll be more interactive with the rest of your ruby program, it was kind of difficult but when I was doing it the most difficult part was wrapping the string in the with the SHA256 digest of string to sign. So I threw together a few methods and it works:
pem = make_pem('PRIVATE KEY', encodedKeyBytes)
digest = OpenSSL::Digest::SHA256.new
pkey = OpenSSL::PKey::RSA.new(pem)
signature = pkey.sign(digest, stringToSign)
def make_pem(tag, der)
box tag, Base64.strict_encode64(der).scan(/.{1,64}/)
end
def box(tag, lines)
lines.unshift "-----BEGIN #{tag}-----"
lines.push "-----END #{tag}-----"
lines.join("\n")
end
It's not perfect but ruby doesn't really have the functionality built in so you have to change it around to get it to work. If this still doesn't work feel free to contact me, but I started out using the jar they provide and I promise it is necessary when you are making thousands of different calls a day with different parameters and urls to be able to find the point of failure and if it isn't in ruby its going to be a lot harder to work with and fix.
2/3. You already answered that these need to be included in every call to the API and I don't really have anything else to add here except to not try to find a way around this, like submitting the same time stamp for a batch of calls. Even though it might work if the calls are made within a certain time window, Walmart uses the time stamp to determine which call came in last which is especially important for things like their price API. Again feel free to email me with any questions, I'll try to respond here too but I don't this website that often.
The variable names I am using these variable names just to reference the code provided in the walmart developer guide. I am just going to translate the java code there to ruby to show how I got the values for stringToSign and encodedKeyBytes.
# This is provided to you by walmart
consumerId = "b68d2a72...."
# Also provided by walmart
privateEncodedStr = "MIICeAIBADANBgkqhkiG9w0BAQEFAA......"
# Full path
baseUrl = "https://marketplace.walmartapis.com/v2/feeds"
# HTTP Method Verb
httpMethod = "GET"
timestamp = (Time.now.to_f * 1000).to_i.to_s
stringToSign = consumerId + "\n" + baseUrl + "\n" + httpMethod + "\n" + timestamp + "\n"
encodedKeyBytes = Base64.decode64(privateEncodedStr)
From there you just run it through the original code and then base64 encode the signature and remove white spaces and then you're good to make a request.
In Order to generate multiple parameter pass string as by escaping sting.
Auth Signature & timestamp need to be generated for each API call for e.g . Pagination call Also
if i try to generate Auth signature for multiple parameters, it does not generate it and returns exceptions. I am using a Jar file to generate Auth signature.
USE SHA class instead of jar file =>
It will generate signature for multiple parameters also.
import org.apache.commons.codec.binary.Base64;
import java.security.KeyFactory;
import java.security.PrivateKey;
import java.security.Signature;
import java.security.spec.PKCS8EncodedKeySpec;
public class SHA256WithRSAAlgo {
private static String consumerId = "b68d2a72...."; // Trimmed for security reason
private static String baseUrl = "https://marketplace.walmartapis.com/v2/feeds";
private static String privateEncodedStr = "MIICeAIBADANBgkqhkiG9w0BAQEFAA......"; //Trimmed for security reasons
public static void main(String[] args) {
String httpMethod = "GET";
String timestamp = String.valueOf(System.currentTimeMillis());
String stringToSign = consumerId + "\n" +
baseUrl + "\n" +
httpMethod + "\n" +
timestamp + "\n";
String signedString = SHA256WithRSAAlgo.signData(stringToSign, privateEncodedStr);
System.out.println("Signed String: " + signedString);
}
public static String signData(String stringToBeSigned, String encodedPrivateKey) {
String signatureString = null;
try {
byte[] encodedKeyBytes = Base64.decodeBase64(encodedPrivateKey);
PKCS8EncodedKeySpec privSpec = new PKCS8EncodedKeySpec(encodedKeyBytes);
KeyFactory kf = KeyFactory.getInstance("RSA");
PrivateKey myPrivateKey = kf.generatePrivate(privSpec);
Signature signature = Signature.getInstance("SHA256withRSA");
signature.initSign(myPrivateKey);
byte[] data = stringToBeSigned.getBytes("UTF-8");
signature.update(data);
byte[] signedBytes = signature.sign();
signatureString = Base64.encodeBase64String(signedBytes);
} catch (Exception e) {
e.printStackTrace();
}
return signatureString;
}
}
Does Auth Signature & timestamps need to be generated for each API call for e.g . Pagination call Also?
YES, for each and every call including pagination , you need to generate new Signature and Timestamps.
Does Authentication need to do for each call?
YES, Authentication need to do for each call.

Cloud service for outbound call messaging

I would like to integrate a service as. Apart of our product signin that will call the user to give them a code.
I can generate codes as mp3 without issue but I don't know of a service that can place the call and then play the mp3 to the user.
Any thoughts or feedback on this sort of app need?
Highly recommend Twilio Cloud Communication, it can be used to send the code via SMS or by a phone call.
Using PHP as an example, here is how it would look -
Text Message
<?php
// Get the Twilio PHP Library from http://twilio.com/docs/libraries
require "Services/Twilio.php";
//Random Code
$code = rand(pow(10, 6-1), pow(10, 6)-1);
// Set your Twilio Account settings
$AccountSid = "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX";
$AuthToken = "YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY";
//Initial Twilio instance
$client = new Services_Twilio($AccountSid, $AuthToken);
//Recipient number - must be +15555555555 format
$recipient = '+18882224444';
$caller_id = '+18008885555'; //Twilio phone number
//Send the message
$sms = $client->account->sms_messages->create(
$caller_id,
$recipient,
"Your activation code is: $code"
);
Phone Call
<?php
// Get the Twilio PHP Library from http://twilio.com/docs/libraries
require "Services/Twilio.php";
// Set your Twilio Account settings
$AccountSid = "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX";
$AuthToken = "YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY";
//Initial Twilio instance
$client = new Services_Twilio($AccountSid, $AuthToken);
$call = $client->account->calls->create(
'9991231234', // From a valid Twilio number
'8881231234', // Call this number
//When the call is connected, code at this URL will be executed
'/say-code.php'
);
say-code.php
<?php
// Get the Twilio PHP Library from http://twilio.com/docs/libraries
require "Services/Twilio.php";
// Random Code
$code = rand(pow(10, 6-1), pow(10, 6)-1);
// Generate TwiML - XML for Twilio
// This will execute when the caller is connected and use Text-To-Speech to
// play their activation code.
// You may also use an MP3 like so:
// $response->play('http://example.com/code.mp3');
$response = new Services_Twilio_Twiml();
$response->say("Your activation code is $code");
print $response;
There are a lot of other helper libraries out there to accomplish this with Twilio. Hope this helps!
I am wondering if something like Twilio Cloud Communications would work.

What is API URL executed by Twitter4j? (Search API)

I use Twitter4J libraries to access Twitter through their Search API.
I provide such a query to Twitter4j:
Query{query='#hungergames', lang='null', locale='null', maxId=-1, rpp=100, page=-1, since='null', sinceId=241378725860618240, geocode='null', until='null', resultType='recent', nextPageQuery='null'}
and
result = twitter.search(query);
but I am not sure what URL is executes internally.
Any insights into how I can find that out?
I know Twitter API documents how I should form the URL to query something here but I want to know what URL did Twitter4J execute.
The easiest way would probably be to sniff network traffic with a tool like Wireshark.
I used the following code to replicate your query:
public static void main(String[] args) throws TwitterException {
Twitter twitter = new TwitterFactory().getInstance();
Query query = new Query("#hungergames");
query.rpp(100);
query.setSinceId(241378725860618240L);
query.setResultType(Query.RECENT);
System.out.println(query);
QueryResult result = twitter.search(query);
for (Tweet tweet : result.getTweets()) {
System.out.println(tweet.getFromUser() + ":" + tweet.getText());
}
}
The line that prints the query gives me:
Query{query='#hungergames', lang='null', locale='null', maxId=-1, rpp=100, page=-1, since='null', sinceId=241378725860618240, geocode='null', until='null', resultType='recent'}
By sniffing the network traffic I found that the code is requesting the following URL:
http://search.twitter.com/search.json?q=%23hungergames&rpp=100&since_id=241378725860618240&result_type=recent&with_twitter_user_id=true&include_entities=true