Get Speech Studio files from Azure Cognitive Text to Speech API (or SDK) - text-to-speech

There is any way to get the files generated in the Speech Studio using the REST API or SDK?
I'm working on a project where I want to create several audios from text, I like the Speech Studio tool, so we are thinking on integrate it to the workflow, creating audios in Speech Studio and after request them in the app.

There is no API to export the audio from Azure Speech Studio Audio Creation Center. But you can generate your audio directly by API/ SDK and export it.
API example -
curl --location --request POST "https://${SPEECH_REGION}.tts.speech.microsoft.com/cognitiveservices/v1" \
--header "Ocp-Apim-Subscription-Key: ${SPEECH_KEY}" \
--header 'Content-Type: application/ssml+xml' \
--header 'X-Microsoft-OutputFormat: audio-16khz-128kbitrate-mono-mp3' \
--header 'User-Agent: curl' \
--data-raw '<speak version='\''1.0'\'' xml:lang='\''en-US'\''>
<voice xml:lang='\''en-US'\'' xml:gender='\''Female'\'' name='\''en-US-JennyNeural'\''>
my voice is my passport verify me
</voice>
</speak>' > output.mp3
Python SDK example
import os
import azure.cognitiveservices.speech as speechsdk
# This example requires environment variables named "SPEECH_KEY" and "SPEECH_REGION"
speech_config = speechsdk.SpeechConfig(subscription=os.environ.get('SPEECH_KEY'), region=os.environ.get('SPEECH_REGION'))
audio_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
# The language of the voice that speaks.
speech_config.speech_synthesis_voice_name='en-US-JennyNeural'
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
# Get text from the console and synthesize to the default speaker.
print("Enter some text that you want to speak >")
text = input()
speech_synthesis_result = speech_synthesizer.speak_text_async(text).get()
if speech_synthesis_result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
print("Speech synthesized for text [{}]".format(text))
elif speech_synthesis_result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = speech_synthesis_result.cancellation_details
print("Speech synthesis canceled: {}".format(cancellation_details.reason))
if cancellation_details.reason == speechsdk.CancellationReason.Error:
if cancellation_details.error_details:
print("Error details: {}".format(cancellation_details.error_details))
print("Did you set the speech resource key and region values?")
More examples and how to you can refer to - https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-text-to-speech?tabs=windows%2Cterminal&pivots=programming-language-python#synthesize-to-a-file
In studio the only way to export the audio is the "Export" button

Related

Can we make an internal Google API call? By installing all pre-requirements locally for OCR?

Trying to implement OCR in bank environment but challenge is, we don't have access to internet connection for security reasons.
"Handwritten and scanned document to be digitalised"
open source like Tesseract OCR are good for normal English but most of our documents are in handwritten Arabic. I have tried Google OCR API here which has AI & ML works better with Arabic hand written and more accuracy.
I have google cloud, created storage,and uploaded hand written Arabic image in bucket then executed internal command in cloud terminal that is giving proper result.
External API Call:
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
--data "{
'requests': [
{
'image': {
'source': {
'imageUri': 'gs://vision-api-handwriting-ocr-bucket/handwriting_image.png'
}
},
'features': [
{
'type': 'DOCUMENT_TEXT_DETECTION'
}
]
}
]
}" "https://vision.googleapis.com/v1/images:annotate"
Internal Google Cloud cmd:
gcloud ml vision detect-document "gs://vision-api-handwriting-ocr-bucket/handwriting_image.png"
The above internal google cloud cmd code works fine in my case. The same i need to implement in the local system. Is there any possibility to install same Google cloud environment and their OCR engine in local? If its priced also no problem ready to pay and buy. Resource like high configured servers, networks everything already available in our bank.
it's been a while since this was posted, but just in case, there is a solution available now.
You may check this OCR on-prem application on Google Cloud Marketplace which can be deployed as a container to any GKE cluster:
https://cloud.google.com/vision/on-prem

Is there a Cognito SDK that uses the `amazoncognito.com/oauth2/token` endpoint?

Accoding to the following docs, I can exchange a code for an access_token using this curl:
curl -X POST \
https://mysubdomain.auth.us-east-2.amazoncognito.com/oauth2/token \
-H 'Content-Type: application/x-www-form-urlencoded' \
-H 'authorization: Basic ...' \
-d 'grant_type=authorization_code&client_id=client_id&code=code&redirect_uri=https%3A%2F%2Fwww.somewhere.com'
https://docs.aws.amazon.com/cognito/latest/developerguide/token-endpoint.html
I got this working no problem in Postman. Now I want to replicate this HTTP request in a dotnet core WebApi application and I'm having a very hard time finding any SDK to manage this. I could build and issue HttpRequest + deserialize the response json to models, but I find it hard time believing there isn't some AWS library that handles and maintains this much better than I ever could.
Is there an SDK for the amazoncognito.com/oauth2/token endpoint, preferably for dotnet core?
Unfortunately, not yet. You should use make a native HTTP call with POST method.
Here is the git issue for the same (for Java):
https://github.com/aws/aws-sdk-java/issues/1792

Programmatically Configure aws-iot button

i'm trying to create an RN APP and i want to know if there is any way to programatically configure AWS IOT button?.
http://docs.aws.amazon.com/iot/latest/developerguide/configure-iot.html
App is already connecting to Button ConfigureMe - XXX network.
The next step is configuring the button, i want to configure the button without the user opening the browser to 192.168.0.1/index.html
i want to build some ajax request to 192.168.0.1/configure passing the form data it needs.
wifi_ssid
wifi_password
aws_iot_certificate
aws_iot_private_key
endpoint_region
endpoint_subdomain
i can see the action endpoint from network console 192.168.0.1/configure
I was able to do this with a curl command in Linux. First you need the body of multipart/form-data. You could get it from the browser console after you click Configure button. Once you have the body, just run this, where $BOUNDARY is from the body text.
curl \
-X POST \
-H "Content-Type: multipart/form-data; boundary=$BOUNDARY" \
-F file=#body.txt \
http://192.168.0.1/configure
I'm still trying to make it work with Powershell via Invoke-RestMethod but still no luck

URLFetchApp with certificate: Google scripts with Apple ads reporting API

Hi I'm attempting to pull data from the Apple Ads API into a Google sheet, and I'm getting completely stuck on providing the security certificates. I've been able to successfully pull my data using Postman, so I'm comfortable that I can structure the request properly.
I'm trying to use URLFetchApp, but I can't see any means of including the PEM and KEY file, or even using the curl example provided by Apple of combining to the P12 file. Am I missing something here or is URLFetchApp unable to complete this?
It doesn't appear to me that this would fit into any of the existing headers for URLFetchApp https://developers.google.com/apps-script/reference/url-fetch/url-fetch-app#fetchurl-params
curl \
--cert ./<FILENAME>.p12 \
--pass <PASSWORD> \
-H "Authorization: orgId=<ORG_ID>" \
-H "Content-Type: application/json" \
-d "<CAMPAIGN_DATA_FILE>.json" \
-X POST "https://api.searchads.apple.com/api/v1/campaigns"
You're right in that Google Apps Script (GAS) does not support client-side SSL certificates in their UrlFetchApp class, which appears to be their only way to make outbound HTTP(S) requests.
Your best bet is probably to make a custom Google Apps Engine (GAE) in a language of your choice and expose an endpoint from there which when called from GAS will make a new request to your destination and provide the needed certificates. However, GAE is not free like GAS (since Google changed their cloud terms of service a couple years back), so that's something to keep in mind.

autoregister new teamcity build agent via API

We use Ansible to configure build agent for different technology stacks like: frontend builder(Nodejs, libs, Dart SDK), backend builder(JDK), etc with Ansible orchestration tool and that easy to replace one linux with another be configuring new one from zero with orchestration except new teamcity agent registration.
Is it piossible to generate new authorizationToken for new agent with API call which can be used in programming language or register new agent via API call to be able to connect new linux box without Admin/human ?
There's REST API call to achieve this, just pass true or false string as request data via PUT request to the /httpAuth/app/rest/agents/<agentLocator>/authorized, <agentLocator> syntax is described here.
Here's an example of a curl command:
curl -X PUT "http://teamcity/httpAuth/app/rest/agents/id:3/authorized" --data true --header "Content-Type: text/plain" -u user:pass
PUT method should be used and Content-Type: text/plain header should be provided.