When requesting azure TTS through the ssml statement, there is a delay of about 2 seconds - text-to-speech

As shown in the code below, when azure TTS is requested with the ssml statement, the response is delayed for about 2 seconds.
public static async Task SynthesizeAudioAsync()
{
var config = SpeechConfig.FromSubscription("xxxxxxxxxKey", "xxxxxxxRegion");
using var synthesizer = new SpeechSynthesizer(config, null);
var ssml = File.ReadAllText("C:/ssml.xml");
var result = await synthesizer.SpeakSsmlAsync(ssml); <=== The delay is right here
​ using var stream = AudioDataStream.FromResult(result);
await stream.SaveToWaveFileAsync("C:/file.wav");
}
What is the problem?
Or is it normal for the response speed to be delayed by 2 seconds?

var result = await synthesizer.SpeakSsmlAsync(ssml);
As far as I know, the time taken would be proportional to the data you have in the ssml for it to be converted speech.
What happens in the backend is that : The ssml data is relayed to Azure Speech Services, azure speech service processes the text, builds the required audio bytes using the ML Model returns to the requesting client.
I am assuming the delay you are talking about is the latency for doing the above process. If the data/text to be recognized is large, there will be considerable time in building required audio bytes.
Since you are using await, the code will be waiting until it is completed.

Related

React Native FileSystem file could not be read? What the Blob?

I am trying to send an audio file recorded using expo-av library to my server using web sockets.
Websocket will allow me to only send String, ArrayBuffer or Blob. I spent whole day trying to find out how to convert my .wav recording into a blob but without success. I tried to use expo-file-system method FileSystem.readAsStringAsync to read the filed as a string but I get an error that the file could not be read. How is that possible? I passed it the correct URI (using recording.getURI()).
I tried to re-engineer my approach to use fetch and FormData post request with the same URI and audio gets sent correctly. But I really would like to use WebSockets so that later I could try to make it stream the sound to the server in real time instead of recording it first and then sending it.
You can try ... But I can't find a way to read the blob itself
// this is from my code ...
let recording = Audio.Recording ;
const info = await FileSystem.getInfoAsync(recording.getURI() || "");
console.log(`FILE INFO: ${JSON.stringify(info)}`);
// get the file as a blob
const response = await fetch(info.uri);
const blob = await response.blob(); // slow - Takes a lot of time

How to work around maximum execution time when uploading to S3 Bucket?

I am using the S3-for-Google-Apps-Script library to export full attachments from Gmail to an S3 bucket. I changed the S3 code to upload the actual content of the attachment rather than an encoded string, as detailed in this post.
However, when attempting to upload an attachment approximately > 5 MB, apps script throws the following error: "Maximum Execution Time Exceeded". I used timestamps to measure the difference in time to ensure that the time issue occurred in the s3.putObject(bucket,objectKey,file) function.
It might be also helpful to note that for a file barely over the limit, it still gets uploaded to my s3 bucket, but apps script returns that the execution time has been exceeded (30 seconds) to the user, disrupting user flow.
Reproducible Example
This is basically a simple button that scrapes a current email for all attachments, if they are pdf's then it calls the export function. and it exports those attachments to our s3 instance. the problem is that when the file > 5mb, it throws the error:
"exportHandler exceeded execution time"
If you're trying to reproduce this be aware that you need to copy an instance of s3 for gas and initialize that as a separate library in apps script with the changes made here.
In order to link the libraries, go to file>libraries, and add the respective library id, version, and development mode in the google apps script console. You'll also need to save your AWS access key and secret key in your property service cache, as detailed in the library documentation.
An initial button that triggers an export of a single attachment on the current Gmail thread:
export default function testButton() {
const Card = CardService.newCardBuilder();
const exportButtonSection = CardService.newCardSection();
const exportWidget = CardService.newTextButton()
.setText('Export File')
.setOnClickAction(CardService.newAction().setFunctionName('exportHandler'));
exportButtonSection.addWidget(exportWidget);
Card.addSection(exportButtonSection);
return Card.build();
}
Export an attachment to a specified s3 bucket. Note that S3Modified is an instance of the s3 for google apps script that is modified in accordance to the post outlined above, it's a separate Apps Script file, s3.putObject is where it takes a long time to process an attachment (this is where the error occurs I think).
credentials initialize your s3 awsAccessKey and awsBucket, and can be stored in PropertiesService.
function exportAttachment(attachment) {
const fileName = attachment.getName();
const timestamp = Date.now();
const credentials = PropertiesService.getScriptProperties().getProperties();
const s3 = S3Modified.getInstance(credentials.awsAccessKeyId, credentials.awsSecretAccessKey);
s3.putObject(credentials.awsBucket, fileName, attachment, { logRequests: true });
const timestamp2 = Date.now();
Logger.log('difference: ', timestamp2 - timestamp);
}
This gets all the attachments that are PDFs in the current email message, this function is pretty much the same as the one on the apps script site for handling Gmail attachments, this specifically looks for pdf's though (not a requirement for the code):
function getAttachments(event) {
const gmailAccessToken = event.gmail.accessToken;
const messageIdVal = event.gmail.messageId;
GmailApp.setCurrentMessageAccessToken(gmailAccessToken);
const mailMessage = GmailApp.getMessageById(messageIdVal);
const thread = mailMessage.getThread();
const messages = thread.getMessages();
const filteredAttachments = [];
for (let i = 0; i < messages.length; i += 1) {
const allAttachments = messages[i].getAttachments();
for (let j = 0; j < allAttachments.length; j += 1) {
if (allAttachments[j].getContentType() === 'application/pdf') {
filteredAttachments.push(allAttachments[j]);
}
}
}
return filteredAttachments;
}
the global handler that gets attachments and exports them to the s3 bucket when the button is clicked:
function exportHandler(event) {
const currAttachment = getAttachments(event).flat()[0];
exportAttachment(currAttachment);
}
global.export = exportHandler;
To be absolutely clear, the bulk of the time is being processed in the second code sample (exportAttachment), since that is where the object is being put into the s3 application.
The timestamps help log how much time that function takes, test it with a 300kb file, you'll get 2 seconds, 4mb 20 seconds, >5mb approx 30 seconds. This part contributes the most to the max execution time.
So this is what leads me to my question, why do I get the maximum execution time exceeded error and how can I fix it? Here are my two thoughts on potential solutions:
Why does the execution limit occur? The quotas say that the runtime limit for a custom function is 30 seconds, and the runtime limit for the script is 6 minutes.
After some research, I only found custom function mentions in the context of AddOns in Google Sheets, but the function where I'm getting the error is a global function (so that it can be recognized by a callback) in my script. Is there a way to change it to not be recognized as a custom function so that I'm not limited to the 30-second execution limit?
Now, how can I work around this execution limit? Is this an issue with the recommendation to modify the S3 library in this post? Essentially, the modification suggests that we export the actual bytes of the attachment rather than the encoded string.
This definitely increases the load that Apps Script has to handle which is why it increases the execution time required. How can I work around this issue? Is there a way to change the S3 library to improve processing speed?
Regarding the first question
From https://developers.google.com/gsuite/add-ons/concepts/actions#callback_functions
Warning: The Apps Script Card service limits callback functions to a maximum of 30 seconds of execution time. If the execution takes longer than that, your add-on UI may not update its card display properly in response to the Action.
Regarding the second question
On the answer to Google Apps Script Async function execution on Server side it's suggested a "hack": Use an "open link" action to call something that can run asynchronously the task that will requiere a long time to run.
Related
How to use HtmlService in Gmail add-on using App Script
Handling Gmail Addon Timeouts
Can't serve HTML on the Google Apps Script callback page in a GMail add-on
Answer to rev 1.
Regarding the first question
In Google Apps Script, a custom function is a function to be used in a Google Sheets formula. There is no way not extend this limit. Reference https://developers.google.com/app-script/guides/sheets/functions
onOpen and onEdit simple triggers has also a 30 seconds execution time limit. Reference https://developers.google.com/apps-script/guides/triggers
Functions being executed from the Google Apps Script editor, a custom menu, an image that has assigned the function, installable triggers, client side code, Google Apps Script API has an execution time limit of 6 minutes for regular Google accounts (like those that have a #gmail.com email address) by the other hand G Suite accounts have a 30 minutes limit.

Masstransit RPC (RabbitMq) throughput limit

We are using Masstransit with RabbitMq for making RPCs from one component of our system to others.
Recently we faced the limit of throughput on client side, measured about 80 completed responses per second.
While trying to investigate where the problem was, I found that requests were processed fast by the RPC server, then responses were put to callback queue, and then, the queue processing speed was 80 M\s
This limit is only on client side. Starting another process of the same client app on the same machine doubles requests throughput on the server side, but then I see two callback queues, filled with messages, are being consumed each with the same 80 M\s
We are using single instance of IBus
builder.Register(c =>
{
var busSettings = c.Resolve<RabbitSettings>();
var busControl = MassTransitBus.Factory.CreateUsingRabbitMq(cfg =>
{
var host = cfg.Host(new Uri(busSettings.Host), h =>
{
h.Username(busSettings.Username);
h.Password(busSettings.Password);
});
cfg.UseSerilog();
cfg.Send<IProcessorContext>(x =>
{
x.UseCorrelationId(context => context.Scope.CommandContext.CommandId);
});
}
);
return busControl;
})
.As<IBusControl>()
.As<IBus>()
.SingleInstance();
The send logic looks like this:
var busResponse = await _bus.Request<TRequest, TResult>(
destinationAddress: _settings.Host.GetServiceUrl<TCommand>(queueType),
message: commandContext,
cancellationToken: default(CancellationToken),
timeout: TimeSpan.FromSeconds(_settings.Timeout),
callback: p => { p.WithPriority(priority); });
Has anyone faced the problem of that kind?
My guess that there is some program limit in the response dispatch logic. It might be the Max thread pool size, or the size of the buffer, also the prefetch count of response queue.
I tried to play with .Net thread pool size, but nothing helped.
I'm kind of new to Masstransit and will appreciate any help with my problem.
Hope it can be fixed in configuration way
There are a few things you can try to optimize the performance. I'd also suggest checking out the MassTransit-Benchmark and running it in your environment - this will give you an idea of the possible throughput of your broker. It allows you to adjust settings like prefetch count, concurrency, etc. to see how they affect your results.
Also, I would suggest using one of the request clients to reduce the setup for each request/response. For example, create the request client once, and then use that same client for each request.
var serviceUrl = yourMethodToGetIt<TRequest>(...);
var client = Bus.CreateRequestClient<TRequest>(serviceUrl);
Then, use that IRequestClient<TRequest> instance whenever you need to perform a request.
Response<Value> response = await client.GetResponse<TResponse>(new Request());
Since you are just using RPC, I'd highly recommend settings the receive endpoint queue to non-durable, to avoid writing RPC requests to disk. And adjust the bus prefetch count to a higher value (higher than the maximum number of concurrent requests you may have by 2x) to ensure that responses are always delivered directly to your awaiting response consumer (it's an internal thing to how RabbitMQ delivers messages).
var busControl = Bus.Factory.CreateUsingRabbitMq(cfg =>
{
cfg.PrefetchCount = 1000;
}

Akka HTTP Source Streaming vs regular request handling

What is the advantage of using Source Streaming vs the regular way of handling requests? My understanding that in both cases
The TCP connection will be reused
Back-pressure will be applied between the client and the server
The only advantage of Source Streaming I can see is if there is a very large response and the client prefers to consume it in smaller chunks.
My use case is that I have a very long list of users (millions), and I need to call a service that performs some filtering on the users, and returns a subset.
Currently, on the server side I expose a batch API, and on the client, I just split the users into chunks of 1000, and make X batch calls in parallel using Akka HTTP Host API.
I am considering switching to HTTP streaming, but cannot quite figure out what would be the value
You are missing one other huge benefit: memory efficiency. By having a streamed pipeline, client/server/client, all parties safely process data without running the risk of blowing up the memory allocation. This is particularly useful on the server side, where you always have to assume the clients may do something malicious...
Client Request Creation
Suppose the ultimate source of your millions of users is a file. You can create a stream source from this file:
val userFilePath : java.nio.file.Path = ???
val userFileSource = akka.stream.scaladsl.FileIO(userFilePath)
This source can you be use to create your http request which will stream the users to the service:
import akka.http.scaladsl.model.HttpEntity.{Chunked, ChunkStreamPart}
import akka.http.scaladsl.model.{RequestEntity, ContentTypes, HttpRequest}
val httpRequest : HttpRequest =
HttpRequest(uri = "http://filterService.io",
entity = Chunked.fromData(ContentTypes.`text/plain(UTF-8)`, userFileSource))
This request will now stream the users to the service without consuming the entire file into memory. Only chunks of data will be buffered at a time, therefore, you can send a request with potentially an infinite number of users and your client will be fine.
Server Request Processing
Similarly, your server can be designed to accept a request with an entity that can potentially be of infinite length.
Your questions says the service will filter the users, assuming we have a filtering function:
val isValidUser : (String) => Boolean = ???
This can be used to filter the incoming request entity and create a response entity which will feed the response:
import akka.http.scaladsl.server.Directives._
import akka.http.scaladsl.model.HttpResponse
import akka.http.scaladsl.model.HttpEntity.Chunked
val route = extractDataBytes { userSource =>
val responseSource : Source[ByteString, _] =
userSource
.map(_.utf8String)
.filter(isValidUser)
.map(ByteString.apply)
complete(HttpResponse(entity=Chunked.fromData(ContentTypes.`text/plain(UTF-8)`,
responseSource)))
}
Client Response Processing
The client can similarly process the filtered users without reading them all into memory. We can, for example, dispatch the request and send all of the valid users to the console:
import akka.http.scaladsl.Http
Http()
.singleRequest(httpRequest)
.map { response =>
response
.entity
.dataBytes
.map(_.utf8String)
.foreach(System.out.println)
}

Servicestack.Redis Pub/Sub limitations with other nested Redis commands

I am having a great experience with ServiceStack & Redis, but I'm confused by ThreadPool and Pub/Sub within a thread, and an apparent limitation for accessing Redis within a message callback. The actual error I get states that I can only call "Subscribe" or "Publish" within the "current context". This happens when I try to do another Redis action from the message callback.
I have a process that must run continuously. In my case I can't just service a request one time, but must keep a thread alive all the time doing calculations (and controlling these threads from a REST API route is ideal). Data must come in to the process on a regular basis, and data must be published. The process must also store and retrieve data from Redis. I am using routes and services to take data in and store it in Redis, so this must take place async from the "calculation" process. I thought pub/sub would be the answer to glue the pieces together, but so far that does not seem possible.
Here is how my code is currently structured (the code with the above error). This is the callback for the route that starts the long term "calculation" thread:
public object Get(SystemCmd request)
{
object ctx = new object();
TradingSystemCmd SystemCmd = new TradingSystemCmd(request, ctx);
ThreadPool.QueueUserWorkItem(x =>
{
SystemCmd.signalEngine();
});
return (retVal); // retVal defined elsewhere
}
Here is the SystemCmd.signalEngine():
public void signalEngine(){
using (var subscription = Redis.CreateSubscription())
{
subscription.OnSubscribe = channel =>
{
};
subscription.OnUnSubscribe = channel =>
{
};
subscription.OnMessage = (channel, msg) =>
{
TC_CalcBar(channel, redisTrade);
};
subscription.SubscribeToChannels(dmx_key); //blocking
}
}
The "TC_CalcBar" call does processing on data as it becomes available. Within this call is a call to Redis for a regular database accesses (and the error). What I could do would be to remove the Subscription and use another method to block on data being available in Redis. But the current approach seemed quite nice until it failed to work. :-)
I also don't know if the ThreadPool has anything to do with the error, or not.
As per Redis documentation:
Once the client enters the subscribed state it is not supposed to
issue any other commands, except for additional SUBSCRIBE, PSUBSCRIBE,
UNSUBSCRIBE and PUNSUBSCRIBE commands.
Source : http://redis.io/commands/subscribe