torchserve with celery / redis - redis

I need to do a long lasting image inferencing, and for some reason, I have to use a torchserve version that has a response timeout limited to 120s.
As inferencing will exceed this limit, does it make sense to let a celery/redis task send an inference request to torchserve (version with timeout bug) custom handler, and wait for a callback, telling that the detection is done now, before the task finishes and proceeds with the next one?
I want furthermore avoid HTTP "too many requests", so just retry N times seems inefficient..

Related

Catch the event of a blocked instance only after a timeout

I have a program where I start several process instances using a cron. For each process instance I have a maximum time, and if the execution time exceeds it, I have to consider it as failure and use some specific methods.
For now what I did was simply to check, once my process instance has finished, if the elapsed time exceeds or not the given maximum time.
But what if my process instance gets blocked for some reason (e.g. server not responding)? I need to catch this event and perform failure operations as soon as the process gets blocked and timeout is exceeded.
How can I catch these two conditions?
I had a look at the FlowableEngineEventType, but there isn’t a PROCESS_BLOCKED/SUSPENDED type of event. But, even if it were, how do I fire it only if a certain amount of time has passed?
I assume that this is the same question as this from the Flowable Forum.
If you are using the Flowable HTTP Task then have a look at the documentation to see how you can set the timeouts on it and how you can react on errors there. If you are firing GET requests from your own code you would need to write your own business logic that would throw some kind of BpmnError and you would then handle that in your process.
The Flowable Process instance does not have the concept of being blocked, and you have to manually to that in your modelling.

Why is async used when using http request?

i can't understand why use Asynchronous and await when fetch data from server
A network request from a client to a server, possibly over a long distance and slow internet can take an eternity in CPU time scales.
If it weren't async, the UI would block until the request is completed.
With async execution the UI thread is free to update a progress bar or render other stuff while the framework or Operating System stack is busy on another thread to send and receive the request your code made.
Most other calls that reach out to the Operating System for files or other resources are async for the same reason, while not all of them are as slow as requests to a remote server, but often you can't know in advance if it will be fast enough to not hurt your frame rate and cause visible disruption or janks in the UI.
await is used to make code after that statement starting with wait is executed only when the async request is completed. async / await is used to make async code look more like sync code to make it easier to write and reason about.
Async helps a lot with scalability and responsiveness.
Using synchronous request blocks the client until a response has been received. As you increase concurrent users you basically have a thread per user. This can create a lot of idle time, and wasted computation. One request gets one response in the order received.
Using asynchronous requests allows the client to receive requests/send responses in any random order of execution, as they are able to be received/sent. This lets your threads work smarter.
Here's a pretty simple and solid resource from Mozilla:
https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Synchronous_and_Asynchronous_Requests#Asynchronous_request

How to tell for a particular request that all available worker threads are BUSY

I have a high-rate UDP server using Netty (3.6.6-Final) but notice that the back-end servers can take 1 to 10 seconds to respond - i have no control over those, so cannot improve latency there.
What happens is that all handler worker threads are busy waiting for response and that any new request must wait to get processed, over time this response comes very late. Is it possible to discover for a given request that the thread pool is exhausted, so as to intercept the request early and issue a server busy response?
I would use an ExecutionHandler configured with an appropriate ThreadPoolExecutor, with a max thread count and a bounded task queue. By choosing differenr RejectedExecutionHandler policies, you can either catch the RejectedExecutionException to answer with a "server busy", or use a "caller runs policy", in which case the IO worker thread will execute the task and create a push back (but that is what you wanted to avoid).
Either way, an execution handler with a limited capacity is the way forward.

Heroku: I have a request that takes more than 30 seconds and it breaks

I have a request that takes more than 30 seconds and it breaks.
What is the solution for this? I am not sure if I add more dynos this will work.
Thanks
You should probably see the Heroku devcenter article regarding this, as the information will be more helpful, here's a small summary:
To answer the timeout question:
Cedar supports long-polling and streaming responses. Your app has an initial 30 second window to respond with a single byte back to the client. After each byte sent (either recieved from the client or sent by your application) you reset a rolling 55 second window. If no data is sent during the 55 second window your connection will be terminated.
(That is, if you had Cedar instead of Aspen or Bamboo you could send a byte every thirty seconds or so just to trick the system. It might work.)
To answer your dynos question:
Additional concurrency is of no help whatsoever if you are encountering request timeouts. You can crank your dynos to the maximum and you'll still get a request timeout, since it is a single request that is failing to serve in the correct amount of time. Extra dynos increase your concurrency, not the speed of your requests.
(That is, don't bother adding more dynos.)
On request timeouts:
Check your code for infinite loops, if you're doing something big:
If so, you should move this heavy lifting into a background job which can run asynchronously from your web request. See Queueing for details.

NServiceBus delay retries

We need to be able to specify a delay in retrying failed messages. NServiceBus retries more or less instantly up to n times (as configured) before moving the message to error queue.
What I need to be able to do is for a given message type specify that its not to be retried for an arbitrary period of time
I've read the post here:
NServiceBus Retry Delay
but this doesn't give what I'm looking for.
Kind regards
Ben
This isn't supported as of right now. What you can do is let the messages go to the error queue and setup and endpoint to monitor that queue. Your code could then determine the rules for replaying messages. You could use a Saga to achieve this in combination with the Timeout manager.
Typically you'll have some rules around when to replay messages. In NSB 3.0 we have a better way to do this using the FaultManager. This gives you options on where to put failed messages and includes the exception. One of the options is a DB which you could then set up a job to inspect the exception and determine what to do with it.
Lastly a low tech way of getting this is to schedule a job that runs the ReturnToSourceQueue tool periodically to "clean up". We are doing this and including an alert so we don't endlessly cycle messages around.