Handling long blocking API calls in ASP.NET Core - asp.net-core

I'm building an API that will handle http calls that perform much work immediately.
For example, in one controller, there is an action (Post request) that will take data in the body, and perform some processing that should be done immediately, but will last for about 1 to 2 minutes.
I'm using CQRS, with Mediatr, and inside this post request, I call a command to handle the processing.
Taking this into consideration, I want the post request to launch the command, then return an Ok to the user though the command is still running in the background. The user will be notified via email once everything is done.
Does someone know any best practice to accomplish this?
Though I'm using MediatR, sending the request to the handler, the call is not executed in parallel.
[HttpPost]
public async Task<IActionResult> RequestReports([FromBody] ReportsIntent reportsIntent)
{
await _mediator.Send(new GetEndOfWeekReports.Command(companyId, clientId, reportsIntent));
return Ok();
}

Does someone know any best practice to accomplish this?
Yes (as described on my blog).
You need what I call the "basic distributed architecture". Specifically, you need at least:
A durable queue. "Durable" here means "on disk; not in-memory".
A backend processor.
So the web api will serialize all the necessary data from the request into a queue message and place that on the queue, and then return to the client.
Then a backend processor retrieves work from that queue and does the actual work. In your case, this work concludes with sending an email.
In a way, this is kind of like MediatR but explicitly going out of process with the on-disk queue and a separate process being your handler.

I would handle API calls that takes some time separately from calls that can be completed directly.
when I do API calls that takes time, I queue them up and process them on the backend.
a typical API call can look something like this:
POST http://api.example.com/orders HTTP/1.1
Host: api.example.com
HTTP/1.1 201 Created
Date: Fri, 5 Oct 2012 17:17:11 GMT
Content-Length: 123
Content-Type: application/json
Location: http://poll.example.com/orders/59cc233e-4068-4d4a-931d-cd5eb93f8c52.xml
ETag: "c180de84f951g8"
{ uri: 'http://poll.example.com/orders/59cc233e-4068-4d4a-931d-cd5eb93f8c52.xml'}
the returned URL is a unique url where the client then can poll/query to get an idea about the status of the job.
When the client queries this URL, then it can look something like this:
and when it is later done, the result would be something like:
Where dataurl is a link to the result/report that the client then can download.

Related

How does calling a 3rd party API works in a loop?

I'm building a project that sends data to Zapier's webhook, these data can range to hundreds to thousands, and even hundreds of thousands and will be processed one by one in a loop.
Basically, what I did is used Laravel's Chunking Results, and inside it is a pool of Http requests. So it looks like this...
** Note that I chunked the result by 25 to avoid resource exhaustion when sending the data to zapier.
If you guys have better suggestion besides chunking, that would be great. **
Model::where(...someCondition)->chunk(25, function ($models) {
Http::pool(function (Pool $pool) use ($models) {
foreach ($models as $key=>$model) {
$pool->as($key)->withBody(json_encode($model), 'json')->post('https://the-zapier/update/hook/endpoint/...')
}
return $pool;
})
})
Now, Zapier receives the body request, and processes it, and of course it takes time for it to finish.
Here's what I understand:
Inside the loop, the http request was sent, but in Zapier, it hasn't finished the specific task yet.
Now, I'm curious, does the loop proceeds to iterate to the next item AFTER it sends the data to the endpoint ,BUT Zapier hasn't finished the task for this particular request?
OR
Does it wait for Zapier to finish the particular task before it proceeds to the next iteration?

What exactly does the expected HTTP response for a Reverse-AJAX request look like?

I'm trying to implement a simple Web Service (running on an Arduino board using an Ethernet shield) that can provide (push) information to a subscribed client by means of Reverse-AJAX.
The web service hosts a single web page that presents information from a (2D-LIDAR) sensor connected to that server board. Whenever the sensor output changes (very frequently and rapidly) the clients viewing that page should be instantly updated. For this application Reverse-AJAX / AJAX Push seems to be the option of choice, however I'm struggling to get the server part working.
This is what's in my aforementioned web page to "listen" for updates:
var xhr = new XMLHttpRequest();
xhr.multipart = true;
xhr.open( 'GET', 'push', true) ;
xhr.onreadystatechange = function() {
if (xhr.readyState == 4) {
processEvents( window.JSON.parse( xhr.responseText ) );
}
}
xhr.send( null );
I'd like to keep the XmlHttpRequest running forever and have it call the processEvents function whenever a chunk of (JSON) data comes in from the server side. However I'm not sure what the server response, especially the HTTP response header should look like to make this work as expected.
Whenever I have the server send a HTTP response header like this
HTTP/1.1 200 OK\r\n
Connection: keep-alive\r\n
Content-Length: 100\r\n
Content-Type: text/json\r\n
\r\n
the XmlHttpRequest finishes after receiving exactly one "chunk" of data. I also tried without "Content-Length" header, but "Content-Type: multipart/mixed; boundary=..." or "Transfer-Encoding: chunked" but both never happened to fire processEvents, supposedly because the browser was waiting for the response to complete, whatever that means.
I'm therefore looking for a working example of such a HTTP Response to an AJAX-Push request. What does a HTTP Response generally need to look like to be accepted by the indefinitely running XmlHttpRequest and to fire processEvents whenever new data arrives?
Btw. I tried those things using Firefox 64.0.
if you a looking for a low latency http based comm you should take a look at websockets. Your xmlhttp with Post method will have a high latency of about 50 milliseconds , trust me, I tried to develop a rgb controller based on a rainbow color picker before, either sync and asynchronous post methods wasn’t working well for me, the script just hold waiting for a response.
Now to answer you question specifically, download postman, a software that allows you to simulate all the http methods requests and headers you wish to. Also gives you the code to implement in many languages, and don’t forget f12 on chrome > network tab, this way you can check how the output of or http requests are being handled

Best practice for initial return in a REST-like image upload api endpoint?

When sending a file, e.g. an image, over HTTP to an API, how should the server respond?
Examples:
respond as soon as file is written to disk
respond only when file is written, processed, checksummed, thumbnailed, watermarked etc.
respond as fast as possible with a link to the resource (even if it's a 404 for a few moments afterwards)
add a 'task' endpoint and respond instantly with a task ID to track the progress before data transfer & processing (eventually including path to resource)
Edit: Added one idea from an answer to a similar question: rest api design and workflow to upload images.
The client doesn't know about disks, processing, checksumming, thumbnailing, etc.
The options then are pretty simple. You either want to return the HTTP request as quickly as possible, or you want the client to wait until you know the operation was successful.
If you want the client to wait, return 201 Created. If you want to return as quickly as possible, return 202 Accepted.
Both are acceptable designs. You should let your own requirements dictate which is better for your case. I would say that by default it's a good idea to 'default' to waiting until the HTTP request was guaranteed to be successful, and only use 202 Accepted if that was a specific requirement.
You could also let the client decide with a Prefer header:
Prefer: respond-async, wait=100
See https://www.rfc-editor.org/rfc/rfc7240#section-4.3

Yielding Data to the Client Early in MVC 4.0 Web Api

Using MVC 4.0 Web Api I have a long running DB query which is running asynchronously and, before it completes, the controller completes its "Get" or "Post" operation. This is all as expected/wanted.
However, although it looks like MVC has sent the data back to the client nothing actually get transmitted until the long running query completes.
Is there any way I can force an early "yield" of the data to the client or even to create and transmit a new response?
The point is that I don't need the results from the query - I just
want to fire and forget and it's important to return a response
(saying the query has started) to the client straight away
If it is fire-and-forget and you do not need to send the result to client, simply start the task
Task.Factory.StartNew(() => db.DoThatQueryThatBroughtDownChicago());
and return a string, a JSON result saying "Task started".

In the new ASP.NET Web API, how do I design for "Batch" requests?

I'm creating a web API based on the new ASP.NET Web API. I'm trying to understand the best way to handle people submitting multiple data-sets at the same time. If they have 100,000 requests it would be nice to let them submit 1,000 at a time.
Let's say I have a create new Contact method in my Contacts Controller:
public string Put(Contact _contact)
{
//add new _contact to repository
repository.Add(_contact);
//return success
}
What's the proper way to allow users to "Batch" submit new contacts? I'm thinking:
public string BatchPut(IEnumerable<Contact> _contacts)
{
foreach (var contact in _contacts)
{
respository.Add(contact);
}
}
Is this a good practice? Will this parse a GET request with a JSON array of Contacts (assuming they are correctly formatted)?
Lastly, any tips on how best to respond to Batch requests? What if 4 out of 300 fail?
Thanks a million!
When you PUT a collection, you are either inserting the whole collection or replacing an existing collection as if it was a single resource. It is very similar to GET, DELETE or POST a collection. It is an atomic operation. Using is as a substitute for individual calls to PUT a contact may not be very RESTfull (but that is really open for debate).
You may want to look at HTTP pipelining and send multiple PutContact requests of the same socket. With each request you can return standard HTTP status for that single request.
I implemented batch updates in the past with SOAP and we encountered a number of unforeseen issues when the system was under load. I suspect you will run into the same issues if you don't pay attention.
For example, the database may timeout in the middle of the batch update and the all hell broke loose in terms of failures, reliability, transactions etc. And the poor client had to figure out what was actually updated and try again.
When there was too many records to update, the HTTP request would time out because we took too long. That opened another can of worms.
Another concern was how much data would we accept during the update? Was 10MB of contacts enough? Perhaps 1MB? Larger buffers has numerous implications in terms of memory usage and security.
Hence my suggestion to look at HTTP pipelining.
Update
My suggestion would to handle batch creation of contacts as an async process. Just assume that a "job" is the same as a "batch create" process. So the service may look as follows:
public class JobService
{
// Post
public void Create(CreateJobRequest job)
{
// 1. Create job in the database with status "pending"
// 2. Save job details to disk (or S3)
// 3. Submit the job to MSMQ (or SQS)
// 4. For 20 seconds, poll the database to see if the job completed
// 5. If the job completed, return 201 with a URI to "Get" method below
// 6. If not, return 202 (aka the request was accepted for processing, but has not completed)
}
// Get
public Job Get(string id)
{
// 1. Fetch the job from the database
// 2. Return the job if it exists or 404
}
}
The background process that consumes stuff from the queue can update the database or alternatively perform a PUT to the service to update the status of Job to running and completed.
You'll need another service to navigate through the data that was just processed, address errors and so forth.
You background process may be need to be tolerant of validation errors. If not, or if your service does validation (assuming you are not doing database calls etc for which response times cannot be guaranteed), you can return a structure like CreateJobResponse that contains enough information for your client to fix the issue and resubmit the request. If you have to do some validation that is time consuming, do it in the background process, mark the job as failed and update the job with the information that will allow a client to fix the errors and resubmit the request. This assumes that the client can do something with the fact that the job failed.
If the Create method breaks the job request into many smaller "jobs" you'll have to deal with the fact that it may not be atomic and pose numerous challenges to monitor whether jobs completed successfully.
A PUT operation is supposed to replace a resource. Normally you do this against a single resource but when doing it against a collection that would mean you replace the original collection with the set of data passed. Not sure if you are meaning to do that but I am assuming you are just updating a subset of the collection in which case a PATCH method would be more appropriate.
Lastly, any tips on how best to respond to Batch requests? What if 4 out of 300 fail?
That is really up to you. There is only a single response so you can send a 200 OK or a 400 Bad Request and put the details in the body.