Why should I build an API with an asynchronous/non-blocking framework? - api

I have been looking into the Play Framework as a possible candidate for helping me to build a simple API. However, the Django Rest Framework (DRF) also seems to be a pretty strong contenter.
As far as I can tell, the DRF does not advertise itself to be an asynchronous (or non-blocking) framework like the Play Framework does, but I am interested in whether or not this even matters. The situation that I keep thinking of is sending an email to a user via Mandrill -- I do not want my API to get bogged-down waiting for the Mandrill API to tell it whether or not the email was sent.
Thus, I think the question can be summarized like this: is there a benefit from the client's perspective that will result from my building an API with an asynchronous/non-blocking framework like Play over the DRF, or am I missing the point?

I'm a Django REST framework contributor (and user), so my perspective is biased towards that.
Django REST framework is built on Django, which is a synchronous framework for web applications. If you're already using a synchronous framework like Django, having a synchronous API is less of an issue.
Now, just because it is synchronous, that doesn't mean that only a single request can ever be handled at a time. Most web servers that are handling Django applications can handle multiple requests, some of theme even do it somewhat asynchronously across multiple threads. Usually this isn't actually an issue, as your web server can typically handle many concurrent requests, even if some of them are blocking. And when you have long, blocking calls you usually don't want that done within the API - you should be delegating that to background workers like Celery or Resque.
This isn't just specific to Django, many of the same principles apply to other synchronous frameworks like Rails and ASP.NET MVC. If you have long-running requests, you generally should be delegating work to other processes instead of holding up the request. It's common to use the 202 response code for these cases.
Now, that doesn't necessarily mean that asynchronous frameworks are useless. In runtimes such as Node.js, most frameworks handle requests asynchronously. It doesn't make sense to use a synchronous framework in these languages, so most libraries are built to be asynchronous.
What you choose very much depends on the tools that you are already using.

Regarding the clients connecting to your app there should be no difference at all if your server uses asynchronous/non-blocking (ANB) technologies or not. But it may make a lot of difference in the number of requests your app can handle.
Suppose the following scenario: a request that checks if a FB/Google/etc access token is valid, and then uses it to get the social profile of your user and then returns something back.
If you are using a blocking http client in your server, during each of the 2 http requests the thread serving that request can be blocked a lot of time doing nothing.
If you are using a non-blocking http client (like the one Play brings) while the HTTP request is made and the response comes back the thread can be used to do something else (ex: process part of another request).
Note that to solve this "problem" you wouldn't need an ANB framework, just an ANB http client. So you should look more to the kind of operations you will have in your app and check how your chosen framework will deal with them. For example: if your app consists almost of DB CRUD operations and the DB driver is blocking (like JDBC in Java and probably the ones used by Django) it really does not matter much if the framework is asynchronous or not, you will be blocking most of the time on that specific component.
In your email example probably Django+Celery will do just as fine as Play/Akka.

Non async frameworks usually do long-running tasks passing them to some external process (e.g. Resque/DelayedJob/sidekiq for Rails development)
just wanted to add that Mandrill API supports async parameter for sending emails.
Here is what's their docs are saying:
enable a background sending mode that is optimized for bulk sending. In async mode, messages/send will immediately return a status of "queued" for every recipient. To handle rejections when sending in async mode, set up a webhook for the 'reject' event.
So in case using async set to true you'll get handle immediately after performing a call to the API without waiting for all emails to be sent.
https://mandrillapp.com/api/docs/messages.JSON.html#method-send
(I took JSON version of the API just as example)

The Django community is working on this thing for now if you want you can utilise the sync_to_async() adapter .
It comes with some limitations and performance penalties but the community is still working on the same .
The link below will help you to work with the sync_to_async() adapter
https://docs.djangoproject.com/en/3.2/topics/async/

Related

Micro-service architecture in .NET Core: pattern or library for services to call each other

I am implementing a micro-service architecture for the first time.
Some of my services (.NET Core Web APIs) need to communicate with each other through HTTP requests. For that purpose, I am injecting a wrapper around HttpClient.
But I suspect that I am reinventing the wheel. Among micro-service practitioners, is there a pattern or even a third-party library to solve this problem?
In a micro-service architecture, the most important thing is a clear separation of concerns and application boundaries. Imagine a simple setup, with Product and Price micro services
An important concept is each service is master of data, and owns its own database. In this example,
a client of the 'Product' service will make an HTTP call to the Product API.
the product API will make a call to the Price API to get prices for the products
the product API therefore depends on the Price API to create a response
These are the synchronous parts of the process, generally achieved through HTTP calls across boundaries. You'll also have asynchronous parts of your solution, in this example,
the Price API publishes an event to a bus whenever a price is changed
the product API publishes an event whenever a product is created
There may be one or more subscribers to these events, that will respond and probably call an API to retrieve the changed data.
The critical parts of this are clearly defining your API and message contracts, understanding if things will be async or sync, having the right level of telemetry across the entire architecture to track and understand distributed system behaviour, and keeping everything as independently buildable/testable/deployable components.
First and foremost, if you're not using containers, start, along with orchestration (both natively supported in Visual Studio, assuming you have Docker, etc. actually installed). Among the many benefits, you can reference your services via hostname without having to worry about ports and different locations for different environments.
As far as actual communication goes. There's not really a magic solution here. HttpClient is what you use, of course, and generally, yes, you want to have a wrapper around that to abstract away the low-level HTTP communication stuff, so the rest of your code can simply call simple methods on that wrapper.
If you aren't using IHttpClientFactory, start. If you already have a wrapper class, you're halfway there, and with that, not only do you get efficient management of HttpMessageHandlers so you don't exhaust your server's connection pool, but you can also use the Polly integration to handle transient HTTP errors and even do retry policies, circuit breakers, etc. for your microservice connections.
Finally, there is the Refit library which can make things a tad more straight-forward. I find it to have more use with huge third-party APIs like Facebook, Google, etc., though. Since microservices should by design be simple, you're probably not saving much code over just having your own wrapper class. Regardless, the way it works is that you define an interface that represents the API, and then Refit uses that to actually make appropriate requests. It's kind of like a wrapper class for free, but you still need to create the interface.

Does LoadRunner support JavaScript execution in response page?

Does Load Runner support JavaScript execution once response is received, unlike Jmeter?
Because in JMeter when we received the response Page if it contains JavaScript or AJAX call then it is not process by JMeter? So is it supported by Load Runner or not?
Yes, TruClient Virtual User type, vesions 11.x and later.
Unless your code is truly asynchronous, where seperate threads are kicking off Javascript and the server requests are arriving substantially different in sequence every time, you really don't need JavaScript processing. Most of the AJAX clients out there are less 'A' and more 'S'ynchronous in their behavior when you look at the sequence of calls for a given business process across multiple recording sessions. Of the remainder that are truely 'A'synchronous in behavior, a substantial majority of the 'A' calls are to third party components that would not be included in your performance test anyway (Can you imagine trying to coordinate your performance test with people at Google because your app includes Google Maps!)
So, back your core core question. Yes, LoadRunner does include a Virtual User type which supports JavaScript processing, the TruClient Virtual User. You could also use a GUI Virtual User or a Citrix|RDP Virtual User if you wanted to run full browsers. To your larger question, do you really need a virtual user which processes JavaScript? Look carefully at your request sequences across multiple recording sessions to understand whethere your business process is truly asynchronous in nature (with your servers and your code) or is synchronous in behavior with your application.

NServiceBus Sagas and REST API Integration best-practices

What is the most sensible approach to integrate/interact NServiceBus Sagas with REST APIs?
The scenario is as follows,
We have a load balanced REST API. Depending on the load we can add more nodes.
REST API is a wrapper around a DomainServices API. This means the API can be consumed directly.
We would like to use Sagas for workflow and implement NServiceBus Distributor to scale-out.
Question is, if we use the REST API from Sagas, the actual processing happens in the API farm. This in a way defeats the purpose of implementing distributor pattern.
On the other hand, using DomainServives API directly from Sagas, allows processing locally within worker nodes. With this approach we will have to maintain API assemblies in multiple locations but the throughput could be higher.
I am trying to understand the best approach. Personally, I’d prefer to consume the API (if readily available) but this could introduce chattiness to the system and could take longer to complete as compared to to in-process.
A typical sequence could be similar to publishing an online advertisement,
Advertiser submits a new advertisement request via a web application.
Web application invokes the relevant API endpoint and sends a command
message.
Command message initiates a new publish advertisement Saga
instance.
Saga sends a command to validate caller permissions (in
process/out of process API call)
Saga sends a command to validate the
advertisement data (in process/out of process API call)
Saga sends a
command to the fraud service (third party service)
Once the content and fraud verifications are successful,
Saga sends a command to the billing system.
Saga invokes an API call to save add details. (in
process/out of process API call)
And this goes on until the advertisement is expired, there are a number of retry and failure condition paths.
After a number of design iterations we came up with the following guidelines,
Treat REST API layer as the integration platform.
Assume API endpoints are capable of abstracting fairly complex micro work-flows. Micro work-flows are operations that executes in a single burst (not interruptible) and completes with-in a short time span (<1 second).
Assume API farm is capable of serving many concurrent requests and can be easily scaled-out.
Favor synchronous invocations over asynchronous message based invocations when the target operation is fairly straightforward.
When asynchronous processing is required use a single message handler and invoke API from the handlers. This will delegate work to the API farm. This will also eliminate the need for a distributor and extra hardware resources.
Avoid Saga’s unless if the business work-flow contains multiple transactions, compensation logic and resumes. Tests reveals Sagas do not perform well under load.
Avoid consuming DomainServices directly from a message handler. This till do the work locally and also introduces a deployment hassle by distributing business logic.
Happy to hear out thoughts.
You are right on with identifying that you will need Sagas to manage workflow. I'm willing to bet that your Domain hooks up to a common database. If that is true then it will be faster to use your Domain directly and remove the serialization/network overhead. You will also lose the ability to easily manage the transactions at the database level.
Assuming your are directly calling your Domain, the performance becomes a question of how the Domain performs. You may take steps to optimize the database, drive down distributed transaction costs, sharding the data, etc. You may end up using the Distributor to have multiple Saga processing nodes, but it sounds like you have some more testing to do once a design is chosen.
Generically speaking, we use REST APIs to model the commands as resources(via POST) to allow interaction with NSB from clients who don't have direct access to messaging. This is a potential solution to get things onto NSB from your web app.

Streaming API vs Rest API?

The canonical example here is Twitter's API. I understand conceptually how the REST API works, essentially its just a query to their server for your particular request in which you then receive a response (JSON, XML, etc), great.
However I'm not exactly sure how a streaming API works behind the scenes. I understand how to consume it. For example with Twitter listen for a response. From the response listen for data and in which the tweets come in chunks. Build up the chunks in a string buffer and wait for a line feed which signifies end of Tweet. But what are they doing to make this work?
Let's say I had a bunch of data and I wanted to setup a streaming API locally for other people on the net to consume (just like Twitter). How is this done, what technologies? Is this something Node JS could handle? I'm just trying to wrap my head around what they are doing to make this thing work.
Twitter's stream API is that it's essentially a long-running request that's left open, data is pushed into it as and when it becomes available.
The repercussion of that is that the server will have to be able to deal with lots of concurrent open HTTP connections (one per client). A lot of existing servers don't manage that well, for example Java servlet engines assign one Thread per request which can (a) get quite expensive and (b) quickly hits the normal max-threads setting and prevents subsequent connections.
As you guessed the Node.js model fits the idea of a streaming connection much better than say a servlet model does. Both requests and responses are exposed as streams in Node.js, but don't occupy an entire thread or process, which means that you could continue pushing data into the stream for as long as it remained open without tying up excessive resources (although this is subjective). In theory you could have a lot of concurrent open responses connected to a single process and only write to each one when necessary.
If you haven't looked at it already the HTTP docs for Node.js might be useful.
I'd also take a look at technoweenie's Twitter client to see what the consumer end of that API looks like with Node.js, the stream() function in particular.

Is having a function call block a bad design process?

I'm writing an API which is used to receive some data from another application. Currently the function is designed to block until data is received. In my mind this limits developers using the API to use multithreading or some sort of multi-process design. So is it better for a function to block or to return a null and then sleep for a few milliseconds before trying again.
Note the other application may not have any data to send through the API for an unknown period of time.
The API is written in C++
Why not use a callback?
You could define the API to allow the user to pass an optional timeout value. If the timeout is not specified, then the API function waits indefinitely, much like how select() works.
Consider another option: use an async transaction -> issue a request & provide a callback address with ticket id. When the response is available, the service end-point callbacks your application with the ticket id and of your the result ;-)
You should avoid as must as possible blocking when you possibly can.
As you say:
Note the other application may not have any data to send through the API for an unknown period of time.
In this case, using a synchronous interface ties up resources unnecessarily.
You haven't said what language this is, but it sounds like your API is listening or checking for some event, and the users of the API are either blocking or polling your API to determine if the event happened?
Is it possible to use a callback? Users of the API would register for notifications of the event happening, and when your library detects the event it will use the callback to notify all listeners.
When your applications calls the O/S api function read(), do you expect it to block? Of course you do—at least by default. In some circumstances, ioctl's allow a programmer to change the behavior to be asynchronous, which is particularly common in network applications.
You've shed very little light on what your API is about, so consider:
Does it make sense that an API user would want to be blocked? That is, is there little to do until it returns.
If you were writing an application for the API, what would you expect it to do? You should definitely write a few sample applications for your own education, as well as to document the API.
Is there any reason why the API user would not multithread (or fork, etc.) requests to the API?
If you want a reusable solution you could apply the Asynchronous Design 'Pattern' which is common in .NET but can also be implemented in C++ as demonstrated in this CodeProject project.
There's nothing wrong with providing both synchronous and asynchronous calls to the same feature in the interface.
Personally I would only go these lengths if I need to service multiple requests (in which case you can queue 'BeginOperation' requests for example), or there are many potentially asynchronous operations in the interface (and I want a standardised, flexible pattern). If you can only handle one request at a time a time-out is usually sufficient.