Best way to handle a synchronous call - lagom

Say your ReadProcessor needs to insert records using JDBC, or you need to integrate with a SOAP layer via a JAXWS call.
What is the best way to handle synchronous calls using Lagom asynchronous (by design) platform.

In contrast to vert.x, which provide dedicated possibilities to handle blocking calls, Lagom seems not to provide such an integrated feature.
According to their documentation (exemplarily for JDBC), one has to create own handling mechanisms, which internally create threads to run on.
So, the answer would be "do it yourself": create executors, runnables/callables and play with Futures to create an own non-blocking wrapper around your blocking calls.

Related

Why should I build an API with an asynchronous/non-blocking framework?

I have been looking into the Play Framework as a possible candidate for helping me to build a simple API. However, the Django Rest Framework (DRF) also seems to be a pretty strong contenter.
As far as I can tell, the DRF does not advertise itself to be an asynchronous (or non-blocking) framework like the Play Framework does, but I am interested in whether or not this even matters. The situation that I keep thinking of is sending an email to a user via Mandrill -- I do not want my API to get bogged-down waiting for the Mandrill API to tell it whether or not the email was sent.
Thus, I think the question can be summarized like this: is there a benefit from the client's perspective that will result from my building an API with an asynchronous/non-blocking framework like Play over the DRF, or am I missing the point?
I'm a Django REST framework contributor (and user), so my perspective is biased towards that.
Django REST framework is built on Django, which is a synchronous framework for web applications. If you're already using a synchronous framework like Django, having a synchronous API is less of an issue.
Now, just because it is synchronous, that doesn't mean that only a single request can ever be handled at a time. Most web servers that are handling Django applications can handle multiple requests, some of theme even do it somewhat asynchronously across multiple threads. Usually this isn't actually an issue, as your web server can typically handle many concurrent requests, even if some of them are blocking. And when you have long, blocking calls you usually don't want that done within the API - you should be delegating that to background workers like Celery or Resque.
This isn't just specific to Django, many of the same principles apply to other synchronous frameworks like Rails and ASP.NET MVC. If you have long-running requests, you generally should be delegating work to other processes instead of holding up the request. It's common to use the 202 response code for these cases.
Now, that doesn't necessarily mean that asynchronous frameworks are useless. In runtimes such as Node.js, most frameworks handle requests asynchronously. It doesn't make sense to use a synchronous framework in these languages, so most libraries are built to be asynchronous.
What you choose very much depends on the tools that you are already using.
Regarding the clients connecting to your app there should be no difference at all if your server uses asynchronous/non-blocking (ANB) technologies or not. But it may make a lot of difference in the number of requests your app can handle.
Suppose the following scenario: a request that checks if a FB/Google/etc access token is valid, and then uses it to get the social profile of your user and then returns something back.
If you are using a blocking http client in your server, during each of the 2 http requests the thread serving that request can be blocked a lot of time doing nothing.
If you are using a non-blocking http client (like the one Play brings) while the HTTP request is made and the response comes back the thread can be used to do something else (ex: process part of another request).
Note that to solve this "problem" you wouldn't need an ANB framework, just an ANB http client. So you should look more to the kind of operations you will have in your app and check how your chosen framework will deal with them. For example: if your app consists almost of DB CRUD operations and the DB driver is blocking (like JDBC in Java and probably the ones used by Django) it really does not matter much if the framework is asynchronous or not, you will be blocking most of the time on that specific component.
In your email example probably Django+Celery will do just as fine as Play/Akka.
Non async frameworks usually do long-running tasks passing them to some external process (e.g. Resque/DelayedJob/sidekiq for Rails development)
just wanted to add that Mandrill API supports async parameter for sending emails.
Here is what's their docs are saying:
enable a background sending mode that is optimized for bulk sending. In async mode, messages/send will immediately return a status of "queued" for every recipient. To handle rejections when sending in async mode, set up a webhook for the 'reject' event.
So in case using async set to true you'll get handle immediately after performing a call to the API without waiting for all emails to be sent.
https://mandrillapp.com/api/docs/messages.JSON.html#method-send
(I took JSON version of the API just as example)
The Django community is working on this thing for now if you want you can utilise the sync_to_async() adapter .
It comes with some limitations and performance penalties but the community is still working on the same .
The link below will help you to work with the sync_to_async() adapter
https://docs.djangoproject.com/en/3.2/topics/async/

NServiceBus Sagas and REST API Integration best-practices

What is the most sensible approach to integrate/interact NServiceBus Sagas with REST APIs?
The scenario is as follows,
We have a load balanced REST API. Depending on the load we can add more nodes.
REST API is a wrapper around a DomainServices API. This means the API can be consumed directly.
We would like to use Sagas for workflow and implement NServiceBus Distributor to scale-out.
Question is, if we use the REST API from Sagas, the actual processing happens in the API farm. This in a way defeats the purpose of implementing distributor pattern.
On the other hand, using DomainServives API directly from Sagas, allows processing locally within worker nodes. With this approach we will have to maintain API assemblies in multiple locations but the throughput could be higher.
I am trying to understand the best approach. Personally, I’d prefer to consume the API (if readily available) but this could introduce chattiness to the system and could take longer to complete as compared to to in-process.
A typical sequence could be similar to publishing an online advertisement,
Advertiser submits a new advertisement request via a web application.
Web application invokes the relevant API endpoint and sends a command
message.
Command message initiates a new publish advertisement Saga
instance.
Saga sends a command to validate caller permissions (in
process/out of process API call)
Saga sends a command to validate the
advertisement data (in process/out of process API call)
Saga sends a
command to the fraud service (third party service)
Once the content and fraud verifications are successful,
Saga sends a command to the billing system.
Saga invokes an API call to save add details. (in
process/out of process API call)
And this goes on until the advertisement is expired, there are a number of retry and failure condition paths.
After a number of design iterations we came up with the following guidelines,
Treat REST API layer as the integration platform.
Assume API endpoints are capable of abstracting fairly complex micro work-flows. Micro work-flows are operations that executes in a single burst (not interruptible) and completes with-in a short time span (<1 second).
Assume API farm is capable of serving many concurrent requests and can be easily scaled-out.
Favor synchronous invocations over asynchronous message based invocations when the target operation is fairly straightforward.
When asynchronous processing is required use a single message handler and invoke API from the handlers. This will delegate work to the API farm. This will also eliminate the need for a distributor and extra hardware resources.
Avoid Saga’s unless if the business work-flow contains multiple transactions, compensation logic and resumes. Tests reveals Sagas do not perform well under load.
Avoid consuming DomainServices directly from a message handler. This till do the work locally and also introduces a deployment hassle by distributing business logic.
Happy to hear out thoughts.
You are right on with identifying that you will need Sagas to manage workflow. I'm willing to bet that your Domain hooks up to a common database. If that is true then it will be faster to use your Domain directly and remove the serialization/network overhead. You will also lose the ability to easily manage the transactions at the database level.
Assuming your are directly calling your Domain, the performance becomes a question of how the Domain performs. You may take steps to optimize the database, drive down distributed transaction costs, sharding the data, etc. You may end up using the Distributor to have multiple Saga processing nodes, but it sounds like you have some more testing to do once a design is chosen.
Generically speaking, we use REST APIs to model the commands as resources(via POST) to allow interaction with NSB from clients who don't have direct access to messaging. This is a potential solution to get things onto NSB from your web app.

How to create an async WCF service

I want to implement a WCF service that responds immediately to the caller, but queues up an asynchronous job to be handled later. What is the best way to go about doing this? I've read the MSDN article on how to implement an asynchronous service operation, but that solution seems to still require the task to finish before responding to the caller.
There are many ways to accomplish this depending what you want to do and what technologies you are using (e.g. Unless you are using silverlight, you may not need to have your app call the service asynchronously) The most straight forward way to achieve your goal would be to have your service method start up a thread to perform the bulk of the processing and return immediately.
Another would be to create some kind of request (e.g. Create an entry in a datastore of some kind) and return. Another process (e.g. A windows service, etc.) could then pick up the request and perform the processing.
Any WCF service can be made asynchronous -
One of the nice things about WCF is you can write a service synchronously. When you add a ServiceReference in the client, you have the option of generating asynchronous methods.
This will automatically make the service call asynchronous. The service will return when it's done, but the client will get two methods - BeginXXX and EndXXX, as well as XXXAsync + an XXXCompleted event, either of which allows for completely asynchronous operation.

Concurrent WCF calls via shared channel

I have a web tier that forwards calls onto an application tier. The web tier uses a shared, cached channel to do so. The application tier services in question are stateless and have concurrency enabled.
But they are not being called concurrently.
If I alter the web tier to create a new channel on every call, then I do get concurrent calls onto the application tier. But I want to avoid that cost since it is functionally unnecessary for my scenario. I have no session state, and nor do I need to re-authenticate the caller each time. I understand that the creation of the channel factory is far more expensive than than the creation of the channels, but it is still a cost I'd like to avoid if possible.
I found this article on MSDN that states:
While channels and clients created by
the channels are thread-safe, they
might not support writing more than
one message to the wire concurrently.
If you are sending large messages,
particularly if streaming, the send
operation might block waiting for
another send to complete.
Firstly, I'm not sending large messages (just a lot of small ones since I'm doing load testing) but am still seeing the blocking behavior. Secondly, this is rather open-ended and unhelpful documentation. It says they "might not" support writing more than one message but doesn't explain the scenarios under which they would support concurrent messages.
Can anyone shed some light on this?
Addendum: I am also considering creating a pool of channels that the web server uses to fulfill requests. But again, I see no reason why my existing approach should block and I'd rather avoid the complexity if possible.
After much ado, this all came down to the fact that I wasn't calling Open explicitly on the channel before using it. Apparently an implicit Open can preclude concurrency in some scenarios.
You can cache the WCF proxy, but still create a channel for each service call - this will ensure concurrency, is not very expensive in comparison to creating a channel from scratch, and re-authentication for each call will not be necessary. This is explained on Wenlong Dong's blog - "Performance Improvement for WCF Client Proxy Creation in .NET 3.5 and Best Practices" (a much better source of WCF information and guidance than MSDN).
Just for completeness: Here is a blog entry explaining the observed behavior of request serialization when not opening the channel explicitly:
http://blogs.msdn.com/b/wenlong/archive/2007/10/26/best-practice-always-open-wcf-client-proxy-explicitly-when-it-is-shared.aspx

Is having a function call block a bad design process?

I'm writing an API which is used to receive some data from another application. Currently the function is designed to block until data is received. In my mind this limits developers using the API to use multithreading or some sort of multi-process design. So is it better for a function to block or to return a null and then sleep for a few milliseconds before trying again.
Note the other application may not have any data to send through the API for an unknown period of time.
The API is written in C++
Why not use a callback?
You could define the API to allow the user to pass an optional timeout value. If the timeout is not specified, then the API function waits indefinitely, much like how select() works.
Consider another option: use an async transaction -> issue a request & provide a callback address with ticket id. When the response is available, the service end-point callbacks your application with the ticket id and of your the result ;-)
You should avoid as must as possible blocking when you possibly can.
As you say:
Note the other application may not have any data to send through the API for an unknown period of time.
In this case, using a synchronous interface ties up resources unnecessarily.
You haven't said what language this is, but it sounds like your API is listening or checking for some event, and the users of the API are either blocking or polling your API to determine if the event happened?
Is it possible to use a callback? Users of the API would register for notifications of the event happening, and when your library detects the event it will use the callback to notify all listeners.
When your applications calls the O/S api function read(), do you expect it to block? Of course you do—at least by default. In some circumstances, ioctl's allow a programmer to change the behavior to be asynchronous, which is particularly common in network applications.
You've shed very little light on what your API is about, so consider:
Does it make sense that an API user would want to be blocked? That is, is there little to do until it returns.
If you were writing an application for the API, what would you expect it to do? You should definitely write a few sample applications for your own education, as well as to document the API.
Is there any reason why the API user would not multithread (or fork, etc.) requests to the API?
If you want a reusable solution you could apply the Asynchronous Design 'Pattern' which is common in .NET but can also be implemented in C++ as demonstrated in this CodeProject project.
There's nothing wrong with providing both synchronous and asynchronous calls to the same feature in the interface.
Personally I would only go these lengths if I need to service multiple requests (in which case you can queue 'BeginOperation' requests for example), or there are many potentially asynchronous operations in the interface (and I want a standardised, flexible pattern). If you can only handle one request at a time a time-out is usually sufficient.