I'm writing an API which is used to receive some data from another application. Currently the function is designed to block until data is received. In my mind this limits developers using the API to use multithreading or some sort of multi-process design. So is it better for a function to block or to return a null and then sleep for a few milliseconds before trying again.
Note the other application may not have any data to send through the API for an unknown period of time.
The API is written in C++
Why not use a callback?
You could define the API to allow the user to pass an optional timeout value. If the timeout is not specified, then the API function waits indefinitely, much like how select() works.
Consider another option: use an async transaction -> issue a request & provide a callback address with ticket id. When the response is available, the service end-point callbacks your application with the ticket id and of your the result ;-)
You should avoid as must as possible blocking when you possibly can.
As you say:
Note the other application may not have any data to send through the API for an unknown period of time.
In this case, using a synchronous interface ties up resources unnecessarily.
You haven't said what language this is, but it sounds like your API is listening or checking for some event, and the users of the API are either blocking or polling your API to determine if the event happened?
Is it possible to use a callback? Users of the API would register for notifications of the event happening, and when your library detects the event it will use the callback to notify all listeners.
When your applications calls the O/S api function read(), do you expect it to block? Of course you do—at least by default. In some circumstances, ioctl's allow a programmer to change the behavior to be asynchronous, which is particularly common in network applications.
You've shed very little light on what your API is about, so consider:
Does it make sense that an API user would want to be blocked? That is, is there little to do until it returns.
If you were writing an application for the API, what would you expect it to do? You should definitely write a few sample applications for your own education, as well as to document the API.
Is there any reason why the API user would not multithread (or fork, etc.) requests to the API?
If you want a reusable solution you could apply the Asynchronous Design 'Pattern' which is common in .NET but can also be implemented in C++ as demonstrated in this CodeProject project.
There's nothing wrong with providing both synchronous and asynchronous calls to the same feature in the interface.
Personally I would only go these lengths if I need to service multiple requests (in which case you can queue 'BeginOperation' requests for example), or there are many potentially asynchronous operations in the interface (and I want a standardised, flexible pattern). If you can only handle one request at a time a time-out is usually sufficient.
Related
I am creating a new net core 2.2 API for use with a JavaScript client. Some examples in Microsoft have the controller having all async methods and some examples aren't. Should the methods on my API be async. Will be using IIS if this is a factor. An example method will involve calling another API and returning the result whilst another will be doing a database request using entity Framework.
It is best practice to use async for your controller methods, especially if your services are doing things like accessing a database. Whether or not your controller methods are async or not doesn't matter to IIS, the .net core runtime will be invoking them. Both will work, but you should always try to use async when possible.
First, you need to understand what async does. Simply put, it allows the thread handling the request to be returned to the pool to field other requests, if the thread enters a wait state. This is almost invariably caused by I/O operations, such as querying a database, writing/reading a file, etc. CPU-bound work such as calculations require active use of the thread and therefore cannot be handled asynchronously. As side benefit of async is the ability to "cancel" work. If the client closes the connection prematurely, this will fire a cancellation token which can be used by supported asynchronous methods to cancel work in progress. For example, assuming you passed the cancellation token into a call to something like ToListAsync(), and the client closes the connection, EF will see this an subsequently cancel the query. It's actually a little more complex than that, but you get the idea.
Therefore, you need to simply evaluate whether async is beneficial in a particular scenario. If you're going to be doing I/O and/or want to be able to cancel work in progress, then go async. Otherwise, you can stick with sync, if you like.
That said, while there's a slight performance cost to async, it's usually negligible, and the benefits it provides in terms of scalability are generally worth the trade-off. As such, it's pretty much preferred to just always go async. Additionally, if you're doing anything async, then your action should also be async. For example, everything EF Core does is async. The "sync" methods (ToList rather than ToListAsync) merely block on the async methods. As such, if you're doing a query via EF, use async. The sync methods are only there to support certain limited scenarios where there's no choice but to process sync, and in such cases, you should run in a separate thread (Task.Run) to prevent deadlocks.
UPDATE
I should also mention that things are a little murky with actions and particularly Razor Page handlers. There's an entire request pipeline, of which an action/handler is just a part of. Having a "sync" action does not preclude doing something async in your view, or in some policy handler, view component, etc. The action itself only needs to be async if it itself is doing some sort of asynchronous work.
Razor Page handlers, in particular, will often be sync, because very little processing typically happens in the handler itself; it's all in subordinate processes.
Async is very important concept to understand and Microsoft focus too much on this. But sometimes we don't realise the importance of this. Every time you are not using Async you are blocking the caller thread.
Why Use Async
Even if your API controller is using single operation (Let's say DB fetch) you should be using Async. The reason is your server has limited number of threads to handle client requests. Let's assume your application can handle 20 requests and if you are not using Async you are blocking the handler thread to do the operation (DB operation) which could be done by other thread (Async). In turn your request queue grows because your main thread is busy dealing other things and not able to look after new requests , at some stage your application will stop responding. If you would use Async the Main thread is free to handle more client requests while other operation run in the background.
More Resources
I would recommend definitely watching very informative official video from Microsoft on Performance issues.
https://www.youtube.com/watch?v=_5T4sZHbfoQ
I have been looking into the Play Framework as a possible candidate for helping me to build a simple API. However, the Django Rest Framework (DRF) also seems to be a pretty strong contenter.
As far as I can tell, the DRF does not advertise itself to be an asynchronous (or non-blocking) framework like the Play Framework does, but I am interested in whether or not this even matters. The situation that I keep thinking of is sending an email to a user via Mandrill -- I do not want my API to get bogged-down waiting for the Mandrill API to tell it whether or not the email was sent.
Thus, I think the question can be summarized like this: is there a benefit from the client's perspective that will result from my building an API with an asynchronous/non-blocking framework like Play over the DRF, or am I missing the point?
I'm a Django REST framework contributor (and user), so my perspective is biased towards that.
Django REST framework is built on Django, which is a synchronous framework for web applications. If you're already using a synchronous framework like Django, having a synchronous API is less of an issue.
Now, just because it is synchronous, that doesn't mean that only a single request can ever be handled at a time. Most web servers that are handling Django applications can handle multiple requests, some of theme even do it somewhat asynchronously across multiple threads. Usually this isn't actually an issue, as your web server can typically handle many concurrent requests, even if some of them are blocking. And when you have long, blocking calls you usually don't want that done within the API - you should be delegating that to background workers like Celery or Resque.
This isn't just specific to Django, many of the same principles apply to other synchronous frameworks like Rails and ASP.NET MVC. If you have long-running requests, you generally should be delegating work to other processes instead of holding up the request. It's common to use the 202 response code for these cases.
Now, that doesn't necessarily mean that asynchronous frameworks are useless. In runtimes such as Node.js, most frameworks handle requests asynchronously. It doesn't make sense to use a synchronous framework in these languages, so most libraries are built to be asynchronous.
What you choose very much depends on the tools that you are already using.
Regarding the clients connecting to your app there should be no difference at all if your server uses asynchronous/non-blocking (ANB) technologies or not. But it may make a lot of difference in the number of requests your app can handle.
Suppose the following scenario: a request that checks if a FB/Google/etc access token is valid, and then uses it to get the social profile of your user and then returns something back.
If you are using a blocking http client in your server, during each of the 2 http requests the thread serving that request can be blocked a lot of time doing nothing.
If you are using a non-blocking http client (like the one Play brings) while the HTTP request is made and the response comes back the thread can be used to do something else (ex: process part of another request).
Note that to solve this "problem" you wouldn't need an ANB framework, just an ANB http client. So you should look more to the kind of operations you will have in your app and check how your chosen framework will deal with them. For example: if your app consists almost of DB CRUD operations and the DB driver is blocking (like JDBC in Java and probably the ones used by Django) it really does not matter much if the framework is asynchronous or not, you will be blocking most of the time on that specific component.
In your email example probably Django+Celery will do just as fine as Play/Akka.
Non async frameworks usually do long-running tasks passing them to some external process (e.g. Resque/DelayedJob/sidekiq for Rails development)
just wanted to add that Mandrill API supports async parameter for sending emails.
Here is what's their docs are saying:
enable a background sending mode that is optimized for bulk sending. In async mode, messages/send will immediately return a status of "queued" for every recipient. To handle rejections when sending in async mode, set up a webhook for the 'reject' event.
So in case using async set to true you'll get handle immediately after performing a call to the API without waiting for all emails to be sent.
https://mandrillapp.com/api/docs/messages.JSON.html#method-send
(I took JSON version of the API just as example)
The Django community is working on this thing for now if you want you can utilise the sync_to_async() adapter .
It comes with some limitations and performance penalties but the community is still working on the same .
The link below will help you to work with the sync_to_async() adapter
https://docs.djangoproject.com/en/3.2/topics/async/
I have a web application that uses the jquery autocomplete plugin, which essentially sends via ajax a request containing text that has been typed into a textbox to our web server, once the web server receives this request, it is then handed off to rabbitmq.
I know that we do get benefits from using messaging, but it seems like using it for blocking rpc calls is a misuse and that something like WCF is far more appropriate in this instance, is this the case or is it considered acceptable architecture?
It's possible to perform RPC synchronous requests with RabbitMQ. Here it's explained very well, with its drawback included! So it's considered an acceptable architecture. Discouraged, but acceptable whenever the synchronous response is mandatory.
As a possible counter-effect is that adding RabbitMQ in the middle, you will add some latency to the solution.
However you have the possibility to gain in terms of reliability, flexibility, scalability,...
What benefit would you get from it? And in fairness if you put the message in the queue how is is synchronous? unless the same process that placed the message in the queue is the one removing it, but that is pretty much useless no?
Now, if all you want to do is place the message in the queue and process it later on is grand.
Also the fact that you had WCF to the mixture is IMHO a symptom that something is perhaps not clear enough. You could use WCF as an API gateway and use it to write the message to the queue so this is not really about WCF or Queues, but more like sync vs async.
The way you are putting your ideas, does not look alright to me.
In the project I'm currently working we're using WCF.
Company policy forces us to use async calls and the reason should be security.
I've asked why this is so much more secure but I don't get clear answers.
Can someone explain why this is so much secure?
They are not. The same security (authentication, encryption) mechanisms and considerations apply whether a call blocks until it gets a response or it uses a callback.
The only way someone may be confused into thinking that asynch calls are more "safe/secure", is they think that unhandled WCF exceptions will not bring down the main thread if they are asynchronous, as they will be raised inside the callback.
In this case, I would advice extreme caution when approaching the owner of this policy to avoid career-limiting consequences. Some people can get emotionally attached to their policies.
There is no point why an async call will be more secure than a sync call. I think you should talk to the owner of the policy for the same.
No they are not more or less secure than synchronous calls. The only difference is the client waits for a response on synchronous calls, whereas on async it is notified of a response.
Are they coming from the angle that synchronous calls leave the connection open longer or something?
Just exposing a WCF operation using an async signature (BeginBlah/EndBlah) doesn't actually affect the exposed operation at all. When you view the meta data, an operation like
[OperationContract(AsyncPattern=true)]
IAsyncResult BeginSomething(AsyncCallback, object)
void EndSomething(IAsyncResult)
...actually still ends up being represented as an operation called 'Something'. And actually this is one of the nice things about WCF: the client and server can differ in whether they choose to implement/consume an operation syncronously.
So if you are using generating WCF proxies (eg through Add Service Reference) then you will get syncronous versions of each operation whether they are implemented asyncronously or not unless you tick the little checkbox to generate the async overloads. And when you do you then get async versions of operations that might only be declared syncronously on the server.
All WCF is doing is, on both the client and server, giving you a choice about your threading model: do you want WCF to wait for the result, or are you going to signal it that you've finished. How the actual transport connection is managed is - to the best of my knowlege - totally unaffected. eg: For a NetTcpBinding the socket still stays open for the duration of the call, either way.
So, to get to the point, I really struggle to imagine how this could possibly make any difference to the security envelope of a WCF service. If a service is exposed using an async pattern, and is genuinely implemented in an async way (async for outbound IO, or queues work via the thread pool or something) then there's probably an argument that it would be harder to DOS the service (by exhausting the pool of WCF IO threads), but that'd be about it.
See Syncronous and Asyncronous Operations in MSDN
NB: If you are sharing the contract interface between the client and server then obviously the syncronisity of the two ends match (because they are both using the same interface type), but that's just a limitation of using a shared interface. If you made another equivilent interface, differing only by the async pattern, you could still create a ChannelFactory against it just fine.
I agree with the other answers - definitely not more secure.
Fire up Fiddler and watch a synchronous request vs. an asynchronous request. You'll basically see the same type of traffic (although the sync may send and receive more data since it's probably a postback). But you can intercept both of those requests, manipulate them, and resend them and cause havoc on your server.
Fiddler's a great tool, by the way. It's an eye-opener in terms of what kind of data and how much data you're sending to the server.
Why do the Silerlight-generated WCF proxy class(es) offer only async calls?
There are cases where I don't really need the async pattern (for example in a BackgroundWorker)
EDIT : Sometimes I need to process the results of two WCF calls. It would have been much simpler if I could have waited (the business of the app allows that) for both calls to end and then process.. but noooo.... async! :P
As I understand it, the aim here is to make it hard for people to do the wrong thing (sync. IO from the UI). If you are using the WCF classes, you'll probably have to live with it.
There's actually a technical reason you can't do sync calls, at least from the 'main' browser thread, which is that the browser invokes all the plug-in API calls on the same thread, so if SL were to block that thread while waiting for the network callback, the network callback wouldn't get through and the app would deadlock. That said, the sync API would work fine if initiated from a different thread -- ie, if the application first does a QueueUserWorkItem to get off the browser thread -- but we felt it would be confusing to offer the sync option and have it only work some of the time.
Andrei, there ar emethods that even using the async pattern, allows you write expressive code, esasy to read and maintian, without becoming crazy wating 4 async requests, by just simplifying the way you write your code.
give a look to this library http://syncwcf.codeplex.com/