Handle Bulk API load with MuleSoft

Handle Bulk API load with MuleSoft - mule

I am new to Mulesoft. I have an api in my application which can't handle more than 2000 parallel requests. I am thinking to use Mulesoft as a proxy API which takes the request and hit my API so that even if my API reaches its capacity Mulesoft will pause for sometime and hit my API without loosing any data.
Does Mulesoft solve my issue? if so can anyone please guide me through the process?
Thanks

You probably would want something as simple as the until-successful scope. You can read up more about that here. The premise of it is this:
You wrap a component in the until-successful scope, and you define the following:
What how a failure is defined or how a success is defined
How many times you want to try the component until an overall failure,
How much time should elapse between each call.
There are examples in the documentation that I linked to and those should help guide you!

What is peak load (no.of requests) your application API is expecting to handle ?
Mule API proxy can be used for response caching here. This means for similar reuests, your API wont get hit , the responses would be sent back from proxy itself. But this alone may not ,probably solve the load issue.
You might have to do the load balancing of your API depending on the peak load requirement.

Related

How would i go about creating a streaming API, that receives data from a POST request and pushes it out?

I am interested in developing an API that is capable of receiving data in real-time and pushing it out to clients connected to an endpoint. I have looked in socket.io and web sockets. However, these depend on events being triggered to send/receive data. This isn't ideal for my use case. What alternatives are there for me to achieve this?
Any help and advice are greatly appreciated.

So If I understand it right, you want to write a streaming service that can push updates on some data in real-time over an endpoint exposed to the clients. I guess webhooks might be a solution looking into your problem statement. I'd recommend you to look into this https://www.youtube.com/watch?v=63grynZmo7c as well. It has got elementary information as to how do you create a webhook and start receiving real-time updates on it

Why should I build an API with an asynchronous/non-blocking framework?

I have been looking into the Play Framework as a possible candidate for helping me to build a simple API. However, the Django Rest Framework (DRF) also seems to be a pretty strong contenter.
As far as I can tell, the DRF does not advertise itself to be an asynchronous (or non-blocking) framework like the Play Framework does, but I am interested in whether or not this even matters. The situation that I keep thinking of is sending an email to a user via Mandrill -- I do not want my API to get bogged-down waiting for the Mandrill API to tell it whether or not the email was sent.
Thus, I think the question can be summarized like this: is there a benefit from the client's perspective that will result from my building an API with an asynchronous/non-blocking framework like Play over the DRF, or am I missing the point?

I'm a Django REST framework contributor (and user), so my perspective is biased towards that.
Django REST framework is built on Django, which is a synchronous framework for web applications. If you're already using a synchronous framework like Django, having a synchronous API is less of an issue.
Now, just because it is synchronous, that doesn't mean that only a single request can ever be handled at a time. Most web servers that are handling Django applications can handle multiple requests, some of theme even do it somewhat asynchronously across multiple threads. Usually this isn't actually an issue, as your web server can typically handle many concurrent requests, even if some of them are blocking. And when you have long, blocking calls you usually don't want that done within the API - you should be delegating that to background workers like Celery or Resque.
This isn't just specific to Django, many of the same principles apply to other synchronous frameworks like Rails and ASP.NET MVC. If you have long-running requests, you generally should be delegating work to other processes instead of holding up the request. It's common to use the 202 response code for these cases.
Now, that doesn't necessarily mean that asynchronous frameworks are useless. In runtimes such as Node.js, most frameworks handle requests asynchronously. It doesn't make sense to use a synchronous framework in these languages, so most libraries are built to be asynchronous.
What you choose very much depends on the tools that you are already using.

Regarding the clients connecting to your app there should be no difference at all if your server uses asynchronous/non-blocking (ANB) technologies or not. But it may make a lot of difference in the number of requests your app can handle.
Suppose the following scenario: a request that checks if a FB/Google/etc access token is valid, and then uses it to get the social profile of your user and then returns something back.
If you are using a blocking http client in your server, during each of the 2 http requests the thread serving that request can be blocked a lot of time doing nothing.
If you are using a non-blocking http client (like the one Play brings) while the HTTP request is made and the response comes back the thread can be used to do something else (ex: process part of another request).
Note that to solve this "problem" you wouldn't need an ANB framework, just an ANB http client. So you should look more to the kind of operations you will have in your app and check how your chosen framework will deal with them. For example: if your app consists almost of DB CRUD operations and the DB driver is blocking (like JDBC in Java and probably the ones used by Django) it really does not matter much if the framework is asynchronous or not, you will be blocking most of the time on that specific component.
In your email example probably Django+Celery will do just as fine as Play/Akka.

Non async frameworks usually do long-running tasks passing them to some external process (e.g. Resque/DelayedJob/sidekiq for Rails development)
just wanted to add that Mandrill API supports async parameter for sending emails.
Here is what's their docs are saying:
enable a background sending mode that is optimized for bulk sending. In async mode, messages/send will immediately return a status of "queued" for every recipient. To handle rejections when sending in async mode, set up a webhook for the 'reject' event.
So in case using async set to true you'll get handle immediately after performing a call to the API without waiting for all emails to be sent.
https://mandrillapp.com/api/docs/messages.JSON.html#method-send
(I took JSON version of the API just as example)

The Django community is working on this thing for now if you want you can utilise the sync_to_async() adapter .
It comes with some limitations and performance penalties but the community is still working on the same .
The link below will help you to work with the sync_to_async() adapter
https://docs.djangoproject.com/en/3.2/topics/async/

Is there some kind of service to queue api calls?

I need to call the desk.com api to create cases when a customer completes a form on my site. However sometimes the API is down for maintenance (too often!) and my call will fail.
Presently I just write the details to a log on error and send myself an email. Then I create the case manually.
So I'm thinking to write some kind of message queue so instead of calling the api in-process, I can put the request in queue, then have some process work the queue and make the api calls. they way if the api call fails the process will just try again next scheduled interval.
Since there are so many web APIs in the world, I figure surely other people must be having the same problem. So are there some third-party solutions which effectively do what I'm trying to do? or some open-source project or something to deal with this issue?
Cheers!

Amazon Simple Queue Service (SQS) is a fast, reliable, scalable, fully managed queue service. SQS makes it simple and cost-effective to decouple the components of a cloud application. You can use SQS to transmit any volume of data, at any level of throughput, without losing messages or requiring other services to be always available.
http://aws.amazon.com/sqs/

Is having a function call block a bad design process?

I'm writing an API which is used to receive some data from another application. Currently the function is designed to block until data is received. In my mind this limits developers using the API to use multithreading or some sort of multi-process design. So is it better for a function to block or to return a null and then sleep for a few milliseconds before trying again.
Note the other application may not have any data to send through the API for an unknown period of time.
The API is written in C++

Why not use a callback?

You could define the API to allow the user to pass an optional timeout value. If the timeout is not specified, then the API function waits indefinitely, much like how select() works.

Consider another option: use an async transaction -> issue a request & provide a callback address with ticket id. When the response is available, the service end-point callbacks your application with the ticket id and of your the result ;-)
You should avoid as must as possible blocking when you possibly can.
As you say:
Note the other application may not have any data to send through the API for an unknown period of time.
In this case, using a synchronous interface ties up resources unnecessarily.

You haven't said what language this is, but it sounds like your API is listening or checking for some event, and the users of the API are either blocking or polling your API to determine if the event happened?
Is it possible to use a callback? Users of the API would register for notifications of the event happening, and when your library detects the event it will use the callback to notify all listeners.

When your applications calls the O/S api function read(), do you expect it to block? Of course you do—at least by default. In some circumstances, ioctl's allow a programmer to change the behavior to be asynchronous, which is particularly common in network applications.
You've shed very little light on what your API is about, so consider:
Does it make sense that an API user would want to be blocked? That is, is there little to do until it returns.
If you were writing an application for the API, what would you expect it to do? You should definitely write a few sample applications for your own education, as well as to document the API.
Is there any reason why the API user would not multithread (or fork, etc.) requests to the API?

If you want a reusable solution you could apply the Asynchronous Design 'Pattern' which is common in .NET but can also be implemented in C++ as demonstrated in this CodeProject project.
There's nothing wrong with providing both synchronous and asynchronous calls to the same feature in the interface.
Personally I would only go these lengths if I need to service multiple requests (in which case you can queue 'BeginOperation' requests for example), or there are many potentially asynchronous operations in the interface (and I want a standardised, flexible pattern). If you can only handle one request at a time a time-out is usually sufficient.

Batching in REST

With web services it is considered a good practice to batch several service calls into one message to reduce a number of remote calls. Is there any way to do this with RESTful services?

If you really need to batch, Http 1.1 supports a concept called HTTP Pipelining that allows you to send multiple requests before receiving a response. Check it out here

I don't see how batching requests makes any sense in REST. Since the URL in a REST-based service represents the operation to perform and the data on which to perform it, making batch requests would seriously break the conceptual model.
An exception would be if you were performing the same operation on the same data multiple times. In this case you can either pass in multiple values for a request parameter or encode this repetition in the body (however this would only really work for PUT or POST). The Gliffy REST API supports adding multiple users to the same folder via
POST /folders/ROOT/the/folder/name/users?userId=56&userId=87&userId=45
which is essentially:
PUT /folders/ROOT/the/folder/name/users/56
PUT /folders/ROOT/the/folder/name/users/87
PUT /folders/ROOT/the/folder/name/users/45
As the other commenter pointed out, paging results from a GET can be done via request parameters:
GET /some/list/of/resources?startIndex=10&pageSize=50
if the REST service supports it.

I agree with Darrel Miller. HTTP already supports HTTP Pipelining, plus HTTP supports keep alive letting you stream multiple HTTP operations concurrently down the same socket to avoid having to wait for the responses before streaming new requests to the server etc.
So with HTTP pipelining and keep alive you get the effect of batching while using the same underlying REST API - so there's usually no need for another REST API to your service

The team with Astoria made good use of multi-part mime to send a batch of calls. Different from pipelining as the multi-part message can infer the intent of an atomic operation. Seems rather elegant.
Original blog post explaining
rational
MSDN Documentation

Of course there is a way but it would require server-side support. There is no magical one size fits all methodology that I know of.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas