How to limit the number of outgoing web request per second? - asp.net-core

Right now, I am working on an ASP.NET Core Web API that calls an external web service and uses the returned data in its own response. This is working fine.
However, I discovered that the external service is not as scalable as I would like to. Therefore, as discussed with the company providing this external service, the number of outgoing requests needs to be limited to one per second. I als use caching to reduce the number of outgoing requests but this has been shown to be not effective enough because (as logically) it only works when a lot of requests are the same so cache data can be reused.
I have been doing some investigation on rate limiting but the documented examples and types are far more complicated than what I need. This is not about partitions, tokens or concurrency. What I want is far more simple. Just 1 outgoing request per second and that´s all.
I am not that experienced when it comes to rate limiting. I just started reading the referred documentation and found out that there is a nice and professional package for it. But the examples are more complicated than what I need for the reasons explained. It is not about tokens or concurrency or so. It is the number of outgoing requests per second that needs to be limited.
Possibly, there is a way using the package System.Threading.RateLimiting in such a way that this is possible by applying the RateLimiter class correctly. Otherwise, I may need to write my own DelegateHandler implementation to do this. But there must be a straightforward way which people with experience in rate limiting can explain me.
So how to limit the number of outgoing web request per second?
In addition, what I want to prevent is a 429 or so in case of to many request. In such a situation, the process should just take more waiting time in order to complete so the number of outgoing requests is limited.

Related

ISO-8583 message processing(defining priority of messages)

I need to get an understanding of ISO-8583 message platform,lets say i want to perform a authorization of a card transaction,so in real time at a particular instance lets say i got 100000 requests from network(VISA/MASTERCARD) all for authorization,how do i define priority of there request and the response,can the connection pool handle it(in my case its HIKARI),how is it done banks/financial institutions for authorizing a request.Please provide me some insights on how to manage all these requests.Should i go for a MQ?
Tech used are:-spring boot,hibernate,spring-tcp-starter
Your question doesn't seem to be very well researched as there are a ton of switch platforms out there that due this today and many of their technology guides can be found on the web including for major vendors like ACI, FIS, AJB,.. etc if you look yard enough.
I have worked with several iso-interface specifications, commercial switches, and home grown platforms and it is actually pretty consistent in how they do the core realtime processing.
This information on prioritization is generally in each ISO-8583 message processing specification and is made explicitly clear in almost every specification I've ever read written by someone who is familar with ISO-8533 and not just making up their own variant or copying someone elses.
That said.. in general at a high level authorizations / financials (0100, 0200) requests always have high priority than force posts (0x20) messages.
Administrative messages in the 05xx and 06xx and 08xx sometimes also get bumped up above other advices.. but these are still advices and almost always auths/financials are always processed first as they A) Impact the customer B) have much tighter timers than any advice by usually more than double or more.
Most switches I have seen do it entirely in memory without going to MQ and or some other disk for core authorization process to manage these.. but not to say there is not some sort of home grown middle ware sometimes involved.. but non-realtime processes regularly use a MQ process to queue or disk queuing these up into processes not in-line of the approval for this Store-and-forward (SAF) processing.. but many of these still use memory only processing for the front of their queue.
It is important to also differentiate between 100000 requests and 100000 transactions.. the various exchanges both internal and external make a big difference in the number of actual requests/responses in flight at even given time.. a basic transaction can be accomplished in like two messages.. but some of the more complex ones can easily exceed 20 messages just for a pre-authorization or a completion component.
If you are dealing with largely batch transaction bursts.. I can see the challenge of queuing but almost every application I have seen has a max in flight for advices and requests separate of each other.. and sometimes even with different timers.. and the apps pumping the transactions almost always wait for the response back before sending more.. and this tends to work fine for just about everyone.. including big posting batches from retailers and card networks. So if your app doesn't have them.. you probably need to add them.
In fact your 100000 requests should be sorted by (Terminal ID and/or Merchant ID) + (timestamp/local timestamp) + (STAN and/or RRN).
Duplicated transaction requests expected to be rejected.
If you simulating multiple requests from single terminal (or host) with same test card details the increasing of STAN/RRN would be a case.
Please refer to previous answers about STAN and RRN ISO 8583 fields.
In ISO message, what's the use of stan and rrn ?

how would I expose 200k+ records via an API?

what would be the best option for exposing 220k records to third party applications?
SF style 'bulk API' - independent of the standard API to maintain availability
server-side pagination
call back to a ftp generated file?
webhooks?
This bulk will have to happen once a day or so. ANY OTHER SUGGESTIONS WELCOME!
How are the 220k records being used?
Must serve it all at once
Not ideal for human consumers of this endpoint without special GUI considerations and communication.
A. I think that using a 'bulk API' would be marginally better than reading a file of the same data. (Not 100% sure on this.) Opening and interpreting a file might take a little bit more time than directly accessing data provided in an endpoint's response body.
Can send it in pieces
B. If only a small amount of data is needed at once, then server-side pagination should be used and allows the consumer to request new batches of data as desired. This reduces unnecessary server load by not sending data without it being specifically requested.
C. If all of it needs to be received during a user-session, then find a way to send the consumer partial information along the way. Often users can be temporarily satisfied with partial data while the rest loads, so update the client periodically with information as it arrives. Consider AJAX Long-Polling, HTML5 Server Sent Events (SSE), HTML5 Websockets as described here: What are Long-Polling, Websockets, Server-Sent Events (SSE) and Comet?. Tech stack details and third party requirements will likely limit your options. Make sure to communicate to users that the application is still working on the request until it is finished.
Can send less data
D. If the third party applications only need to show updated records, could a different endpoint be created for exposing this more manageable (hopefully) subset of records?
E. If the end-result is displaying this data in a user-centric application, then maybe a manageable amount of summary data could be sent instead? Are there user-centric applications that show 220k records at once, instead of fetching individual ones (or small batches)?
I would use a streaming API. This is an API that does a "select * from table" and then streams the results to the consumer. You do this using a for loop to fetch and output the records. This way you never use much memory and as long as you frequently flush the output the webserver will not close the connection and you will support any size of result set.
I know this works as I (shameless plug) wrote the mysql-crud-api that actually does this.

File Based Processing versus REST API [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
We have a requirement where we need to process 10,000 transactions once daily in an offline (non real time mode).
Which of the 2 options are preferable
A batch file with 10,000 rows sent once a day and processed
or
An API call in small batches (as I am presuming sending 10K rows at once is not an option).
I was advised by my architects that option 1 is preferable and an API would only make sense when batch sizes are small - as the disadvantage of 2 is that the person calling the API has to break the payload down into small chunks when they have all the information available to them at once.
I am keen to see how "2" could be a viable option so any comments/suggestion to help make the case would be very helpful.
Thanks
Rahul
This is not a full answer. However, I would like to mention one reason in favor of REST API: Validation. This is better managed through the API. Once the file is dropped into an FTP location, it will be your responsibility to validate the format of the file. Will it be easy to route a "bad" file back to its source with a message to explain the bounce back?
With an API call, if the representation coming in does not adhere to a valid schema e.g. XML, json, etc. then your service can respond with a: "400 Bad Request" http status code. This keeps the responsibility of sending data in a valid format with the consumer of the service and helps to achieve a better separation of concerns.
Additional reasoning for a REST API:
Since your file contains transactions, each record should be atomic (If this were not true e.g. there are relationships between the records in the file, then those records should not be considered "transactions"). Therefore, chunking the file up into smaller batches should be trivial.
Regardless, you can define a service that accepts transactions in batch and respond with an HTTP status code of "202 Accepted". A 202 code indicates that the request was received and will be processed asynchronously. Therefore, the response can also contain callback links to check the status of individual transactions; or the batch as a whole. At that point, you would be implementing HATEOAS (Hypermedia as the Engine of Application State) and be in a position to automate the entire process and report on status.
Alteratively with batch files, if the file passes an upfront format validation check, then you'll still have to process each transaction individually downstream. Some records may load, others may not. My assumption is the records that fail to load would still need to be handled. And, you may need to provide the users a view of what succeeded vs. failed. Now, this can all be handled outside the REST API. However, the API pattern is simple and elegant IMHO to this purpose.
Using Batch Process is always a better idea. you can trigger batch process using REST API.
With Batch processing you can always send an email with msg "improper file format" or you can also send "Which records processed and which did not" . With Rest you cannot keep track records and transactions.
As mentioned in above comment you can use Rest API to trigger a batch Process asynchronously and send the status response using HATEOAS.
SPRING BATCH + SPring REST using SPring BOOT
I have the same question and all answer I found the same subjective answer. I would like put some ideas to compare both concepts:
Batch solution requires more storage than REST API. You will need
store your results on intermediate storage area, and write it on an
open format. Perhaps you can compress it, but you are changing
storage with processing.
REST API could use more network bandwidth than batch solution, only
if the intermediate storage is not in network drive. Fetch request,
and query pooling could require a lot of network bandwidth,
but could be solved with web-hooks or web-sockets.
REST API is easiest to automatic recovery than batch solution. REST
API response code can help to take automatic decision to recover
from a FAIL. And you reduce the number of services required to
identify it. If the network is down an email could fail as REST API.
And REST API help you to define a good API on these cases.
REST API can manage high number of rows as any other TCP protocol
(as FTP). But in case of any fail you will need logic to manage it.
It means the REST API will require a chunk enabled protocol too. For
batch service, this logic is in FTP protocol, but with his own
logic, not your business logic.
Batch service does not require to reserve an instance all time
(CPU, IP address, port, etc), just
run when it is needed. You will need a scheduler to start it, or men
force. Or a man to restart it if it fails. Again, out of scheduler,
it is not natural to automatize.
Batch service does not require more security setup from developer
side: REST API must take care about authentication. Also, must think
on injection or other attack methods. REST API could be use helper
services to prevent all of this, but it means more configuration.
Batch services are easy to deploy. Batch services could run on your
machine, or a server and run it when business need. REST API requires
continues health check, use a deployment strategy to keep it up, take
care about DNS configuration, etc. Check if your company give you all
this services.
If this solution is for your company, check what your company is
doing. Right now there is a common policy to move to REST API, but
if your support team do not know about it but has a lot of
experience with batch solution, could be a good idea do not improve.

ASP.NET MVC site, shared WCF client object, causing a single-threaded bottleneck?

I'm trying to nail down a performance issue under load in an application which I didn't build, but have become very familiar with the workings of.
The architecture is: mobile apps call an ASP.NET MVC 3 website to get data to display. The ASP.NET site calls a third-party SOAP API using WCF clients (basicHttpBinding), caching results as much as it can to minimize load on that third party.
The load from the mobile apps is in the order of 200+ requests per second at peak times, which translates to something in the order of 20 SOAP requests per second to the third-party, after caching.
Normally it runs fine but we get periods of cascading slowness where every request to the API starts taking 5 seconds.. then 10.. 15.. 20.. 25.. 30.. at which point they time out (we set the WCF client timeout to 30 seconds). Clearly there is a bottleneck somewhere which is causing an increasingly long queue until requests can't be serviced inside 30 seconds.
Now, the third-party API is out of my control but they swear that it should not be having any issues whatsoever with 20 requests per second. So I've been looking into the possibility of a bottleneck at my end.
I've read questions on StackOverflow about ServicePointManager.DefaultConnectionLimit and connectionManagement, but digging through the source, I think the problem is somewhat more fundamental. It seems that our WCF client object (which is a standard System.ServiceModel.ClientBase<T> auto-generated by "Add Service Reference") is being stored in the cache, and thus when multiple requests come in to the ASP.NET site simultaneously, they will share a single Client object.
From a quick experiment with a couple of console apps and spawning multiple threads to call a deliberately slow WCF service with a shared Client object, it seems to me that only one call will occur at a time when multiple threads use a single ClientBase. This would explain a bottleneck when e.g. 20 calls need to be made per second and each one takes more than 50ms to complete.
Can anyone confirm that this is indeed the case?
And if so, and if I switched to every request creating it's own WCF Client object, I would just need to alter ServicePointManager.DefaultConnectionLimit to something greater than the default (which I believe is 2?) before creating the Client objects, in order to increase my maximum number of simultaneous connections?
(sorry for the verbose question, I figured too much information was better than too little)

ZMQ device queue does not load balance properly

I know that ZMQ offers all of the flexibility to do your own load-balancing. However I would expect the out-of-the-box broker, about 4 lines of code using the line
zmq_device (ZMQ_QUEUE, frontend, backend);
to load balance quite well as the documentation says it does load balance.
ZMQ_QUEUE creates a shared queue that collects requests from a set of clients, and distributes these fairly among a set of services. Requests are fair-queued from frontend connections and load-balanced between backend connections. Replies automatically return to the client that made the original request.
I have an army of back-end services and yet find that often my front-end clients have to wait several seconds for something that takes < 1/10 of a second in a 1:1 setting (there are same # of client and service machines). I suspect that ZMQ is not load-balancing properly out of the box - it's sending too many requests to the same service even though it doesn't have bandwidth, etc.
I think this is partly because the services are multithreaded in a way that lets them take up to 10 concurrent requests yet it slows down greatly at near the 10th request even though it can still accept them. Random distribution would be ideal. Is there an out-of-the-box way to do this or can it be done in a few lines of code, or do I have to write my own broker from scratch?
Fwiw issue was the workers were taking on work when they didn't have room for it, issue was not in ZMQ layer per se.