On Heroku, does utilising Node.js prevent the need for queues + worker dynos for third-party API calls? - api

The Heroku Dev Center on the page about using worker dynos and background jobs states that you need to use worker's + queues to handle API calls, such as fetching an RSS feed, as the operation may take some time if the server is slow and doing this on a web dyno would result in it being blocked from receiving additional requests.
However, from what I've read, it seems to me that one of the major points of Node.js is that it doesn't suffer from blocking under these conditions due to its asynchronous event-based runtime model.
I'm confused because wouldn't this imply that it would be ok to do API calls (asynchronously) in the web dynos? Perhaps the docs were written more for the Ruby/Python/etc use cases where a synchronous model was more prevalent?

NodeJS is an implementation of the reactor pattern. The default build of of NodeJS uses 5 reactors. Once these 5 reactors are being used for IO bound tasks, the main event loop will block.
A common misconception about NodeJS is that it is a system that allows you to do many things at once. This is not necessarily the case, it allows you to do other things while waiting on IO bound tasks, up to 5 at a time.
Any CPU bound tasks are always executed in the main event loop, meaning they will block.
This means if your "job" is IO bound, like putting things in databases then you can probably get away with not using dynos. This of course is dependent on how many things you plan on having go on at once. Remember, any task you put in your main app will take away resources from other incoming requests.
Generally it is not recommended for things like this, if you have a job that does some processing, it belongs in a queue that is executed in its own process or thread.


Would a blocking web server get hung up to the sense it needs restarting, if many http clients send requests at most in parallel?

I read there are web servers their behaviors are called blocking whereas Node.js's is said non-blocking.
Would a blocking web server get hung up to the sense it needs restarting, if many http clients send requests at most in parallel?
As a complement, I don't say that it needs restarting while it potentially works fine again after the flood of parallel requests have stopped.
And I currently don't understand how request buffers and overflows work for web servers.
Although technically it could be possible to make a single-thread, single-process blocking server that can only handle 1 request at a time, it doesn't really practically make sense. Concurrency is kind of important.
The three main paradigms for parallelism (that I know of) are:
Using an event loop/reactor pattern
Node falls in the third category, and also a bit in the second category depending on how you look at it.
Most languages can look at a socket and read from it, and immediately move on if there was nothing to read. Therefore most languages can have this non-blocking behavior.

Will BackgroundService play nicely on a Kubernetes cluster

I have a kubernetes cluster into which I'm intending to implement a service in a pod - the service will accept a grpc request, start a long running process but return to the caller indicating the process has started. Investigation suggests that IHostedService (BackgroundService) is the way to go for this.
My question is, will use of BackgroundService behave nicely with various neat features of asp.net and k8s:
Will horizontal scaling understand that a service is getting overloaded and spin up a new instance even though the service will appear to have no pending grpc requests because all the work is background (I appreciate there's probably hooks that can be implemented, I'm wondering what's default behaviour)
Will the notion of awaiting allowing the current process to be swapped out and another run work okay with background services (I've only experienced it where one message received hits an await so allows another message to be processed, but backround services are not a messaging context)
I think asp.net will normally manage throttling too many requests, backing off if the server is too busy, but will any of that still work if the 'busy' is background processes
What's the best method to mitigate against overloading the service (if horizontal scaling is not an option) - I can have the grpc call reutrn 'too busy' but would need to detect it (not quite sure if that's cpu bound, memory or just number of background services)
Should I be considering something other than BackgroundService for this task
I'm hoping the answer is that "it all just works" but feel it's better to have that confirmed than to just hope...
Investigation suggests that IHostedService (BackgroundService) is the way to go for this.
I strongly recommend using a durable queue with a separate background service. It's not that difficult to split into two images, one running ASP.NET GRPC requests, and the other processing the durable queue (this can be a console app - see the Service Worker template in VS). Note that solutions using non-durable queues are not reliable (i.e., work may be lost whenever a pod restarts or is scaled down). This includes in-memory queues, which are commonly suggested as a "solution".
If you do make your own background service in a console app, I recommend applying a few tweaks (noted on my blog):
Wrap ExecuteAsync in Task.Run.
Always have a top-level try/catch in ExecuteAsync.
Call IHostApplicationLifetime.StopApplication when the background service stops for any reason.
Will horizontal scaling understand that a service is getting overloaded and spin up a new instance even though the service will appear to have no pending grpc requests because all the work is background (I appreciate there's probably hooks that can be implemented, I'm wondering what's default behaviour)
One reason I prefer using two different images is that they can scale on different triggers: GRPC requests for the API and queued messages for the worker. Depending on your queue, using "queued messages" as the trigger may require a custom metric provider. I do prefer using "queued messages" because it's a natural scaling mechanism for the worker image; out-of-the-box solutions like CPU usage don't always work well - in particular for asynchronous processors, which you mention you are using.
Will the notion of awaiting allowing the current process to be swapped out and another run work okay with background services (I've only experienced it where one message received hits an await so allows another message to be processed, but backround services are not a messaging context)
Background services can be asynchronous without any problems. In fact, it's not uncommon to grab messages in batches and process them all concurrently.
I think asp.net will normally manage throttling too many requests, backing off if the server is too busy, but will any of that still work if the 'busy' is background processes
No. ASP.NET only throttles requests. Background services do register with ASP.NET, but that is only to provide a best-effort at graceful shutdown. ASP.NET has no idea how busy the background services are, in terms of pending queue items, CPU usage, or outgoing requests.
What's the best method to mitigate against overloading the service (if horizontal scaling is not an option) - I can have the grpc call reutrn 'too busy' but would need to detect it (not quite sure if that's cpu bound, memory or just number of background services)
Not a problem if you use the durable queue + independent worker image solution. GRPC calls can pretty much always stick another message in the queue (very simple and fast), and K8 can autoscale based on your (possibly custom) metric of "outstanding queue messages".
Generally, "it all works".
For the automatic horizontal scale, you need a autoscaler, read this: Horizontal Pod Autoscale
But you can just scale it yourself (kubectl scale deployment yourDeployment --replicas=10).
Lets assume, you have a deployment of your backend, which will start with one pod. Your autoscaler will watch your pod (eg. used cpu) and will start a new pod for you, when you have a high load.
A second pod will be started. Each new request will send to different pods (round-robin).
There is no need, that your backend throttle calls. It should just handle many calls as possible.

RestKit network limits blocks other calls when parallel requests are running

we are facing a problem.
we have background requests that are downloading files constantly (up to 5MB each file). meanwhile, we have a UI that most navigations require REST calls.
we limited the number of background downloads so it won't suffocate the operationQueue that RESTkit uses.
when several files are downloaded in background, we see the network usage with 1->2 MB (which is understandable).
The problem is: the user navigates through the app, and each navigation calls a quick REST call that should return very little data. but because of the background downloads, the UI call is taking forever (~10 seconds).
Priority did not help, i saw that the UI call i make instantly is handled by the operation queue (because we limited the downloads limit and the NSOperationQueue had more space to fulfill other requests.
when we limited the concurrent REST download calls to 5 - the REST calls from the UI took 10 seconds.
when we limited the concurrent REST download calls to 2 - everything worked fine.
the issue here is that because we let only 2 downloads occur in the background - the whole background operation of downloading files will take forever.
the best scenario would be that every UI call would be considered as most important network-wise and even pause the background operations and let only the UI call to be handled - then resume the background operation - but i'm not sure it's possible.
any other idea to address this issue?
You could use 2 RKObjectManagers so that you have 2 separate queues, then use one for 'UI' and the other for 'background'. On top of that you can set the concurrent limits for each queue differently and you could suspend the background queue. Note that suspending the queue doesn't mean already running operations are paused, it just stops new operations from being started.
By doing this you can gain some control, but better options really are to limit the data flow, particularly when running on a mobile data network, and to inform the user what is happening so they can accept the situation or pause it till later.

NServiceBus Sagas and REST API Integration best-practices

What is the most sensible approach to integrate/interact NServiceBus Sagas with REST APIs?
The scenario is as follows,
We have a load balanced REST API. Depending on the load we can add more nodes.
REST API is a wrapper around a DomainServices API. This means the API can be consumed directly.
We would like to use Sagas for workflow and implement NServiceBus Distributor to scale-out.
Question is, if we use the REST API from Sagas, the actual processing happens in the API farm. This in a way defeats the purpose of implementing distributor pattern.
On the other hand, using DomainServives API directly from Sagas, allows processing locally within worker nodes. With this approach we will have to maintain API assemblies in multiple locations but the throughput could be higher.
I am trying to understand the best approach. Personally, I’d prefer to consume the API (if readily available) but this could introduce chattiness to the system and could take longer to complete as compared to to in-process.
A typical sequence could be similar to publishing an online advertisement,
Advertiser submits a new advertisement request via a web application.
Web application invokes the relevant API endpoint and sends a command
Command message initiates a new publish advertisement Saga
Saga sends a command to validate caller permissions (in
process/out of process API call)
Saga sends a command to validate the
advertisement data (in process/out of process API call)
Saga sends a
command to the fraud service (third party service)
Once the content and fraud verifications are successful,
Saga sends a command to the billing system.
Saga invokes an API call to save add details. (in
process/out of process API call)
And this goes on until the advertisement is expired, there are a number of retry and failure condition paths.
After a number of design iterations we came up with the following guidelines,
Treat REST API layer as the integration platform.
Assume API endpoints are capable of abstracting fairly complex micro work-flows. Micro work-flows are operations that executes in a single burst (not interruptible) and completes with-in a short time span (<1 second).
Assume API farm is capable of serving many concurrent requests and can be easily scaled-out.
Favor synchronous invocations over asynchronous message based invocations when the target operation is fairly straightforward.
When asynchronous processing is required use a single message handler and invoke API from the handlers. This will delegate work to the API farm. This will also eliminate the need for a distributor and extra hardware resources.
Avoid Saga’s unless if the business work-flow contains multiple transactions, compensation logic and resumes. Tests reveals Sagas do not perform well under load.
Avoid consuming DomainServices directly from a message handler. This till do the work locally and also introduces a deployment hassle by distributing business logic.
Happy to hear out thoughts.
You are right on with identifying that you will need Sagas to manage workflow. I'm willing to bet that your Domain hooks up to a common database. If that is true then it will be faster to use your Domain directly and remove the serialization/network overhead. You will also lose the ability to easily manage the transactions at the database level.
Assuming your are directly calling your Domain, the performance becomes a question of how the Domain performs. You may take steps to optimize the database, drive down distributed transaction costs, sharding the data, etc. You may end up using the Distributor to have multiple Saga processing nodes, but it sounds like you have some more testing to do once a design is chosen.
Generically speaking, we use REST APIs to model the commands as resources(via POST) to allow interaction with NSB from clients who don't have direct access to messaging. This is a potential solution to get things onto NSB from your web app.

WCF polling, background processing, and resource starvation

I have a web service, implemented with WCF and hosted in IIS7, with a submit-poll communication pattern. An initial request is made, which returns quickly and kicks off a background process. The client polls for the status of the background process. This interface is set and can't be changed (it's a simulation of an external service we depend on).
I implemented the background processing by adding another service contract to the existing service with a one-way message contract that starts the long-running process. The "background" service keeps a database updated with the status in order to communicate with the main service. This avoids creating any new web services or items to deploy.
The problem is that the background process is very CPU intensive, and it seems to be starving the other service calls out. It will take up an entire processor, and while a single instance of the background process is running, status polling calls to the main service can take over a minute. I don't care how long the background process takes.
Is there any way to throttle the resource usage of the background method? Or an obvious way to do long running async processes in WCF without changing my submit/poll service contract? Would separating them into different web services help if the two services were still running on the same server?
The first thing I would try would be to lower the priority.
If you're actually spinning off a separate process for the background work, then you can do it like this:
Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.BelowNormal;
If it's really just a background thread, use this instead (from within the thread):
Thread.CurrentThread.Priority = ThreadPriority.BelowNormal;
(Actually, it's better to start the thread suspended and change the priority at the caller before running it, but it's generally OK to lower your own priority.)
At the very least it should help determine whether or not it's really a CPU issue. If you still have problems after lowering the priority then it might be something else that's getting starved, like file or network I/O.