Adjust Thread pool size when using twistd - twisted

I am going to deploy my app in the twistd way(Application,Service, etc).
I'm wondering if there is a way to adjust the thread pool size of twisted like using reactor.suggestPoolSize()
I found an API called "adjustPoolsize" in twisted.python.threadpool.ThreadPool
Can I call it directly for my purpose?
Thank you!

Recent versions of Twisted let you access the reactor's thread pool:
from twisted.internet import reactor
threadpool = reactor.getThreadPool()
threadpool.adjustPoolsize(3, 7)
However, there's no guarantee that the reactor itself won't re-adjust the size as it sees fit. If you need to control the size of the threadpool used by your application, it may be better to create your own ThreadPool instance, rather than using the reactor's.

Related

Will BackgroundService play nicely on a Kubernetes cluster

I have a kubernetes cluster into which I'm intending to implement a service in a pod - the service will accept a grpc request, start a long running process but return to the caller indicating the process has started. Investigation suggests that IHostedService (BackgroundService) is the way to go for this.
My question is, will use of BackgroundService behave nicely with various neat features of asp.net and k8s:
Will horizontal scaling understand that a service is getting overloaded and spin up a new instance even though the service will appear to have no pending grpc requests because all the work is background (I appreciate there's probably hooks that can be implemented, I'm wondering what's default behaviour)
Will the notion of awaiting allowing the current process to be swapped out and another run work okay with background services (I've only experienced it where one message received hits an await so allows another message to be processed, but backround services are not a messaging context)
I think asp.net will normally manage throttling too many requests, backing off if the server is too busy, but will any of that still work if the 'busy' is background processes
What's the best method to mitigate against overloading the service (if horizontal scaling is not an option) - I can have the grpc call reutrn 'too busy' but would need to detect it (not quite sure if that's cpu bound, memory or just number of background services)
Should I be considering something other than BackgroundService for this task
I'm hoping the answer is that "it all just works" but feel it's better to have that confirmed than to just hope...
Investigation suggests that IHostedService (BackgroundService) is the way to go for this.
I strongly recommend using a durable queue with a separate background service. It's not that difficult to split into two images, one running ASP.NET GRPC requests, and the other processing the durable queue (this can be a console app - see the Service Worker template in VS). Note that solutions using non-durable queues are not reliable (i.e., work may be lost whenever a pod restarts or is scaled down). This includes in-memory queues, which are commonly suggested as a "solution".
If you do make your own background service in a console app, I recommend applying a few tweaks (noted on my blog):
Wrap ExecuteAsync in Task.Run.
Always have a top-level try/catch in ExecuteAsync.
Call IHostApplicationLifetime.StopApplication when the background service stops for any reason.
Will horizontal scaling understand that a service is getting overloaded and spin up a new instance even though the service will appear to have no pending grpc requests because all the work is background (I appreciate there's probably hooks that can be implemented, I'm wondering what's default behaviour)
One reason I prefer using two different images is that they can scale on different triggers: GRPC requests for the API and queued messages for the worker. Depending on your queue, using "queued messages" as the trigger may require a custom metric provider. I do prefer using "queued messages" because it's a natural scaling mechanism for the worker image; out-of-the-box solutions like CPU usage don't always work well - in particular for asynchronous processors, which you mention you are using.
Will the notion of awaiting allowing the current process to be swapped out and another run work okay with background services (I've only experienced it where one message received hits an await so allows another message to be processed, but backround services are not a messaging context)
Background services can be asynchronous without any problems. In fact, it's not uncommon to grab messages in batches and process them all concurrently.
I think asp.net will normally manage throttling too many requests, backing off if the server is too busy, but will any of that still work if the 'busy' is background processes
No. ASP.NET only throttles requests. Background services do register with ASP.NET, but that is only to provide a best-effort at graceful shutdown. ASP.NET has no idea how busy the background services are, in terms of pending queue items, CPU usage, or outgoing requests.
What's the best method to mitigate against overloading the service (if horizontal scaling is not an option) - I can have the grpc call reutrn 'too busy' but would need to detect it (not quite sure if that's cpu bound, memory or just number of background services)
Not a problem if you use the durable queue + independent worker image solution. GRPC calls can pretty much always stick another message in the queue (very simple and fast), and K8 can autoscale based on your (possibly custom) metric of "outstanding queue messages".
Generally, "it all works".
For the automatic horizontal scale, you need a autoscaler, read this: Horizontal Pod Autoscale
But you can just scale it yourself (kubectl scale deployment yourDeployment --replicas=10).
Lets assume, you have a deployment of your backend, which will start with one pod. Your autoscaler will watch your pod (eg. used cpu) and will start a new pod for you, when you have a high load.
A second pod will be started. Each new request will send to different pods (round-robin).
There is no need, that your backend throttle calls. It should just handle many calls as possible.

How to use a different named worker pool in same verticle?

I have one verticle in my service which takes in the http requests and uses executeBlocking to talk to MySQL db. I am using named worker pool to interact with DB. Now, for pushing the application metrics (using a lib. which is blocking) I want to use a different named worker pool. As I don't want the DB operations to be interrupted with metrics so I want to have a separate worker pool.
I could use event bus and use a worker verticle to push the metrics but as that has overhead of transformation to the JsonObject, I want to use executeBlocking itself from the same verticle.
As mentioned here https://groups.google.com/d/msg/vertx/eSf3AQagGGU/9m8RizIJeNQJ
, the worker pool used in both the cases is same. So, will making a new worker verticle really help me in decoupling the threads used for DB operation and the ones used to push metrics.
Can anyone help me with a better design choice or how can I use a different worker pool if I use the same verticle ?
Try the following code (written in Kotlin, but you get the idea):
val workerExecutor1 = vertx.createSharedWorkerExecutor("executor1", 4)
val workerExecutor2 = vertx.createSharedWorkerExecutor("executor2", 4)
workerExecutor1.executeBlocking(...) // execute your db code here
workerExecutor2.executeBlocking(...) // execute your metrics code here
Don't forget to close the workerExecutor once it's not needed:
workerExecutor1.close()

How to ACK celery tasks with parallel code in reactor?

I have a celery task that, when called, simply ignites the execution of some parallel code inside a twisted reactor. Here's some sample (not runnable) code to illustrate:
def run_task_in_reactor():
# this takes a while to run
do_something()
do_something_more()
#celery.task
def run_task():
print "Started reactor"
reactor.callFromThread(run_task_in_reactor)
(For the sake of simplicity, please assume that the reactor is already running when the task is received by the worker; I used the signal #worker_process_init.connect to start my reactor in another thread as soon as the worker comes up)
When I call run_task.delay(), the task finishes pretty quickly (since it does not wait for run_task_in_reactor() to finish, only schedules its execution in the reactor). And, when run_task_in_reactor() finally runs, do_something() or do_something_more() can throw an exception, which will go unoticed.
Using pika to consume from my queue, I can use an ACK inside do_something_more() to make the worker notify the correct completion of the task, for instance. However, inside Celery, this does not seems to be possible (or, at least, I do't know how to accomplish the same effect)
Also, I cannot remove the reactor, since it is a requirement of some third-party code I'm using. Other ways to achieve the same result are appreciated as well.
Use reactor.blockingCallFromThread instead.

Using Appkit Framework in Launch Daemon

I want to use NSWorkspace to check if application is launched or closed.
But the process is Launch Daemon and Apple documentation says its not thread safe.
However, the part of code that makes use of Workspace will not be executed at start up or login time. It will be executed after some commands received from other application via BSD communication and process is background process without UI?
Is it safe to use Appkit framework in this situation? Only NSWorkspace API and no other? Alternate solution is Polling? What is your suggestion?
Generally you can use any code that isn't thread safe, as long as you are only doing one operation of whatever the unthreadafe operation is at any given time. I would go ahead and try it, and just be aware that whatever you are doing you can't do concurrently, if you absolutely need to do something concurrently you can try throwing a couple of #synchronized blocks around the code, either in callbacks of a long running background process, or delegate calls.

Simple way to rig an "Activity monitor" for a Twisted socket Factory

I'd like to have a real-time 'system status'/'activity monitor' console for my Twisted application.
The app is basically a protocol.ServerFactory which accepts connections performs different jobs.
Kind of like the twisted.manhole, I'm looking for the simplest way to create a admin application where I can check the current stats of my app.
The admin can be a simple ascii-based shell or html/json setup.
I'm aware that I could build this with a bunch of counters, a separate protocol for authenticating and monitoring these, but I'm thinking Twisted might already have such thing since it at least knows the number of connections, protocol types, etc etc.
Tips?
There's the unmaintained, slowly rotting twisted.internet.gladereactor. If you're using twistd, then you can use this trivally:
twistd --reactor debug-gui ...
If you're running the reactor directly yourself, then it's only slightly more effort:
from twisted.manhole import gladereactor
gladereactor.install()
from twisted.internet import reactor
...
The Inspect feature appears to have been broken for some time, but it will still show you a list of established connections and what state they are in, and it will also apparently give you a traffic log for each connection. Fixing Inspect may also be a fairly straightforward effort, in case you're looking for a little project. :)