NServiceBus Outbox with RavenDB persistence clean up / DeduplicationDataCleanup design - ravendb

I'm looking into an issue where we're seeing CPU usage maxed out on a RavenDB instance using NServiceBus outbox implementation.
The design currently has all outbox workers configuring the deduplicationdatacleanup settings in their start up configuration. i.e. these settings:
endpointConfiguration.SetTimeToKeepDeduplicationData(TimeSpan.FromDays(7));
endpointConfiguration.SetFrequencyToRunDeduplicationDataCleanup(TimeSpan.FromMinutes(1));
If you've got multiple workers that are processing messages for the system, should the cleanup be implemented on each of those, or should it be treated like a cronjob where you run the clean up process on just one of those workers, or a dedicated system in the environment that is not a worker, but more of a utility role?
I would imagine the latter, otherwise if you scale out workers, all of them are going to be trying to run the cleanup process every minute, or am I understanding the way this configuration executes the clean up incorrectly?
Thanks

Related

Flink on yarn use yarn-session or not?

There are two methods to deploy flink applications on yarn. The first one is use yarn-session and all flink applications are deployed in the session. The second method is each flink application deploy on yarn as a yarn application.
My question is what's the difference between these two methods? Which one to choose in product environment?
I can't find any material about this.
I think the first method will save resources since only need one jobmanager(yarn application master). While it is also the disadvantage since the only jobmanager can be the bottleneck while flink applications getting more and more.
Both modes have their uses in production environments.
Session mode generally makes sense when you will be running a bunch of short-lived jobs, and want to avoid the overhead of starting up a cluster for each one. On the other hand, there are security implications, as any credentials available to any of the jobs will be accessible to all of the jobs. Cluster-per-job mode may use more resources overall, but is, in some sense, more straightforward.

Distributed job management system

I'm using beeQueue for video transcoding job scheduling and processing
For now everything is fine and but I'm now facing challenge of working with distributed environment like auto scaling the amazon the instances for adding more workers to process more jobs which are pending in the queue, We scale well but need to implement a system which is fail safe, I mean in case a instance on which workers were processing the job has gone shutdown and we don't get job status or events, In that case the job which were running on that instance is gone into blackhole and can't be recovered and processed again.
What I did :
I'm looking up for ready made solution who works fail safe in distributed env.
Thanks

Multiple Mobilefirst-Server artifacts concurrent deploy

I use a batch procedure for deploying MFP v7 artifacts (wlapps and adapters).
The procedure is based on the standard ant tasks defined in worklight-ant-deployer.jar.
The MFP environment runs onto a WAS cell, and consists of a single AdminService application managing multiple WLRuntimes.
Is it possible to run two (or more) deploy tasks concurrently against different WLRuntime targets ?
Furthermore, sticking to a single WLRuntime, is it possible to deploy different multiple artifacts concurrently ?
Thanks in advance for any answer/comment.
Ciao, Stefano.
For a single WL runtime, all deployments are internally done sequentially. You can start the deployments concurrently, but internally only one deployment is done after the other, due to a transaction locking mechanism. If you start too many deployments in parallel, it may come to timeout situations, even though this is seldom. By default, a deployment transaction waits for 20 minutes before it may time out.
Note: starting deployments in parallel means here using ant tasks or the wladm tool or the REST service directly. In the MobileFirst Admin Console UI, you will see deploy buttons disabled when another deployment transaction is ongoing, hence in the UI, it is not so easily possible to start deployments in parallel. The UI tries to prohibit that.
Note 2: the 20 minutes that I mentioned above is for the locking mechanism itself. Ant/wladm has its own parameters for time out that may be lower, hence in ant tasks, you might get time outs quicker than 20 min. See here.
For multiple WL runtimes, deployments can be concurrently. The mentioned locking mechanism is per runtime, hence deployments that occur in one WL runtime will not influence any other WL runtime.

What exactly is a pre-fork web server model?

I want to know what exactly it means when a web server describes itself as a pre-fork web server. I have a few examples such as unicorn for ruby and gunicorn for python.
More specifically, these are the questions:
What problem does this model solve?
What happens when a pre-fork web server is initially started?
How does it handle requests?
Also, a more specific question for unicorn/gunicorn:
Let's say that I have a webapp that I want to run with (g)unicorn. On initialization, the webapp will do some initialization stuff (e.g. fill in additional database entries). If I configure (g)unicorn with multiple workers, will the initialization stuff be run multiple times?
Pre-forking basically means a master creates forks which handle each request. A fork is a completely separate *nix process.
Update as per the comments below. The pre in pre-fork means that these processes are forked before a request comes in. They can however usually be increased or decreased as the load goes up and down.
Pre-forking can be used when you have libraries that are NOT thread safe. It also means issues within a request causing problems will only affect the process which they are processed by and not the entire server.
The initialisation running multiple times all depends on what you are deploying. Usually however connection pools and stuff of that nature would exist for each process.
In a threading model the master would create lighter weight threads to dispatch requests too. But if a thread causes massive issues it could have repercussions for the master process.
With tools such as Nginx, Apache 2.4's Event MPM, or gevent (which can be used with Gunicorn) these are asynchronous meaning a process can handle hundreds of requests whilst not blocking.
How does a "pre-fork worker model" work?
Master Process: There is a master process that spawns and kills workers, depending on the load and the capacity of the hardware. More incoming requests would cause the master to spawn more workers, up to a point where the "hardware limit" (e.g. all CPUs saturated) is reached, at which point queing will set in.
Workers: A worker can be understood as an instance of your application/server. So if there are 4 workers, your server is booted 4 times. It means it occupies 4 times the "Base-RAM" than only one worker would, unless you do shared memory wizardry.
Initialization: Your initialization logic needs to be stable enough to account for multiple servers. For example, if you write db entries, check if they are there already or add a setup job before your app server
Pre-fork: The "pre" in prefork means that the master always adds a bit more capacity than currently required, such that if the load goes up the system is "already ready". So it preemptively spawns some workers. For example in this apache library, you control this with the MinSpareServers property.
Requests: The requests (TCP connection handles) are being passed from the master process to the children.
What problem do pre-fork servers solve?
Multiprocessing: If you have a program that can only target one CPU core, you potentially waste some of your hardware's capacity by only spawning one server. The forked workers tackle this problem.
Stability: When one worker crashes, the master process isn't affected. It can just spawn a new worker.
Thread safety: Since it's really like your server is booted multiple times, in separate processes, you don't need to worry about threadsafety (since there are no threads). This means it's an appropriate model when you have non-threadsafe code or use non-threadsafe libs.
Speed: Since the child processes aren't forked (spawned) right when needed, but pre-emptively, the server can always respond fast.
Alternatives and Sidenotes
Container orchestration: If you're familiar with containerization and container orchestration tools such as kubernetes, you'll notice that many of the problems are solved by those as well. Kubernetes spawns multiple pods for multiprocessing, it has the same (or better) stability and things like "horizontal pod autoscalers" that also spawn and kill workers.
Threading: A server may spawn a thread for each incoming request, which allows for many requests being handled "simultaneously". This is the default for most web servers based on Java, since Java natively has good support for threads. Good support meaning the threads run truly parallel, on different cpu cores. Python's threads on the other hand cannot truly parallelize (=spread work to multiple cores) due to the GIL (Global Interpreter Lock), they only provide a means for contex switching. More on that here. That's why for python servers "pre-forkers" like gunicorn are so popular, and people coming from Java might have never heard of such a thing before.
Async / non-blocking processing: If your servers spend a lot of time "waiting", for example disk I/O, http requests to external services or database requests, then multiprocessing might not be what you want. Instead consider making your code "non-blocking", meaning that it can handle many requests concurrently. Async / await (coroutines) based systems like fastapi (asgi server) in python, Go or nodejs use this mechanism, such that even one server can handle many requests concurrently.
CPU bound tasks: If you have CPU bound tasks, the non-blocking processing mentioned above won't help much. Then you'll need some way of multiprocessing to distribute the load on your CPU cores, as the solutions mentioned above, that is: container orchestration, threading (on systems that allow true parallelization) or... pre-forked workers.
Sources
https://www.reddit.com/r/learnprogramming/comments/25vdm8/what_is_a_prefork_worker_model_for_a_server/
https://httpd.apache.org/docs/2.4/mod/prefork.html

Cluster-wide singleton in Websphere Cluster

I need to run a component using Apache Camel (or Spring Integration) under WAS ND 8.0 cluster. They both run some threads on startup, and stop them on shutdown normally. No problem to supply WAS managed threadpool. But that threads must run on single cluster's node at the same time. Moreover it must be high-available i.e. switch to other node when active node falls.
Solution I found - is WAS Partitioning Facility. It requires additional Extended Deployment licenses. Is it the only way, or there is some way to implement this using Network Deployment license only?
Thanks in advance.
I think that there is not a feature that address this interesting requirement.
I can imagine a "trick":
A Timer EJB send a message on a queue (let's say 1 per minute)
Configure a Service Integration Bus (SIB) with High Availability and No Scalability, so the HA Manager ensure that only one messaging engine (ME) is alive.
Create a non-reliable queue for high performances and low resource consumption.
The Activation Spec should be configured to listen only local ME.
A MDB implement the following logic: when the message arrives, it check if the singleton thread is alive, otherwise it start the thread.
Does it make sense?