I am building a distributed app in which a set of processes (running on separate machines) are working together. Each process works on its own set of "resources" (and there is no sharing of resources among the processes). When the app starts, I need to distribute the pool of resources among the processes so that they can work independently. In order to do the resource distribution, I am considering using Apache Curator's Leader Election recipe.
Once a process is chosen as leader, it will do the resource distribution and then the remaining processes can work on their assigned resources. While the leader is doing the work, other processes have to be "blocked". What is the curator behavior for the non leader processes? Do they immediately get a failure from the leader election api or they get blocked?
Related
The docs say:
Assuming all cluster members are available, a client can connect to
any node and perform any operation. Nodes will route operations to the
queue master node transparently to clients.
What does this "route" mean exactly? Does the node I connected to redirect all the traffic to another node? It means that network utilization is doubled, doesn't it?
Should I try to connect to the master node of the queue when I'm only going to make operation on this single queue only?
The question is, can I save some resources by connecting to the home
node?
Yes you can, but it is a premature optimization that is probably not worth the effort. You can only verify this via running benchmarks.
You will probably spend more time developing code to ensure your application connects to the master node, setting up your benchmark environment and running them than you would save by this optimization.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.
We have been working on a application using RabbitMQ which is installed in local development server. Now we want to move this application to integration and QA environments then the problem is do we need to setup RabbitMQ in different environments or is there any ways to have one central location for RabbitMQ with same exchange and queue names.
You can certainly have a central RabbitMQ instance/cluster that can be shared by different applications and different teams. If you want to go down this route, I'd recommend isolating data that belongs to each team from others' using Rabbit's access control mechanisms. Virtual hosts allow users to share Rabbit server/cluster resources without stepping on each other's queues.
If, for whatever reason, you decide to stick to a single virtual host in a shared environment, I'd advise against sharing the same exchange and queue names with other teams (or even other devs!) not least because of potential for conflict between the different parties' data. I say 'potential' because not knowing the architecture of your application and queues, there may not be a possibility for such a conflict so it's up to you to decide whether there would be a problem.
Finally, if the desire to share queues and exchanges with other teams is driven by concern over setup effort, I'd suggest automating queue configuration or building it into your application's startup routines to avoid headaches.
Hope this helps!
I want to know what exactly it means when a web server describes itself as a pre-fork web server. I have a few examples such as unicorn for ruby and gunicorn for python.
More specifically, these are the questions:
What problem does this model solve?
What happens when a pre-fork web server is initially started?
How does it handle requests?
Also, a more specific question for unicorn/gunicorn:
Let's say that I have a webapp that I want to run with (g)unicorn. On initialization, the webapp will do some initialization stuff (e.g. fill in additional database entries). If I configure (g)unicorn with multiple workers, will the initialization stuff be run multiple times?
Pre-forking basically means a master creates forks which handle each request. A fork is a completely separate *nix process.
Update as per the comments below. The pre in pre-fork means that these processes are forked before a request comes in. They can however usually be increased or decreased as the load goes up and down.
Pre-forking can be used when you have libraries that are NOT thread safe. It also means issues within a request causing problems will only affect the process which they are processed by and not the entire server.
The initialisation running multiple times all depends on what you are deploying. Usually however connection pools and stuff of that nature would exist for each process.
In a threading model the master would create lighter weight threads to dispatch requests too. But if a thread causes massive issues it could have repercussions for the master process.
With tools such as Nginx, Apache 2.4's Event MPM, or gevent (which can be used with Gunicorn) these are asynchronous meaning a process can handle hundreds of requests whilst not blocking.
How does a "pre-fork worker model" work?
Master Process: There is a master process that spawns and kills workers, depending on the load and the capacity of the hardware. More incoming requests would cause the master to spawn more workers, up to a point where the "hardware limit" (e.g. all CPUs saturated) is reached, at which point queing will set in.
Workers: A worker can be understood as an instance of your application/server. So if there are 4 workers, your server is booted 4 times. It means it occupies 4 times the "Base-RAM" than only one worker would, unless you do shared memory wizardry.
Initialization: Your initialization logic needs to be stable enough to account for multiple servers. For example, if you write db entries, check if they are there already or add a setup job before your app server
Pre-fork: The "pre" in prefork means that the master always adds a bit more capacity than currently required, such that if the load goes up the system is "already ready". So it preemptively spawns some workers. For example in this apache library, you control this with the MinSpareServers property.
Requests: The requests (TCP connection handles) are being passed from the master process to the children.
What problem do pre-fork servers solve?
Multiprocessing: If you have a program that can only target one CPU core, you potentially waste some of your hardware's capacity by only spawning one server. The forked workers tackle this problem.
Stability: When one worker crashes, the master process isn't affected. It can just spawn a new worker.
Thread safety: Since it's really like your server is booted multiple times, in separate processes, you don't need to worry about threadsafety (since there are no threads). This means it's an appropriate model when you have non-threadsafe code or use non-threadsafe libs.
Speed: Since the child processes aren't forked (spawned) right when needed, but pre-emptively, the server can always respond fast.
Alternatives and Sidenotes
Container orchestration: If you're familiar with containerization and container orchestration tools such as kubernetes, you'll notice that many of the problems are solved by those as well. Kubernetes spawns multiple pods for multiprocessing, it has the same (or better) stability and things like "horizontal pod autoscalers" that also spawn and kill workers.
Threading: A server may spawn a thread for each incoming request, which allows for many requests being handled "simultaneously". This is the default for most web servers based on Java, since Java natively has good support for threads. Good support meaning the threads run truly parallel, on different cpu cores. Python's threads on the other hand cannot truly parallelize (=spread work to multiple cores) due to the GIL (Global Interpreter Lock), they only provide a means for contex switching. More on that here. That's why for python servers "pre-forkers" like gunicorn are so popular, and people coming from Java might have never heard of such a thing before.
Async / non-blocking processing: If your servers spend a lot of time "waiting", for example disk I/O, http requests to external services or database requests, then multiprocessing might not be what you want. Instead consider making your code "non-blocking", meaning that it can handle many requests concurrently. Async / await (coroutines) based systems like fastapi (asgi server) in python, Go or nodejs use this mechanism, such that even one server can handle many requests concurrently.
CPU bound tasks: If you have CPU bound tasks, the non-blocking processing mentioned above won't help much. Then you'll need some way of multiprocessing to distribute the load on your CPU cores, as the solutions mentioned above, that is: container orchestration, threading (on systems that allow true parallelization) or... pre-forked workers.
Sources
https://www.reddit.com/r/learnprogramming/comments/25vdm8/what_is_a_prefork_worker_model_for_a_server/
https://httpd.apache.org/docs/2.4/mod/prefork.html
I am new in
Apache ZooKeeper : ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
Apache Mesos : Apache Mesos is a cluster manager that simplifies the complexity of running applications on a shared pool of servers.
Apache Helix : Apache Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes.
Erlang Langauge : Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability.
It sounds to me that Helix and Mesos both are useful for Clustering management System. How they are related to ZooKeeper? It'd better if someone give me a real world example for their usage.
I am curious to know How [BOINC][1] are distributing tasks to their clients? Are they using any of the above technologies? (Forget about Erlang).
I just need a brief view on it :)
Erlang was built by Ericsson, designed for use in phone systems. By design, it runs hundreds, thousands, or even 10s of thousands of small processes to handle tasks by sending information between them instead of sharing memory or state. This enables all sorts of interesting features that are great for high availability distributed systems such as:
hot code reloading. Each process is paused, it's relevant module code is swapped out, and it is resumed where it left off, so deploys can happen without restarting or causing significant interruption.
Easy distributed messaging and clustering. Sending a message to a local process or a remote one is fairly seamless in most instances.
Process-local GC. Garbage collection happens in each process independently instead of a global stop-the-world even like java, aiding in low-latency results.
Supervision trees and complex process hierarchy and monitoring/managing.
A few concrete real-world examples that makes great use of Erlang would be:
MongooseIM A highly performant and incredibly scalable, distributed XMPP / Chat server
Riak A distributed key/value store.
Mesos, on the other hand, you can sort of think of as a platform effectively for turning a datacenter of servers into a platform for teams and developers. If I, say as a company, own a datacenter with 10,000 physical servers, and I have 1,000 engineers developing hundreds of services, a good way to allow the engineers to deploy and manage services across that hardware without them needing to worry about the servers directly. It's an abstraction layer over-top of the physical servers to that allows you to share and intelligently allocate resources.
As a user of Mesos, I might say that I have Service X. It's an executable bundle that lives in location Y. Each instance of Service X needs 4 GB of RAM and 2 cores. And I need 8 instances which will be attached to a load balancer. You can specify this in configuration and deploy based on that config. Mesos will find hardware that has enough ram and CPU capacity available to handle each instance of that service and start it running in each of those locations.
It can handle a lot of other more complex topics about the orchestration of them as well, but that's probably a bit in-depth for this :)
Zookeepers most common use cases are Service Discover and configuration management. You can think of it, fundamentally, a bit like a nested key value store, where services can look at pre-defined paths to see where other services currently live.
A simple example is that I have a web service using a shared database cluster. I know a simple name for that database cluster and where the configuration for it lives in zookeeper. I can look up (or repeatedly poll) that path in zookeeper to check what the addresses of the active database hosts are. And on the other side, if I take a database node out of rotation and replace it with a new one, the config in zookeeper gets updated with the new address, and anything continually looking at it will detect this change and change where it's connected to.
A more complex use case for zookeeper is how Kafka uses it (or did at the time that I last used Kafka). Kafka has streams, and streams have many shards. Each consumer of each stream use zookeeper to save checkpoints in each shard after they have read and processed up to a certain point in the stream. That way if the consumer crashes or is restarted, it knows where to pick up in the stream.
I dont know about Meos and Earlang language. But this article might help you with Helix and Zookeeper.
This article tells us:
Zookeeper is responsible for gluing all parts together where Helix is cluster management component that registers all cluster details (cluster itself, nodes, resources).
The article is related to clustering in JBPM using helix and zookeeper.But with this you will get a basic idea on what helix and zookeeper is used for.
And from most of the articles i read online it seems like zookeeper and helix are used together.
Apache Zookeeper can be installed on a single machine or on a cluster.
It can be used to keep track of logs. It can provide various services on a distributed platform.
Storm and Kafka rely on Zookeeper.
Storm uses Zookeeper to store all state so that it can recover from an outage in any of its (distributed) component services.
Kafka queue consumers can use Zookeeper to store information on what has been consumed from the queue.
I need to run a component using Apache Camel (or Spring Integration) under WAS ND 8.0 cluster. They both run some threads on startup, and stop them on shutdown normally. No problem to supply WAS managed threadpool. But that threads must run on single cluster's node at the same time. Moreover it must be high-available i.e. switch to other node when active node falls.
Solution I found - is WAS Partitioning Facility. It requires additional Extended Deployment licenses. Is it the only way, or there is some way to implement this using Network Deployment license only?
Thanks in advance.
I think that there is not a feature that address this interesting requirement.
I can imagine a "trick":
A Timer EJB send a message on a queue (let's say 1 per minute)
Configure a Service Integration Bus (SIB) with High Availability and No Scalability, so the HA Manager ensure that only one messaging engine (ME) is alive.
Create a non-reliable queue for high performances and low resource consumption.
The Activation Spec should be configured to listen only local ME.
A MDB implement the following logic: when the message arrives, it check if the singleton thread is alive, otherwise it start the thread.
Does it make sense?