Real world example of Apache Helix, Zookeeper, Mesos and Erlang? - apache

I am new in
Apache ZooKeeper : ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
Apache Mesos : Apache Mesos is a cluster manager that simplifies the complexity of running applications on a shared pool of servers.
Apache Helix : Apache Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes.
Erlang Langauge : Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability.
It sounds to me that Helix and Mesos both are useful for Clustering management System. How they are related to ZooKeeper? It'd better if someone give me a real world example for their usage.
I am curious to know How [BOINC][1] are distributing tasks to their clients? Are they using any of the above technologies? (Forget about Erlang).
I just need a brief view on it :)

Erlang was built by Ericsson, designed for use in phone systems. By design, it runs hundreds, thousands, or even 10s of thousands of small processes to handle tasks by sending information between them instead of sharing memory or state. This enables all sorts of interesting features that are great for high availability distributed systems such as:
hot code reloading. Each process is paused, it's relevant module code is swapped out, and it is resumed where it left off, so deploys can happen without restarting or causing significant interruption.
Easy distributed messaging and clustering. Sending a message to a local process or a remote one is fairly seamless in most instances.
Process-local GC. Garbage collection happens in each process independently instead of a global stop-the-world even like java, aiding in low-latency results.
Supervision trees and complex process hierarchy and monitoring/managing.
A few concrete real-world examples that makes great use of Erlang would be:
MongooseIM A highly performant and incredibly scalable, distributed XMPP / Chat server
Riak A distributed key/value store.
Mesos, on the other hand, you can sort of think of as a platform effectively for turning a datacenter of servers into a platform for teams and developers. If I, say as a company, own a datacenter with 10,000 physical servers, and I have 1,000 engineers developing hundreds of services, a good way to allow the engineers to deploy and manage services across that hardware without them needing to worry about the servers directly. It's an abstraction layer over-top of the physical servers to that allows you to share and intelligently allocate resources.
As a user of Mesos, I might say that I have Service X. It's an executable bundle that lives in location Y. Each instance of Service X needs 4 GB of RAM and 2 cores. And I need 8 instances which will be attached to a load balancer. You can specify this in configuration and deploy based on that config. Mesos will find hardware that has enough ram and CPU capacity available to handle each instance of that service and start it running in each of those locations.
It can handle a lot of other more complex topics about the orchestration of them as well, but that's probably a bit in-depth for this :)
Zookeepers most common use cases are Service Discover and configuration management. You can think of it, fundamentally, a bit like a nested key value store, where services can look at pre-defined paths to see where other services currently live.
A simple example is that I have a web service using a shared database cluster. I know a simple name for that database cluster and where the configuration for it lives in zookeeper. I can look up (or repeatedly poll) that path in zookeeper to check what the addresses of the active database hosts are. And on the other side, if I take a database node out of rotation and replace it with a new one, the config in zookeeper gets updated with the new address, and anything continually looking at it will detect this change and change where it's connected to.
A more complex use case for zookeeper is how Kafka uses it (or did at the time that I last used Kafka). Kafka has streams, and streams have many shards. Each consumer of each stream use zookeeper to save checkpoints in each shard after they have read and processed up to a certain point in the stream. That way if the consumer crashes or is restarted, it knows where to pick up in the stream.

I dont know about Meos and Earlang language. But this article might help you with Helix and Zookeeper.
This article tells us:
Zookeeper is responsible for gluing all parts together where Helix is cluster management component that registers all cluster details (cluster itself, nodes, resources).
The article is related to clustering in JBPM using helix and zookeeper.But with this you will get a basic idea on what helix and zookeeper is used for.
And from most of the articles i read online it seems like zookeeper and helix are used together.

Apache Zookeeper can be installed on a single machine or on a cluster.
It can be used to keep track of logs. It can provide various services on a distributed platform.
Storm and Kafka rely on Zookeeper.
Storm uses Zookeeper to store all state so that it can recover from an outage in any of its (distributed) component services.
Kafka queue consumers can use Zookeeper to store information on what has been consumed from the queue.

Related

why use rabbitmq or similar versus python builtin multiprocessing queue?

I have a producer of tasks and multiple workers to consume those tasks. Many places recommend rabbitmq and/or celery. However python has a builtin multiprocessing queue that can be shared on an ip/port using a manager/proxy. What would be the advantages of using something like rabbitmq instead?
RabbitMq is an enterprise level tool, typically deployed separately on out-of-process servers / VMs / Containers, and plays in the enterprise service bus space.
Rabbit has reliable messaging as an objective - e.g. messages are persisted, and nodes in the cluster can be restarted without losing messages.
Supports a large range of messaging topologies, such as Point-Point, Fan out, and Topic subscriptions
Can be scaled for volume by adding multiple nodes to a cluster
Allows for conditional routing of messages to queues using routing keys or header filters
Agnostic of client technology, i.e. Clients can be on any platform which support the AMQP protocol
Has an out of the box administration, monitoring and diagnostics UI
Has a wide range of extensions and tools, such as shovels allowing messages to be replicated across multiple RabbitMQ clusters.
I'm no Python expert, but from what I understand of the multiprocessing package, it serves as an manager for distributing work between worker processes and threads, so IMO would be regarded as a more local system concern, as opposed to 'enterprise' level.
e.g. you would need to handle persistence, i.e. so messages are not lost during a crash / restart, and would likely need to built your own administration and monitoring tools.

Service Fabric - Local Cluster - Queuing

I am in a situation where I can use Service Fabric (locally) but cannot leverage Azure Service Bus (or anything "cloud"). What would be the corollary for queuing/pub-sub? Service Fabric is allowed since it is able to run in a local container, and is "free". Other 3rd party messaging infrastructure, like RabbitMQ, are also off the table (at the moment).
I've built systems using a locally grown bus, built on MSMQ and WCF, but I don't see how to accomplish the same thing in SF. I suspect I can have SF services use a custom ICommunicationListener that exposes msmq, but that would only be available inside the cluster (the way I understand it). I can build an HTTPBridge (in SF) in front of those to make them available outside the cluster, but then I'd lose the lifetime decoupling (client being able to call a service, using queues, even if that service isn't online at the time) since the bridge itself wouldn't benefit from any of the aspects of queuing.
I have a few possibilities but all suffer from some malady that only exists because of SF, locally. Also, the same code needs to easily deploy to full Azure SF (where I can use ASB and this issue disappears) so I don't want to build two separate systems just because of where I am hosting it in some instances.
Thanks for any tips.
You can build this yourself, for example like this. This uses a BrokerService that will distribute message-data to subscribed services and actors.
You can also run a containerized queuing platform like RabbitMQ with volumes.
By running the queue system inside the cluster you won't introduce an external dependency.
The problem is not SF, The main issue with your design is that you are coupling architectural requirements to implementations. SF runs on top of VirtualMachines, in the end, the only difference is that SF put the services in those machines, using another solution you would have an Agent Deploying these services in there or doing a Manual deployment. The challenges are the same.
It is clear from the description that the requirement in your design is a need for a message queue, the concept of queues are the same does not matter if it is Service Bus, RabbitMQ or MSMQ. Each of then will have the basic foundations of queues with specifics of each implementation, some might add transactions, some might implement multiple patterns, and so on.
If you design based on specific implementation, you will couple your solution to the implementation and make your solution hard to maintain and face challenges like you described.
Solutions like NServiceBus and Masstransit reduce a lot of these coupling from your code, and if you think these are not enough, you can create your own abstraction. Then you use configurations to tied your business logic to implementations.
Despite the above advice, I would not recommend you using different
solutions per environment, because as said previously, each solution
has it's own implementations and they might not assimilate to each other, as example, you might face issues in
production because you developed against MSMQ on DEV and TEST
environments, and when deployed to Production you use ServiceBus, they
have different limitations, like message size, retention period and son
on.
If you are willing to use MSMQ, you can add MSMQ to the VMs running your cluster and connect from your services without any issue. Take a look into this SO first: How can I use MSMQ in Azure Service Fabric

Are Activemq, Redis and Apache camel a right combination?

Are Activemq, Redis and Apache camel a right combination?
Am planning for a high performant enterprise level integration solution accross multiple applications
My objective is to make the solution
a. independent of the consumers performance
b. able to trouble shoot in case of any issue
c. highly available with failover support
d. Hanlde 10k msgs per second
Here I'm planning to have
a. network of activemq brokers running in all app servers and storing the consumed messages in redis data store
b. from redis data store, application can retrieve the messages through camel end points
(camel end point is chosen to process the messages before reaching the app).
Also can ActiveMQ be removed with only Redis + Apache camel, as I see from the discussions forms that Redis does most of the ActiveMQ stuff
Could any one advise on this technology stack.
ActiveMQ and Camel works great together and scales very well - should be no problem to handle the load given proper hardware.
Are you thinking about something like this?
Message producer App -> ActiveMQ -> Camel -> Redis
Message Consumer App <- Camel [some endpoint] <- Redis
Puting ActiveMQ in between is usually a very good way to achieve HA, load balancing and making the solution elastic. Depending on your specific setup with machines etc. ActiveMQ can help in many ways to solve HA issues.
Removing ActiveMQ can a good option if your apps use some other protocol than JMS/ActiveMQ messaging, i.e. HTTP, raw tcp or similar. Can you elaborate on how the apps will communicate with Camel? ActiveMQ, by default, supports transactions, guaranteed delivery and you can live with a limited number of threads on the server, even for your heavy traffic. For other protocols, this might be a bit trickier to achieve. Without a HA layer (cluster) in ActiveMQ you need to setup Redis to handle HA in all aspects, which might be just as easy, but Redis is a bit memory hungry, so be aware of that.

Redis PUBLISH/SUBSCRIBE limits

I'm considering Redis for a section of the architecture of a new project. It will consist of a lot of clients (node.js connections) SUBSCRIBING to particular keys with one process PUBLISHING to those keys as needed.
I'm curious about the limits of the PUBLISH/SUBSCRIBE commands and how to mitigate those. An obvious limit is the amount of file descriptors open on the machine with Redis so at some point I'll need to implement Master-Slave or Consistent Hashing to multiple Redis instances.
Does anyone have any solutions about how to scale this architecture with Redis' PubSub?
Redis PubSub scales really easily since the Master/Slave replication automatically publishes to all slaves.
The easiest way is to load balance the connections to node.js with for instance HAProxy, run a Redis slave on each webserver that syncs with a single master that publishes the messages.
I can't give you exact numbers since that greatly depends on the underlying system, but this should scale extremely well. And you don't need to manage the clients and which server they connect to manually. You obviously need some way to handle session state, so you might need to do that anyway, but that's a lot easier to do in the load balancer than in your application.

WebLogic load balancing

I'm currently developing a project supported on a WebLogic clustered environment. I've successfully set up the cluster, but now I want a load-balancing solution (currently, only for testing purposes, I'm using WebLogic's HttpClusterServlet with round-robin load-balancing).
Is there any documentation that gives a clear comparison (with pros and cons) of the various ways of providing load-balancing for WebLogic?
These are the main topics I want to cover:
Performance (normal and on failover);
What failures can be detected and how fast is the failover recovery;
Transparency to failure (e.g., ability to automatically retry an idempotent request);
How well is each load-balancing solution adapted to various topologies (N-tier, clustering)
Thanks in advance for your help.
Is there any documentation that gives a clear comparison (with pros and cons) of the various ways of providing load-balancing for WebLogic?
It's not clear what kind of application you are building and what kind of technologies are involved. But...
You will find useful information in Failover and Replication in a Cluster and Load Balancing in a Cluster (also look at Cluster Implementation Procedures) but, no real comparison between the different options, at least not to my knowledge. But, the choice isn't that complex: 1. Hardware load balancers will perform better than software load balancers and 2. If you go for software load balancers, then WebLogic plugin for Apache is the recommended (by BEA) choice for production. Actually, for web apps, its pretty usual to put the static files on a web server and thus to use the Apache mod_wl plugin. See the Installing and Configuring the Apache HTTP Server Plug-In chapter.
These are the main topics I want to cover:
Performance (normal and on failover): If this question is about persistent session, WebLogic uses in memory replication by default and this works pretty well with a relatively low overhead.
What failures can be detected and how fast is the failover recovery: It is unclear which protocols you're using. But see Connection Errors and Clustering Failover.
Transparency to failure (e.g., ability to automatically retry an idempotent request): Clarifying the protocols you are using would make answering easier. If this question is about HTTP requests, then see Figure 3-1 Connection Failover.
How well is each load-balancing solution adapted to various topologies (N-tier, clustering): The question is unclear and too vague (for me). But maybe have a look at Cluster Architectures.
Oh, by the way, another nice chapter that you must read Clustering Best Practices.