Is the RavenDB subscription storage a central point of failure for NServiceBus? - ravendb

I am evaluating using NServiceBus as a SOA mechanism in our product. I'm looking into using the publish/subscribe pattern and my understanding is that the subscription service will store all subscriptions.
Does that mean that if my RavenDB server goes down then my publishers lose the ability to send to subscribers? Or is there a way for the publishers to cache the subscribers it has and if RavenDB were to go down then it would deliver to its known subscribers?

You can run the RavenDB server as a replicated node, to avoid this being a single point of failure.

The general pattern is for an endpoint to have a master node that acts as worker and distributor, and then the master node uses a Raven installation on that same server to store its subscriptions and saga storage.
So, it is a point of failure for that one endpoint, but other endpoints in the distributed system will use the Raven installs on their own servers. Thus, the system is kept distributed and the entire system does not have a single point of failure. RavenDB enables this because it is fairly easy to install it on any server.
Contrast this to SQL Server, which is frequently centralized, scaled up to the max, and even clustered in order to provide high availability. (Read: expensive!)

You can also run RavenDB in a Windows failover cluster where the nodes use a shared SAN for the RavenDB data files. If the active node dies, another takes over. Since the data is stored on the SAN, you shouldn't even notice it except the time it takes to start up the RavenDB windows service on the new node. Check out http://ravendb.net/docs/server/administration/fmc_configuration
This is also the recommended setup for High Availability when running with Distributors. http://docs.particular.net/nservicebus/scalability-and-ha/distributor/

Related

How can I setup Redis Cluster mode or master slave mode in PCF?

This is regarding the use case where we are trying to use the Redis in PCF (Pivotal Cloud Foundry). In our use case, we will refresh the Redis cache daily once or twice with the required data and then API will query Redis and then provide the response.
One thing of particular concern for us is that we want API queries to happen from Redis only that means Redis to be available at all times. But whenever we are refreshing the Redis DB, Redis would not be able to serve the APIs since it is refreshing the keys. To avoid that we wanted to setup a Redis in cluster mode or master-slave mode so if one instance is being written another can be read from.
How can we setup Redis cluster or master-slave mode in PCF and then fulfil our requirement?
Please provide any other suggestions as well that you may have.
At the time I write this, the Redis for Pivotal Platform product does not support clustering. See Availability, in the docs here -> https://docs.pivotal.io/redis/2-3/erc.html#offerings.
All Redis for Pivotal Platform services are single VMs without clustering capabilities. This means that planned maintenance jobs (e.g., upgrades) can result in 2–10 minutes of downtime, depending on the nature of the upgrade. Unplanned downtime (e.g., VM failure) also affects the Redis service.
Redis for Pivotal Platform has been used successfully in enterprise-ready apps that can tolerate downtime. Pre-existing data is not lost during downtime with the default persistence configuration. Successful apps include those where the downtime is passively handled or where the app handles failover logic.
If you require clustered Redis, you'd need to look at a different offering. Redis Labs has some offerings that integrate with PCF, you could use a Cloud Provider's Redis offering, or you could host your own.
If the solution you use isn't integrated into PCF, you can create a user-provided service with cf cups and provide the Redis credentials to your application that way. It will function just like a Redis service instance created through the marketplace.

Best Redis setup for session caching

I see there are multiple modes of operation for Redis (cluster, sentinel, master-slave, etc?). I don't fully understand the implications of each, but my question is this:
If I have a web application that requires distributed session persistence, which configuration of Redis makes the most sense? The main reason I'm using redis is to achieve some level of fault tolerance. If one of my frontend servers fails, I want the sessions to be available for other nodes to pickup the workload. If a redis node goes down, I don't want this to affect the user experiences, and I don't want to have to wake up a developer at midnight to correct the matter.
From everything I've read, Redis Sentinel is the way to go for fault tolerance.

Real world example of Apache Helix, Zookeeper, Mesos and Erlang?

I am new in
Apache ZooKeeper : ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
Apache Mesos : Apache Mesos is a cluster manager that simplifies the complexity of running applications on a shared pool of servers.
Apache Helix : Apache Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes.
Erlang Langauge : Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability.
It sounds to me that Helix and Mesos both are useful for Clustering management System. How they are related to ZooKeeper? It'd better if someone give me a real world example for their usage.
I am curious to know How [BOINC][1] are distributing tasks to their clients? Are they using any of the above technologies? (Forget about Erlang).
I just need a brief view on it :)
Erlang was built by Ericsson, designed for use in phone systems. By design, it runs hundreds, thousands, or even 10s of thousands of small processes to handle tasks by sending information between them instead of sharing memory or state. This enables all sorts of interesting features that are great for high availability distributed systems such as:
hot code reloading. Each process is paused, it's relevant module code is swapped out, and it is resumed where it left off, so deploys can happen without restarting or causing significant interruption.
Easy distributed messaging and clustering. Sending a message to a local process or a remote one is fairly seamless in most instances.
Process-local GC. Garbage collection happens in each process independently instead of a global stop-the-world even like java, aiding in low-latency results.
Supervision trees and complex process hierarchy and monitoring/managing.
A few concrete real-world examples that makes great use of Erlang would be:
MongooseIM A highly performant and incredibly scalable, distributed XMPP / Chat server
Riak A distributed key/value store.
Mesos, on the other hand, you can sort of think of as a platform effectively for turning a datacenter of servers into a platform for teams and developers. If I, say as a company, own a datacenter with 10,000 physical servers, and I have 1,000 engineers developing hundreds of services, a good way to allow the engineers to deploy and manage services across that hardware without them needing to worry about the servers directly. It's an abstraction layer over-top of the physical servers to that allows you to share and intelligently allocate resources.
As a user of Mesos, I might say that I have Service X. It's an executable bundle that lives in location Y. Each instance of Service X needs 4 GB of RAM and 2 cores. And I need 8 instances which will be attached to a load balancer. You can specify this in configuration and deploy based on that config. Mesos will find hardware that has enough ram and CPU capacity available to handle each instance of that service and start it running in each of those locations.
It can handle a lot of other more complex topics about the orchestration of them as well, but that's probably a bit in-depth for this :)
Zookeepers most common use cases are Service Discover and configuration management. You can think of it, fundamentally, a bit like a nested key value store, where services can look at pre-defined paths to see where other services currently live.
A simple example is that I have a web service using a shared database cluster. I know a simple name for that database cluster and where the configuration for it lives in zookeeper. I can look up (or repeatedly poll) that path in zookeeper to check what the addresses of the active database hosts are. And on the other side, if I take a database node out of rotation and replace it with a new one, the config in zookeeper gets updated with the new address, and anything continually looking at it will detect this change and change where it's connected to.
A more complex use case for zookeeper is how Kafka uses it (or did at the time that I last used Kafka). Kafka has streams, and streams have many shards. Each consumer of each stream use zookeeper to save checkpoints in each shard after they have read and processed up to a certain point in the stream. That way if the consumer crashes or is restarted, it knows where to pick up in the stream.
I dont know about Meos and Earlang language. But this article might help you with Helix and Zookeeper.
This article tells us:
Zookeeper is responsible for gluing all parts together where Helix is cluster management component that registers all cluster details (cluster itself, nodes, resources).
The article is related to clustering in JBPM using helix and zookeeper.But with this you will get a basic idea on what helix and zookeeper is used for.
And from most of the articles i read online it seems like zookeeper and helix are used together.
Apache Zookeeper can be installed on a single machine or on a cluster.
It can be used to keep track of logs. It can provide various services on a distributed platform.
Storm and Kafka rely on Zookeeper.
Storm uses Zookeeper to store all state so that it can recover from an outage in any of its (distributed) component services.
Kafka queue consumers can use Zookeeper to store information on what has been consumed from the queue.

LDAP Fault-tolerance configuration (e.g SunOne)

LDAP Fault-tolerance configuration (e.g SunOne):
Does anyboby know how to configuration "Fault-tolerance" for LDAP, e.g SunOne LDAP.
I search via google without any userful result?
Thanks
Assuming, by "fault tolerance," "high availability (HA)" is being asked, I would say it can be achieved by redundancy. And, it would not be peculiar to SunOne or any directory server software from other vendors.
There are different ways to solve this. It depends on the business requirements and the affordability. One method that comes to mind is to have the LDAP software installed on an HA pair. This requires hardware and OS capabilities for fail-over and it requires two servers (in a world of virtualization, "server" can mean different things [physical box, frame, LPAR, etc.]; so, I'll just leave the interpretation to the reader). When one server fails, the other server takes over and assumes the primary role in the pair. This is the fault-tolerance part. In this approach, the machine/server with the secondary role is passive (i.e., it's not serving clients) until the primary goes down. You will need to implement LDAP data replication between two servers. They can be two LDAP masters in a P2P replication topology.
Another method is to have multiple LDAP servers (i.e., masters, replicas) and cluster them using a network dispatcher (ND) software/appliance/etc., which would distribute the incoming traffic to the individual servers (usually replicas) in the cluster. If you lose one replica in the cluster, ND will not send any traffic to that replica until it comes back. However, other replicas will still be receiving load and therefore serving to the incoming traffic. This is the fault-tolerance part in this method. The degree of the availability you want will also dictate what can be done in a clustered environment. You can have a single LDAP master (to which the organization's applications would make updates) and keep it out of the cluster, but pair with another server for fail-over (so you wouldn't lose availability for updates from the applications - this also gives you the freedom to do maintenance on the master without interrupting your applications [well, they need to be written to be able to write to more than one LDAP master if the primary one is not available]). You would have to have the secondary server to receive replication from the primary in any case. If the budget doesn't let you have more servers/replicas, then you can put the master server along with replicas in the cluster as well to help with the read traffic. Instead of an HA-pair in which one of the servers would be passive, you can have two masters configured in a P2P replication topology and have them both in the cluster to help with the traffic too. There are different ways to approach to this method depending on the level of redundancy wanted or that can be afforded.

Redis PUBLISH/SUBSCRIBE limits

I'm considering Redis for a section of the architecture of a new project. It will consist of a lot of clients (node.js connections) SUBSCRIBING to particular keys with one process PUBLISHING to those keys as needed.
I'm curious about the limits of the PUBLISH/SUBSCRIBE commands and how to mitigate those. An obvious limit is the amount of file descriptors open on the machine with Redis so at some point I'll need to implement Master-Slave or Consistent Hashing to multiple Redis instances.
Does anyone have any solutions about how to scale this architecture with Redis' PubSub?
Redis PubSub scales really easily since the Master/Slave replication automatically publishes to all slaves.
The easiest way is to load balance the connections to node.js with for instance HAProxy, run a Redis slave on each webserver that syncs with a single master that publishes the messages.
I can't give you exact numbers since that greatly depends on the underlying system, but this should scale extremely well. And you don't need to manage the clients and which server they connect to manually. You obviously need some way to handle session state, so you might need to do that anyway, but that's a lot easier to do in the load balancer than in your application.