My understanding could be amiss here. As I understand it, Couchbase uses a smart client to automatically select which node to write to or read from in a cluster. What I DON'T understand is, when this data is written/read, is it also immediately written to all other nodes? If so, in the event of a node failure, how does Couchbase know to use a different node from the one that was 'marked as the master' for the current operation/key? Do you lose data in the event that one of your nodes fails?
This sentence from the Couchbase Server Manual gives me the impression that you do lose data (which would make Couchbase unsuitable for high availability requirements):
With fewer larger nodes, in case of a node failure the impact to the
application will be greater
Thank you in advance for your time :)
By default when data is written into couchbase client returns success just after that data is written to one node's memory. After that couchbase save it to disk and does replication.
If you want to ensure that data is persisted to disk in most client libs there is functions that allow you to do that. With help of those functions you can also enshure that data is replicated to another node. This function is called observe.
When one node goes down, it should be failovered. Couchbase server could do that automatically when Auto failover timeout is set in server settings. I.e. if you have 3 nodes cluster and stored data has 2 replicas and one node goes down, you'll not lose data. If the second node fails you'll also not lose all data - it will be available on last node.
If one node that was Master goes down and failover - other alive node becames Master. In your client you point to all servers in cluster, so if it unable to retreive data from one node, it tries to get it from another.
Also if you have 2 nodes in your disposal you can install 2 separate couchbase servers and configure XDCR (cross datacenter replication) and manually check servers availability with HA proxies or something else. In that way you'll get only one ip to connect (proxy's ip) which will automatically get data from alive server.
Hopefully Couchbase is a good system for HA systems.
Let me explain in few sentence how it works, suppose you have a 5 nodes cluster. The applications, using the Client API/SDK, is always aware of the topology of the cluster (and any change in the topology).
When you set/get a document in the cluster the Client API uses the same algorithm than the server, to chose on which node it should be written. So the client select using a CRC32 hash the node, write on this node. Then asynchronously the cluster will copy 1 or more replicas to the other nodes (depending of your configuration).
Couchbase has only 1 active copy of a document at the time. So it is easy to be consistent. So the applications get and set from this active document.
In case of failure, the server has some work to do, once the failure is discovered (automatically or by a monitoring system), a "fail over" occurs. This means that the replicas are promoted as active and it is know possible to work like before. Usually you do a rebalance of the node to balance the cluster properly.
The sentence you are commenting is simply to say that the less number of node you have, the bigger will be the impact in case of failure/rebalance, since you will have to route the same number of request to a smaller number of nodes. Hopefully you do not lose data ;)
You can find some very detailed information about this way of working on Couchbase CTO blog:
http://damienkatz.net/2013/05/dynamo_sure_works_hard.html
Note: I am working as developer evangelist at Couchbase
Related
Actually I'm using a configuration of Redis Master-Slaves with HAProxy for Wordpress to have High Avaibility. This configuration is nice and works perfect (I'm able to remove any server to maintenance without downtime). The problem of this configuration is that only one Redis server is getting all the traffic and the others are just waiting if that server dies, so in a very high load webpage can be a problem, and add more servers is not a solution because always only one will be master.
With this in mind, I'm thinking if maybe I can just use a Redis Cluster to allow to read/write on all nodes but I'm not really sure if it will works on my setup.
My setup is limited to three nodes the most of times, and I've read in some places that Redis cluster minimal setup is three nodes, but six is recommended. This is rational because this setup allow to have Slaves nodes that will become Masters if her Master dies, and then all data will be kept, but what happend if data don't cares?. I mean, on my setups the data is just cached objects, so if don't exists it just create it again so:
The data will be lost (don't care), and the other nodes will get the objects from clients again, to serve it on later requests (like happen if a Flush the data).
The nodes will answer that data doesn't exists and will reject to cache because the object would have to be on other node that is dead.
Someone know it?
Thanks!!
When a master dies, the Redis cluster goes to a down state and any command involving a key served by the failed instance will fail.
This may differ from some other distributed software because Redis Cluster is not the kind of program that every master holds all data. In fact, the key space is horizontally partitioned and each key is served by only one master.
This is mentioned in the specification:
The key space is split into 16384 slots...
a single hash slot will be served by a single node...
The base algorithm used to map keys to hash slots is the following:
HASH_SLOT = CRC16(key) mod 16384
When you setup a cluster, you certainly ask each node to serve a set of slots, and each slot can only be served by one node. If one node dies, you lose the slots on this node unless you have a slave failover to serve them, so that any command involving keys mapped to these slots will fail.
I've found different zookeeper definitions across multiple resources. Maybe some of them are taken out of context, but look at them pls:
A canonical example of Zookeeper usage is distributed-memory computation...
ZooKeeper is an open source Apacheā¢ project that provides a centralized infrastructure and services that enable synchronization across a cluster.
Apache ZooKeeper is an open source file application program interface (API) that allows distributed processes in large systems to synchronize with each other so that all clients making requests receive consistent data.
I've worked with Redis and Hazelcast, that would be easier for me to understand Zookeeper by comparing it with them.
Could you please compare Zookeeper with in-memory-data-grids and Redis?
If distributed-memory computation, how does zookeeper differ from in-memory-data-grids?
If synchronization across cluster, than how does it differs from all other in-memory storages? The same in-memory-data-grids also provide cluster-wide locks. Redis also has some kind of transactions.
If it's only about in-memory consistent data, than there are other alternatives. Imdg allow you to achieve the same, don't they?
https://zookeeper.apache.org/doc/current/zookeeperOver.html
By default, Zookeeper replicates all your data to every node and lets clients watch the data for changes. Changes are sent very quickly (within a bounded amount of time) to clients. You can also create "ephemeral nodes", which are deleted within a specified time if a client disconnects. ZooKeeper is highly optimized for reads, while writes are very slow (since they generally are sent to every client as soon as the write takes place). Finally, the maximum size of a "file" (znode) in Zookeeper is 1MB, but typically they'll be single strings.
Taken together, this means that zookeeper is not meant to store for much data, and definitely not a cache. Instead, it's for managing heartbeats/knowing what servers are online, storing/updating configuration, and possibly message passing (though if you have large #s of messages or high throughput demands, something like RabbitMQ will be much better for this task).
Basically, ZooKeeper (and Curator, which is built on it) helps in handling the mechanics of clustering -- heartbeats, distributing updates/configuration, distributed locks, etc.
It's not really comparable to Redis, but for the specific questions...
It doesn't support any computation and for most data sets, won't be able to store the data with any performance.
It's replicated to all nodes in the cluster (there's nothing like Redis clustering where the data can be distributed). All messages are processed atomically in full and are sequenced, so there's no real transactions. It can be USED to implement cluster-wide locks for your services (it's very good at that in fact), and tehre are a lot of locking primitives on the znodes themselves to control which nodes access them.
Sure, but ZooKeeper fills a niche. It's a tool for making a distributed applications play nice with multiple instances, not for storing/sharing large amounts of data. Compared to using an IMDG for this purpose, Zookeeper will be faster, manages heartbeats and synchronization in a predictable way (with a lot of APIs for making this part easy), and has a "push" paradigm instead of "pull" so nodes are notified very quickly of changes.
The quotation from the linked question...
A canonical example of Zookeeper usage is distributed-memory computation
... is, IMO, a bit misleading. You would use it to orchestrate the computation, not provide the data. For example, let's say you had to process rows 1-100 of a table. You might put 10 ZK nodes up, with names like "1-10", "11-20", "21-30", etc. Client applications would be notified of this change automatically by ZK, and the first one would grab "1-10" and set an ephemeral node clients/192.168.77.66/processing/rows_1_10
The next application would see this and go for the next group to process. The actual data to compute would be stored elsewhere (ie Redis, SQL database, etc). If the node failed partway through the computation, another node could see this (after 30-60 seconds) and pick up the job again.
I'd say the canonical example of ZooKeeper is leader election, though. Let's say you have 3 nodes -- one is master and the other 2 are slaves. If the master goes down, a slave node must become the new leader. This type of thing is perfect for ZK.
Consistency Guarantees
ZooKeeper is a high performance, scalable service. Both reads and write operations are designed to be fast, though reads are faster than writes. The reason for this is that in the case of reads, ZooKeeper can serve older data, which in turn is due to ZooKeeper's consistency guarantees:
Sequential Consistency
Updates from a client will be applied in the order that they were sent.
Atomicity
Updates either succeed or fail -- there are no partial results.
Single System Image
A client will see the same view of the service regardless of the server that it connects to.
Reliability
Once an update has been applied, it will persist from that time forward until a client overwrites the update. This guarantee has two corollaries:
If a client gets a successful return code, the update will have been applied. On some failures (communication errors, timeouts, etc) the client will not know if the update has applied or not. We take steps to minimize the failures, but the only guarantee is only present with successful return codes. (This is called the monotonicity condition in Paxos.)
Any updates that are seen by the client, through a read request or successful update, will never be rolled back when recovering from server failures.
Timeliness
The clients view of the system is guaranteed to be up-to-date within a certain time bound. (On the order of tens of seconds.) Either system changes will be seen by a client within this bound, or the client will detect a service outage.
Mirroring is replicating data between Kafka cluster, while Replication is for replicating nodes within a Kafka cluster.
Is there any specific use of Replication, if Mirroring has already been setup?
They are used for different use cases. Let's try to clarify.
As described in the documentation,
The purpose of adding replication in Kafka is for stronger durability and higher availability. We want to guarantee that any successfully published message will not be lost and can be consumed, even when there are server failures. Such failures can be caused by machine error, program error, or more commonly, software upgrades. We have the following high-level goals:
Inside a cluster there might be network partitions (a single server fails, and so forth), therefore we want to provide replication between the nodes. Given a setup of three nodes and one cluster, if server1 fails, there are two replicas Kafka can choose from. Same cluster implies same response times (ok, it also depends on how these servers are configured, sure, but in a normal scenario they should not differ so much).
Mirroring, on the other hand, seems to be very valuable, for example, when you are migrating a data center, or when you have multiple data centers (e.g., AWS in the US and AWS in Ireland). Of course, these are just a couple of use cases. So what you do here is to give applications belonging to the same data center a faster and better way to access data - data locality in some contexts is everything.
If you have one node in each cluster, in case of failure, you might have way higher response times to go, let's say, from AWS located in Ireland to AWS in the US.
You might claim that in order to achieve data locality (services in cluster one read from kafka in cluster one) one still needs to copy the data from one cluster to the other. That's definitely true, but the advantages you might get with mirroring could be higher than those you would get by reading directly (via an SSH tunnel?) from Kafka located in another data center, for example single connections down, clients connection/session times longer (depending on the location of the data center), legislation (some data can be collected in a country while some other data shouldn't).
Replication is the basis of higher availability. You shouldn't use Mirroring to handle high availability in a context where data locality matters. At the same time, you should not use just Replication where you need to duplicate data across data centers (I don't even know if you can without Mirroring/an ssh tunnel).
I have an application that runs a single Membase server (1.7.1.1) that I use to cache data I'd otherwise fetch from our central SQL Server DB. I have one default bucket associated to the Membase server, and follow the traditional data-fetching pattern of:
When specific data is requested, lookup the relevant key in Membase
If data is returned, use it.
If no data is returned, fetch data from the DB
Store the newly returned data in Membase
I am looking to add an additional server to my default cluster, and rebalance the keys. (I also have replication enabled for one additional server).
In this scenario, I am curious as to how I can use the current pattern (or modify it) to make sure that I am not getting data out of sync when one of my two servers goes down in either an auto-failover or manual failover scenario.
From my understanding, if one server goes down (call it Server A), during the period that it is down but still attached to the cluster, there will be a cache key miss (if the active key is associated to Server A, not Server B). In that case, in the data-fetching pattern above, I would get no data returned and fetch straight from SQL Server. But, when I attempt to store the data back to my Membase cluster, will it store the data in Server B and remap that key to Server B on the next fetch?
I understand that once I mark Server A as "failed over", Server B's replica key will become the active one, but I am unclear about how to handle the intermittent situation when Server A is inaccessible but not yet marked as failed over.
Any help is greatly appreciated!
That's a pretty old version. But several things to clarify.
If you are performing caching you are probably using a memcached bucket, and in this case there is no replica.
Nodes are always considered attached to the cluster until they are explicitly removed by administrative action (autofailover attempts to automate this administrative action for you by attempting to remove the node from the cluster if it's determined to be down for n amount of time).
If the server is down (but not failed over), you will not get a "Cache Miss" per se, but some other kind of connectivity error from your client. Many older memcached clients do not make this distinction and simply return a NULL, False, or similar value for any kind of failure. I suggest you use a proper Couchbase client for your application which should help differentiate between the two.
As far as Couchbase is concerned, data routing for any kind of operation remains the same. So if you were not able to reach the item on Server A. because it was not available, you will encounter this same issue upon attempting to store it back again. In other words, if you tried to get data from Server A and it was down, attempting to store data to Server A will fail in the exact same way, unless the server was failed over between the last fetch and the current storage attempt -- in which case the client will determine this and route the request to the appropriate server.
In "newer" versions of Couchbase (> 2.x) there is a special get-from-replica command available for use with couchbase (or membase)-style buckets which allow you to explicitly read information from a replica node. Note that you still cannot write to such a node, though.
Your overall strategy seems very sane for a cache; except that you need to understand that if a node is unavailable, then a certain percentage of your data will be unavailable (for both reads and writes) until the node is either brought back up again or failed over. There is no
I am using redis version 2.8.3. I want to build a redis cluster. But in this cluster there should be multiple master. This means I need multiple nodes that has write access and applying ability to all other nodes.
I could build a cluster with a master and multiple slaves. I just configured slaves redis.conf files and added that ;
slaveof myMasterIp myMasterPort
Thats all. Than I try to write something into db via master. It is replicated to all slaves and I really like it.
But when I try to write via a slave, it told me that slaves have no right to write. After that I just set read-only status of slave in redis.conf file to false. Hence, I could write something into db.
But I realize that, it is not replicated to my master replication so it is not replicated to all other slave neigther.
This means I could'not build an active-active cluster.
I tried to find something whether redis has active-active cluster capability. But I could not find exact answer about it.
Is it available to build active-active cluster with redis?
If it is, How can I do it ?
Thank you!
Redis v2.8.3 does not support multi-master setups. The real question, however, is why do you want to set one up? Put differently, what challenge/problem are you trying to solve?
It looks like the challenge you're trying to solve is how to reduce the network load (more on that below) by eliminating over-the-net reads. Since Redis isn't multi-master (yet), the only way to do it is by setting up each app server with a master and a slave (to the other master) - i.e. grand total of 4 Redis instances (and twice the RAM).
The simple scenario is when each app updates only a mutually-exclusive subset of the database's keys. In that scenario this kind of setup may actually be beneficial (at least in the short term). If, however, both apps can touch all keys or if even just one key is "shared" for writes between the apps, then you'll need to bake locking/conflict resolution/etc... logic into your apps to consolidate local master and slave differences (and that may be a bit of an overkill). In either case, however, you'll end up with too many (i.e. more than 1) Redises, which means more admin effort at the very least.
Also note that by colocating app and database on the same server you're setting yourself for near-certain scalability failure. What will happen when you need more compute resources for your apps or Redis? How will you add yet another app server to the mix?
Which brings me back to the actual problem you are trying to solve - network load. Why exactly is that an issue? Are your apps so throughput-heavy or is the network so thin that you are willing to go to such lengths? Or maybe latency is the issue that you want to resolve? Be the case as it may be, I recommended that you consider a time-proven design instead, namely separating Redis from the apps and putting it on its own resources. True, network will hit you in the face and you'll have to work around/with it (which is what everybody else does). On the other hand, you'll have more flexibility and control over your much simpler setup and that, in my book, is a huge gain.
Redis Enterprise has had this feature for quite a while, but if you are looking for an open source solution KeyDB is a fork with Active Active support (called Active Replica).
Setting it up is just a little more work than standard replication:
Both servers must have "active-replica yes" in their respective configuration files
On server B execute the command "replicaof [A address] [A port]"
Server B will drop its database and load server A's dataset
On server A execute the command "replicaof [B address] [B port]"
Server A will drop its database and load server B's dataset (including the data it just transferred in the prior step)
Both servers will now propagate writes to each other. You can test this by writing to a key on Server A and ensuring it is visible on B and vice versa.
https://github.com/JohnSully/KeyDB/wiki/KeyDB-(Redis-Fork):-Active-Replica-Support