I want to restart my Scylla db cluster. But I don't want to lose any data.
Do I lose any data if I restart one after other node?
No, you will not loose data if you are doing a rolling restart.
Scylla keeps the data replicated across multiple nodes (usually 3 or more)
Depending on your Replication Factor (RF) and Consistency Level (CL) you might see read or write operations failed during the restart. See interactive calc here https://docs.scylladb.com/getting-started/consistency/#consistency-level-calculator
If "restarting a node" just involves restarting Scylla or rebooting the kernel on which it runs, then you're safe: Scylla is a distributed database, and is designed to support durability and availability even when nodes temporarily disappear from the network. When a node is temporarily down, all its data is still available for reads (from two other replicas), and also writes continue to work normally and will be eventually replicated to the down node when it finally comes up (using the "hinted handoff" and/or "repair" mechanisms).
However, if by "restarting a node" you mean something more destructive - replacing it with a brand-new node with empty storage, as in some cloud setups where nodes have transient storage. In that case you have to be more careful: If the node's data is lost, we still have two more replicas and the database continues to be available, but you should tell the cluster to "stream" the data which the node lost back to the node - before continuing to do this destructive restart to additional nodes. If you have RF=3 and destroy three nodes at the same time, you will surely lose data.
Related
We have 3 ignite server nodes in 3 different server farms, full replicated, persistence enabled, and all servers area baseline nodes. It happens that if 2 server nodes fail (node or connection crash or slow connection), the remainig one also perform a shutdown, perhaps guessing it's disconnected from the network.
Is it possible to make the surviving node not to shutdown?
Is it possible to adjust some timeout to avoid disconnections from slow networks or nodes?
I cannot find any hint into the documentation.
To avoid the problem I've to run only one server node (what we tried to avoid using Ignite...).
You can try to customize StopNodeOrHaltFailureHandler https://stackoverflow.com/questions/tagged/ignite with SEGMENTATION added to ignoredFailureTypes.
But in this case, if all 3 nodes are segmented and remain alive, you need to keep in mind that the cluster may enter the split-brain state.
To decide wich node should be used for cache operations, you can add TopologyValidator https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/configuration/TopologyValidator.html to cache config. And based on node attributes to decide which node is allowed.
According to ML 9 doco, a database is backed up onto all nodes in the cluster, but the backup process appears to backup the forests that are only local to each node. So for a database with 6 forests across 3 nodes, I may have 2 forests backup files on each node.
If I have a 3 node cluster and lose one node ( so that one node is now 100% unrecoverable ), are all my backups now effectively useless as they will be missing the back up files for 2 forests?
Or is ML smart enough to re-create the missing data from the dead node, via parity?
Thanks.
The answer is it depends.
Typically backups are made to some sort of network storage, so loosing a node doesn't affect the backups. If for some reason the backups are stored locally to the system, then it would depend on if you have HA enabled, and whether you are backing up thee replica forests along with the primary forests.
If HA is enabled, you could lose a node and keep running, giving you time to rebuild the lost node. Alternatively, if you are backing up both the primary and replica forests in your cluster, you would have a complete data set in your backups even if you lose a node.
I am very new in REDIS cache implementation.
Could you please let me know what is the replication factor means?
How it works or What is the impact?
Thanks.
At the base of Redis replication (excluding the high availability features provided as an additional layer by Redis Cluster or Redis Sentinel) there is a very simple to use and configure leader follower (master-slave) replication: it allows replica Redis instances to be exact copies of master instances. The replica will automatically reconnect to the master every time the link breaks, and will attempt to be an exact copy of it regardless of what happens to the master.
This system works using three main mechanisms:
When a master and a replica instances are well-connected, the master keeps the replica updated by sending a stream of commands to the replica, in order to replicate the effects on the dataset happening in the master side due to: client writes, keys expired or evicted, any other action changing the master dataset.
When the link between the master and the replica breaks, for network issues or because a timeout is sensed in the master or the replica, the replica reconnects and attempts to proceed with a partial resynchronization: it means that it will try to just obtain the part of the stream of commands it missed during the disconnection.
When a partial resynchronization is not possible, the replica will ask for a full resynchronization. This will involve a more complex process in which the master needs to create a snapshot of all its data, send it to the replica, and then continue sending the stream of commands as the dataset changes.
Redis uses by default asynchronous replication, which being low latency and high performance, is the natural replication mode for the vast majority of Redis use cases.
Synchronous replication of certain data can be requested by the clients using the WAIT command. However WAIT is only able to ensure that there are the specified number of acknowledged copies in the other Redis instances, it does not turn a set of Redis instances into a CP system with strong consistency: acknowledged writes can still be lost during a failover, depending on the exact configuration of the Redis persistence. However with WAIT the probability of losing a write after a failure event is greatly reduced to certain hard to trigger failure modes.
RabbitMQ clusters need to have at least one disc node (you can't turn your last disc node to a ram node).
However (especially in a cloud context) nodes can die - what is supposed to happen to the cluster if the only disc node dies?
Does the cluster automatically appoint a new disc node, or it continues working with no disc node.
Short answer: In case all disc nodes dies and you have at least one RAM node you'll get RAM-only cluster. In case only one RAM node left and it goes down and then up, only durable entities will reside on it.
Long answer:
If you use clustering as it described in Clustering Guide queues reside only on one node:
All data/state required for the operation of a RabbitMQ broker is
replicated across all nodes, for reliability and scaling, with full
ACID properties. An exception to this are message queues, which by
default reside on the node that created them, though they are visible
and reachable from all nodes. To replicate queues across nodes in a
cluster, see the documentation on high availability (note that you
will need a working cluster first).
So when node dies (not only disc one, it applied to RAM too) you lose queues (with content) resides on that node.
If you use High Availability to mirror queue across more than one nodes (actually, it depends how you set it up, see detailed explanation on ha-mode and ha-policy policy keys - all, exactly and nodes).
With HA, if queue has some ha-policy set and the node it reside dies, that queue will be tried to be mirrored to other nodes, including RAM-only one (sure, it depends how you set up ha-mode, for example if it set to nodes and all nodes from list dies you lose the queue).
So after such intro,
If you turn off all disc nodes and you have only RAM nodes and queues fits the memory everything will works normally. If queues doesn't fit in memory, Flow Control memory limits applied which explained in clustering doc in Restarting section (at the end of e:
At least one disk node should be running at all times to prevent data
loss. RabbitMQ will prevent the creation of a RAM-only cluster in many
situations, but it still won't stop you from stopping and forcefully
resetting all the disc nodes, which will lead to a RAM-only cluster.
Doing this is not advisable and makes losing data very easy.
and a bit more from clustering doc:
A node can be a disk node or a RAM node. (Note: disk and disc are
used interchangeably. Configuration syntax or status messages normally
use disc.) RAM nodes keep their state only in memory (with the
exception of queue contents, which can reside on disc if the queue is
persistent or too big to fit in memory). Disk nodes keep state in
memory and on disk. As RAM nodes don't have to write to disk as much
as disk nodes, they can perform better. However, note that since the
queue data is always stored on disc, the performance improvements will
affect only resources management (e.g. adding/removing queues,
exchanges, or vhosts), but not publishing or consuming speed. Because
state is replicated across all nodes in the cluster, it is sufficient
(but not recommended) to have just one disk node within a cluster, to
store the state of the cluster safely.
So if you don't literally add any disc node you'll get RAM-only cluster. It may be fast in some cases, but if all nodes goes down you will lose all your queues with it content forever, except durable ones while any node dump persistent queues and messages on disc.
But don't rely on RAM node dump persistent entities on disc, while under certain situations it may not dump at all or not all entities (especially, messages).
There are old mailing list threads which may bring some extra light on situation:
Cluster with all memory nodes
Cluster Disk Node vs Ram Node explanation
My understanding could be amiss here. As I understand it, Couchbase uses a smart client to automatically select which node to write to or read from in a cluster. What I DON'T understand is, when this data is written/read, is it also immediately written to all other nodes? If so, in the event of a node failure, how does Couchbase know to use a different node from the one that was 'marked as the master' for the current operation/key? Do you lose data in the event that one of your nodes fails?
This sentence from the Couchbase Server Manual gives me the impression that you do lose data (which would make Couchbase unsuitable for high availability requirements):
With fewer larger nodes, in case of a node failure the impact to the
application will be greater
Thank you in advance for your time :)
By default when data is written into couchbase client returns success just after that data is written to one node's memory. After that couchbase save it to disk and does replication.
If you want to ensure that data is persisted to disk in most client libs there is functions that allow you to do that. With help of those functions you can also enshure that data is replicated to another node. This function is called observe.
When one node goes down, it should be failovered. Couchbase server could do that automatically when Auto failover timeout is set in server settings. I.e. if you have 3 nodes cluster and stored data has 2 replicas and one node goes down, you'll not lose data. If the second node fails you'll also not lose all data - it will be available on last node.
If one node that was Master goes down and failover - other alive node becames Master. In your client you point to all servers in cluster, so if it unable to retreive data from one node, it tries to get it from another.
Also if you have 2 nodes in your disposal you can install 2 separate couchbase servers and configure XDCR (cross datacenter replication) and manually check servers availability with HA proxies or something else. In that way you'll get only one ip to connect (proxy's ip) which will automatically get data from alive server.
Hopefully Couchbase is a good system for HA systems.
Let me explain in few sentence how it works, suppose you have a 5 nodes cluster. The applications, using the Client API/SDK, is always aware of the topology of the cluster (and any change in the topology).
When you set/get a document in the cluster the Client API uses the same algorithm than the server, to chose on which node it should be written. So the client select using a CRC32 hash the node, write on this node. Then asynchronously the cluster will copy 1 or more replicas to the other nodes (depending of your configuration).
Couchbase has only 1 active copy of a document at the time. So it is easy to be consistent. So the applications get and set from this active document.
In case of failure, the server has some work to do, once the failure is discovered (automatically or by a monitoring system), a "fail over" occurs. This means that the replicas are promoted as active and it is know possible to work like before. Usually you do a rebalance of the node to balance the cluster properly.
The sentence you are commenting is simply to say that the less number of node you have, the bigger will be the impact in case of failure/rebalance, since you will have to route the same number of request to a smaller number of nodes. Hopefully you do not lose data ;)
You can find some very detailed information about this way of working on Couchbase CTO blog:
http://damienkatz.net/2013/05/dynamo_sure_works_hard.html
Note: I am working as developer evangelist at Couchbase