Consider the following scenario.
There are 2 Hazelcast nodes. One is stopped, another is running under quite heavy load.
Now, the second node comes up. The application starts up and its Hazelcast instance hooks up to the first. Hazelcast starts data repartitioning. For 2 nodes, it essentially means
that each entry in IMap gets copied to the new node and two nodes are assigned to be master/backup arbitrarily.
PROBLEM:
If the first node is brought down during this process, and the replication is not done completely, part of the IMap contents and ITopic subscriptions may be lost.
QUESTION:
How to ensure that the repartitioning process has finished, and it is safe to turn off the first node?
(The whole setup is made to enable software updates without downtime, while preserving current application state).
I tried using getPartitionService().addMigrationListener(...) but the listener does not seem to be hooked up to the complete migration process. Instead, I get tens to hundreds calls migrationStarted()/migrationCompleted() for each chunk of the replication.
1- When you gracefully shutdown first node, shutdown process should wait (block) until data is safely backed up.
hazelcastInstance.getLifecycleService().shutdown();
2- If you use Hazelcast Management Center, it shows ongoing migration/repartitioning operation count in home screen.
Related
I use Ignite.Net and run ignite in my .net core app process.
My application receives some messages (5000 per second) and I put or remove some keys according to the messages received. The cache mode is replicated, with default Primary_Sync write mode.
Everything is good and I can process up to 20,000 messages/sec.
But when I run another ignite node on another machine, everything changes. Processing speed is reduced up to 1000 messages per second.
perhaps it's due to that some operations do on the network, but I want just put or remove keys on the local instance and replicate them (changed keys) to other nodes. Write mode is Primary_Sync and this means ignite must put or remove key on the local node (because all nodes are the same due to replicated mode and no need to distribute them on other nodes) and then replicate them to other nodes asynchronously.
Where is the problem?
Is the slowdown due to network operations?
Looking at the code (could not run it - requires messing with SQL server), I can provide the following recommendations:
Use DataStreamer. Always use streamer when adding/removing batches of data.
Try using multiple threads to load the data. Ignite APIs are thread-safe.
Maybe try CacheWriteSynchronizationMode.FullAsync
Together this should result in a noticeable speedup, no matter how many nodes.
Here is a use case:
I have version 1 of a web app deployed.
It uses couple Ignite-powered distributed (configured for replication) Maps, Sets and other data structures.
I'm going to deploy v2 of this application and once data is replicated I'm going to shutdown v1 of this app and re-route users (using nginx) to new instance (v2).
I can see that Ignite on v1 and v2 can discover each other and automatically perform replication of data structures.
My intention: I don't want to shutdown 1st instance (v1) before all data is replicated to 2nd instance (v2).
Question is: how do I know if initial replication is completed? Is there any event that is fired in such cases, or maybe some other way to accomplish this task?
If you configure you caches to use synchronous rebalancing [1], second node will not complete start process before rebalancing is completed. This way you will guarantee that all the data is replicated to the second node (of course, assuming that you're using fully replicated caches).
[1] https://apacheignite.readme.io/docs/rebalancing#section-rebalance-modes
I set up a basic test topology with Petabridge Lighthouse and two simple test actors that communicate with each other. This works well so far, but there is one problem: Lighthouse (or the underlying Akka.Cluster) makes one of my actors the leader, and when not shutting the node down gracefully (e.g. when something crashes badly or I simply hit "Stop" in VS) the Lighthouse is not usable any more. Tons of exceptions scroll by and it must be restarted.
Is it possible to configure Akka.Cluster .net in a way that the rest of the topology elects a new leader and carries on?
There are 2 things to point here. One is that if you have a serious risk of your lighthouse node going down, you probably should have more that one -
akka.cluster.seed-nodes setting can take multiple addresses, the only requirement here is that all nodes, including lighthouses, must have them specified in the same order. This way if one lighthouse is going down, another one still can take its role.
Other thing is that when a node becomes unreachable (either because the process crashed on network connection is unavailable), by default akka.net cluster won't down that node. You need to tell it, how it should behave, when such thing happens:
At any point you can configure your own IDowningProvider interface, that will be triggered after certain period of node inactivity will be reached. Then you can manually decide what to do. To use it add fully qualified type name to followin setting: akka.cluster.downing-provider = "MyNamespace.MyDowningProvider, MyAssembly". Example downing provider implementation can be seen here.
You can specify akka.cluster.auto-down-unreachable-after = 10s (or other time value) to specify some timeout given for an unreachable node to join - if it won't join before the timeout triggers, it will be kicked out from the cluster. Only risk here is when cluster split brain happens: under certain situations a network failure between machines can split your cluster in two, if that happens with auto-down set up, two halves of the cluster may consider each other dead. In this case you could end up having two separate clusters instead of one.
Starting from the next release (Akka.Cluster 1.3.3) a new Split Brain Resolver feature will be available. It will allow you to configure more advanced strategies on how to behave in case of network partitions and machine crashes.
I have a fairly simple Akka.NET system that tracks in-memory state, but contains only derived data. So any actor can on startup load its up-to-date state from a backend database and then start receiving messages and keep their state from there. So I can just let actors fail and restart the process whenever I want. It will rebuild itself.
But... I would like to run across multiple nodes (mostly for the memory requirements) and I'd like to increase/decrease the number of nodes according to demand. Also for releasing a new version without downtime.
What would be the most lightweight (in terms of Persistence) setup of clustering to achieve this? Can you run Clustering without Persistence?
This not a single question, so let me answer them one by one:
So I can just let actors fail and restart the process whenever I want - yes, but keep in mind, that hard reset of the process is a lot more expensive than graceful shutdown. In distributed systems if your node is going down, it's better for it to communicate that to the rest of the nodes before, than requiring them to detect the dead one - this is a part of node failure detection and can take some time (even sub minute).
I'd like to increase/decrease the number of nodes according to demand - this is a standard behavior of the cluster. In case of Akka.NET depending on which feature set are you going to use, you may sometimes need to specify an upper bound of the cluster size.
Also for releasing a new version without downtime. - most of the cluster features can be scoped to a set of particular nodes using so called roles. Each node can have it's set of roles, that can be used what services it provides and detect if other nodes have required capabilities. For that reason you can use roles for things like versioning.
Can you run Clustering without Persistence? - yes, and this is a default configuration (in Akka, cluster nodes don't need to use any form of persistent backend to work).
My understanding could be amiss here. As I understand it, Couchbase uses a smart client to automatically select which node to write to or read from in a cluster. What I DON'T understand is, when this data is written/read, is it also immediately written to all other nodes? If so, in the event of a node failure, how does Couchbase know to use a different node from the one that was 'marked as the master' for the current operation/key? Do you lose data in the event that one of your nodes fails?
This sentence from the Couchbase Server Manual gives me the impression that you do lose data (which would make Couchbase unsuitable for high availability requirements):
With fewer larger nodes, in case of a node failure the impact to the
application will be greater
Thank you in advance for your time :)
By default when data is written into couchbase client returns success just after that data is written to one node's memory. After that couchbase save it to disk and does replication.
If you want to ensure that data is persisted to disk in most client libs there is functions that allow you to do that. With help of those functions you can also enshure that data is replicated to another node. This function is called observe.
When one node goes down, it should be failovered. Couchbase server could do that automatically when Auto failover timeout is set in server settings. I.e. if you have 3 nodes cluster and stored data has 2 replicas and one node goes down, you'll not lose data. If the second node fails you'll also not lose all data - it will be available on last node.
If one node that was Master goes down and failover - other alive node becames Master. In your client you point to all servers in cluster, so if it unable to retreive data from one node, it tries to get it from another.
Also if you have 2 nodes in your disposal you can install 2 separate couchbase servers and configure XDCR (cross datacenter replication) and manually check servers availability with HA proxies or something else. In that way you'll get only one ip to connect (proxy's ip) which will automatically get data from alive server.
Hopefully Couchbase is a good system for HA systems.
Let me explain in few sentence how it works, suppose you have a 5 nodes cluster. The applications, using the Client API/SDK, is always aware of the topology of the cluster (and any change in the topology).
When you set/get a document in the cluster the Client API uses the same algorithm than the server, to chose on which node it should be written. So the client select using a CRC32 hash the node, write on this node. Then asynchronously the cluster will copy 1 or more replicas to the other nodes (depending of your configuration).
Couchbase has only 1 active copy of a document at the time. So it is easy to be consistent. So the applications get and set from this active document.
In case of failure, the server has some work to do, once the failure is discovered (automatically or by a monitoring system), a "fail over" occurs. This means that the replicas are promoted as active and it is know possible to work like before. Usually you do a rebalance of the node to balance the cluster properly.
The sentence you are commenting is simply to say that the less number of node you have, the bigger will be the impact in case of failure/rebalance, since you will have to route the same number of request to a smaller number of nodes. Hopefully you do not lose data ;)
You can find some very detailed information about this way of working on Couchbase CTO blog:
http://damienkatz.net/2013/05/dynamo_sure_works_hard.html
Note: I am working as developer evangelist at Couchbase