Apache ignite node won't join cluster if server nodes start simultaneously - ignite

We have a problem with ignite when we start two ignite server nodes at the exact same time. We are currently implementing our own discovery mechanism by extending TcpDiscoveryIpFinderAdapter. The first time the TcpDiscoveryIpFinderAdapter is called neither ignite servers will be able to find the other node (due to the nature of our discovery mechanism). Subsequent invocations does report the other node with a correct IP, yet the ignite nodes will not start to talk to each other.
If we start the servers with some delay, the second server will (on the first attempt) find the other node and join the cluster successfully.
Is there a way to get the two nodes to talk to each other even after both of them initially think they are a cluster of one node?

Related

Apache ignite connecting to different servers

I am using apache ignite with default configurations. I have two development server A and B where each server has the same code. I have 3 ignite nodes started on each server. 3 ignite nodes on A and 3 on B
I have created a ignite cache " ignite-bridg". Since on one server each node would create a cache and partition the data and these two servers are isolated so nothing will happen.
However I see that both the servers form a cluster and 6 nodes get connected. This is highly problematic for me. I think this is happening because both servers are accidently on same multicast group.
How to resolve this problem. I need to rectify it quickly
By default Ignite uses Multicast IP finder (TcpDiscoveryMulticastIpFinder) for nodes discovery process, in your case you should use Static IP finder (TcpDiscoveryVmIpFinder) instead. By using it you could specify different lists of IP addresses for each server and form two clusters instead of one.
Here is more information regarding Static IP Finder configuration:
https://www.gridgain.com/docs/latest/developers-guide/clustering/tcp-ip-discovery#static-ip-finder

Apache Ignite Force Server Mode

We are trying to prevent our application startups from just spinning if we cannot reach the remote cluster. From what I've read Force Server Mode states
In this case, discovery will happen as if all the nodes in topology
were server nodes.
What i want to know is:
Does this client then permanently act as a server which would run computes and store caching data?
If connection to the cluster does not happen at first, a later connection to an establish cluster cause issue with consistency? What would be the expect behavior with a Topology version mismatch? Id their potential for a split brain scenario?
No, it's still a client node, but behaves as a server on discovery protocol level. For example, it can start without any server nodes running.
Client node can never cause data inconsistency as it never stores the data. This does not depend forceServerMode flag.

Re-join cluster node after seed node got restarted

Let's imagine such scenario. I have a three nodes inside my akka cluster (node A,B,C). Each node is deployed to a different physical device inside a network.
All of those nodes are wrapped inside Topshelf windows services.
Node A is my seed node, the other ones are just simply 'worker' nodes with port specified.
When I run cluster and stop node (service) B or C and then restart them. Nodes are rejoining with no issues.
I'd like to ask whether it's possible to handle other scenario which will be. When I stop seed node (node A), the other nodes - services still running and then I restart node-service A - I'd like to make nodes B,C rejoin the cluster and make the whole eco system working again.
Is such scenario possible to implement? If yes then how should I do that?
In Akka.NET cluster any node can serve as a seed node for others as long as it's a part of the cluster. "Seeds" are just a configuration thing, so you can define a list of well-known node addresses you know, that are a part of the cluster.
Regarding your case, there are several solutions I can think of:
Quite common approach is to define more than one seed node in the configuration, so that your node doesn't serve as a single point of failure. As long as at least one of the configured seed nodes is alive, everything should work fine. Keep in mind, that the seed nodes should be defined in each configuration in the exactly same order.
If your "worker" nodes have statically assigned endpoints, they can be used as seed nodes as well.
Since you can initialize the cluster programmaticaly from code, you can also use 3rd party service as a node discovery service. You can use i.e. consul for that - I've started a project, which gives such functionality. While it's not yet published, feel free to fork it or contribute, if it will help you.

Redis cluster via HAProxy

I have a Redis Cluster that clients are connecting to via HAPRoxy with a Virtual IP. The Redis cluster has three nodes (with each node sharing the same server with a running sentinel instance).
My question is, when i clients gets a "MOVED" error/message from a cluster node upon sending a request, does it bypass the HAProxy the second time when it connects since it has been provided with an IP:port when the MOVEd message was issued? If not, how does the HAProxy know the second time to send it to the correct node?
I just need to understand how this works under the hood.
If you want to use HAProxy in front of Redis Cluster nodes, you will need to either:
Set up an HAProxy for each master/slave pair, and wire up something to update HAProxy when a failure happens, as well as probably intercept the topology related commands to insert the virtual IPs rather than the IPs the nodes themselves have and report via the topology commands/responses.
Customize HAProxy to teach it how to be the cluster-aware Redis client so the actual client doesn't know about cluster at all. This means teaching it the Redis protocol, storing the cluster's topology information, and selecting the node to query based on the key(s) being accessed by the consumer code.
With Redis Cluster the client must be able to access every node in the cluster. Of the two options above Option 2 is the "easier" one, but at this point I wouldn't recommend either.
Conceivably you could use the VIP as a "first place to get the topology info" IP but I suspect you'd have serious issues develop as that original IP would not be one of the ones properly being reported as a nod handling data. For that you could simply use round-robin DNS and avoid that problem, or use the built-in "here is a list of cluster IPs (or names?)" to the initial connection configuration.
Your simplest, and least likely to be problematic, route is to go "full native" and simply give full and direct access to every node in the cluster to your clients and not use HAProxy at all.

CouchBase 2.5 2 nodes in replica: 1 node fail: the service is no more available

We are testing Couchbase with a two node cluster with one replica.
When we stop the service on one node, the other one does not respond until we restart the service or manually failover the stopped node.
Is there a way to maintain the service from the good node when one node is temporary unavailable?
If a node goes down then in order to activate the replicas on the other node you will need to manually fail it over. If you want this to happen automatically then you can enable auto-failover, but in order to use that feature I'm pretty sure you must have at least a three node cluster. When you want to add the failed node back then you can just re-add it to the cluster and rebalance.