Setup docker-swarm to use specific nodes as backup? - load-balancing

Is there a way to setup docker-swarm to only use specific nodes (workers or managers) as fail-over nodes? For instance if one specific worker dies (or if a service on it dies), only then it will use another node, before that happens it's as if the node wasn't in the swarm.

No, that is not possible. However, docker-swarm does have the features to build that up. Let's say that you have 3 worker nodes in which you want to run service A. 2/3 nodes will always be available and node 3 will be the backup.
Add a label to the 3 nodes. E.g: runs=serviceA . This will make sure that your service only runs in those 3 nodes.
Make the 3rd node unable to schedule tasks by running docker node update --availability drain <NODE-ID>
Whenever you need your node back, run docker node update --availability active <NODE-ID>

Related

Redis Cluster Single Node issue

Can Redis run in cluster mode enabled but with single instance. It seems we get cluster status as fail when we try to deploy with single node because no slots are added to it.
https://serverfault.com/a/815813/510599
I understand we can manually add SLOTS after deployment. But I wonder if we can modify Redis source code to make this change.

Uninstalling 2 instances on one node in a 2 node SQL Server 2012 cluster

I need some pointers to uninstalling 2 instances on SQL Server 2012 cluster
We have a SQLl2012 Active -Active cluster with instance A on Node 1 and Instance B, Instance C and 3 services on Node 2. I need to uninstall Instance A and B from Node2. The services could be removed/disabled.So I will have an Active-Active cluster with Instance C on Node 1 and no instance on Node2.
I have never uninstalled an instance on a cluster and we don’t have any test system so I have a couple of questions.
1. When there is no instance on Node 2, does it become an Active Passive cluster?
2. Do I have to take the node offline before the uninstall?
3. Can I get any links to uninstalling an instance? I have googled but have only come up with uninstalling nodes. I got one that said about uninstalling an instance but at the end of it, uninstalled the node. https://www.mssqltips.com/sqlservertip/2172/uninstalling-a-sql-server-clustered-instance/
4. Am I right in understanding that you need the installation media to uninstall the instance? And that I cannot use uninstall programme method for this?
5. I have a link below with a comment from Perry Whittle.https://www.sqlservercentral.com/Forums/Topic1470694-1550-1.aspx So does it mean if I use the remove node option , it will remove only the instance BUT the node will still be there ?
I had initially disabled the 2 instances on the node ,thinking that would be a safer option but that gave rise to continuous alerts on our Nagios monitoring system and Operations got errors during the monthly patching.
Thanks

Re-join cluster node after seed node got restarted

Let's imagine such scenario. I have a three nodes inside my akka cluster (node A,B,C). Each node is deployed to a different physical device inside a network.
All of those nodes are wrapped inside Topshelf windows services.
Node A is my seed node, the other ones are just simply 'worker' nodes with port specified.
When I run cluster and stop node (service) B or C and then restart them. Nodes are rejoining with no issues.
I'd like to ask whether it's possible to handle other scenario which will be. When I stop seed node (node A), the other nodes - services still running and then I restart node-service A - I'd like to make nodes B,C rejoin the cluster and make the whole eco system working again.
Is such scenario possible to implement? If yes then how should I do that?
In Akka.NET cluster any node can serve as a seed node for others as long as it's a part of the cluster. "Seeds" are just a configuration thing, so you can define a list of well-known node addresses you know, that are a part of the cluster.
Regarding your case, there are several solutions I can think of:
Quite common approach is to define more than one seed node in the configuration, so that your node doesn't serve as a single point of failure. As long as at least one of the configured seed nodes is alive, everything should work fine. Keep in mind, that the seed nodes should be defined in each configuration in the exactly same order.
If your "worker" nodes have statically assigned endpoints, they can be used as seed nodes as well.
Since you can initialize the cluster programmaticaly from code, you can also use 3rd party service as a node discovery service. You can use i.e. consul for that - I've started a project, which gives such functionality. While it's not yet published, feel free to fork it or contribute, if it will help you.

CouchBase 2.5 2 nodes in replica: 1 node fail: the service is no more available

We are testing Couchbase with a two node cluster with one replica.
When we stop the service on one node, the other one does not respond until we restart the service or manually failover the stopped node.
Is there a way to maintain the service from the good node when one node is temporary unavailable?
If a node goes down then in order to activate the replicas on the other node you will need to manually fail it over. If you want this to happen automatically then you can enable auto-failover, but in order to use that feature I'm pretty sure you must have at least a three node cluster. When you want to add the failed node back then you can just re-add it to the cluster and rebalance.

High Availability of Resource Manager, Node Manager and Application Master in YARN

From reading documentation around YARN, I couldn't find any relevant information about HA of resource manager, node manager and application master in YARN. Are they single point of failures? If so are there any plan to improve?
A YARN cluster is comprised of a potentially large number of machines ("nodes"). To be part of the cluster, each node runs at least one service daemon. The service daemon's type determines the task this node plays in the cluster.
Almost all nodes run a "node manager" service deamon, which makes them "regular" YARN nodes. The node manager takes care of executing a certain part of a YARN job on this very machine, while other parts are executed on other nodes. It makes only sense to run a single node manager on each node. For a 1000 node YARN cluster, there are probably around 999 node managers running. So node managers are indeed redundantly distributed in the cluster. If one node manager fails, others are assigned to take over its tasks.
Every YARN job is an application of its own, and a dedicated application master daemon is started for the job on one of the nodes. For another application, another application master is started on a different node. The application's actual work is executed on even other nodes in the cluster. The application master only controls the overall execution of the application. If an application master dies, the whole application has failed, but other applications will continue. The failed application has to be restarted.
The resource manager daemon is running on one dedicated YARN node, tasked only with starting applications (by starting the related application master), with collecting information about all nodes in the cluster and with assigning computing resources to applications. The resource manager currently isn't build to be HA, but this normally isn't a problem. If the resource manager dies, all applications need to be restarted.