I want to use Redis for a particular use case. I am not sure to go with a Redis Cluster or with Twemproxy + Sentinel.
I know the Cluster is a winner any day. I am just skeptical due to the MOVED responses. In case of MOVED responses, the client will connect another node and in case of resharding, it may have to connect another again. But in case of Twem, it knows where the data is residing, so it will never get a MOVED response.
There are different problems with Twem, like added hop, may increase overall turnaround time, problem with adding new nodes or if it ejects some nodes out, it won't be able to serve the requests for the keys present on that node. Extra maintenance headache as in, having sentinels for my Redis instances and mechanism for HA of twem itself.
Can anyone suggest me, should I go with Twem or Cluster? I am thinking of going with Twem as I will not be going to and fro in case of MOVED responses. But I am skeptical about it, considering the above mentioned concerns.
P.S. I am planning to using Jedis client for Redis (if that helps).
First of all, I'm not familiar with Twemproxy, so I'll only talk about your concerns on Redis Cluster.
Redis client can get the complete slot-node mapping, i.e. the location of keys, from Redis Cluster. It can cache the mapping on the client side, and sends request to the right node. So most of the time, it won't be redirected, i.e. get the MOVED message.
However, if you add/delete node or reshard the data set, client will receive MOVED message, since it still uses the old mapping. In this case, client can update its local cache, and any subsequent requests will be sent to the right node, i.e. no MOVED message any more.
A decent client library can take the above optimization to make it more efficient. So if your client library has this optimization, you don't need to worry about the MOVED penalty.
Related
I have a cluster of backend servers on GCP, and they need to send messages to each other. All the servers need to receive every message, but I can tolerate a low error rate. I can deal with receiving the message more than once on a given server. Packet ordering doesn't matter.
I don't need much of a persistence layer. A message becomes stale within a couple of seconds after sending it.
I wired up Google Cloud PubSub and pretty quickly realized that for a given subscription, you can have any number of subscribers but only one of them is guaranteed to get the message. I considered making the subscribers all fail to ack it, but that seems like a gross hack that probably won't work well.
My server cluster is sized dynamically by an autoscaler. It spins up VM instances as needed, with dynamic hostnames and IP addresses. There is no convenient way to map the dynamic hosts to static subscriptions, but it feels like that's my only real option: Create more subscriptions than my max server pool size, and then use some sort of paxos system (runtime config, zookeeper, whatever) to allocate servers to subscriptions.
I'm starting to feel that even though my use case feels really simple ("Every server can multicast a message to every other server in my group"), it may not be a good fit for Cloud PubSub.
Should I be using GCM/FCM? Or some other technology?
Cloud Pub/Sub may or may not be a fit for you, depending on the size of your server cluster. Failing to ack the messages certainly won't work because you can't be sure each instance will get the message; it could just be redelivered to the same instance over and over again.
You could use multiple subscriptions and have each instance create a new subscription when it starts up. This only works if you don't plan to scale beyond 10,000 instances in your cluster, as that is the maximum number of subscriptions per topic allowed. The difficulty here is in cleaning up subscriptions for instances that go down. Ones that cleanly shut down could probably delete their own subscriptions, but there will always be some that don't get cleaned up. You'd need some kind of external process that can determine if the instance for each subscription is still up and running and if not, delete the subscription. You could use GCE shutdown scripts to catch this most of the time, though there will still be edge cases where deletes would have to be done manually.
Ok, so what I have are 2 web servers running inside of a Windows NLB clustered environment. The servers are identical in every respect, and as you'd expect in an NLB clustered environment, everybody is hitting the cluster name and not the individual members. We also have affinity turned off on the members in the cluster.
But, what I'm trying to do is to turn on some caching for a few large files (MP3s). It's easy enough to dial up a Redis node on one particular member and hit it, everything works like you'd expect. I can pull the data from the cache and serve it up as needed.
Now, let's add the overhead of the NLB. With an NLB in play, you may not be hitting the same web server each time. You might make your first hit to member 01, and the second hit to 02. So, I'd need a way to sync between the two servers. That way it doesn't matter which cluster member you hit, you are going to get the same data.
I don't need to worry about one cache being out of date, the only thing I'm storing in there is read only data from an internal web service.
I've only got 2 servers and it looks like redis clusters need 3. So I guess that's out.
Is this the best approach? Or perhaps there is something else better?
Reasons for redis: We only want the cache to use in-memory only. No writes to the database. Thought this would be a good fit, but need to make sure the data is available in both servers.
It's not possible to have redis multi master (writing on both). And I might say it's replication is blazing fast (check the slaveof command of Redis).
But why you need it in the same server? Access it as a service. So every node will access the actual data. If the main server goes down, the slave will promptly turn itself into a master.
One observation: you might notice that Redis makes use of disk in an async way. An append only file that it does checkpoint depending on the size from time to time so.
As explained in the StackExchange.Redis Basics documentation, you can connect to multiple Redis servers, and StackExchange.Redis will automatically determine the master/slave setup. Quoting the relevant part:
A more complicated scenario might involve a master/slave setup; for this usage, simply specify all the desired nodes that make up that logical redis tier (it will automatically identify the master):
ConnectionMultiplexer redis = ConnectionMultiplexer.Connect("server1:6379,server2:6379");
I performed a test in which I triggered a failover, such that the master would go down for a bit, causing the old slave to become the new master, and the old master to become the new slave. I noticed that in spite of this change, StackExchange.Redis keeps sending commands to the old master, causing write operations to fail.
Questions on the above:
How does StackExchange.Redis decide which endpoint to use?
How should multiple endpoints (as in the above example) be used?
I also noticed that for each connect, StackExchange.Redis opens two physical connections, one of which is some sort of subscription. What is this used for exactly? Is it used by Sentinel instances?
What should happen there is that it uses a number of things (in particular the defined replication configuration) to determine which is the master, and direct traffic at the appropriate server (respecting the "server" parameter, which defaults to "prefer master", but which always sends write operations to a master).
If a "cannot write to a readonly slave" (I can't remember the exact text) error is received, it will try to re-establish the configuration, and should switch automatically to respect this. Unfortunately, redis does not broadcast configuration changes, so the library can't detect this ahead of time.
Note that if you use the library methods to change master, it can exploit pub/sub to detect that change immediately and automatically.
Re the second connection: that would be for pub/sub; it spins this up ahead of time, as by default it attempts to listen for the library-specific configuration broadcasts.
My understanding could be amiss here. As I understand it, Couchbase uses a smart client to automatically select which node to write to or read from in a cluster. What I DON'T understand is, when this data is written/read, is it also immediately written to all other nodes? If so, in the event of a node failure, how does Couchbase know to use a different node from the one that was 'marked as the master' for the current operation/key? Do you lose data in the event that one of your nodes fails?
This sentence from the Couchbase Server Manual gives me the impression that you do lose data (which would make Couchbase unsuitable for high availability requirements):
With fewer larger nodes, in case of a node failure the impact to the
application will be greater
Thank you in advance for your time :)
By default when data is written into couchbase client returns success just after that data is written to one node's memory. After that couchbase save it to disk and does replication.
If you want to ensure that data is persisted to disk in most client libs there is functions that allow you to do that. With help of those functions you can also enshure that data is replicated to another node. This function is called observe.
When one node goes down, it should be failovered. Couchbase server could do that automatically when Auto failover timeout is set in server settings. I.e. if you have 3 nodes cluster and stored data has 2 replicas and one node goes down, you'll not lose data. If the second node fails you'll also not lose all data - it will be available on last node.
If one node that was Master goes down and failover - other alive node becames Master. In your client you point to all servers in cluster, so if it unable to retreive data from one node, it tries to get it from another.
Also if you have 2 nodes in your disposal you can install 2 separate couchbase servers and configure XDCR (cross datacenter replication) and manually check servers availability with HA proxies or something else. In that way you'll get only one ip to connect (proxy's ip) which will automatically get data from alive server.
Hopefully Couchbase is a good system for HA systems.
Let me explain in few sentence how it works, suppose you have a 5 nodes cluster. The applications, using the Client API/SDK, is always aware of the topology of the cluster (and any change in the topology).
When you set/get a document in the cluster the Client API uses the same algorithm than the server, to chose on which node it should be written. So the client select using a CRC32 hash the node, write on this node. Then asynchronously the cluster will copy 1 or more replicas to the other nodes (depending of your configuration).
Couchbase has only 1 active copy of a document at the time. So it is easy to be consistent. So the applications get and set from this active document.
In case of failure, the server has some work to do, once the failure is discovered (automatically or by a monitoring system), a "fail over" occurs. This means that the replicas are promoted as active and it is know possible to work like before. Usually you do a rebalance of the node to balance the cluster properly.
The sentence you are commenting is simply to say that the less number of node you have, the bigger will be the impact in case of failure/rebalance, since you will have to route the same number of request to a smaller number of nodes. Hopefully you do not lose data ;)
You can find some very detailed information about this way of working on Couchbase CTO blog:
http://damienkatz.net/2013/05/dynamo_sure_works_hard.html
Note: I am working as developer evangelist at Couchbase
I have implemented a wcf service and now, my client wants it to have three copies of it, working independently on different machines. A master-slave approach. I need to find a solution that will enable behavior:
the first service that is instantiated "asks" the other two "if they are alive?" - if no, then it becomes a master and it is the one that is active on the net. The other two, once instantiated see that there is already a master alive, so they became slaves and start sleeping. There needs to be some mechanisms to periodically check if master is not dead and if so, choses the next copy that is alive to became a master (until it becomes dead respectively)
This i think should be a kind of an architectural pattern, so I would be more than glad to be given any advices.
thanks
I would suggest looking at the WCF peer channel (System.Net.PeerToPeer) to facilitate each node knowing about the other nodes. Here is a link that offers a decent introduction.
As for determining which node should be the master, the trick will be negotiating which node should be the master if two or more nodes come online at about the same time. Once the nodes become aware of each other, there needs to be some deterministic mechanism for establishing the master. For example, you could use the earliest creation time, the lowest value of the last octet of each node's IP address, or anything really. You'll just need to define some scheme that allows the nodes to negotiate this automatically.
Finally, as for checking if the master is still alive, I would suggest using the event-based mechanism described here. The master could send out periodic health-and-status events that the other nodes would register for. I'd put a try/catch/finally block at the code entry point so that if the master were to crash, it could publish one final MasterClosing event to let the slaves know it's going away. What this does not account for is a system crash, e.g., power failure, etc. To handle this, provide a timeout in the slaves so that when the timeout expires, the slaves can query the master to see if it's still there. If not, the slaves can negotiate between themselves using your deterministic algorithm about who should be the next master.