I'm using cassandra 2.1.2 and datastax cassandra-driver-core-2.1.2. Here is a strange problem: when a keyspace is created ( or table created, deleted), some of my clients received duplicated events, about 200+ times. my cluster and my clients are in different places(not in one lan).
This cause a lot of problems, Once client received such a event, it should refresh schema, and fetch all schema infos from system.keyspaces and so on. in the end , it also refreshNodeListAndTokenMap. All of these operations may cause some data transfer, and 200+ events in one second is horrible. So any body knows why & how to prevent?
thanks for reading this.
When you mention "some of my clients receive duplicated events", I'm assuming you have multiple Cluster instances, is that correct? If you only have one Cluster object and you are getting multiple events I'm wondering if that is a bug. The java-driver will only subscribe 1 connection (named the 'Control Connection') to schema, topology and node status changes per Cluster instance. That connection is established to one of your contact points initially (and if connection is lost it'll choose another node in the cluster).
Without understanding more about your configuration, I would consider the following:
Follow the 4 simple rules, namely only creating 1 cluster instance per application (JVM).
If you want to prevent 1 node from being responsible for sending events to your clients, randomize your contact points so the same one isn't always primarily chosen for the Control Connection. (Note: There is a ticket for this so the java-driver can do that for you, JAVA-618)
Related
I have implemented a replicated key/value store on top of Redis. I have passive replication in which all write and read requests are forwarded to the leader that always returns the last value written for the key. The system uses quorum. So it works even if there are nodes which are down or with a network partition. In this case, the value in those nodes are not consistent. But this does not prevent the system to return the last most updated value. Do I have an eventual consistency model or a strict one? Thanks
You mentioned that, it is a quorum based system, with one node as a leader. The read and write requests are always forwarded to the leader.
For the sake of simplicity, let's assume that, there are 5 nodes in the system and one of them is a leader. Other 4 nodes are secondaries.
Typically quorum based systems work on consensus protocols. So out of 5 nodes, if 3 of the nodes have the latest value, it is enough to always return the latest value.
This is how writes should work
Leader first updates the key/value in it's database
Forwards the request to remaining 4 nodes, which are secondaries
The leader waits for at least 2 of the secondaries to acknowledge that, they have updated the latest key/value in their database. It means out of 5 nodes, at least 3 nodes have the latest updated value.
If the leader does not get response from at least 2 of the secondary nodes within the specified time period (request times out), then leader returns failure to client and client needs to retry.
So, the write requests succeeds only if 3 out of 5 nodes have the latest value. At any point, 2 of the nodes may or may not have the latest value and may catch up later.
For the reads, the leader (which has the latest key/value), always returns the response.
What happens when a leader machine is unable to serve requests, due to some issue? (For e.g. network error)
Typically these systems will have a leader election protocol, when the current leader is unable to serve the requests due to some error.
The new leader will be chosen from one of the secondaries, which have the latest updates. So, the newly elected leader should have the latest updated state and should start serving read requests with the latest set of values.
Your system is strictly consistent.
I have a question related to a tricky situation in an event-driven system that I want to ask for advise. Here is the situation:
In our system, I use redis as a memcached database, and kafkaa as message queues. To increase the performance of redis, I use lua scripting to process data, and at the same time, push events into a blocking list of redis. Then there will be a process to pick redis events in that blocking list and move them to kafka. So in this process, there are 3 steps:
1) Read events from redis list
2) Produce in batch into kafka
3) Delete corresponding events in redis
Unfortunately, if the process dies between 2 and 3, meaning that after producing all events into kafka, it doesn't delete corresponding events in redis, then after that process is restarted, it will produce duplicated events into kafka, which is unacceptable. So does any one has any solution for this problem. Thanks in advance, I really appreciate it.
Kafka is prone to reprocess events, even if written exactly once. Reprocessing will almost certainly be caused by rebalancing clients. Rebalancing might be triggered by:
Modification of partitions on a topic.
Redeployment of servers and subsequent temporary unavailabilty of clients.
Slow message consumption and subsequent recreation of client by the broker.
In other words, if you need to be sure that messages are processed exactly once, you need to insure that at the client. You could do so, by setting a partition key that ensures related messages are consumed in a sequential fashion by the same client. This client could then maintain a databased record of what he has already processed.
My app will work as follows:
I'll have a bunch of replica servers and a load balancer. The data updates will be managed outside CometD. EDIT: I still intend to notify each CometD server of those updates, if necessary, so they can respond back to clients.
The clients are only subscribing to those updates (i.e. read only), so the CometD server nodes don't need to know anything about each other's behavior.
Am I right in thinking I could have server side "client" instances on the load balancer, per client connection, where each instance listens on the same channel as its respective client and forwards any messages back to it? If so, are there any disadvantages to this approach, instead of using Oort?
Reading the docs about Oort, it seems that the nodes "know" about each other, which I don't need. Would it be better then for me to avoid using Oort altogether, in my case? My concern would be that if I ended up adding many many nodes, the fact that they communicate to "each other" could mean unnecessary processing?
The description of the issue specifies that the data updates are managed outside CometD, but it does not detail how the CometD servers are notified of these data updates.
The two common solutions are A) to notify each CometD server or B) to use Oort.
In solution A) you have an event that triggers a data update, and some external application performs the data update on, say, a database. At this point the external application must notify the CometD servers that there was a data update. If the external application runs on a JVM, it can use the CometD Java client to send a message to each CometD server, notifying them of the data update; in turn, the CometD servers will notify the remote clients.
In solution B) the external application must notify just one CometD server that there was a data update; the Oort cluster will do the rest, broadcasting that message across the cluster, and then to remote clients.
Solution A) does not require the Oort cluster, but requires the external application to know exactly all nodes, and send a message to each node.
Solution B) uses Oort, so the external application needs only to know one Oort node.
Oort requires a bit of additional processing because the nodes are interconnected, but depending on the case this processing may be negligible, or the complications of notifying each CometD server "manually" (as in solution A) may be greater than running Oort.
I don't understand exactly what you mean by having "server side client instances on the load balancer". Typically load balancers don't run a JVM so it is not possible to run CometD clients on them, so this sentence does not sound right.
Besides the CometD documentation, you can also look at these slides.
My understanding could be amiss here. As I understand it, Couchbase uses a smart client to automatically select which node to write to or read from in a cluster. What I DON'T understand is, when this data is written/read, is it also immediately written to all other nodes? If so, in the event of a node failure, how does Couchbase know to use a different node from the one that was 'marked as the master' for the current operation/key? Do you lose data in the event that one of your nodes fails?
This sentence from the Couchbase Server Manual gives me the impression that you do lose data (which would make Couchbase unsuitable for high availability requirements):
With fewer larger nodes, in case of a node failure the impact to the
application will be greater
Thank you in advance for your time :)
By default when data is written into couchbase client returns success just after that data is written to one node's memory. After that couchbase save it to disk and does replication.
If you want to ensure that data is persisted to disk in most client libs there is functions that allow you to do that. With help of those functions you can also enshure that data is replicated to another node. This function is called observe.
When one node goes down, it should be failovered. Couchbase server could do that automatically when Auto failover timeout is set in server settings. I.e. if you have 3 nodes cluster and stored data has 2 replicas and one node goes down, you'll not lose data. If the second node fails you'll also not lose all data - it will be available on last node.
If one node that was Master goes down and failover - other alive node becames Master. In your client you point to all servers in cluster, so if it unable to retreive data from one node, it tries to get it from another.
Also if you have 2 nodes in your disposal you can install 2 separate couchbase servers and configure XDCR (cross datacenter replication) and manually check servers availability with HA proxies or something else. In that way you'll get only one ip to connect (proxy's ip) which will automatically get data from alive server.
Hopefully Couchbase is a good system for HA systems.
Let me explain in few sentence how it works, suppose you have a 5 nodes cluster. The applications, using the Client API/SDK, is always aware of the topology of the cluster (and any change in the topology).
When you set/get a document in the cluster the Client API uses the same algorithm than the server, to chose on which node it should be written. So the client select using a CRC32 hash the node, write on this node. Then asynchronously the cluster will copy 1 or more replicas to the other nodes (depending of your configuration).
Couchbase has only 1 active copy of a document at the time. So it is easy to be consistent. So the applications get and set from this active document.
In case of failure, the server has some work to do, once the failure is discovered (automatically or by a monitoring system), a "fail over" occurs. This means that the replicas are promoted as active and it is know possible to work like before. Usually you do a rebalance of the node to balance the cluster properly.
The sentence you are commenting is simply to say that the less number of node you have, the bigger will be the impact in case of failure/rebalance, since you will have to route the same number of request to a smaller number of nodes. Hopefully you do not lose data ;)
You can find some very detailed information about this way of working on Couchbase CTO blog:
http://damienkatz.net/2013/05/dynamo_sure_works_hard.html
Note: I am working as developer evangelist at Couchbase
The Scenario:
We have multiple nodes distributed geographically on which we want to have queues collecting messages for that location. And then we want to send this collected data from every queue in every node to their corresponding queues in a central location. In the central node, we will pull out data collected in the queues (from other nodes), process it and store it persistently.
Constraints:
Data is very important to us. Therefore, we have to make sure that we are not loosing data in any case.
Therefore, we need persistent queues on every node so that even if the node goes down for some random reason, when we bring it up we have the collected data safe with us and we can send it to the central node where it can be processed.
Similarly, if the central node goes down, the data must remain at all the other nodes so that when the central node comes up we can send all the data to the central node for processing.
Also, the data on the central node must not get duplicated or stored again. That is data collected on one of the nodes should be stored on the central nodes only once.
The data that we are collecting is very important to us and the order of data delivery to the central node is not an issue.
Our Solution
We have considered a couple of solutions out of which I am going to list down the one that we thought would be the best. A possible solution (in our opinion) is to use Redis to maintain queues everywhere because Redis provides persistent storage. Then perhaps have a daemon running on all the geographically separated nodes which reads the data from the queue and sends it to the central node. The central node on receiving the data sends an ACK to the node it received the data from (because data is very important to us) and then on receiving the ACK, the node deletes the data from the queue. Of course, there will be timeout period in which the ACK must be received.
The Problem
The above stated solution (according to us) will work fine but the issue is that we don't want to implement the whole synchronization protocol by ourselves for the simple reason that we might be wrong here. We were unable to find this particular way of synchronization in Redis. So we are open to other AMQP based queues like RabbitMQ, ZeroMQ, etc. Again we were not able to figure out if we can do this with these solutions.
Do these Message Queues or any other data store provide features that can be the solution to our problem? If yes, then how?
If not, then is our solution good enough?
Can anyone suggest a better solution?
Can there be a better way to do this?
What would be the best way to make it fail safe?
The data that we are collecting is very important to us and the order of data delivery to the central node is not an issue.
You could do this with RabbitMQ by setting up the central node (or cluster of nodes) to be a consumer of messages from the other nodes, and using the message acknowledgement feature. This feature means that the central node(s) can ack delivery, so that other nodes only delete messages after the ack. See for example: http://www.rabbitmq.com/tutorials/tutorial-two-python.html
If you have further questions please email the mailing list rabbitmq-discuss.