How can I check in RemoteFilte, if current node is primary or backup? - ignite

I have two nodes in Partitioned mode and I use Continuous Query. When I put value to cache I see RemoteFilter is working twice (on primary node and on backup node). How can I check in filter if current node is primary or backup?

Well, there are several methods on Affinity API to help you detect whether a node is a primary or backup. However, if the topology changes while checking the Affinity API, then you may end up on a primary node that became a backup or vice versa.
There is a way to check this deterministically, which is described in IGNITE-3878 ticket. This should come in the next release.

Related

NIFI for heterogeneous clusters

I am new to NIFI and trying to understand its architecture.
From what I understood, the primary node in the cluster is selected internally by the system and user has no control over it. Also, we can configure some processors to run only on Primary Node (Isolated Processors).
My doubt is that if my cluster is heterogeneous and I want to run the Isolated (CPU-intensive) processors on the powerful node, is it possible to configure that node as primary?
Currently it the primary node is automatically chosen by the NiFi cluster and you cannot choose which node it is. If the primary node goes down, then a new node will be elected as primary.
Generally the concept of primary node is used to trigger source processors that you would only want to execute once. For example, when using ListSFTP you would likely want to run this on primary node only, otherwise all the nodes in your cluster are going to list the same files.

Apache Ignite Replicated Cache race conditions?

I'm quite new to Apache Ignite so please be gentle. My question is simple.
If I have a replicated cache using Apache Ignite. I write to this cache key 123. My cluster has 10 nodes.
First question is:
Does replicated cache mean that before the "put" call comes back the key 123 must be written to all 10 nodes? Or does the call come back immediately and the replication is done behind the scenes?
Second question is:
Lets say key 123 is written on Node 1. It's now being replicated to all other nodes. However a few microseconds later Node 2 tries to write key 123 with a different value. Do I now have a race condition? Or does Ignite somehow handles this situation in such a way where Node 2's attempt to write key 123 won't happen until Node 1's "put" has replicated across all nodes?
For some context, what I'm trying to build is a de-duplication system across a cluster of API machines. I was hoping that I would be able to create a hash of my API request (with only values that make the request unique) and write it to the Ignite Cache. The API request would only proceed if the cache does not already contain the unique hash (possibly created by a different API instance). Of course the cache would have an eviction policy to evict these cache keys after a few seconds because they won't be needed anymore.
REPLICATED cache is the same as PARTITIONED with infinite number of backups and some optimizations. So it has primary partitions that distributed across nodes according to affinity function.
Now when you perform update, request comes to primary node, and primary node, in it's turn, updates all backups. Property CacheConfiguration.setWriteSynchronizationMode() is responsible for the way in which entries will be updated. By default it's PRIMARY_SYNC, which means that thread which calls put() will wait only for primary partition update, and backups will be updated asynchronously. If you set it to FULL_SYNC, thread will be released only when all backups updated.
Answering your second question, there will not be a race condition, because all requests will come to primary node.
Additionally to your clarification, if backup node wasn't updated yet, get() request will go to primary node, so in PRIMARY_SYNC mode you'll never get null if primary partition has a value.

Explicit setting of write synchronization mode FULL_SYNC needed for replicated caches?

I understand from the docs that replicated caches are implemented using partitioned caches where every key has a primary copy and is also backed up on all other nodes in the cluster & that when data is queried lookups would be made from both primary & backup on the node for serving the query.
But i see that the default cache write synchronization mode is PRIMARY_SYNC, where client will not wait for backups to be updated. Does that mean i have to explicitly set it to FULL_SYNC for replicated caches since responses rely on lookup of primary & backup?
The first option is to use 'FULL_SYNC' mode.
In that case, client request will wait for write to complete on all participating nodes (primaries and backups).
The second option, that can be used here, is to use 'PRIMARY_SYNC' and set 'CacheConfiguration#readFromBackup' flag to false (which is true by default).
Ignite will send the request to primary node and get the value from there.
Please see https://ignite.apache.org/releases/mobile/org/apache/ignite/configuration/CacheConfiguration.html
By the way, both options make sense for partitioned cache as well.

Passive Replication in Distributed Systems - Replacing the Primary Server

In a passive replication based distributed system, if the primary server fails, one of the backups is promoted as primary. However, suppose that the original primary server recovers, then how do we switch back the primary server to it from the current backup?
I was wondering
if the failed primary server recovers, it must be incorporated into the system as a secondary and updated to reflect the most accurate information at the given point of time. To restore it as the primary server, it can be promoted as the primary in case the current primary (which was originally a backup) fails, otherwise, if required the current primary can be blocked for a while, the original primary promoted as primary again and the blocked reintroduced as backup.
I could not find an answer to this question elsewhere and this is what I feel. Please suggest any better alternatives.
It depends on what system you're looking at. Usually there's no immediate need to replace the backup when the original primary server recovers; if there is, you'd need to synchronize the two and promote the original primary.
Distributed synchronization (or consensus) is a hard problem. There's a lot of literature out there and I recommend that you read up. An example of a passively replicated system (with Leaders/Followers/Candidates) is Raft, which you could start with. A good online visualization can be found here, and the paper is here.
ZAB and Paxos are worth a read as well!

ElastiCache URL I can hit that always uses primary node

This morning I ran into an issue were my primary node in replication group was changed. I still need to investigate why this happened.
The upshot was lots of failures in a Rails application as it was trying to write to what was the primary node but had become a read replica.
Is there a URL I can use that basically says "write to the primary node of this replication group, I don't care which node that is"
Right now I am using something similar to;
name-002.aaaaa.0001.use1.cache.amazonaws.com
My "fix" for now was changing what was name-001 to name-002 but until I know the reason why the primary node was changed I have to assume this will break again.
I think I have answered my own question.
In the admin section for the replication group there is a Primary Endpoint which seems to do the job of delegating that work out.