Splunk : How to figure out replication Factor

Splunk : How to figure out replication Factor - splunk

If this sound silly to you I apologise in advance, I am new to splunk and did udemy course but can't figure out this.
If I check my indexes.conf file in cluster master I get repFator=0
#
# By default none of the indexes are replicated.
#
repFactor = 0
but if I check https://:8089/services/cluster/config
I see replication factor :
replication_factor 2
So I am confused whether my data is getting replicated,
I have two indexes in a cluster

I believe replication_factor determines how many replicas to have amongst nodes in the cluster, and refFactor determines whether or not to replicate a particular index.
For repFactor, which is an index specific setting
The indexes.conf repFactor attribute
When you add a new index stanza, you must set the repFactor attribute to "auto". This causes the index's data to be replicated to other peers in the cluster.
Note: By default, repFactor is set to 0, which means that the index will not be replicated. For clustered indexes, you must set it to "auto".
The only valid values for repFactor are 0 and "auto".
For replication_factor, which is a cluster setting:
Replication factor and cluster resiliency
The cluster can tolerate a failure of (replication factor - 1) peer nodes. For example, to ensure that your system can tolerate a failure of two peers, you must configure a replication factor of 3, which means that the cluster stores three identical copies of each bucket on separate nodes. With a replication factor of 3, you can be certain that all your data will be available if no more than two peer nodes in the cluster fail. With two nodes down, you still have one complete copy of data available on the remaining peers.
By increasing the replication factor, you can tolerate more peer node failures. With a replication factor of 2, you can tolerate just one node failure; with a replication factor of 3, you can tolerate two concurrent failures; and so on.

The repFactor setting lets you choose which indexes are replicated. By default, none are. The replication_factor setting says how many copies of a replicated bucket to make. Both must be non-zero to replicate data.
The Cluster Manager should confirm that. Select Settings->Indexer Clustering to see which indexes are replicated and their state.

Related

Aerospike cluster behavior in different consistency mode?

I want to understand the behavior of aerospike in different consistancy mode.
Consider a aerospike cluster running with 3 nodes and replication factor 3.
AP modes is simple and it says
Aerospike will allow reads and writes in every sub-cluster.
And Maximum no. of node which can go down < 3 (replication factor)
For aerospike strong consistency it says
Note that the only successful writes are those made on replication-factor number of nodes. Every other write is unsuccessful
Does this really means the no writes are allowed if available nodes < replication factor.
And then same document says
All writes are committed to every replica before the system returns success to the client. In case one of the replica writes fails, the master will ensure that the write is completed to the appropriate number of replicas within the cluster (or sub cluster in case the system has been compromised.)
what does appropriate number of replica means ?
So if I lose one node from my 3 node cluster with strong consistency and replication factor 3 , I will not be able to wright data ?

For aerospike strong consistency it says
Note that the only successful writes are those made on
replication-factor number of nodes. Every other write is unsuccessful
Does this really means the no writes are allowed if available nodes <
replication factor.
Yes, if there are fewer than replication-factor nodes then it is impossible to meet the user specified replication-factor.
All writes are committed to every replica before the system returns
success to the client. In case one of the replica writes fails, the
master will ensure that the write is completed to the appropriate
number of replicas within the cluster (or sub cluster in case the
system has been compromised.)
what does appropriate number of replica means ?
It means replication-factor nodes must receive the write. When a node fails, a new node can be promoted to replica status until either the node returns or an operator registers a new roster (cluster membership list).
So if I lose one node from my 3 node cluster with strong consistency
and replication factor 3 , I will not be able to wright data ?
Yes, so having all nodes a replicas wouldn't be a very useful configuration. Replication-factor 3 allows up to 2 nodes to be down, but only if the remaining nodes are able to satisfy the replication-factor. So for replication-factor 3 you would probably want to run with a minimum of 5 nodes.

You are correct, with 3 nodes and RF 3, losing one node means the cluster will not be able to successfully take write transactions since it wouldn't be able to write the required number of copies (3 in this case).
Appropriate number of replicas means a number of replicas that would match the replication factor configured.

what do we mean by hash slot in redis cluster?

I have read redis-cluster documents but couldn't get the gist of it. Can someone help me understand it from the basics?
Redis Cluster does not use consistent hashing, but a different form of
sharding where every key is conceptually part of what we call an hash
slot.

Hash slots are defined by Redis so the data can be mapped to different nodes in the Redis cluster. The number of slots (16384 ) can be divided and distributed to different nodes.
For example, in a 3 node cluster one node can hold the slots 0 to 5640, next 5461 to 10922 and the third 10923 to 16383. Input key or a part of it is hashed (run against a hash function) to determine a slot number and hence the node to add the key to.

Think of it as logical shards. So redis has 16384 logical shards and these logical shards are mapped to the available physical machines in the cluster.
Mapping may look something like:
0-1000 : Machine 1
1000-2000 : Machine 2
2000-3000 : Machine 3
...
...
When redis gets a key, it does the following:
Calculate hash(key)%16384 -> this finds the logical shard where the given key belongs, let's say it comes to 1500
Look at logical shard to physical machine mapping to identify physical machine. From the above mapping, logical shard 1500 is served by Machine 2. So route the request to physical machine #2.

You can consider slot as its literal meaning, just like slots in the real world.
Every key belongs to a certain slot by some rules. And a slot also belongs to a certain redis node by config.

Affinity Key in Aerospike

In the aerospike documentation, it is mentioned that aerospike has 4096 logical partitions and each key is hashed and eventually mapped to any of the partitions between 1 to 4096, which determines in which node the data for that key should be stored.
However if we have two keys "A" and "AB" and we want to store them in the same node, is there a way?
In Redis it can be achieved by making the keys as "A" and "{A}B" that will make sure that the key "{A}B" will go to a node where "A" is hashed and stored.
In Apache Ignite, same can be done using "AffinityKey".
Does a similar idea exist in Aerospike?
Thanks

Aerospike was designed as a distributed database. Redis was designed to run on a single node, and lacks concepts such as data distribution, clustering, replication, failover, at least natively. I'm aware that you can use various application-side shenanigans to make it into an ad-hoc cluster.
Don't worry about the implementation details of Aerospike's data distribution. Those happen automatically between the client and cluster, and don't require you to do anything on the application side. Instead, think about your access patterns.
First, your Aerospike cluster will make sure the data is evenly distributed. Because work is directly proportional to data, you should make sure the nodes are homogeneous. You can then expect multi-node operations to wrap up in roughly the same amount of time on each node.
You can create a secondary index on the fields that you'll be querying often to enhance the speed of the query. Release 3.12 adds predicate filtering, allowing you to create more complex query predicates on top of the initial secondary index based filter (also see the Java client's PredExp class).
If you don't want to use secondary indexes (there are several valid reasons), you can create your own lookup using external records. In a set called country-school you can have a record for each country (keys such as 'india', 'luxembourg') with the value being a list containing the IDs of the schools in that country. You can get the list with a single get (or a batch-get if it's several records, such as india-1, india-2, ... , india-9999), then use the results to compose a batch-get operation for the schools. Batch reads return results in the ordered you asked so you can get a large batch, check whether the last element is null, and if not get another batch.
('ns1', 'country-school', 'us-california') => [ 1, 2, 3, 5, 8, 11, .. ]
Similarly, you can create permutations such as country-state-city, (example, us-california-oakland) with smaller lists. This costs some extra space, but gives you faster (key-value based) retrieval without spending memory on secondary indexes.
('ns1', 'country-school', 'us-california-oakland') => [ 1, 5, 42, .. ]

Datastax consistency

We've installed Datastax on five nodes with search enabled on the five nodes and replication factor of 3. After adding 590 rows to a table and select from node 1 it retrieve 590. And when selecting from other nodes the number varies from 570 to 585 rows.
I tried using CONSISTENCY QUORUM on cqlsh, but nothing changed. And solr_query is not supported on CONSISTENCY QUORUM.
Is there a way to assure all data written to Cassandra is relieved as it is?

As LHWizard mentioned, if you use Consistency levels such that (nodes_written + nodes_read) > RF, you will ensure immediate consistency.
In your case, you can try using a CONSISTENCY ALL on your read so that all nodes are checked before returning (this will be immediately consistent even with write CL of ONE). This should actually trigger a read repair on the inconsistent nodes and the missing data will be streamed to those nodes.
You're right that solr queries can only be read at CL ONE. If you need higher consistency requirements, you will need to raise the CL for the writes to achieve what you need.

Cassandra rebalance

I'm using Apache Cassandra 2.1.1 and when using nodetool status the Load for one of my nodes is about half the size of the other two while the Owns is almost equal on all the nodes. I am somewhat new to Cassandra and don't know if I should be worried about this or not. I have tried using repair and cleanup after restarting all the nodes, but it still appears unbalanced. I am using GossipingPropertyFileSnitch with each node configured dc=DC1 and rack=RAC1 specified in cassandra-rackdc.properties. I am also using Murmur3Partitioner with NetworkTopologyStrategy where my keyspace is defined as
CREATE KEYSPACE awl WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '2'} AND durable_writes = true;
I believe the problem to be with the awl keyspace since the size of the data/awl folder is the same size as reported by nodetool status. My output for nodetool status is below. Any help would be much appreciated.
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.1.1.152 3.56 GB 256 68.4% d42945cc-59eb-41de-9872-1fa252762797 RAC1
UN 10.1.1.153 6.8 GB 256 67.2% 065c471d-5025-4bf1-854d-52d579f2a6d3 RAC1
UN 10.1.1.154 6.31 GB 256 64.4% 46f05522-29cc-491c-ab65-334b205fc415 RAC1

I would suspect this is due to the distribution of the key values that are being inserted. They are probably not well distributed across the possible key values, so many of them are hashing to one node. Since you are using replication factor 2, the second replica is the next node in the ring, resulting in two nodes with more data than the third node.
You didn't show your table schema, so I don't know what you are using for the partition and clustering keys. You want to use key values that have a high cardinality and good distribution to avoid hot spots where a lot of inserts are hashing to one node. With a better distribution you will get better performance and more even space usage across the nodes.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas