Does adding more replica to AWS elastic cache costs more? - redis

we are using non cluster elastic cache cache.m3.medium which supports 0 - 5 replicas. I am trying to understand does it cost more if we add more replicas. e.g. cost of 1 primary node and 1 replica Vs 1 primary node and 2 replicas.

Each replica has its own node. Each node will have its own instance hour cost.
1 primary + 2 replicas = 3 instances/hr
3 instances * $/instance hr will be the hourly cost.
https://aws.amazon.com/elasticache/pricing/

Related

Redis in Multi Datacenter

we have many datacenters but datacenter1 is the main.
the master in datacenter1 is being monitored by sentinel so if the master goes down one the replicas will become master and also all data is being synced continuously.
we want to have one Redis replica in each datacenter, replicate all data from datacenter1 but without the ability to become master. (always get data from data center 1 and just replica 1 have the ability to become master but other replicas must not be able)
is there a Redis config for this or any idea?
Redis Multi Datacenter
Redis config [1] has a replica-priority parameter which should serve your purpose.
The replica priority is an integer number published by Redis in the INFO
output. It is used by Redis Sentinel in order to select a replica to promote
into a master if the master is no longer working correctly.
A replica with a low priority number is considered better for promotion, so
for instance if there are three replicas with priority 10, 100, 25 Sentinel
will pick the one with priority 10, that is the lowest.
However a special priority of 0 marks the replica as not able to perform the
role of master, so a replica with priority of 0 will never be selected by
Redis Sentinel for promotion.
By default the priority is 100.
The idea can be setting lower replica-priority value to replicas in datacenter1 and higher value to replicas in other datacenters.
[1] redis.conf file of Redis version 6.2.6: https://github.com/redis/redis/blob/6.2.6/redis.conf

Is it possible for the total count in a cluster mode enabled Redis cluster to be fewer than the number of uploaded entries?

I am using a cluster enabled Redis setup with 1 primary node and 1 read replica, and I am attempting to store ~13 million unique entries, but I am noticing that the "count" in each of the 5 primary nodes of my cluster is only ~1.6 million (~8 million total) entries. However, when attempting to hit the cache with each of the 13 million entries, there are definitely not 5 million cache misses. In fact, there are only ~5 cache misses per 3 million entries. There are no TTL issues here as the TTL for all entries is set to months from now.
Is there a good explanation for the seemingly 5 million missing entries?

Redis is not expiring keys

I am running a cron which is setting approximate 1.5 million keys to my redis cluster in 1 minute. These keys which i am setting to have length of approximate 68 bytes. My configuration of redis cluster:
Number of master 2.
Replication factor 1.
Eviction Policy - volatile-lru.
Maxmemory of one node is 8 GB
Average TTL of keys is about 300 seconds. As the keys are not expiring the cluster is getting full after some time. How can we resolve this?

Impala concurrent query delay

My cluster configuration is as follows:
3 Node cluster
128GB RAM per cluster node.
Processor: 16 core HyperThreaded per cluster node.
All 3 nodes have Kudu master and T-Server and Impala server, one of the node has Impala catalogue and Impala StateStore.
My issues are as follows:
1) I've a hard time figuring out Dynamic resource pooling in Impala while running concurrent queries. I've tried giving mem_limit still no luck. I've also tried static service pool but with that also I couldn't achieve required concurrency. Even with admission control, the required concurrency was not achieved.
I) The time taken for 1 query: 500-800ms.
II) But if 10 concurrent queries are given the time taken grows to 3-6s per query.
III) But if more than 20 concurrent queries are given the time taken is exceeding 10s per query.
2) One of my cluster nodes is not taking the load after submitting the query, I checked this by the summary of the query. I've tried giving the NUM_NODES as 0 and 1 on the node which is not taking the load, still, the summary shows that the node is not taking the load.
What is the table size ? How many rows are there in the tables ? Are the tables partitioned ? It will be nice if you can compare your configurations with the Impala Benchmarks
As mentioned above Impala is designed to run on a Massive Parallel Processing Infrastructure. If when we had a setup of 10 nodes with 80 cores and 160 virtual cores with 12 TB SAN storage, we could get a computation time of 60 seconds with 5 concurrent users.

Aerospike - Read (with consistency level ALL) when one replica is down

TL;DR
If a replica node goes down and new partition map is not available yet, will a read with consistency level = ALL fail?
Example:
Given this Aerospike cluster setup:
- 3 physical nodes: A, B, C
- Replicas = 2
- Read consistency level = ALL (reads consult both nodes holding the data)
And this sequence of events:
- A piece of data "DAT" is stored into two nodes, A and B
- Node B goes down.
- Immediately after B goes down, a read request ("request 1") is performed with consistency ALL.
- After ~1 second, a new partition map is generated. The cluster is now aware that B is gone.
- "DAT" now becomes replicated at node C (to preserve replicas=2).
- Another read request ("request 2") is performed with consistency ALL.
It is reasonable to say "request 2" will succeed.
Will "request 1" succeed? Will it:
a) Succeed because two reads were attempted, even if one node was down?
b) Fail because one node was down, meaning only 1 copy of "DAT" was available?
Request 1 and request 2 will succeed. The behavior of the consistency level policies are described here: https://discuss.aerospike.com/t/understanding-consistency-level-overrides/711.
The gist for read/write consistency levels is that they only apply when there are multiple versions of a given partition within the cluster. If there is only one version of a given partition in the cluster then a read/write will only go to a single node regardless of the consistency level.
So given an Aerospike cluster of A,B,C where A is master and B is
replica for partition 1.
Assume B fails and C is now replica for partition 1. Partition 1
receives a write and the partition key is changed.
Now B is restarted and returns to the cluster. Partition 1 on B will
now be different from A and C.
A read arrives with consistency all to node A for a key on Partition
1 and there are now 2 versions of that partition in the cluster. We
will read the record from nodes A and B and return the latest
version (not fail the read).
Time lapse
Migrations are now complete, for partition 1, A is master, B is
replica, and C no longer has the partition.
A read arrives with consistency all to node A. Since there is only
one version of Partition 1, node A responds to the client without
consulting node B.