redis cluster recovery downtime when master goes down [closed]

redis cluster recovery downtime when master goes down [closed] - redis

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 months ago.
Improve this question
When a master goes down in a Redis cluster, Redis will wait for node timeout to promote slave to master. There may be additional time taken for slave promotion to master. During the time master goes down to slave promotion to master, writes/reads, especially writes will fail. How do I ensure zero downtime?

I think its a common problem with most databases. Lets say you have a mongo replicaset, and master goes down, it takes a while for the slave to be promoted, and you lose the writes, same with mongo shard, or mysql.
Even if redis could provide an instant failover(which is not possible), your writes could not be guaranteed unless you use AOF with write to disk on every operation, but that would be terribly slow and defeat the whole purpose of redis.
One solution to get closer to better guarantees for writes would be to push the data to a queue, like kafka and write to redis or any other datastore asynchronously. But then you introduce one more stack, and we have to worry about its failover also.
So, i think we should try to treat redis like a cache, and not as a permanent datastore.

When it comes to design an architecture, we need to think about the tradeoffs.
Yes, whenever redis master go down, there is some wait time that promotes one of the slaves to master and some of the writes may miss in the meantime.
That's the nature of redis.
If you have a cluster with 1 Master and 3 slaves and you are writing, eventually you will write things to master and there should be a sync with slaves but redis doesn't wait for the acknowledgments from the slaves to send the acknowledgement back to client. If redis wants to do that, redis can't be this much quick.
At the end, redis can be useful only as a cache storage not the disk storage. But whenever you are facing CATCH MISS, you can search the things in permanent data storage like DBs. Don't use redis as a permanent storage and it won't built in such a way.

Related

I/O Data tranfer Modes and I/O addresses access [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I've realized that the 3 ways to make an I/O connection :
1- Programmed I/O (polling)
2- Interrupt-Driven I/O
3- Direct Memory Access (DMA)
now, I need to relate this with the reality of how accessing I/O addresses is done
(Isolated I/O || Memory-mapped I/O) :
DMA
Memory mapping does not affect the direct memory access (DMA) for a device, because, by definition, DMA is a memory-to-device communication method that bypasses the CPU.
this is all information I have.
now, what about Interrupt-driven and Programmed I/O, what is the addressing modes are used in these cases?
Does a microcontroller can do both addressing modes (Isolated/memory-mapped) or only one choice?
Am I understanding the topics right now, or there are any misconceptions?

Port mapped vs memory mapped (Communication)
This is how the IO access is performed, i.e. how the CPU communicates with the device.
With port mapped IO the CPU uses special instructions (e.g. x86's in and out) to read/write from a device in a special IO address space you can't access with load/store instructions.
With memory mapped IO the CPU performs normal memory loads and stores to communicate with a device.
The latter is usually more granular and uniform when it comes to security permissions and code generation.
Polling vs Interrupt driven (Notification)
This is how notifications from the devices are received by the CPU.
With polling the CPU will repeatedly read a status register from the device and check if a completion bit (or equivalent) is set.
With interrupt driven notifications the device will raise an interrupt without the need for the CPU to do any periodic work.
Polling hogs the CPU but has less latency for some workload.
DMA vs non-DMA (Transfer)
This is how the data is transferred from the device to the CPU.
With DMA the device will write directly into memory.
Without DMA the CPU will have to read the data repeatedly (either with port or memory mapped IO).
All these three dimensions are independent of each other, you can combine them however you like (e.g. port mapped, interrupt driven, DMA).
Note, however, that the nomenclature is not consistent in the literature.
Also, different devices have different interfaces that may not need all of the three dimensions (e.g. a very simple output-only GPIO pin may have a single write-only register, so it makes no sense to talk about polling or DMA in this case).

How to save memory from unpopular/cold Redis?

We have a lot of Redis instances, consuming TBs of memory and hundreds of machines.
With our business activities goes up and down, some Redis instances are just not used that frequent any more -- they are "unpopular" or "cold". But Redis stores everything in memory, so a lot of infrequent data that should have been stored in cheap disk are occupying expensive memory.
We are exploring a way to save the memory from these unpopular/cold Redis, as to reduce our machines usage.
We cannot delete data, nor can we migrate to other database. Are there some way to achieve our goals?
PS: We are thinking of some Redis compatible product that can "mix" memory and disk, i.e. it stores hot data in memory but cold in disk, and USING LIMITED RESOURCES. We know RedisLabs' "Redis on Flash(ROF)" solution, but it uses RocksDB, which is very memory unfriendly. What we want is a very memory restrained product. Besides, ROF is not open source :(
Thanks in advance!

ElastiCache Redis now supports data tiering. Data tiering provides a new cost optimal option for storing data in Redis by utilizing lower-cost local NVMe SSDs in each cluster node in addition to storing data in memory. It is ideal for workloads that access up to 20 percent of their overall dataset regularly, and for applications that can tolerate additional latency when accessing data on SSD. More details about data tiering can be found here.

Your problem might be solved by using an orchestrator approach: scaledown when not in use, scale up when in demand.
Implementation depends much on your infrastructure, but a base requirement is proper monitoring of Redis instances usage.
Based on that, if you are running on Kubernetes, you can leverage pod autoscaling.
Otherwise you can implement Consul and use HAProxy to handle the shutdown/spin-up logic. A starting point for that strategy is this article.
Of course Reiner's idea of using swap is a quick win if it works the intended way!

Redis performance on a multi core CPU

I am looking around redis to provide me an intermediate cache storage with a lot of computation around set operations like intersection and union.
I have looked at the redis website, and found that the redis is not designed for a multi-core CPU. My question is, Why is it so ?
Also, if yes, how can we make 100% utilization of CPU resources with redis on a multi core CPU's.

I have looked at the redis website, and found that the redis is not designed for a multi-core CPU. My question is, Why is it so?
It is a design decision.
Redis is single-threaded with epoll/kqueue and scales indefinitely in terms of I/O concurrency. --#antirez (creator of Redis)
A reason for choosing an event-driven approach is that synchronization between threads comes at a cost in both the software (code complexity) and the hardware level (context switching). Add to this that the bottleneck of Redis is usually the network or the *memory, not the CPU. On the other hand, a single-threaded architecture has its own benefits (for example the guarantee of atomicity).
Therefore event loops seem like a good design for an efficient & scalable system like Redis.
Also, if yes, how can we make 100% utilization of CPU resources with
redis on a multi core CPU's.
The Redis approach to scale over multiple cores is sharding, mostly together with Twemproxy.
However if for some reason you still want to use a multi-threaded approach, take a look at Thredis but make sure you understand the implications of what its author did (you can not use it as a replication master, for instance).

Redis server is a single threaded. But it allows to achieve 100% utilization of CPU resources using Redis nodes (master and/or slave).
Read operations could be scaled using Redis master/slave configuration with single master. One of CPU core used for master node and all others for slaves.
Write operations could be scaled using Redis multi-master cluster configuration. Multiple CPU cores used for master nodes and all others for slaves.
Redisson - Redis Java client which provides full support of Redis cluster. Works with AWS Elasticache and Azure Redis Cache. It includes master/slave discovery and topology update.

RabbitMQ - strange synchronization behavior

I have simple RabbitMQ cluster with 2 physical identical linux nodes: (CentOS, RabbitMQ 3.1.5, Erlang R15B, 2GB Ram, CPU 1xCore). Mirroring and synchronization of nodes is turned on.
I have two problems which bothers me:
In a normal situation everything is fine, but after restarting one of the nodes(by stop_app and start_app in the commandline) the whole cluster becomes unavaible to producers and consumers - I can't produce or receive messages from a queue during synchronization. Is this situation normal?
During synchronization I observed very high CPU load (almost 100%) on the slave node(that which was restarted). I measured the speed of synchronization - it's dramatic low (synchronization of 2 millions of messages takes above 3 hours). It's strange because producing of such amount takes much less. Is this situation normal too?

I've recently been tasked with looking into RabbitMQ at work and so have been deep in the documentation.
When synchronising this is the case. This is an extract from the RabbitMQ HA documentation here.
If a queue is set to automatically synchronise it will synchronise
whenever a new slave joins - becoming unresponsive until it has done
so.
If the messages are being read-from and persisted-to disk (either through choice or through memory limitations) there may be overhead there. You can see a chart on this blog entry (it's the last chart before the comments) which indicates that there are performance changes when reading from and writing to queues of that many messages. These charts are for older versions of RabbitMQ but I've not seen anything more recent.
Hope this helps!

What is the purpose of distributed testing in Jmeter? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 years ago.
Improve this question
What is distributed testing in Jmeter?
What is its actual purpose?
I searched and read a lot about master/slave concept in Jmeter and I know how it can be done, but what is its use?

Actually there is limitation of a single normal configuration machine to generate large user load test. So we use distributed load testing to use multiple machines for generating the load wherein you rightly said we used master slave configuration.
For more about distributed load testing and other things about JMeter you can refer the link-
Distributed load testing with JMeter

Distributed testing is to be used when you reach the limits of a machine in terms of CPU, Memory, Network:
It can be used within one machine (many JVMs on one machine), if you reach the limits of one reasonable JVM in terms of Memory & CPU
It can be used accross many machines ( 1 or many JVMs on 1 or many Machines)
BUT before using it ensure you really need it, read this:
http://www.dzone.com/links/r/see_how_to_make_jmeter_run_thousands_of_threads_w.html
http://jmeter.apache.org/usermanual/best-practices.html

The main reason of the distributed testing in jmeter is load distribution. That means: think you want to generate a load of 3000 users to an application, Jmeter have no limitation in creating number of users but the limitation is our hardware or CPU. We assume that because of our CPU limitation we can send only 1000 request from one computer. If I need to send 3000 requests means I need 3 systems. this distributed test will give cumilated result of 3000 users in one file as an output.
If your system is well and using http sampler or smpt you do not feel to use distributed to generate loads. But If you start using Samplers like Webdriver sampler or something which gives heavy load CPU you need to go for distributed. Because for example: Webdriver sampler you are running then only 10 to 15 users can be started in one system, if you need more users you need to go for distributed and there is no other good option other than distributed

It is used to generate load from a different location or to generate more load than you could from a single computer.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas