fairly new to Redis and working on a Azure Cache for Redis implementation.
In the Azure documentation around Redis trouble shooting it's stated that there can be long-running commands and that the redis command documentation shows the time complexity of all commands.
I couldn't find anything around the config set maxmemory-policy command.
Is my assumption correct that setting/changing the maxmemory-policy itself is not an expensive command (unlike e.g. resharding/rebalancing a cluster)?
(I know, "expensive" does not really have a proper definition here :) )
Thanks for any hints or answers!
Yes, the config set command itself is not an expensive command. It iterates the config array which contains about 200 items, to find the config, and set it. That's all.
However, after the setting, Redis might need to free memory for evicted items periodically or for each command. That's a cost.
Related
I have a Redis cluster that is still active and not sure if it is still been used. Logged in and tried command KEYS * and found around 30k+ keys.
How to identify if it is still been used if not can process to terminate?
First of all, you shouldn't use KEYS command in production, it can jam your Redis. Use SCAN instead.
If you mean you would like to know if some queries are still being made against Redis, you can use the Monitor command. Be careful though, it reduces your throughput by about 50%, so only use when necessary.
I understand Redis AOF and RDB persistence options and have read the doc (maybe not thoroughly). What I want to ask is this: is it possible to eliminate the possibility of data loss with Redis?
Setting appendfsync to always seems to be the closest solution. However, there is stil the possibility that Redis crashes just after responding to the client with "OK" and before persisting the data to disk. There would be no way for the client to know that the data is lost, which will result in inconsistency.
As far as I'm concerned, an option to make Redis respond after fsync should resolve the issue (or maybe an additional WAITFSYNC command). Is that possible?
i think for now the safest option is to add appendonly yes in your redis config.
if you are using version 1.1 or greater one.
appendfsync always is slowest among them. if you are okay with that then sure you can use it. but if you care about your DB's performance use appendfsync everysec.
The append-only file is a fully-durable strategy for Redis. every time Redis receives a command that changes the dataset (e.g. SET) it will append it to the AOF. When you restart Redis it will re-play the AOF to rebuild the state.
details
What are the Redis capacity requirements to support 50k consumers within one consumer group to consume and process the messages in parallel? Looking for testing an infrastructure for the same scenario and need to understand considerations.
Disclaimer: I worked in a company which used Redis in a somewhat large scale (probably less consumers than your case, but our consumers were very active), however I wasn't from the infrastructure team, but I was involved in some DevOps tasks.
I don't think you will find an exact number, so I'll try to share some tips and tricks to help you:
Be sure to read the entire Redis Admin page. There's a lot of useful information there. I'll highlight some of the tips from there:
Assuming you'll set up a Linux host, edit /etc/sysctl.conf and set a high net.core.somaxconn (RabbitMQ suggests 4096). Check the documentation of tcp-backlog config in redis.conf for an explanation about this.
Assuming you'll set up a Linux host, edit /etc/sysctl.conf and set vm.overcommit_memory = 1. Read below for a detailed explanation.
Assuming you'll set up a Linux host, edit /etc/sysctl.conf and set fs.file-max. This is very important for your use case. The Open File Handles / File Descriptors Limit is essentially the maximum number of file descriptors (each client represents a file descriptor) the SO can handle. Please check the Redis documentation on this. RabbitMQ documentation also present some useful information about it.
If you edit the /etc/sysctl.conf file, run sysctl -p to reload it.
"Make sure to disable Linux kernel feature transparent huge pages, it will affect greatly both memory usage and latency in a negative way. This is accomplished with the following command: echo never > /sys/kernel/mm/transparent_hugepage/enabled." Add this command also to /etc/rc.local to make it permanent over reboot.
In my experience Redis is not very resource-hungry, so I believe you won't have issues with CPU. Memory are directly related to how much data you intend to store in it.
If you set up a server with many cores, consider using more than one Redis Server. Redis is (mostly) single-threaded and will not use all your CPU resources if you use a single instance in a multicore environment.
Redis server also warns about wrong/risky configurations on startup (sorry for the old image):
Explanation on Overcommit Memory (vm.overcommit_memory)
Setting overcommit_memory to 1 says Linux to relax and perform the fork in a more optimistic allocation fashion, and this is indeed what you want for Redis [from Redis FAQ]
There are three possible settings for vm.overcommit_memory.
0 (zero): Check if enough memory is available and, if so, allow the allocation. If there isn’t enough memory, deny the request and return an error to the application.
1 (one): Permit memory allocation in excess of physical RAM plus swap, as defined by vm.overcommit_ratio. The vm.overcommit_ratio parameter is a
percentage added to the amount of RAM when deciding how much the kernel can overcommit. For instance, a vm.overcommit_ratio of 50 and 1 GB of
RAM would mean the kernel would permit up to 1.5 GB, plus swap, of memory to be allocated before a request failed.
2 (two): The kernel’s equivalent of "all bets are off", a setting of 2 tells the kernel to always return success to an application’s request for memory. This is absolutely as weird and scary as it sounds.
In my application I am implementing a server-side cache using Redis (for mySQL database). When data change in the database, I want to completely clear the cache to invalidate the old data.
However, I would like to see some statistics about how often are different keys queried in Redis, so that I can sort of manually pre-fetch frequently queried data for them to be available immediately after clearing the cache.
Is there any way how to see these statistics directly in Redis? Or what is a common solution to this problem?
You can use the object command.
OBJECT FREQ returns the logarithmic access frequency counter of
the object stored at the specified key. This subcommand is available
when maxmemory-policy is set to an LFU policy.
https://redis.io/commands/object
redis-cli --hotkeys can do the help for redis-cli version 4.x and above.
We usually use redis for caching in the Spring‘s project. My problem is that since redis is single-threaded, then our concurrent requests become serialized requests when accessing redis. then,what is the significance of using redis?
Is it only because of "It's not very frequent that CPU becomes your bottleneck with Redis, as usually Redis is either memory or network bound.
......
using pipelining Redis running on an average Linux system can deliver even 1 million requests per second......
"?
I am learning redis, Redis document FAQ
You've basically asked two questions in one question:
What is the significance of using Redis.
Well, Redis is known to be fast because it keeps the data in memory. If you ask whether being a single-threaded application is very restrictive - well, its a product, that works like this by design, maybe it could be even more performant if it was multithreaded, it depends on actual implementation under the hood after all.
In any case, it offers much more than just a "get data in memory":
- Many primitives to work with
- Configurable persistence
- Replication of data
And much more
If the question is whether the in-memory cache will be faster (you've mentioned Spring framework, so you're at Java Land) - then yes.
In fact, Spring Cache support Guava Cache (spring 5/spring boot 2 use Caffeine for the same purpose instead) - and yes it will be faster in a head-to-head comparison with Redis. But what if you have a distributed application with many instances and one instance calculated something and put it to cache, how do you get the same information from another instance without distributing the information between the instances. Well, there are tools like Hazelcast but it's out of scope for this question, the point is that when the application is beyond basic, the tasks like cache synchronization /keeping it up-to-date becomes much less obvious.
If you can deliver 1 million operations per second.
Now this question is too vague to answer:
What is the hardware that runs Redis?
What are the network configurations? (after all Redis calls are done over the network)
How often do you persist on disk (Redis has configurations for that)
Do you use replication and split the load between many Redis servers reaching an overall much faster throughput?
What commands exactly are being running under that hood?
In any case, when it comes to benchmarking you can set up your system in the option way and use the tool offered by Redis itself:
Redis Benchmarking Chapter in Redis tutorial
The tool is called redis-benchmark you can run it with various parameters and see how fast redis really is:
Here is an example (I encourage you to read the full article in the link):
$ redis-benchmark -t set,lpush -n 100000 -q
SET: 74239.05 requests per second
LPUSH: 79239.30 requests per second
This says: Connect to redis server available on localhost, run (-n) 100000 requests in a quiet mode (-q parameter) and run only tests specific for two commands: set and lpush