I am trying to check redis performance against mysql in my windows localhost. I am a student and we are learning various things in my school. I have around 1048580 records in mysql local and I am performing various rest operations. I also have implemented redis to store the values by using springboot cacheable and lettuce. It all works fine but I don't know how to measure the performance to see thaat redis is performing better than mysql. I think it would be easier on a very laarge scale company structure. can I simulate on my local? Also, how to benchmark redis performances on my local for my academic project?
I have tried sending multiple requests in a loop to try to determine performance but don't see much of a difference for localhost - my records. I have tried understanding various commands of redis cli monitoring but don't see much latency.
Well it depends on how you are actually testing these redis vs MySQL. You have to keep in mind that MySQL internally use caches, also if you use hibernet then it also does a level of caching. If you do make same get request several time then there would not be any major difference between redis and MySQL result.
You should compare your result by doing several different operation, like inserting/deleting/getting thousands of different values. Then same operation for identical values etc.
Related
We usually use redis for caching in the Spring‘s project. My problem is that since redis is single-threaded, then our concurrent requests become serialized requests when accessing redis. then,what is the significance of using redis?
Is it only because of "It's not very frequent that CPU becomes your bottleneck with Redis, as usually Redis is either memory or network bound.
......
using pipelining Redis running on an average Linux system can deliver even 1 million requests per second......
"?
I am learning redis, Redis document FAQ
You've basically asked two questions in one question:
What is the significance of using Redis.
Well, Redis is known to be fast because it keeps the data in memory. If you ask whether being a single-threaded application is very restrictive - well, its a product, that works like this by design, maybe it could be even more performant if it was multithreaded, it depends on actual implementation under the hood after all.
In any case, it offers much more than just a "get data in memory":
- Many primitives to work with
- Configurable persistence
- Replication of data
And much more
If the question is whether the in-memory cache will be faster (you've mentioned Spring framework, so you're at Java Land) - then yes.
In fact, Spring Cache support Guava Cache (spring 5/spring boot 2 use Caffeine for the same purpose instead) - and yes it will be faster in a head-to-head comparison with Redis. But what if you have a distributed application with many instances and one instance calculated something and put it to cache, how do you get the same information from another instance without distributing the information between the instances. Well, there are tools like Hazelcast but it's out of scope for this question, the point is that when the application is beyond basic, the tasks like cache synchronization /keeping it up-to-date becomes much less obvious.
If you can deliver 1 million operations per second.
Now this question is too vague to answer:
What is the hardware that runs Redis?
What are the network configurations? (after all Redis calls are done over the network)
How often do you persist on disk (Redis has configurations for that)
Do you use replication and split the load between many Redis servers reaching an overall much faster throughput?
What commands exactly are being running under that hood?
In any case, when it comes to benchmarking you can set up your system in the option way and use the tool offered by Redis itself:
Redis Benchmarking Chapter in Redis tutorial
The tool is called redis-benchmark you can run it with various parameters and see how fast redis really is:
Here is an example (I encourage you to read the full article in the link):
$ redis-benchmark -t set,lpush -n 100000 -q
SET: 74239.05 requests per second
LPUSH: 79239.30 requests per second
This says: Connect to redis server available on localhost, run (-n) 100000 requests in a quiet mode (-q parameter) and run only tests specific for two commands: set and lpush
I had to switch an enterprise Django 1.11 site from a corporate-hosted PostgreSQL 9.4 server to AWS RDS Aurora-PostgreSQL 10 cluster. My initial impression was that it should be a straightforward migration, as I was not using any version-specific code.
Immediately after migration, the site started breaking down horribly. Queries that used to take milliseconds suddenly jumped to 100x the time, causing timeouts all over gunicorn threads. I also kept seeing connections being dropped from both RDS and Django.
It kept appearing as if it would be some setting I need to match between previous server and current server, but despite engaging PostgreSQL experts and AWS support, there were no simple answers (or even complex ones). I finally had to fine-tune most queries in my Django code to bring stability to the site.
The app has several queries that refer to foreign relationships, so I used a number of prefetch_related and similar tricks to fix the slowdown. So, a query that was taking 0.5 seconds went to 80 seconds, and after I added prefetch_related, went back to 0.5 seconds.
Even though the site is now stable, I am posting this in the hope that some PostgreSQL and/or Django expert sees this and recognizes this as a symptom of some wrong setting. I am not in a position to share sample queries and am not asking for query optimization. The question is: what would cause a query to become 100x slower when we move from one PostgreSQL server to another, with no change in application code?
In general, postgres-compatible aurora has wildly different performance characteristics than vanilla postgres, and the configuration and tuning for both can be very different. The easiest path forward for you would have been for you to have used AWS RDS for Postgres and not AWS RDS with Aurora Postgres if you had wanted to get performance characteristics that were close to your self-hosted postgres. There are a number of configuration details that you didn't share that would affect performance between RDS and a self-hosted server including VPC settings, SSL, etc. that could also affect performance.
I have read that netflix uses evcache , which is a wrapper over memcache and evcache proves better than memcache
In general it is said that redis server as a better cache than memcache, was trying to find the comparisons of redis and evcache.
Does redis scale as well as evcache or memcache? I am assuming that evcache scaling is tried and tested (hence works good for netflix)
EVCache is a functionality add wrapper over memcache. It is an application that Netflix devs wrote to add functionality they need in their cache layer while using memcache as the underlying data store. You can write your own EVCache to use redis as the data store
Comparing redis to Evcache is not the correct comparison as they operate on two different layers.
Does redis scale as well as evcache or memcache?
Redis can scale to many hundreds of thousands of requests per second.
In general, redis is preferred over memcache because of its many in built data structures
Redis is single threaded so once CPU usage hits 80+% it is better to run another instance instead of giving it a bigger server
I currently have some different project that works on different redis instance ( consider the sample where I've 3 different asp.net application that are on different server each one with its redis server).
We've been asked to virtualize and to remove useless instances so I was wondering what happens if I have only one redis server and all the 3 asp.net points to the same redis instance.
For the application key I think there's no problem, I can prefix my own key with the application name , for example "fi-agents", "ga-agents", and so on... but I was wondering for the auth session what happens?
as far as I've read the Prefix is used as internal and it can't be used by final user to separate... it's just enought to use different Db?
Thanks
Generally and unless there are truely compelling reasons, you don't want to mix different applications and their data in the same database. Yes, it does lower ops costs initially but it can quickly deteriorate to scaling and performance nightmare. This, I believe, is true for any database.
Specifically with Redis, technically yes - you could use a key prefix or the shared/numbered database approach. I'm not sure what you meant by "auth" sessions but you can probably apply the same approach to them. But you really shouldn't... since Redis is a single-threaded process you can end up where one of the apps is blocking the other two. Since Redis by itself is so lightweight, just spin up dedicated servers - one per app - even in the same VM if you must. You can read more background information on why you don't want to opt for the shared approach here: https://redislabs.com/blog/benchmark-shared-vs-dedicated-redis-instances
I have a very large set of keys, 200M keys, with small values, <100 bytes, to store and I'm trying to use Redis. The problem is such that I have 10 Redis DB to split the keys over, but currently I'm on a single server with those 10 Redis DB. By a Redis DB I mean using SELECT. From my calculations it looks like I'm going to blow out memory. I think I'll need over 4TB of memory for this case! What are my options? First, my calculation is based on 10000 keys with 100 byte values taking 220MB of RAM (this is from a table I found). So simply put (2*10^8 / 10^4) * 220MB = 4.4TB.
If my calculation looks correct, what are my options? I've read on different posts that Redis VM is no longer an option. Can I use a Redis cluster? This still appears to require too many servers to be practical. I understand I could switch to another DB, but I'd like that to be the last resort option.
Firstly, using shared databases (i.e. the SELECT command) isn't a recommended practice since all of these databases are essentially managed by the same Redis process. It is preferable having 10 separate Redis processes (even on the same server) in order to avoid contention (more info here).
Next, there are ways to reduce the memory footprint of your database. You could, for example, perform client-side compression (see here) or consider other optimizations such as using Hashes to keep multiple values (as described here).
That said, a Redis server is ultimately bound by the amount of RAM that the host provides. Once you've reached that limit you'll need to shard your database and use a Redis cluster. Since you're already using multiple databases this shouldn't pose a big challenge as your code should already be compatible with that to a degree. Sharding can be done in one of three approaches: client, proxy or Redis Cluster. Client-side sharding can be implemented in your code or by the Redis client that you're using (if the client library that you're using supports that). Redis Cluster (v3) is expected to be released in the very near future and already has a stable release candidate. As for proxy-based sharding, there are several open source solutions out there, including Twitter's twemproxy, Netflix's dynomite and codis. Additional information about sharding and partitioning can be found here.
Disclaimer: I work at Redis Labs. Lastly, AFAIK there's only one Redis-as-a-Service provider that already provides built-in support for clustering Redis. Redis Labs' Redis Cloud is a fully-managed service that can scale seamlessly to any required capacity. Our clusters support both the '{}' hashtag standard as well as sharding by RegEx - more about this can be found here.
You can use LMDB with Dynomite to store data beyond your memory capacity. LMDB uses both disk and memory to store data. Dynomite make LMDB to be distributed.
We have done a POC with this combo and they work nicely together.
For more information, please check out our open issue here:
https://github.com/Netflix/dynomite/issues/254