Is Redis a bottleneck in SignalR + Redis when it comes to scaling out? - redis

I'm interested in SignalR + Redis solution for implementing a server application that is scalable. And my concern is that Redis cluster is not production ready yet! So my question is:
Is Redis a bottleneck in SignalR + Redis when it comes to scaling out? If it is, is there any Linux-based solution that solves the problem?

On a single redis server you can easily handle up to 10K concurrent clients using pubsub. If you are still evaluating what to use, this should be more than you need at your current stage.
Redis cluster is supposed to be production ready by the end of the year or early 2014. You can actually download it and try it already. Lots of people are using it now and reporting the odd bug. The creator of redis is focused on making the cluster work and as of now it is very mature.
By using the proxy you could have up to 1000 nodes simultaneously, with over 10K clients on pubsub, so 10 million of concurrent users. The limit of the cluster is theoritecally of 16384 nodes, but a maximum of 1000 is recommended right now.
Unless you are of facebook scale, you can probably use redis for your case use (and even when you are twitter scale, given twitter uses redis intensively for storing all the timelines on redis)
I've been asked to add some references on a comment, so here you are the relevant links:
On the number of concurrent connections per redis process http://redis.io/topics/clients
On how twitter is using redis http://highscalability.com/blog/2013/7/8/the-architecture-twitter-uses-to-deal-with-150m-active-users.html
On cluster size/specs http://redis.io/topics/cluster-spec

Is Redis a bottleneck in SignalR + Redis when it comes to scaling out? If it is, is there any Linux-based solution that solves the problem?
I don't think so. Check the below article on how to scale out using Redis
http://www.asp.net/signalr/overview/performance-and-scaling/scaleout-with-redis

Related

since redis is single-threaded, then our concurrent requests become serialized requests when accessing redis. What is the significance of using redis?

We usually use redis for caching in the Spring‘s project. My problem is that since redis is single-threaded, then our concurrent requests become serialized requests when accessing redis. then,what is the significance of using redis?
Is it only because of "It's not very frequent that CPU becomes your bottleneck with Redis, as usually Redis is either memory or network bound.
......
using pipelining Redis running on an average Linux system can deliver even 1 million requests per second......
"?
I am learning redis, Redis document FAQ
You've basically asked two questions in one question:
What is the significance of using Redis.
Well, Redis is known to be fast because it keeps the data in memory. If you ask whether being a single-threaded application is very restrictive - well, its a product, that works like this by design, maybe it could be even more performant if it was multithreaded, it depends on actual implementation under the hood after all.
In any case, it offers much more than just a "get data in memory":
- Many primitives to work with
- Configurable persistence
- Replication of data
And much more
If the question is whether the in-memory cache will be faster (you've mentioned Spring framework, so you're at Java Land) - then yes.
In fact, Spring Cache support Guava Cache (spring 5/spring boot 2 use Caffeine for the same purpose instead) - and yes it will be faster in a head-to-head comparison with Redis. But what if you have a distributed application with many instances and one instance calculated something and put it to cache, how do you get the same information from another instance without distributing the information between the instances. Well, there are tools like Hazelcast but it's out of scope for this question, the point is that when the application is beyond basic, the tasks like cache synchronization /keeping it up-to-date becomes much less obvious.
If you can deliver 1 million operations per second.
Now this question is too vague to answer:
What is the hardware that runs Redis?
What are the network configurations? (after all Redis calls are done over the network)
How often do you persist on disk (Redis has configurations for that)
Do you use replication and split the load between many Redis servers reaching an overall much faster throughput?
What commands exactly are being running under that hood?
In any case, when it comes to benchmarking you can set up your system in the option way and use the tool offered by Redis itself:
Redis Benchmarking Chapter in Redis tutorial
The tool is called redis-benchmark you can run it with various parameters and see how fast redis really is:
Here is an example (I encourage you to read the full article in the link):
$ redis-benchmark -t set,lpush -n 100000 -q
SET: 74239.05 requests per second
LPUSH: 79239.30 requests per second
This says: Connect to redis server available on localhost, run (-n) 100000 requests in a quiet mode (-q parameter) and run only tests specific for two commands: set and lpush

Redis: Efficient cluster of servers for large key set

I have a very large set of keys, 200M keys, with small values, <100 bytes, to store and I'm trying to use Redis. The problem is such that I have 10 Redis DB to split the keys over, but currently I'm on a single server with those 10 Redis DB. By a Redis DB I mean using SELECT. From my calculations it looks like I'm going to blow out memory. I think I'll need over 4TB of memory for this case! What are my options? First, my calculation is based on 10000 keys with 100 byte values taking 220MB of RAM (this is from a table I found). So simply put (2*10^8 / 10^4) * 220MB = 4.4TB.
If my calculation looks correct, what are my options? I've read on different posts that Redis VM is no longer an option. Can I use a Redis cluster? This still appears to require too many servers to be practical. I understand I could switch to another DB, but I'd like that to be the last resort option.
Firstly, using shared databases (i.e. the SELECT command) isn't a recommended practice since all of these databases are essentially managed by the same Redis process. It is preferable having 10 separate Redis processes (even on the same server) in order to avoid contention (more info here).
Next, there are ways to reduce the memory footprint of your database. You could, for example, perform client-side compression (see here) or consider other optimizations such as using Hashes to keep multiple values (as described here).
That said, a Redis server is ultimately bound by the amount of RAM that the host provides. Once you've reached that limit you'll need to shard your database and use a Redis cluster. Since you're already using multiple databases this shouldn't pose a big challenge as your code should already be compatible with that to a degree. Sharding can be done in one of three approaches: client, proxy or Redis Cluster. Client-side sharding can be implemented in your code or by the Redis client that you're using (if the client library that you're using supports that). Redis Cluster (v3) is expected to be released in the very near future and already has a stable release candidate. As for proxy-based sharding, there are several open source solutions out there, including Twitter's twemproxy, Netflix's dynomite and codis. Additional information about sharding and partitioning can be found here.
Disclaimer: I work at Redis Labs. Lastly, AFAIK there's only one Redis-as-a-Service provider that already provides built-in support for clustering Redis. Redis Labs' Redis Cloud is a fully-managed service that can scale seamlessly to any required capacity. Our clusters support both the '{}' hashtag standard as well as sharding by RegEx - more about this can be found here.
You can use LMDB with Dynomite to store data beyond your memory capacity. LMDB uses both disk and memory to store data. Dynomite make LMDB to be distributed.
We have done a POC with this combo and they work nicely together.
For more information, please check out our open issue here:
https://github.com/Netflix/dynomite/issues/254

ElastiCache Redis spikes

I'm using ElastiCache Redis and storing small piece of data (~5-10MB) in it. Everything works perfect for a while and then suddenly it responds lot longer than usually (like 2000ms instead of 100ms). Most of actions that I'm doing is simple select single entry from Redis and then providing it to client. I noticed this problem only in benchmarks, not in real usage.
According to Google and StackOverflow it can be related to Redis Persistence, but I found that persistence is disabled in group options of ElastiCache.
I used redis-stat to monitor stuff in Redis, and seems like there are regular CPU usage spikes by system every n-minutes.
Anyone knows what kind of thing can cause such problem?

Redis configuration for production

I'm developing project with redis.My redis configuration is normal redis setup configuration.
I don't know how should I do redis configuration? Master-Slave? Cluster?
Do you have anything suggestion redis configuration for production?
Standard approach would be to have one master and at least one slave. Depending on your I/O requirements and number of ops/sec, you can always have multiple read-only slaves. Slaves can be read from but not written to. So you'll want to design your application to take advantage of doing round-robin requests to the slaves and writes only to the single master.
Depending on your data storage/backup requirement, you can set fsync for append-only mode to be every second. So while this means you can lose up to one second worth of data, it's really much less than that because your slaves serve as hot backups, and they will have the data within milliseconds.
You'll at least want to do a BGSAVE every hour to get a dump.rdp produced. You can then save this file live while the server is still running, and store it to some off-site backup facility.
But if you're just using Redis as a standard memcache replacement and don't care about data, then you can ignore all of this. Much of it will be changing in Redis Cluster in the 3.0 version.
It depends on what your Read/Writes requirements are. Could you give us more informations on that matter ?
I think 10,000 people use instant my application.I persist member login token on redis.It's important for me.If I don't write redis, member don't login on application.
Even a Redis single instance will be enough to process 10K users (start redis-bench to the throughput available), so just to be sure use a Master/Slave configuration with autopromotion of the slave if the master goes down.
Since you want persistence, use RDB (maybe along with AOF), see this topic on Redisio.

Redis HA without clustering

I tried to find more info about it online, but cant seem to find a fitting answer.
Our new application uses HA loadbalancers on top to distribute visitors to clustered ampq and clustered mysql and everything works flawlessly.
Now we have decided that we need to store our sessions on REDIS and according to everyone out there.. REDIS seems to be a good choice.
But what I dont understand is, since Redis doesnt support cluster yet in production. How do people achieve HA with Redis? Its all great to setup a Master-Slave REDIS setup, but that means I can only write to the master. What happens if the master die? And even with Redis Sentinel promoting slaves to master.. the replication from master to slave can have a delay and reply me with stale data. How do people prevent that?
But to keep it short, I just dont "see" it. Please enlightenment me! Thank you
Have a look at Twemproxy. It was deisnged to partition data amongst multiple redis masters, so there's no single point of failure; currently, it's the recommended approach to partition redis based on this (scroll to bottom).
Bonus Alert: Here's an interesting article on how to use redis slaves and sentinel with twemproxy, so they all play nice.
try redis-mgr: https://github.com/idning/redis-mgr
redis+twemproxy+sentinel deploy/auto-failover/monitor/migrate/rolling-upgrade
Redis 3.x has clustering functionality in the core.
http://redis.io