How can I handle redis-cluster with Redis Stack (RedisJSON & RediSearch)? - redis

I'm currently having problem dealing with redis-cluster.
creating a redis cluster, I'm using the "redis/redis-stack-server:latest" docker image.
I am doing a query test using RediSearch, but in standalone mode, the number of requests per second is not high than other DBs. so I'm trying to optimize the speed using Redis-Cluster.
When I first started redis-cluster, I thought I would be able to import it across nodes. However, the test results didn't. I can only access the data within the same node and cannot ft.search (RediSearch) across all the nodes.
so I looked through the document again to find a solution. The document related to RediSearch says to use RSCoordinator.
When Running RediSearch in a clustered database, you can span the index across shards using RSCoordinator. In this case the above does not apply.
https://redis.io/commands/ft.create/
but looking at github, it seems that RSCoordinator is already built-in.
https://github.com/RediSearch/RediSearch/tree/638de1dbc5e641ca4f8943732c3c468c59c5a47a
Is there a way for redis-cluster to fetch data across all nodes?
Additionally, I wonder if the reason why Redis does not have higher requests per second than other DBs (MongoDB, PostgreSQL, etc..) in large dataset(about 2,400,000) is because Redis is Single Thread? Are there other factors?

Related

Redis performance in localhost

I am trying to check redis performance against mysql in my windows localhost. I am a student and we are learning various things in my school. I have around 1048580 records in mysql local and I am performing various rest operations. I also have implemented redis to store the values by using springboot cacheable and lettuce. It all works fine but I don't know how to measure the performance to see thaat redis is performing better than mysql. I think it would be easier on a very laarge scale company structure. can I simulate on my local? Also, how to benchmark redis performances on my local for my academic project?
I have tried sending multiple requests in a loop to try to determine performance but don't see much of a difference for localhost - my records. I have tried understanding various commands of redis cli monitoring but don't see much latency.
Well it depends on how you are actually testing these redis vs MySQL. You have to keep in mind that MySQL internally use caches, also if you use hibernet then it also does a level of caching. If you do make same get request several time then there would not be any major difference between redis and MySQL result.
You should compare your result by doing several different operation, like inserting/deleting/getting thousands of different values. Then same operation for identical values etc.

Implementing Cuckoo Filter on multiple nodes in Redis

I'm trying to implement cuckoo filter in Redis. What I have till now works fine except that it just inserts all the values on a single node even when working on a cluster.
In order to implement it on multiple nodes, I'm thinking of directing different elements to different nodes using some hash function. Is there any command or function call in Redis that allows forcing of elements to a particular node using its key or number, or even a particular slot?
For reference, this is the implementation of cuckoo filter I have till now.
As an aside, is there any existing implementation of Cuckoo Filter or Bloom Filter on distributed nodes in Redis that I can refer to?
This page explains how Redis cluster works and how the redis-cli works when using it in cluster mode. Other clients do a better handling of the operations in cluster mode, but the basic functionality of the redis-cli should work for simple tests.
If you check the code of other data structures (for example, hash or set) that come with Redis, you'll notice that they do not have code to deal with cluster mode. This is handled by the code in cluster.c, and should be orthogonal to your implementation. Are you sure you have correctly configured the cluster and the Redis cli?

Two-directional replication of two separate Solr servers

I read about multi core or master slave in Solr but I am looking for complete replication of two separate Solr servers (Two-directional ). Where can I find a manual for doing that?
The two or more separate Solr servers can have internal replication or not.
The primary reason I expect you'd want bi-directional replication would be to support something like a cross-datacenter situation. That is, you want to isolate queries to particular places, but keep things in sync across a high-latency link.
If you don't need this, just use SolrCloud and let it handle replication. You can shard your index and get whatever update throughput you need. Any update can go to any node, and Solr will make sure it gets written to the right places.
If you are really thinking about datacenters, Solr added some brand new data center support in 6.0, which you can read about here: https://sematext.com/blog/2016/04/20/solr-6-datacenter-replication/
However, this still assumes updating to a single data center and just having the other just follow along.
Apple also did a talk about their (internal) bidirectional replication system you can watch here: https://www.youtube.com/watch?v=_Erkln5WWLw
That said, the simplest thing would just to be to write the updates to both places.

synch data in Redis multi masters configuration

I'm a newbie to Redis and I was wondering if someone could help me to understand if it can be the right tool.
This is my scenario:
I have many different nodes, everyone behaving like a master and accepting clients connections to read and write a few geographical data data and the timestamp of the incoming record.
Each master node could be hosted onto a drone that only randomly get in touch and can comunicate with others, accordind to network conditions; when this happens they should synchronize their data according to their age (only the ones more recent than a specified time).
Is there any way to achieve this by Redis or do I have to implement this feature at application level?
I tried master/slaves configuration without success and I was wondering if Redis Cluster can somewhat meet my neeeds.
I googled around, but what I found had not an answer good for me
https://serverfault.com/questions/717406/redis-multi-master-replication
Using Redis Replication on different machines (multi master)
Teo, as a matter of fact, redis don't have a multi master replication.
And the cluster shard it's data through different instances. Say you have only two redis instances. Instance1 will accept store and retrieve instance1 and instance2 data. But he will ask for, and store in, instance2 every key that does not belong to his shard.
This is not, I think, really what you want. You could give a try to PostgreSQL+BDR as PostgreSQL supports nosql store and BDR provides a real master master replication (https://wiki.postgresql.org/wiki/BDR_Project) if that's really what you need.
I work with both today (and also MongoDB). Each one with a different goal. Redis would provide a smaller overhead and memory use, fast connection and fast replication. But it won't provide multi master (if you really need it).

Redis: Efficient cluster of servers for large key set

I have a very large set of keys, 200M keys, with small values, <100 bytes, to store and I'm trying to use Redis. The problem is such that I have 10 Redis DB to split the keys over, but currently I'm on a single server with those 10 Redis DB. By a Redis DB I mean using SELECT. From my calculations it looks like I'm going to blow out memory. I think I'll need over 4TB of memory for this case! What are my options? First, my calculation is based on 10000 keys with 100 byte values taking 220MB of RAM (this is from a table I found). So simply put (2*10^8 / 10^4) * 220MB = 4.4TB.
If my calculation looks correct, what are my options? I've read on different posts that Redis VM is no longer an option. Can I use a Redis cluster? This still appears to require too many servers to be practical. I understand I could switch to another DB, but I'd like that to be the last resort option.
Firstly, using shared databases (i.e. the SELECT command) isn't a recommended practice since all of these databases are essentially managed by the same Redis process. It is preferable having 10 separate Redis processes (even on the same server) in order to avoid contention (more info here).
Next, there are ways to reduce the memory footprint of your database. You could, for example, perform client-side compression (see here) or consider other optimizations such as using Hashes to keep multiple values (as described here).
That said, a Redis server is ultimately bound by the amount of RAM that the host provides. Once you've reached that limit you'll need to shard your database and use a Redis cluster. Since you're already using multiple databases this shouldn't pose a big challenge as your code should already be compatible with that to a degree. Sharding can be done in one of three approaches: client, proxy or Redis Cluster. Client-side sharding can be implemented in your code or by the Redis client that you're using (if the client library that you're using supports that). Redis Cluster (v3) is expected to be released in the very near future and already has a stable release candidate. As for proxy-based sharding, there are several open source solutions out there, including Twitter's twemproxy, Netflix's dynomite and codis. Additional information about sharding and partitioning can be found here.
Disclaimer: I work at Redis Labs. Lastly, AFAIK there's only one Redis-as-a-Service provider that already provides built-in support for clustering Redis. Redis Labs' Redis Cloud is a fully-managed service that can scale seamlessly to any required capacity. Our clusters support both the '{}' hashtag standard as well as sharding by RegEx - more about this can be found here.
You can use LMDB with Dynomite to store data beyond your memory capacity. LMDB uses both disk and memory to store data. Dynomite make LMDB to be distributed.
We have done a POC with this combo and they work nicely together.
For more information, please check out our open issue here:
https://github.com/Netflix/dynomite/issues/254