I want to know why Redis use key-value for storing? - redis

I am studying Redis and I want to know why Redis use key-value for storing?
and why use Redis there is SQL which is key-value then why not SQL why Redis

SQL vs Redis:
SQL is a database technology which persists data onto disk whereas Redis a cache store i.e. it stores data on RAM.
Since RAM access is faster than DISK access, storing and fetching data in Redis is much faster than SQL but the catch is RAM has limited storage.
So basically, one might want to store some part of data into Redis so as to make the application faster.
Redis is Key-Value:
Generally, some part of the data is frequently accessed whereas some is not. One again might want to store this particular field into Redis and rest into SQL database.
Redis is used so as to make the application faster by reducing the number of hits to the DISK.

Related

Can i query directly Redis database (persistent not in-memory) or data is always kept in-memory and requests are executed against the in-memory data?

A simple question about using Redis as a persistent database (not in-memory):
Can I directly query the Redis database from my spring boot application (just like with MySQL or Oracle db) or data should always be loaded in-memory first and requests are to be executed against the in-memory data?
Thanks.
When you query data from Redis it does not load that data in memory at that point. Redis is an in-memory database, meaning it always keeps all the data in it's memory, and when you send the query to redis, it processes it against the data that is already in memory.
Redis is an in-memory database which you can treat like any other external dependency you may have in your application. Compared with the other databases you mentioned, it does not offer the ability to use SQL to query it, so you must rely on its own commands, which are very specific.
There are some Java clients you can use to interact with Redis, including Lettuce and Jedis. The commands you send to Redis are executed against the data that Redis itself keep in its own memory.

Redis: Dump db and delete dumped key / value pairs

I have multiple servers that all store set members in a shared Redis cache. When the cache fills up, I need to persist the data to disk to free up RAM. I then plan to parse the dumped data such that I will be able to combine all of the values that belong to a given key in MongoDB.
My first plan was to have each server process attempt an sadd operation. If the request fails because Redis has reached maxmemory, I planned to query for each of my set keys, and write each to disk.
However, I am wondering if there is a way to use one of the inbuilt persistence methods in Redis to write the Redis data to disk and delete the key/value pairs after writing. If this is possible I could just parse the rdb dump and work with the data in that fashion. I'd be grateful for any help others can offer on this question.
Redis' persistence is meant to be used for whatever's in the RAM. Put differently, you can't persist what ain't in RAM.
To answer your question: no, you can't use persistence to "offload" data from RAM.

How can I dump a single Redis DB index?

Redis SAVE and BGSAVE commands dump the complete Redis data to a persistent file.
But is there a way to dump only one DB index?
I am using the same Redis server with multiple DB indices.
I use DB 0 as config which is edited manually and contains just a small number of keys. I wish to dump this to a file as a config snapshot (versioned) to keep track of manual changes in the prod environment.
The rest of the DBs have a large number of items, that will take too long to dump, and I don't need to back them up.
Redis' persistence scope is the entire instance, meaning all shared/numbered databases and all keys in them. Saving only a subset of these is not supported.
Instead, use two independent Redis instances and configure each to persist (or not) per your needs. The overhead of running an insurance is a few megabytes so it is practically negligible.

Redis: Efficient cluster of servers for large key set

I have a very large set of keys, 200M keys, with small values, <100 bytes, to store and I'm trying to use Redis. The problem is such that I have 10 Redis DB to split the keys over, but currently I'm on a single server with those 10 Redis DB. By a Redis DB I mean using SELECT. From my calculations it looks like I'm going to blow out memory. I think I'll need over 4TB of memory for this case! What are my options? First, my calculation is based on 10000 keys with 100 byte values taking 220MB of RAM (this is from a table I found). So simply put (2*10^8 / 10^4) * 220MB = 4.4TB.
If my calculation looks correct, what are my options? I've read on different posts that Redis VM is no longer an option. Can I use a Redis cluster? This still appears to require too many servers to be practical. I understand I could switch to another DB, but I'd like that to be the last resort option.
Firstly, using shared databases (i.e. the SELECT command) isn't a recommended practice since all of these databases are essentially managed by the same Redis process. It is preferable having 10 separate Redis processes (even on the same server) in order to avoid contention (more info here).
Next, there are ways to reduce the memory footprint of your database. You could, for example, perform client-side compression (see here) or consider other optimizations such as using Hashes to keep multiple values (as described here).
That said, a Redis server is ultimately bound by the amount of RAM that the host provides. Once you've reached that limit you'll need to shard your database and use a Redis cluster. Since you're already using multiple databases this shouldn't pose a big challenge as your code should already be compatible with that to a degree. Sharding can be done in one of three approaches: client, proxy or Redis Cluster. Client-side sharding can be implemented in your code or by the Redis client that you're using (if the client library that you're using supports that). Redis Cluster (v3) is expected to be released in the very near future and already has a stable release candidate. As for proxy-based sharding, there are several open source solutions out there, including Twitter's twemproxy, Netflix's dynomite and codis. Additional information about sharding and partitioning can be found here.
Disclaimer: I work at Redis Labs. Lastly, AFAIK there's only one Redis-as-a-Service provider that already provides built-in support for clustering Redis. Redis Labs' Redis Cloud is a fully-managed service that can scale seamlessly to any required capacity. Our clusters support both the '{}' hashtag standard as well as sharding by RegEx - more about this can be found here.
You can use LMDB with Dynomite to store data beyond your memory capacity. LMDB uses both disk and memory to store data. Dynomite make LMDB to be distributed.
We have done a POC with this combo and they work nicely together.
For more information, please check out our open issue here:
https://github.com/Netflix/dynomite/issues/254

What is a recommended scalable DB platform to use in AWS for large amounts of volatile data sets - elasticsearch, Redis or DynamoDB?

Users of our platform will have large amounts of stored data on our system. Through an application, once connected, that data will be transferred to them and no longer need to remain on our servers. There could potentially be hundreds or thousands of users connected at any given time, performing their downloads.
Here's the proposed architecture:
User management, configuration, and data download statistics will be maintained in a SQL Server database, while using either Redis or DynamoDB for the large data sets.
The reason for choosing either Redis or DynamoDB is based on cost - cheaper than running another SQL Server instance, and performance. The data format will be similar to a datamart - flat table with no joins.
Initially the queries would be simple - get all data for user X between a date range, and optionally delete.
Since we may want to add free text searching for certain fields of that data using elasticsearch may be a better option to use from the get-go.
I want this to be auto-scaling but not sure which database would be best to use for this scenario.
Here's some great discussion on Database + Search tier from AWS ReInvent:
https://youtu.be/K7o5OlRLtvU?t=1574
I would not take Elastic-search alone because it does not provide auto-scaling for writing capacity. In fact, it's not trivial to augment the number of shard of an index. Secondly it can only handle the JSON format, which could be an issue for you.
Redis could be a good idea because it is really fast, everything is done in RAM, and it provides keys with a limited time-to-live which could be interesting for you. Unfortunately, if your data size exceeds the capacity in RAM of your amazon instance you will have to shard your Redis database. And Redis does not support it, you will have to deal it on your application code. Moreover, as far as I know Redis does not handle complex queries. You will also need to save your data in a Redis data structure which could be an issue for you
DynamoDB handles auto-scaling really well but on the other hand it is a key/value database so it does not allow you to make queries like "get all data for user X between a date range". DynamoDB also allows you to save your data in any format.
The solution will be to use either DynamoDB or either Redis depending of the size of your datas, and to use ElasticSearch in order to index your key with only the meta-data (user and dates). Like that your index will be small, and if you lost the ability to index because of ElasticSearch get too buzy, you keep the ability to save user's datas.