I have multiple servers that all store set members in a shared Redis cache. When the cache fills up, I need to persist the data to disk to free up RAM. I then plan to parse the dumped data such that I will be able to combine all of the values that belong to a given key in MongoDB.
My first plan was to have each server process attempt an sadd operation. If the request fails because Redis has reached maxmemory, I planned to query for each of my set keys, and write each to disk.
However, I am wondering if there is a way to use one of the inbuilt persistence methods in Redis to write the Redis data to disk and delete the key/value pairs after writing. If this is possible I could just parse the rdb dump and work with the data in that fashion. I'd be grateful for any help others can offer on this question.
Redis' persistence is meant to be used for whatever's in the RAM. Put differently, you can't persist what ain't in RAM.
To answer your question: no, you can't use persistence to "offload" data from RAM.
Related
I am studying Redis and I want to know why Redis use key-value for storing?
and why use Redis there is SQL which is key-value then why not SQL why Redis
SQL vs Redis:
SQL is a database technology which persists data onto disk whereas Redis a cache store i.e. it stores data on RAM.
Since RAM access is faster than DISK access, storing and fetching data in Redis is much faster than SQL but the catch is RAM has limited storage.
So basically, one might want to store some part of data into Redis so as to make the application faster.
Redis is Key-Value:
Generally, some part of the data is frequently accessed whereas some is not. One again might want to store this particular field into Redis and rest into SQL database.
Redis is used so as to make the application faster by reducing the number of hits to the DISK.
Redis SAVE and BGSAVE commands dump the complete Redis data to a persistent file.
But is there a way to dump only one DB index?
I am using the same Redis server with multiple DB indices.
I use DB 0 as config which is edited manually and contains just a small number of keys. I wish to dump this to a file as a config snapshot (versioned) to keep track of manual changes in the prod environment.
The rest of the DBs have a large number of items, that will take too long to dump, and I don't need to back them up.
Redis' persistence scope is the entire instance, meaning all shared/numbered databases and all keys in them. Saving only a subset of these is not supported.
Instead, use two independent Redis instances and configure each to persist (or not) per your needs. The overhead of running an insurance is a few megabytes so it is practically negligible.
I have been using redis a lot lately, and really am loving it. I am mostly familiar with persistence (rdb and aof). I do have one concern. I would like to be able to selectively "archive" some of my data to disk (or cheaper storage) once it is no longer important. I don't really want to delete it because it might be valuable at some point.
All of my keys are named id_<id>_<someattribute>. So when I am done with id 4, I want to "archive" all all keys that match id_4_*. I can view them quite easily in with the command line, but I can't do anything with them, persay. I have quite a bit of data (very large bitmaps) associated with this data set, and frankly I can't afford the space once the id is no longer relevant or important.
If this were mysql, I would have my different tables and would very easily just dump it to a .sql file and then drop the table. The actual .sql file isn't directly useful to me, but I could reimport the data if/when I need it. Or maybe I have to mysql database and I want to move one table to another database. Are there redis corollaries to these processes? Is there someway to make an rdb or aof file that is a subset of the data?
Any help or input on this matter would be appreciated! Thanks!
#Hoseong Hwang recently asked what I did, so I'm posting what I ended up doing.
It was really quite simple, actually. I was benefited by the fact that my key space is segmented out by different users. All of my keys were of the structure user_<USERID>_<OTHERVALUES>. My archival needs were on a user basis, some user's data was no longer needed to be kept in redis.
So, I started up another instance of redis-server, on another port locally (6380?) or another machine, it makes no difference. Then, I wrote a short script that basically just called KEYS user_<USERID>_* (I understand the blocking nature of KEYS, my key space is so small it didn't matter, you can use SCAN if that is an issue for you.) Then, for each key, I MIGRATED them to that new redis-server instance. After they were all done. I did a SAVE to ensure that the rdb file for that instance was up to date. And now I have that rdb, which is just the content that I wanted to archive. I then terminated that temporary redis-server and the memory was reclaimed.
Now, keep that rdb file somewhere for cheap, safe keeping. And if you ever needed it again, doing the reverse of my process above to get those keys back into your main redis-server would be fairly straightforward.
Instead of trying to extract data from a live Redis instance for archiving purpose, my suggestion would be to extract the data from a dump file.
Run a bgsave command to generate a dump, and then use redis-rdb-tools to extract the keys you are interested in - you can easily get the result as a json file.
See https://github.com/sripathikrishnan/redis-rdb-tools
You can keep the json data in flat files, or try to store them into a relational database or a document store if you need them to be indexed for retrieval purpose.
A few suggestions for you...
I would like to be able to selectively "archive" some of my data to
disk (or cheaper storage) once it is no longer important. I don't
really want to delete it because it might be valuable at some point.
If such data is that valuable, use a traditional database for storage. Despite redis supporting snap-shotting to disk and AOF logs, you should view it as mostly volatile storage. The primary use case for redis is reducing latency, not persistence of valuable data.
So when I am done with id 4, I want to "archive" all all keys that
match id_4_*
What constitutes done? You need to ask yourself this question; does it mean after 1 day the data can fall out of redis? If so, just use TTL and expiration to let redis remove the object from memory. If you need it again, fall back to the database and pull the object back into redis. That first client will take the hit of pulling from the db, but subsequent requests will be cached. If done means something not associated with a specific duration, then you'll have to remove items from redis manually to conserve memory space.
If this were mysql, I would have my different tables and would very
easily just dump it to a .sql file and then drop the table. The actual
.sql file isn't directly useful to me, but I could reimport the data
if/when I need it.
We do the same at my firm. Important data is imported into redis from rdbms executed as on-demand job. We don't drop tables, we just selectively import data from the database into redis; nothing wrong with that.
Is there someway to make an rdb or aof file that is a subset of the
data?
I don't believe there is a way to do selective archiving; it's either all or none.
IMO, spend more time playing with redis. I highly recommend leveraging out-of-box features instead of reinventing and/or over-engineering solutions to suit your needs.
Hope that helps!...
We have big shopping and product dealing system. We have faced lots problem with MySQL so after few r&D we planned to use Redis and we start integrating Redis in our system.
Following this previously directly hitting the database now we have moved the Redis system
User shopping cart details
Affiliates clicks tracking records
We have product dealing user data.
other site stats.
I am not only storing the data in Redis system i have written crons which moves Redis data in MySQL data at time intervals. This is the main point i am facing the issues.
Bellow points i am looking for solution
Is their any other ways to dump big data from Redis to MySQL?
Redis fail our store data in file so is it possible to store that data directly to MySQL database?
Is Redis have any trigger system using that i can avoid the crons like queue system?
Is their any other way to dump big data from Redis to MySQL?
Redis has the possibility (using bgsave) to generate a dump of the data in a non blocking and consistent way.
https://github.com/sripathikrishnan/redis-rdb-tools
You could use Sripathi Krishnan's well-known package to parse a redis dump file (RDB) in Python, and populate the MySQL instance offline. Or you can convert the Redis dump to JSON format, and write scripts in any language you want to populate MySQL.
This solution is only interesting if you want to copy the complete data of the Redis instance into MySQL.
Does Redis have any trigger system that i can use to avoid the crons like queue system?
Redis has no trigger concept, but nothing prevents you to post events in Redis queues each time something must be copied to MySQL. For instance, instead of:
# Add an item to a user shopping cart
RPUSH user:<id>:cart <item>
you could execute:
# Add an item to a user shopping cart
MULTI
RPUSH user:<id>:cart <item>
RPUSH cart_to_mysql <id>:<item>
EXEC
The MULTI/EXEC block makes it atomic and consistent. Then you just have to write a little daemon waiting on items of the cart_to_mysql queue (using BLPOP commands). For each dequeued item, the daemon has to fetch the relevant data from Redis, and populate the MySQL instance.
Redis fail our store data in file so is it possible to store that data directly to MySQL database?
I'm not sure I understand the question here. But if you use the above solution, the latency between Redis updates and MySQL updates will be quite limited. So if Redis fails, you will only loose the very last operations (contrary to a solution based on cron jobs). It is of course not possible to have 100% consistency in the propagation of data though.
Is Redis memory only store like memcached or does it write the data to the disk? If it does write to the disk, how often is the disk written to?
Redis persistence is described in detail here:
http://redis.io/topics/persistence
By default, redis performs snapshotting:
By default Redis saves snapshots of the dataset on disk, in a binary file called dump.rdb. You can configure Redis to have it save the dataset every N seconds if there are at least M changes in the dataset, or you can manually call the SAVE or BGSAVE commands.
For example, this configuration will make Redis automatically dump the dataset to disk every 60 seconds if at least 1000 keys changed: save 60 1000
Another good reference is this link to the author's blog where he tries to explain how redis persistance works:
http://antirez.com/post/redis-persistence-demystified.html
Redis holds all data in memory. If the size of an application's data is too large for that, then Redis is not an appropriate solution.
However, Redis also offers two ways to make the data persistent:
1) Snapshots at predefined intervals, which may also depend on the number of changes. Any changes between these intervals will be lost at a power failure or crash.
2) Writing a kind of change log at every data change. You can fine-tune how often this is physically written to the disk, but if you chose to always write immediately (which will cost you some performance), then there will be no data loss caused by the in-memory nature of Redis.