Does redis have documents and collections? - redis

I am new to redis. I am coming from mongodb. I have read on the data types redis supports. I was wondering does redis group its data in documents and does it have collections?
I have seen that you can have different databases in the same redis server here where you can select database 3 instead of the default 0.

does redis group its data in documents
No, Redis is not a document DB, meaning you can't have searches and aggregations like you did with mongo, redis offers alternatives sometimes though...
does it have collections
Redis offers various data structures (including lists, sets, hashes, etc) that can be queried. So given you have a set of elements you can query whether the element is in the set, add the element to the set, etc.
I have seen that you can have different databases in the same redis server here where you can select database 3 instead of the default 0.
You can think about databases in redis as "pre-defined" schemas/databases in mongo db So yes, you can say like "I would like to work with DB Number 3 and not with DB number 0. In mongo and RDBMS you connect to schema/database by name instead
All-in-all I think you shouldn't try to think about Redis as an alternative to Mongo, its another tool designed for other use cases. Make sure you understand the laws of data modeling in Redis and the fact that Redis is more "Memory" oriented, its not a persistent DB like Mongo, Postgres, etc.

Related

I want to know why Redis use key-value for storing?

I am studying Redis and I want to know why Redis use key-value for storing?
and why use Redis there is SQL which is key-value then why not SQL why Redis
SQL vs Redis:
SQL is a database technology which persists data onto disk whereas Redis a cache store i.e. it stores data on RAM.
Since RAM access is faster than DISK access, storing and fetching data in Redis is much faster than SQL but the catch is RAM has limited storage.
So basically, one might want to store some part of data into Redis so as to make the application faster.
Redis is Key-Value:
Generally, some part of the data is frequently accessed whereas some is not. One again might want to store this particular field into Redis and rest into SQL database.
Redis is used so as to make the application faster by reducing the number of hits to the DISK.

Redis: Efficient cluster of servers for large key set

I have a very large set of keys, 200M keys, with small values, <100 bytes, to store and I'm trying to use Redis. The problem is such that I have 10 Redis DB to split the keys over, but currently I'm on a single server with those 10 Redis DB. By a Redis DB I mean using SELECT. From my calculations it looks like I'm going to blow out memory. I think I'll need over 4TB of memory for this case! What are my options? First, my calculation is based on 10000 keys with 100 byte values taking 220MB of RAM (this is from a table I found). So simply put (2*10^8 / 10^4) * 220MB = 4.4TB.
If my calculation looks correct, what are my options? I've read on different posts that Redis VM is no longer an option. Can I use a Redis cluster? This still appears to require too many servers to be practical. I understand I could switch to another DB, but I'd like that to be the last resort option.
Firstly, using shared databases (i.e. the SELECT command) isn't a recommended practice since all of these databases are essentially managed by the same Redis process. It is preferable having 10 separate Redis processes (even on the same server) in order to avoid contention (more info here).
Next, there are ways to reduce the memory footprint of your database. You could, for example, perform client-side compression (see here) or consider other optimizations such as using Hashes to keep multiple values (as described here).
That said, a Redis server is ultimately bound by the amount of RAM that the host provides. Once you've reached that limit you'll need to shard your database and use a Redis cluster. Since you're already using multiple databases this shouldn't pose a big challenge as your code should already be compatible with that to a degree. Sharding can be done in one of three approaches: client, proxy or Redis Cluster. Client-side sharding can be implemented in your code or by the Redis client that you're using (if the client library that you're using supports that). Redis Cluster (v3) is expected to be released in the very near future and already has a stable release candidate. As for proxy-based sharding, there are several open source solutions out there, including Twitter's twemproxy, Netflix's dynomite and codis. Additional information about sharding and partitioning can be found here.
Disclaimer: I work at Redis Labs. Lastly, AFAIK there's only one Redis-as-a-Service provider that already provides built-in support for clustering Redis. Redis Labs' Redis Cloud is a fully-managed service that can scale seamlessly to any required capacity. Our clusters support both the '{}' hashtag standard as well as sharding by RegEx - more about this can be found here.
You can use LMDB with Dynomite to store data beyond your memory capacity. LMDB uses both disk and memory to store data. Dynomite make LMDB to be distributed.
We have done a POC with this combo and they work nicely together.
For more information, please check out our open issue here:
https://github.com/Netflix/dynomite/issues/254

What is a recommended scalable DB platform to use in AWS for large amounts of volatile data sets - elasticsearch, Redis or DynamoDB?

Users of our platform will have large amounts of stored data on our system. Through an application, once connected, that data will be transferred to them and no longer need to remain on our servers. There could potentially be hundreds or thousands of users connected at any given time, performing their downloads.
Here's the proposed architecture:
User management, configuration, and data download statistics will be maintained in a SQL Server database, while using either Redis or DynamoDB for the large data sets.
The reason for choosing either Redis or DynamoDB is based on cost - cheaper than running another SQL Server instance, and performance. The data format will be similar to a datamart - flat table with no joins.
Initially the queries would be simple - get all data for user X between a date range, and optionally delete.
Since we may want to add free text searching for certain fields of that data using elasticsearch may be a better option to use from the get-go.
I want this to be auto-scaling but not sure which database would be best to use for this scenario.
Here's some great discussion on Database + Search tier from AWS ReInvent:
https://youtu.be/K7o5OlRLtvU?t=1574
I would not take Elastic-search alone because it does not provide auto-scaling for writing capacity. In fact, it's not trivial to augment the number of shard of an index. Secondly it can only handle the JSON format, which could be an issue for you.
Redis could be a good idea because it is really fast, everything is done in RAM, and it provides keys with a limited time-to-live which could be interesting for you. Unfortunately, if your data size exceeds the capacity in RAM of your amazon instance you will have to shard your Redis database. And Redis does not support it, you will have to deal it on your application code. Moreover, as far as I know Redis does not handle complex queries. You will also need to save your data in a Redis data structure which could be an issue for you
DynamoDB handles auto-scaling really well but on the other hand it is a key/value database so it does not allow you to make queries like "get all data for user X between a date range". DynamoDB also allows you to save your data in any format.
The solution will be to use either DynamoDB or either Redis depending of the size of your datas, and to use ElasticSearch in order to index your key with only the meta-data (user and dates). Like that your index will be small, and if you lost the ability to index because of ElasticSearch get too buzy, you keep the ability to save user's datas.

Redis full text search : reverse indexing or sunspot?

I have 3,5 millions records (readonly) actually stored in a MySQL DB that I would want to pull out to Redis for performance reasons. Actually, I've managed to store things like this into Redis :
1 {"type":"Country","slug":"albania","name_fr":"Albanie","name_en":"Albania"}
2 {"type":"Country","slug":"armenia","name_fr":"Arménie","name_en":"Armenia"}
...
The key I use here is the legacy MySQL id, so with some Ruby glue, I can break as less things as possible in this existing app (and this is a serious concern here).
Now the problem is when I need to perform a search on the keyword "Armenia", inside the value part. Seems like there's only two ways out :
Either I multiplicate Redis index :
id => JSON values (as shown above)
slug => id (reverse indexing based on the slug, that could do the basic search trick)
finally, another huge index specifically for autocomplete, as shown in this post : http://oldblog.antirez.com/post/autocomplete-with-redis.html
Either I use sunspot or some full text search engine (unfortunatly, I actually use ThinkingSphinx which is too much tied to MySQL :-(
So, what would you do ? Do you think the MySQL to Redis move of a single table is even a good idea ? I'm afraid of the Memory footprint those gigantic Redis key/values could take on a 16GB RAM Server.
Any feedback on a similar Redis usage ?
Before I start with a real answer, I wanted to mention that I don't see a good reason for you to be using Redis here. Based on what types of use cases it sounds like you're trying to do, it sounds like something like elasticsearch would be more appropriate for you.
That said, if you just want to be able to search for a few different fields within your JSON, you've got two options:
Auxiliary index that points field_key -> list_of_ids (in your case, "Armenia" -> 1).
Use Lua on top of Redis with JSON encoding and decoding to get at what you want. This is way more flexible and space efficient, but will be slower as your table grows.
Again, I don't think either is appropriate for you because it doesn't sound like Redis is going to be a good choice for you, but if you must, those should work.
Here's my take on Redis.
Basically I think of it as an in-memory cache that can be configured to only store the least recently used data (LRU). Which is the role I made it to play in my use case, the logic of which may be applicable to helping you think about your use case.
I'm currently using Redis to cache results for a search engine based on some complex queries (slow), backed by data in another DB (similar to your case). So Redis serves as a cache storage for answering queries. All queries either get served the data in Redis or the DB if it's a cache-miss in Redis. So, note that Redis is not replacing the DB, but merely being an extension via cache in my case.
This fit my specific use case, because the addition of Redis was supposed to assist future scalability. The idea is that repeated access of recent data (in my case, if a user does a repeated query) can be served by Redis, and take some load off of the DB.
Basically my Redis schema ended up looking somewhat like the duplication of your index you outlined above. I used sets and sortedSets to create "batches / sets" of redis-keys, each of which pointed to specific query results stored under a particular redis-key. And in the DB, I still had the complete data set and an index.
If your data set fits on RAM, you could do the "table dump" into Redis, and get rid of the need for MySQL. I could see this working, as long as you plan for persistent Redis storage and plan for the possible growth of your data, if this "table" will grow in the future.
So depending on your actual use case and how you see Redis fitting into your stack, and the load your DB serves, don't rule out the possibility of having to do both of the options you outlined above (which happend in my case).
Hope this helps!
Redis does provide Full Text Search with RediSearch.
Redisearch implements a search engine on top of Redis. This also enables more advanced features, like exact phrase matching, auto suggestions and numeric filtering for text queries, that are not possible or efficient with traditional Redis search approaches.

Is this a good use-case for Redis on a ServiceStack REST API?

I'm creating a mobile app and it requires a API service backend to get/put information for each user. I'll be developing the web service on ServiceStack, but was wondering about the storage. I love the idea of a fast in-memory caching system like Redis, but I have a few questions:
I created a sample schema of what my data store should look like. Does this seems like it's a good case for using Redis as opposed to a MySQL DB or something like that?
schema http://www.miles3.com/uploads/redis.png
How difficult is the setup for persisting the Redis store to disk or is it kind of built-in when you do writes to the store? (I'm a newbie on this NoSQL stuff)
I currently have my setup on AWS using a Linux micro instance (because it's free for a year). I know many factors go into this answer, but in general will this be enough for my web service and Redis? Since Redis is in-memory will that be enough? I guess if my mobile app skyrockets (hey, we can dream right?) then I'll start hitting the ceiling of the instance.
What to think about when desigining a NoSQL Redis application
1) To develop correctly in Redis you should be thinking more about how you would structure the relationships in your C# program i.e. with the C# collection classes rather than a Relational Model meant for an RDBMS. The better mindset would be to think more about data storage like a Document database rather than RDBMS tables. Essentially everything gets blobbed in Redis via a key (index) so you just need to work out what your primary entities are (i.e. aggregate roots)
which would get kept in its own 'key namespace' or whether it's non-primary entity, i.e. simply metadata which should just get persisted with its parent entity.
Examples of Redis as a primary Data Store
Here is a good article that walks through creating a simple blogging application using Redis:
http://www.servicestack.net/docs/redis-client/designing-nosql-database
You can also look at the source code of RedisStackOverflow for another real world example using Redis.
Basically you would need to store and fetch the items of each type separately.
var redisUsers = redis.As<User>();
var user = redisUsers.GetById(1);
var userIsWatching = redisUsers.GetRelatedEntities<Watching>(user.Id);
The way you store relationship between entities is making use of Redis's Sets, e.g: you can store the Users/Watchers relationship conceptually with:
SET["ids:User>Watcher:{UserId}"] = [{watcherId1},{watcherId2},...]
Redis is schema-less and idempotent
Storing ids into redis sets is idempotent i.e. you can add watcherId1 to the same set multiple times and it will only ever have one occurrence of it. This is nice because it means you don't ever need to check the existence of the relationship and can freely keep adding related ids like they've never existed.
Related: writing or reading to a Redis collection (e.g. List) that does not exist is the same as writing to an empty collection, i.e. A list gets created on-the-fly when you add an item to a list whilst accessing a non-existent list will simply return 0 results. This is a friction-free and productivity win since you don't have to define your schemas up front in order to use them. Although should you need to Redis provides the EXISTS operation to determine whether a key exists or a TYPE operation so you can determine its type.
Create your relationships/indexes on your writes
One thing to remember is because there are no implicit indexes in Redis, you will generally need to setup your indexes/relationships needed for reading yourself during your writes. Basically you need to think about all your query requirements up front and ensure you set up the necessary relationships at write time. The above RedisStackOverflow source code is a good example that shows this.
Note: the ServiceStack.Redis C# provider assumes you have a unique field called Id that is its primary key. You can configure it to use a different field with the ModelConfig.Id() config mapping.
Redis Persistance
2) Redis supports 2 types persistence modes out-of-the-box RDB and Append Only File (AOF). RDB writes routine snapshots whilst the Append Only File acts like a transaction journal recording all the changes in-between snapshots - I recommend adding both until your comfortable with what each does and what your application needs. You can read all Redis persistence at http://redis.io/topics/persistence.
Note Redis also supports trivial replication you can read more about at: http://redis.io/topics/replication
Redis loves RAM
3) Since Redis operates predominantly in memory the most important resource is that you have enough RAM to hold your entire dataset in memory + a buffer for when it snapshots to disk. Redis is very efficient so even a small AWS instance will be able to handle a lot of load - what you want to look for is having enough RAM.
Visualizing your data with the Redis Admin UI
Finally if you're using the ServiceStack C# Redis Client I recommend installing the Redis Admin UI which provides a nice visual view of your entities. You can see a live demo of it at:
http://servicestack.net/RedisAdminUI/AjaxClient/