Is this a good use-case for Redis on a ServiceStack REST API? - redis

I'm creating a mobile app and it requires a API service backend to get/put information for each user. I'll be developing the web service on ServiceStack, but was wondering about the storage. I love the idea of a fast in-memory caching system like Redis, but I have a few questions:
I created a sample schema of what my data store should look like. Does this seems like it's a good case for using Redis as opposed to a MySQL DB or something like that?
schema http://www.miles3.com/uploads/redis.png
How difficult is the setup for persisting the Redis store to disk or is it kind of built-in when you do writes to the store? (I'm a newbie on this NoSQL stuff)
I currently have my setup on AWS using a Linux micro instance (because it's free for a year). I know many factors go into this answer, but in general will this be enough for my web service and Redis? Since Redis is in-memory will that be enough? I guess if my mobile app skyrockets (hey, we can dream right?) then I'll start hitting the ceiling of the instance.

What to think about when desigining a NoSQL Redis application
1) To develop correctly in Redis you should be thinking more about how you would structure the relationships in your C# program i.e. with the C# collection classes rather than a Relational Model meant for an RDBMS. The better mindset would be to think more about data storage like a Document database rather than RDBMS tables. Essentially everything gets blobbed in Redis via a key (index) so you just need to work out what your primary entities are (i.e. aggregate roots)
which would get kept in its own 'key namespace' or whether it's non-primary entity, i.e. simply metadata which should just get persisted with its parent entity.
Examples of Redis as a primary Data Store
Here is a good article that walks through creating a simple blogging application using Redis:
http://www.servicestack.net/docs/redis-client/designing-nosql-database
You can also look at the source code of RedisStackOverflow for another real world example using Redis.
Basically you would need to store and fetch the items of each type separately.
var redisUsers = redis.As<User>();
var user = redisUsers.GetById(1);
var userIsWatching = redisUsers.GetRelatedEntities<Watching>(user.Id);
The way you store relationship between entities is making use of Redis's Sets, e.g: you can store the Users/Watchers relationship conceptually with:
SET["ids:User>Watcher:{UserId}"] = [{watcherId1},{watcherId2},...]
Redis is schema-less and idempotent
Storing ids into redis sets is idempotent i.e. you can add watcherId1 to the same set multiple times and it will only ever have one occurrence of it. This is nice because it means you don't ever need to check the existence of the relationship and can freely keep adding related ids like they've never existed.
Related: writing or reading to a Redis collection (e.g. List) that does not exist is the same as writing to an empty collection, i.e. A list gets created on-the-fly when you add an item to a list whilst accessing a non-existent list will simply return 0 results. This is a friction-free and productivity win since you don't have to define your schemas up front in order to use them. Although should you need to Redis provides the EXISTS operation to determine whether a key exists or a TYPE operation so you can determine its type.
Create your relationships/indexes on your writes
One thing to remember is because there are no implicit indexes in Redis, you will generally need to setup your indexes/relationships needed for reading yourself during your writes. Basically you need to think about all your query requirements up front and ensure you set up the necessary relationships at write time. The above RedisStackOverflow source code is a good example that shows this.
Note: the ServiceStack.Redis C# provider assumes you have a unique field called Id that is its primary key. You can configure it to use a different field with the ModelConfig.Id() config mapping.
Redis Persistance
2) Redis supports 2 types persistence modes out-of-the-box RDB and Append Only File (AOF). RDB writes routine snapshots whilst the Append Only File acts like a transaction journal recording all the changes in-between snapshots - I recommend adding both until your comfortable with what each does and what your application needs. You can read all Redis persistence at http://redis.io/topics/persistence.
Note Redis also supports trivial replication you can read more about at: http://redis.io/topics/replication
Redis loves RAM
3) Since Redis operates predominantly in memory the most important resource is that you have enough RAM to hold your entire dataset in memory + a buffer for when it snapshots to disk. Redis is very efficient so even a small AWS instance will be able to handle a lot of load - what you want to look for is having enough RAM.
Visualizing your data with the Redis Admin UI
Finally if you're using the ServiceStack C# Redis Client I recommend installing the Redis Admin UI which provides a nice visual view of your entities. You can see a live demo of it at:
http://servicestack.net/RedisAdminUI/AjaxClient/

Related

Using Redis as main database in production

I have used Redis before in many projects, usually as a temporary cache, or to share small pieces of data between different subsystems. you get the idea. But now, I'm building a service that mainly revolves around the concept of a key-value store and needs super-fast data retrieval, so I'm using Redis. And because of this, all my data would be on Redis (it's not just a temporary cache or something of that sort). So I'm wondering, is this OK or should I take backups to another more "stable" database like MongoDB or MySQL?
Note that I'm not talking about the usual periodic backups. I would do that regardless of what database I end up using.

Cons of using MemoryCache as a temporary copy of DB table

I have a site where you can list your car for sale. There is a list and a map with filtering on car types and other car specifications. My idea was to cache cars table and use that to filter on when user is searching for a car on the website. Currently, especially when zooming in/out on the map, each time user does that, http request is made and it's querying the database, and that can be slow and heavy on the server.
As an experiment with 1 000 items, I have cached map data (trimmed data with only basic info) and it's working fine. I was thinking of doing a basically copy of cars table instead with all needed joins added in Memory Cache and use that instead of querying the DB every request for both list and the map. I would have Cron Job every 5 minutes (as data can change, but it doesn't have to be immediate) to update Memory Cache with latest cars data from DB.
What would be the cons of using this approach in long term and for using it for example storing 100 000 records? Beside server needing more RAM, would there be any concerns about scalability or usability of this approach? Would it be better to use Redis instead?
I do have in place now "search as you type" service, but I don't really need that functionality as filtering is pretty exact, I have added it more as a caching server but I think I would be better off just using Memory Cache until a real need for that kind of service is required.
Thank you
Since memory isn’t infinite, we need to limit the number of items stored in the In-Memory cache.
MemoryCache VS Redis
MemoryCache
MemoryCache is embedded in the process , hence can only be used as a plain key-value store from that process.
Redis
Redis is a remote data structure server. It is certainly slower than just storing the data in local memory.
I conclude that MemoryCache is running in the web server of the current application, and it is limited by the performance of the web server. Of course, it will be very fast under the same configuration. I think the disadvantage is that the stored data cannot be shared with other applications.
If redis is used, reading data directly from memory is not as fast as memorycache, but it has high reliability and high scalability.
Related Post:
1. How to update redis after updating database?
2. how to keep caching up to date
3. How can MySQL update data in real time in redis cache?

Realtime queries in deepstream "cache" layer?

I see, that by using RethinkDB connector one can achieve real time querying capabilites by subscribing into specifically named lists. I assume, that this is not actually the fastest solution, as the query probably updates only after changes to records are written to the database. Is there any recommended approach to achieve realtime querying capabilites deepstream-side?
There are some favourable properties like:
Number of unique queries is small compared to number of records or even number of connected clients
All manipulation of records that are subject to querying is done via RPC.
I can imagine multiple ways how to do that:
Imitate the rethinkdb connector approach. But for that I am missing a list.listen() method. With that I would be able to create a backend process creating a list on-demand and on each RPC CRUD operation on records update all currently active lists=queries.
Reimplement basic list functionality in records and use the above approach with now existing .listen()
Use .listen() in events?
Or do we have list.listen() and I just missed it? Or there is more elegant way how to do it?
Great question - generally lists are a client-side concept, implemented on top of records. Listen notifies you about clients subscribing to records, not necessarily changing them - change notifications arrive via mylist.subscribe(data => {}) or myRecord.subscribe(data => {}).
The tricky bit is the very limited querying capability of caches. Redis has a basic concept of secondary indices that can be searched for ranges and intersection, memcached and co are to my knowledge pure key-value stores, searchable only by ID - as a result the actual querying would make most sense on the database layer where your data will usually arrive in significantly less than 200ms.
The RethinkDB search provider offers support for RethinkDB's built in realtime querying capabilites. Alternatively you could use MongoDB and trail its operations log or use PostGres and deepstream's built in subscribe feature for change notifications.

Redis: Efficient cluster of servers for large key set

I have a very large set of keys, 200M keys, with small values, <100 bytes, to store and I'm trying to use Redis. The problem is such that I have 10 Redis DB to split the keys over, but currently I'm on a single server with those 10 Redis DB. By a Redis DB I mean using SELECT. From my calculations it looks like I'm going to blow out memory. I think I'll need over 4TB of memory for this case! What are my options? First, my calculation is based on 10000 keys with 100 byte values taking 220MB of RAM (this is from a table I found). So simply put (2*10^8 / 10^4) * 220MB = 4.4TB.
If my calculation looks correct, what are my options? I've read on different posts that Redis VM is no longer an option. Can I use a Redis cluster? This still appears to require too many servers to be practical. I understand I could switch to another DB, but I'd like that to be the last resort option.
Firstly, using shared databases (i.e. the SELECT command) isn't a recommended practice since all of these databases are essentially managed by the same Redis process. It is preferable having 10 separate Redis processes (even on the same server) in order to avoid contention (more info here).
Next, there are ways to reduce the memory footprint of your database. You could, for example, perform client-side compression (see here) or consider other optimizations such as using Hashes to keep multiple values (as described here).
That said, a Redis server is ultimately bound by the amount of RAM that the host provides. Once you've reached that limit you'll need to shard your database and use a Redis cluster. Since you're already using multiple databases this shouldn't pose a big challenge as your code should already be compatible with that to a degree. Sharding can be done in one of three approaches: client, proxy or Redis Cluster. Client-side sharding can be implemented in your code or by the Redis client that you're using (if the client library that you're using supports that). Redis Cluster (v3) is expected to be released in the very near future and already has a stable release candidate. As for proxy-based sharding, there are several open source solutions out there, including Twitter's twemproxy, Netflix's dynomite and codis. Additional information about sharding and partitioning can be found here.
Disclaimer: I work at Redis Labs. Lastly, AFAIK there's only one Redis-as-a-Service provider that already provides built-in support for clustering Redis. Redis Labs' Redis Cloud is a fully-managed service that can scale seamlessly to any required capacity. Our clusters support both the '{}' hashtag standard as well as sharding by RegEx - more about this can be found here.
You can use LMDB with Dynomite to store data beyond your memory capacity. LMDB uses both disk and memory to store data. Dynomite make LMDB to be distributed.
We have done a POC with this combo and they work nicely together.
For more information, please check out our open issue here:
https://github.com/Netflix/dynomite/issues/254

Using data from multiple redis databases in one command

At my current project I actively use redis for various purposes. There are 2 redis databases for current application:
The first one contains absolutely temporary data: how many users are online, who are online, various admin's counters. This db is cleared before the application starts by start-up script.
The second database is used for persistent data like user's ratings, user's friends, etc.
Everything seems to be correct and everybody is happy.
However, when I've started implementing a new functionality in my application, I discover that I need to intersect a set with user's friends with a set of online users. These sets stored in different redis databases, and I haven't found any possibility to do this task in redis, except changing application architecture and move all keys into one namespace(database).
Is there actually any way to perform some command in redis using data from multiple databases? Or maybe my use case of redis is wrong and I have to perform a fix of system architecture?
There is not. There is a command that makes it easy to move keys to another DB:
http://redis.io/commands/move
If you move all keys to one DB, make sure you don't have any key clashes! You could suffix or prefix the keys from the temp DB to make absolutely sure. MOVE will do nothing if the key already exists in the target DB. So make sure you act on a '0' reply
Using multiple DBs is definitely not a good idea:
A Quote from Salvatore Sanfilippo (the creator of redis):
I understand how this can be useful, but unfortunately I consider
Redis multiple database errors my worst decision in Redis design at
all... without any kind of real gain, it makes the internals a lot
more complex. The reality is that databases don't scale well for a
number of reason, like active expire of keys and VM. If the DB
selection can be performed with a string I can see this feature being
used as a scalable O(1) dictionary layer, that instead it is not.
With DB numbers, with a default of a few DBs, we are communication
better what this feature is and how can be used I think. I hope that
at some point we can drop the multiple DBs support at all, but I think
it is probably too late as there is a number of people relying on this
feature for their work.
https://groups.google.com/forum/#!msg/redis-db/vS5wX8X4Cjg/8ounBXitG4sJ