Making sure (distributed) caches only store the latest value in a distributed system - redis

Let's say I want to use a built-in solution such as Redis or Memcached to cache database rows (as an example), to avoid recurrent costly trips to the database.
For the sake of the argument, let's assume I have a TABLE(id, x, y) and that I want to cache all rows so I never have to read directly from the database.
Questions:
Consider the following case: NodeA tries to update a given row's field x while NodeB tries to update y, then both simultaneously try to update the cache line. If they try to "manually" update the field they just changed to the row in the cache, if we follow the typical last-write-wins, one of the fields is going to be discarded, which is catastrophic. This makes me think I need to always fill the cache's rows with a full row read from the database.
But this by itself won't necessarily help me. If NodeA writes to x and loads the entire row in memory and then NodeB writes to y and reads the entire row in memory, if NodeB writes to the cache before NodeA then NodeB's changes will be overwritten! This makes me believe I need to always somehow version the rows both in the DB and in the cache. Is this the case? Memcached seems to have a compare and set primitive, but I see no such thing in Redis.
Even if 1. and 2. are not an issue, I still need to guarantee that my write / read has read-after-write consistency, otherwise it may happen that what I'm reading and intending to put in the cache is not necessarily the most up-to-date version. If that's the case, how can I make sure of this? By requiring w + r > n?
This seems to be a very common use-case, I'd guess it's pretty much a solved problem. What can I try to resolve this?

Key value stores as redis support advance data structures, such as HASHs.
If you're doing partial updates to cached entities (only a set of fields is updated as part of the super set), and given your goal is to avoid time-consuming database reads, simply save the table entry as a HASH K/V pairs (using HSET) and the use HGETALL for reading.
Redis OPS are atomic by nature, so that should solve your problems, if I got them right.
On a side note, if you're caching an entire entity yet doing partial updates, you should consider a simpler caching approach, such as read-through (making cache validity a reader-only concern).
As opposed to Database accesses. Redis cache access from different location unless somehow serialized, will always have the potential of being out of order when it comes to distributed systems, as there's always the execution environment (network, threading) to introduce possible delays.
Doing read-through caching will ensure data is always updated after the most recent write without the need to synchronize anything else.

This is how Facebook solved the issue with Memcached: http://nil.csail.mit.edu/6.824/2020/papers/memcache-faq.txt.
The idea is to use the concept of a lease: when a request for a cached value is received and there is no data for such key, a lease token (64 bits id) is returned.
When the webserver fetches the data from the database it can then store the data in the cache with that token. Every time an invalidation request is invoked on a key, a new lease token is created, and as such, if a put is attempted for an old token, the put ends up rejected.
As far as I understand, it's not really possible to (easily) replicate this behavior with Redis without resorting to LUA scripts.

Related

Auto Syncing for Keys in Apache Geode

I have an Apache Geode setup, connected with external Postgres datasource. I have a scenario where I define an expiration time for a key. Let's say after T time the key is going to expire. Is there a way so that the keys which are going to expire can make a call to an external datasource and update the value incase the value has been changed? I want a kind of automatic syncing for my keys which are there in Apache Geode. Is there any interface which i can implement and get the desired behavior?
I am not sure I fully understand your question. Are you saying that the values in the cache may possibly be more recent than what is currently stored in the database?
Whether you are using Look-Aside Caching, Inline Caching, or even Near Caching, Apache Geode combined with Spring would take care of ensuring the cache and database are kept in sync, to some extent depending on the caching pattern.
With Look-Aside Caching, if used properly, the database (i.e. primary System of Record (SOR), e.g. Postgres in your case) should always be the most current. (Look-Aside) Caching is secondary.
With Synchronous Inline Caching (using a CacheLoader/CacheWriter combination for Read/Write-Through) and in particular, with emphasis on CacheWriter, during updates (e.g. Region.put(key, value) cache operations), the DB is written to first, before the entry is stored (or overwritten) in the cache. If the DB write fails, then the cache entry is not written or updated. This is true each time a value for a key is updated. If the key has not be updated recently, then the database should reflect the most recent value. Once again, the database should always be the most current.
With Asynchronous Inline Caching (using AEQ + Listener, for Write-Behind), the updates for a cache entry are queued and asynchronously written to the DB. If an entry is updated, then Geode can guarantee that the value is eventually written to the underlying DB regardless of whether the key expires at some time later or not. You can persist and replay the queue in case of system failures, conflate events, and so on. In this case, the cache and DB are eventually consistent and it is assumed that you are aware of this, and this is acceptable for your application use case.
Of course, all of these caching patterns and scenarios I described above assume nothing else is modifying the SOR/database. If another external application or process is also modifying the database, separate from your Geode-based application, then it would be possible for Geode to become out-of-sync with the database and you would need to take steps to identify this situation. This is rather an issue for reads, not writes. Of course, you further need to make sure that stale cache entries does not subsequently overwrite the database on an update. This is easy enough to handle with optimistic locking. You could even trigger a cache entry remove on an DB update failure to have the cache refreshed on read.
Anyway, all of this is to say, if you applied 1 of the caching patterns above correctly, the value in the cache should already be reflected in the DB (or will be in the Async, Write-Behind Caching UC), even if the entry eventually expires.
Make sense?

Is the data always available with a Rename in Redis?

When I run a rename command, I think it does something like this,
Use new name for new data
remove reference for old name
remove old data (this can take some time if it’s large)
For clients accessing this data, is there ever a time where any of these happen?
The key does not exist
The data is not in a good state
Redis hangs during access
What steps are performed during a Redis rename command?
Since Redis has single threaded execution of commands, the rename will be atomic, so the answer to 1 and 2 are no. The thing about it "removing old data" is only if the destination key already points to a large structure that it needs to delete (Redis will clobber it.) The original data object will not be copied. Only hash table entries pointing to it might be moved around. Since rehashing in Redis is incremental, this should essentially be constant time.
Redis will always "hang" on slow commands due to the single threaded command execution. So for 3, it can always be yes depending on what you're doing, but in this case, only if you're doing significantly large implicit delete.
Edit: as of Redis 4.0 you can actually specify the config option lazyfree-lazy-server-del yes (default is no) and the server will actually delete asynchronously for side-effect deletes such as this. In other words, instead of delete blocking, the object will be queued for background deletion. This would effectively make RENAME constant time. See sample cfg: https://raw.githubusercontent.com/antirez/redis/4.0/redis.conf

Safely setting keys with StackExchange.Redis while allowing deletes

I am trying to use Redis as a cache that sits in front of an SQL database. At a high level I want to implement these operations:
Read value from Redis, if it's not there then generate the value via querying SQL, and push it in to Redis so we don't have to compute that again.
Write value to Redis, because we just made some change to our SQL database and we know that we might have already cached it and it's now invalid.
Delete value, because we know the value in Redis is now stale, we suspect nobody will want it, but it's too much work to recompute now. We're OK letting the next client who does operation #1 compute it again.
My challenge is understanding how to implement #1 and #3, if I attempt to do it with StackExchange.Redis. If I naively implement #1 with a simple read of the key and push, it's entirely possible that between me computing the value from SQL and pushing it in that any number of other SQL operations may have happened and also tried to push their values into Redis via #2 or #3. For example, consider this ordering:
Client #1 wants to do operation #1 [Read] from above. It tries to read the key, sees it's not there.
Client #1 calls to SQL database to generate the value.
Client #2 does something to SQL and then does operation #2 [Write] above. It pushes some newly computed value into Redis.
Client #3 comes a long, does some other operation in SQL, and wants to do operation #3 [Delete] to Redis knowing that if there's something cached there, it's no longer valid.
Client #1 pushes its (now stale) value to Redis.
So how do I implement my operation #1? Redis offers a WATCH primitive that makes this fairly easy to do against the bare metal where I would be able to observe other things happened on the key from Client #1, but it's not supported by StackExchange.Redis because of how it multiplexes commands. It's conditional operations aren't quite sufficient here, since if I try saying "push only if key doesn't exist", that doesn't prevent the race as I explained above. Is there a pattern/best practice that is used here? This seems like a fairly common pattern that people would want to implement.
One idea I do have is I can use a separate key that gets incremented each time I do some operation on the main key and then can use StackExchange.Redis' conditional operations that way, but that seems kludgy.
It looks like question about right cache invalidation strategy rather then question about Redis. Why i think so - Redis WATCH/MULTI is kind of optimistic locking strategy and this kind of
locking not suitable for most of cases with cache where db read query can be a problem which solves with cache. In your operation #3 description you write:
It's too much work to recompute now. We're OK letting the next client who does operation #1 compute it again.
So we can continue with read update case as update strategy. Here is some more questions, before we continue:
That happens when 2 clients starts to perform operation #1? Both of them can do not find value in Redis and perform SQL query and next both of then write it to Redis. So we should have garanties that just one client would update cache?
How we can be shure in the right sequence of writes (operation 3)?
Why not optimistic locking
Optimistic concurrency control assumes that multiple transactions can frequently complete without interfering with each other. While running, transactions use data resources without acquiring locks on those resources. Before committing, each transaction verifies that no other transaction has modified the data it has read. If the check reveals conflicting modifications, the committing transaction rolls back and can be restarted.
You can read about OCC transactions phases in wikipedia but in few words:
If there is no conflict - you update your data. If there is a conflict, resolve it, typically by aborting the transaction and restart it if still need to update data.
Redis WATCH/MULTY is kind of optimistic locking so they can't help you - you do not know about your cache key was modified before try to work with them.
What works?
Each time your listen somebody told about locking - after some words you are listen about compromises, performance and consistency vs availability. The last pair is most important.
In most of high loaded system availability is winner. Thats this means for caching? Usualy such case:
Each cache key hold some metadata about value - state, version and life time. The last one is not Redis TTL - usually if your key should be in cache for X time, life time
in metadata has X + Y time, there Y is some time to garantie process update.
You never delete key directly - you need just update state or life time.
Each time your application read data from cache if should make decision - if data has state "valid" - use it. If data has state "invalid" try to update or use absolete data.
How to update on read(the quite important is this "hand made" mix of optimistic and pessisitic locking):
Try set pessimistic locking (in Redis with SETEX - read more here).
If failed - return absolete data (rememeber we still need availability).
If success perform SQL query and write in to cache.
Read version from Redis again and compare with version readed previously.
If version same - mark as state as "valid".
Release lock.
How to invalidate (your operations #2, #3):
Increment cache version and set state "invalid".
Update life time/ttl if need it.
Why so difficult
We always can get and return value from cache and rarely have situatiuon with cache miss. So we do not have cache invalidation cascade hell then many process try to update
one key.
We still have ordered key updates.
Just one process per time can update key.
I have queue!
Sorry, you have not said before - I would not write it all. If have queue all becomes more simple:
Each modification operation should push job to queue.
Only async worker should execute SQL and update key.
You still need use "state" (valid/invalid) for cache key to separete application logic with cache.
Is this is answer?
Actualy yes and no in same time. This one of possible solutions. Cache invalidation is much complex problem with many possible solutions - one of them
may be simple, other - complex. In most of cases depends on real bussines requirements of concrete applicaton.

HA Database configuration that avoids split-brain issues?

I am looking for a (SQL/RDB) database setup that works something like this:
I will have 3+ databases in an active/active/active configuration
prior to doing any insert, the database will communicate with atleast a majority of the others, such that they all either insert at the same time or rollback (transaction)
this way I can write and read from any of the databases, and always get the same results (as long as the field wasn't updated very recently)
note: this is for a use case that will be very read-heavy and have few writes (and delay on the writes is an OK situation)
does anything like this exist? I see all sorts of solutions with database HA configurations, but most of them suggest writing to a primary node or having a passive backup
alternatively I could setup a custom application, and have each application talk to exactly 1 database, and achieve a similar result, but I was hoping something similar would already exist
So my questions is: does something like this exist? if not, are there any technical/architectural reasons why not?
P.S. - I will NOT be using a SAN where all databases can store/access the same data
edit: more clarifications as far as what I am looking for:
1. I have no database picked out yet, but I am more familiar with MySQL / SQL Server / Oracle, so I would have a minor inclination towards on of those
2. If a majority of the nodes are down (or a single node can't communicate with the collective), then I expect all writes from that node to fail, and accept that it may provide old/outdated information
failure / recover scenario expectations:
1. A node goes down: it will query and get updates from the other nodes when it comes back up
2. A node loses connection with the collective: it will provide potentially old data to read request, and refuse any writes
3. A node is in disagreement with the data stores in others: majority rule
4. 4. majority rule does not work: go with whomever has the latest data (although this really shouldn't happen)
5. The entries are insert/update/read only, i.e. there will be no deletes (except manually ofc), so I shouldn't need to worry about an update after a delete, however in that case I would choose to delete the record and ignore the update
6. Any other scenarios I missed?
update: I the closest I see to what I am looking for seems to be using a quorum + 2 DBs, however I would prefer if I could have 3 DBs instead, such that I can query any of them at any time (to further distribute the reads, and also to keep another copy of the data)
You need to define "very recently". In most environments with replication for inserts, all the databases will have the same data within a few seconds of an insert (and a few seconds seems pessimistic).
An alternative approach is a "read-from-one/write-to-all" approach. In this case, reads are spread through the system. Writes are then sent to all nodes by the application (or a common layer that the application uses).
Remember, though, that the issue with database replication is not how it works when it works. The issue is how it recovers when it fails and even how failures are identified. You need to decide what happens when nodes go down, how they recover lost transactions, how you decide that nodes are really synchronized. I would suggest that you peruse the documentation of the database that you are actually using and understand the replication mechanisms provided by that platform.

Redis and Object Versioning

How are people coping with changes to redis object schemas - adding or removing properties from objects?
Sharing from my own experience (one year old project with thousands of user requests per second).
Usually, there were two scenarios for me:
Add new information to existing structures (like, "email" field to a user)
Remove or change existing values in existing structures (like, change format of some field)
Drop stuff from the database
For 1 I keep following simple strategy: degrade gracefully, e.g. if user doesn't have email record - treat it as empty email. Worked all the time.
For 2 and 3 it depends, whether data can be changed/calculated/fixed before releasing or after. I run a job on database that does all the work for me, for few millions of keys it takes considerable time (minutes). If that job can be run only after I release the new code - then degrading gracefully helps a lot, I simply release and then run the job.
PS: If you affect a lot of keys in redis then it is very important to use http://redis.io/topics/pipelining Saves a lot of time.
Take a list of all affected (i.e. you want to fix them in any way) keys or records in pipeline
Do whatever you want on them. If it's possible try to queue writing operations into pipeline too
Send queued operations to redis.
It is also very important for you to make indexes of your structures. I keep sets with ids. Then I simply iterate over SMEMBERS(set_with_ids).
It is much, much better than iterating over KEYS command.
For extremely simple versioning, you could use different database numbers. This could be quite limiting in cases where almost everything is the same between two versions but it's also a very clean way to do it if it will work for you.