Possibility of high transaction failure rate using WATCH, MULTI, EXEC in Redis

Possibility of high transaction failure rate using WATCH, MULTI, EXEC in Redis - redis

I will be storing key-value pairs in Redis but the number of keys will be just 4. Since there will be multiple processes updating the values parallelly, I plan to use Redis transactions using WATCH, MULTI and EXEC commands.
My algorithm is something like this:
GET key
WATCH key
MULTI
SET key new_val
EXEC
My main concern is that, since WATCH uses optimistic locking, when I will have multiple processes (much more than the number of keys, which are only 4) trying to update values, the transaction failure rate will be very high.
Is this correct? Is there a way to prevent this?

It is unclear why you'd need a transaction here unless you're doing something in your application with the reply for GET key. I'll therefore assume that you are using the value for something meaningful, otherwise, you can drop the transaction semantics and just call SET key new_val.
Optimistic locking is mainly intended to be used in cases where there's low contention for the resources. Since the use case that you're describing is clearly the opposite, it would probably result in high failure rates. This isn't saying that Redis and your application will not work, but it does mean there a potential for a lot of wasted effort.
I would advise that you consider switching if possible to using Redis' server-side Lua scripts. These are blocking, atomic, and let you read and meaningfully manipulate the data in Redis programmatically. See the EVAL command for details.

Related

Redis: Using lua and concurrent transactions

Two issues
Do lua scripts really solve all cases for redis transactions?
What are best practices for asynchronous transactions from one client?
Let me explain, first issue
Redis transactions are limited, with an inability to unwatch specific keys, and all keys being unwatched upon exec; we are limited to a single ongoing transaction on a given client.
I've seen threads where many redis users claim that lua scripts are all they need. Even the redis official docs state they may remove transactions in favour of lua scripts. However, there are cases where this is insufficient, such as the most standard case: using redis as a cache.
Let's say we want to cache some data from a persistent data store, in redis. Here's a quick process:
Check cache -> miss
Load data from database
Store in redis
However, what if, between step 2 (loading data), and step 3 (storing in redis) the data is updated by another client?
The data stored in redis would be stale. So... we use a redis transaction right? We watch the key before loading from db, and if the key is updated somewhere else before storage, storage would fail. Great! However, within an atomic lua script, we cannot load data from an external database, so lua cannot be used here. Hopefully I'm simply missing something, or there is something wrong with our process.
Moving on to the 2nd issue (asynchronous transactions)
Let's say we have a socket.io cluster which processes various messages, and requests for a game, for high speed communication between server and client. This cluster is written in node.js with appropriate use of promises and asynchronous concepts.
Say two requests hit a server in our cluster, which require data to be loaded and cached in redis. Using our transaction from above, multiple keys could be watched, and multiple multi->exec transactions would run in overlapping order on one redis connection. Once the first exec is run, all watched keys will be unwatched, even if the other transaction is still running. This may allow the second transaction to succeed when it should have failed.
These overlaps could happen in totally separate requests happening on the same server, or even sometimes in the same request if multiple data types need to load at the same time.
What is best practice here? Do we need to create a separate redis connection for every individual transaction? Seems like we would lose a lot of speed, and we would see many connections created just from one server if this is case.
As an alternative we could use redlock / mutex locking instead of redis transactions, but this is slow by comparison.
Any help appreciated!

I have received the following, after my query was escalated to redis engineers:
Hi Jeremy,
Your method using multiple backend connections would be the expected way to handle the problem. We do not see anything wrong with multiple backend connections, each using an optimistic Redis transaction (WATCH/MULTI/EXEC) - there is no chance that the “second transaction will succeed where it should have failed”.
Using LUA is not a good fit for this problem.
Best Regards,
The Redis Labs Team

Safely setting keys with StackExchange.Redis while allowing deletes

I am trying to use Redis as a cache that sits in front of an SQL database. At a high level I want to implement these operations:
Read value from Redis, if it's not there then generate the value via querying SQL, and push it in to Redis so we don't have to compute that again.
Write value to Redis, because we just made some change to our SQL database and we know that we might have already cached it and it's now invalid.
Delete value, because we know the value in Redis is now stale, we suspect nobody will want it, but it's too much work to recompute now. We're OK letting the next client who does operation #1 compute it again.
My challenge is understanding how to implement #1 and #3, if I attempt to do it with StackExchange.Redis. If I naively implement #1 with a simple read of the key and push, it's entirely possible that between me computing the value from SQL and pushing it in that any number of other SQL operations may have happened and also tried to push their values into Redis via #2 or #3. For example, consider this ordering:
Client #1 wants to do operation #1 [Read] from above. It tries to read the key, sees it's not there.
Client #1 calls to SQL database to generate the value.
Client #2 does something to SQL and then does operation #2 [Write] above. It pushes some newly computed value into Redis.
Client #3 comes a long, does some other operation in SQL, and wants to do operation #3 [Delete] to Redis knowing that if there's something cached there, it's no longer valid.
Client #1 pushes its (now stale) value to Redis.
So how do I implement my operation #1? Redis offers a WATCH primitive that makes this fairly easy to do against the bare metal where I would be able to observe other things happened on the key from Client #1, but it's not supported by StackExchange.Redis because of how it multiplexes commands. It's conditional operations aren't quite sufficient here, since if I try saying "push only if key doesn't exist", that doesn't prevent the race as I explained above. Is there a pattern/best practice that is used here? This seems like a fairly common pattern that people would want to implement.
One idea I do have is I can use a separate key that gets incremented each time I do some operation on the main key and then can use StackExchange.Redis' conditional operations that way, but that seems kludgy.

It looks like question about right cache invalidation strategy rather then question about Redis. Why i think so - Redis WATCH/MULTI is kind of optimistic locking strategy and this kind of
locking not suitable for most of cases with cache where db read query can be a problem which solves with cache. In your operation #3 description you write:
It's too much work to recompute now. We're OK letting the next client who does operation #1 compute it again.
So we can continue with read update case as update strategy. Here is some more questions, before we continue:
That happens when 2 clients starts to perform operation #1? Both of them can do not find value in Redis and perform SQL query and next both of then write it to Redis. So we should have garanties that just one client would update cache?
How we can be shure in the right sequence of writes (operation 3)?
Why not optimistic locking
Optimistic concurrency control assumes that multiple transactions can frequently complete without interfering with each other. While running, transactions use data resources without acquiring locks on those resources. Before committing, each transaction verifies that no other transaction has modified the data it has read. If the check reveals conflicting modifications, the committing transaction rolls back and can be restarted.
You can read about OCC transactions phases in wikipedia but in few words:
If there is no conflict - you update your data. If there is a conflict, resolve it, typically by aborting the transaction and restart it if still need to update data.
Redis WATCH/MULTY is kind of optimistic locking so they can't help you - you do not know about your cache key was modified before try to work with them.
What works?
Each time your listen somebody told about locking - after some words you are listen about compromises, performance and consistency vs availability. The last pair is most important.
In most of high loaded system availability is winner. Thats this means for caching? Usualy such case:
Each cache key hold some metadata about value - state, version and life time. The last one is not Redis TTL - usually if your key should be in cache for X time, life time
in metadata has X + Y time, there Y is some time to garantie process update.
You never delete key directly - you need just update state or life time.
Each time your application read data from cache if should make decision - if data has state "valid" - use it. If data has state "invalid" try to update or use absolete data.
How to update on read(the quite important is this "hand made" mix of optimistic and pessisitic locking):
Try set pessimistic locking (in Redis with SETEX - read more here).
If failed - return absolete data (rememeber we still need availability).
If success perform SQL query and write in to cache.
Read version from Redis again and compare with version readed previously.
If version same - mark as state as "valid".
Release lock.
How to invalidate (your operations #2, #3):
Increment cache version and set state "invalid".
Update life time/ttl if need it.
Why so difficult
We always can get and return value from cache and rarely have situatiuon with cache miss. So we do not have cache invalidation cascade hell then many process try to update
one key.
We still have ordered key updates.
Just one process per time can update key.
I have queue!
Sorry, you have not said before - I would not write it all. If have queue all becomes more simple:
Each modification operation should push job to queue.
Only async worker should execute SQL and update key.
You still need use "state" (valid/invalid) for cache key to separete application logic with cache.
Is this is answer?
Actualy yes and no in same time. This one of possible solutions. Cache invalidation is much complex problem with many possible solutions - one of them
may be simple, other - complex. In most of cases depends on real bussines requirements of concrete applicaton.

Acquiring Locks when updating a Redis key/value

I'm using AcquireLock method from ServiceStack Redis when updating and getting the key/value like this:
public virtual void Set(string key, T entity)
{
using (var client = ClientManager.GetClient())
{
using (client.AcquireLock(key + ":locked", DefaultLockingTimeout, DefaultLockExpire))
{
client.Set(key, entity);
}
}
}
I've extended AcqurieLock method to accept extra parameter for expiration of the lock key. So I'm wondering that if I need AcquireLock at all or not? My class uses AcquireLock in every operation like Get<>, GetAll<>, ExpireAt, SetAll<>, etc..
But this approach doesn't work everytime. For example, if the operating in the lock throws an exception, then the key remains locked. For this situation I've added DefaultLockExpire parameter to AcquireLock method to expire the "locked" key.
Is there any better solution, or when do we need acquiring locks like "lock" blocks in multi-thread programming.

As The Real Bill answer has said, you don't need locks for Redis itself. What the ServiceStack client offers in terms of locking is not for Redis, but for your application. In a C# application, you can lock things locally with lock(obj) so that something cannot happen concurrently (only one thread can access the locked section at a time), but that only works if you have one webserver. If you want to prevent something happening concurrently, you need a locking mechanism living outside of the webserver. Redis is a good fit for this.
We have a case where it is checked if a customer has a shopping cart already and if not, create it. Between checking and creating it, there's a time where another request could have also found out that cart doesn't exist and might also proceed to create one. That's a classical case for locking but a simple lock wouldn't work here as the request may have arrived from an entirely different web-server. So for this, we use the ServiceStack Redis client (with some abstraction) to lock using Redis and only allow one request at a time to enter the "create a cart" section.
So to answer your actual question: no, you don't need a lock for getting/setting values to Redis.

I wouldn't use locks for get/set operations. Redis will do those actions atomically, so there is no chance of it getting "changed underneath you" when setting or getting. I've built systems where hundreds of clients are updating/operating on values concurrently and never needed a lock to do those operations (especially an expire).
I don't know how Service Stack redis implements the locking it has so I can't say why it is failing. However, I'm not sure I'd trust it given there is no true locking needed on the Redis side for data operations. Redis is single-threaded so locking there doesn't make sense.
If you are doing complex operations where you get a value, operate on things based on it, then update it after a while and can't have the value change in the meantime I'd recommend reading and groking http://redis.io/topics/transactions to see if what you want is what Redis is good for, whether your code needs refactored to eliminate the problem, or at the least find a better way to do it.
For example, SETNX may be the route you need to get what you want, but without details I can't say it will work.

As #JulianR says, the locking in ServiceStack.Redis is only for application-level distributed locks (i.e. to replace using a DB or an empty .lock file on a distributed file system) and it only works against other ServiceStack.Redis clients in other process using the same key/API to acquire the lock.
You would never need to do this for normal Redis operations since they're all atomic. If you want to ensure a combination of redis operations happen atomically than you would combine them within a Redis Transaction or alternatively you can execute them within a server-side Lua script - both allow atomic execution of batch operations.

Locking and Redis

We have 75 (and growing) servers that need to share data via Redis. All 75 servers would ideally want to write to two fields in Redis with INCRBYFLOAT operations. We anticipate eventually having potentially millions of daily write operations and billions of daily reads on these two fields. This data must be persistent.
We're concerned that Redis locking might cause write operations to be repeatedly retried with many simultaneous attempts to increment the same field.
Questions:
Is multiple, simultaneous INCRBYFLOAT on a single field a bad idea under a very heavy load?
Should we have an external process "summarize" separate fields and write the two fields instead? (this introduces another failure point)
Will reads on those two fields block while writing?

Redis does not lock. Also, it is single threaded; so there are no race conditions. Reads or Writes do not block.
You can run millions of INCRBYFLOAT on the same key without any problems. No need for external processes. Reading those fields does not pose any problems.
That said, "Millions of updates to two keys" sounds strange. If you can explain your use case, perhaps there might be a better way to handle it within Redis.

Since Redis is single threaded, you will probably want to use master-slave replication to separate writes from reads, since yes, writes will generally block reads.
Alternatively you can consider using Apache Zookeeper for this, it provides reliable cluster coordination without single points of failure (like single Redis instance).

Recover from SQL batch-abort errors inside a transaction? Alternative?

I'm looking for a way to continue execution of a transaction despite errors while inserting low-priority data. It seems like real nested transaction could be a solution, but they aren't supported by SQL Server 2005/2008. Another solution would be to have logic to decide if an error is critical or not, but it would seem that's not possible either.
Here's more detail on my scenario:
Data is periodicaly inserted in the database using ADO.NET/C#, and while some of it is vital, some could also be missing without problems. When the inserts are done, some computations are made on the data. (Both vital and non-vital) This whole process is inside a transaction so everything remains in synch.
Currently, transaction save points are used, and partial rollbacks are made on exceptions which occur during non-vital inserts. However, this doesn't work for "batch-abort" errors, which automaticly rollback the entire transaction. I understand some errors are critical, but things like failed casts are considered by SQL Server to be batch-abort errors. (Info on batch errors) I'm trying to prevent these errors from bringing down the whole insert if they occur on low priority data.
If what I'm describing isn't possible, I'm willing to consider any alternative way to achieve data integrity but allow the failure of the non-vital inserts.
Thanks for your help.

Unfortunately, can't be done as you describe (full support for nested transactions would be key here). Couple things I can think of that have been used to get around this in the past:
Best option would probably be to separate the commands into important/non-important commands that could be executed distinctly, naturally this would require that they not be order-dependent on each other
Could also use a messaging based approach (see Service Broker) where you would execute the primary commands inline and push the non-primary commands onto a queue for execution later/separately. The push to the queue would be transactional within the batch, but the execution of the command when you pop off the queue would be separate. This too would require they not be order-dependent on each other.
If order-dependent, you could use the messaging approach for everything, which would ensure order and could have separate messages per operation, then grouping them together (via conversation groups) would allow you to pull them off the queue in order as well and use separate transactions for each 'type' of operation (i.e. primary vs. non-primary). This would require some special coding on your part if all the grouped messages must be a single autonomous operation, but could be done.
I hesitate to even mention this option, because it is a terrible option, but for full disclosure I suppose you could consider it at your discretion if you think it fits (but it is definitely not an architecture that would apply to almost any scenario). You could use xp_cmdshell to call out to the command line and execute sqlcmd/osql for the non-critical tasks - this sqlcmd execution would be in a separate transaction from the module you are executing from, and simply ignoring the xp_cmdshell failure should allow the primary batch to continue.
Those are some ideas...

Can you do your import into a temporary location, using transactions only for the important parts. Once the temp location loaded, having absorbed any non-critical errors, you can copy the data into its final destination in a single transaction. Depends on the nature the work you are doing, but potentially a viable option.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas