Why Redis SETNX succeeds for two clients - redis

I am studying the locking method on Redis, and found this official documents
https://redis.io/commands/setnx#handling-deadlocks
My understanding is that SETNX is an atomic operation and ONLY ONE client can do it successful (get 1) and the others failed (get 0)
However in the section Handling deadlocks on this document, it said:
1) C1 and C2 read lock.foo to check the timestamp, because they both received 0 after executing SETNX, as the lock is still held by C3 that crashed after holding the lock.
2) C1 sends DEL lock.foo
3) C1 sends SETNX lock.foo and it succeeds
4) C2 sends DEL lock.foo
5) C2 sends SETNX lock.foo and it succeeds
ERROR: both C1 and C2 acquired the lock because of the race condition.
My question is that why 3) and 5) can be succeed? I think only C1 or C2 can be succeed but not both.
Please correct my understanding, thank you,.

SETNX only sets the key if it doesn't already exist. So, indeed, if multiple clients try to SETNX at the same time, only one will succeed.
But in this case (see steps 2 and 4), the clients are deleting the key before calling SETNX. Since the key no longer exists, there is nothing to prevent SETNX from succeeding.

Related

Understanding transactions and WATCH

I am reading https://redis.io/docs/manual/transactions/, and I am not sure I fully understand the importance of WATCH in a Redis transaction. From https://redis.io/docs/manual/transactions/#usage, I am led to believe EXEC is atomic:
> MULTI
OK
> INCR foo
QUEUED
> INCR bar
QUEUED
> EXEC
1) (integer) 1
2) (integer) 1
But between MULTI and EXEC, the values at keys foo and bar could have been changed (e.g. in response to a request from another client). Is that why WATCH is useful? To error out the transaction if e.g. foo or bar changed after MULTI and before EXEC?
Do I have it correct? Anything I am missing?
I am led to believe EXEC is atomic
YES
Is that why WATCH is useful? To error out the transaction if e.g. foo or bar changed after MULTI and before EXEC?
NOT exactly. Redis fails the transaction if some key the client WATCHed before the transaction has been changed (after WATCH and before EXEC). The watched key might even not used in transaction.
MULTI-EXEC ensures commands between them runs atomically, while WATCH provides check-and-set(CAS) behavior.
With MULTI-EXEC, Redis ensures running INCR foo and INCR bar atomically, otherwise, Redis might run some other commands between INCR foo and INCR bar (these commands might and might not change foo or bar). If you watch some key (either foo, or bar, or even any other keys) before the transaction, and the watched key has been modified since the WATCH command, the transaction fails.
UPDATE
The doc you referred has a good example of using WATCH: implement INCR with GET and SET. In order to implement INCR, you need to 1) GET the old counter from Redis, 2) incr it on client side, and 3) SET the new value to Redis. In order to make it work, you need to ensure no other clients update/change the counter after you run step 1, otherwise, you might write incorrect value and should fail the transaction. 2 clients both get the old counter, i.e. 1, and incr it locally, and client 1 update it to 2, then client 2 update it to 2 again (the counter should be 3). In this case, WATCH is helpful.
WATCH mykey
val = GET mykey
val = val + 1
MULTI
SET mykey $val
EXEC
If some other client update the counter after we watch it, the transaction fails.
When do watched keys no longer watched?
EXEC is called.
UNWATCH is called.
DISCARD is called.
connection is closed.

Redis streams is returning an empty array

I created a new Redis steam using the following command.
XGROUP CREATE A mygroup $ MKSTREAM
I added the below mentioned data
xadd A * X 1
xadd A * X 2
xadd A * X 3
xadd A * X 4
I am reading the data using the following command.
XREADGROUP GROUP mygroup Alice COUNT 1 STREAMS A 0
Its returning an empty array
1) 1) "A"
2) (empty array)
I am using Redis version 6.2.1. Kindly help me to debug the error.
When you use XREADGROUP command to read message, you should specify > as ID, instead of 0.
Reference from the doc:
The special > ID, which means that the consumer want to receive only messages that were never delivered to any other consumer. It just means, give me new messages.
Any other ID, that is, 0 or any other valid ID or incomplete ID (just the millisecond time part), will have the effect of returning entries that are pending for the consumer sending the command with IDs greater than the one provided. So basically if the ID is not >, then the command will just let the client access its pending entries: messages delivered to it, but not yet acknowledged. Note that in this case, both BLOCK and NOACK are ignored.
If ID is not >, you can only read pending messages, however, in your case, there's no pending message, since you have not consume anything.

How to get pending items with minIdleTime greater then some value?

Using Redis stream we can have pending items which aren't finished by some consumers.
I can find such items using xpending command.
Let we have two pending items:
1) 1) "1-0"
2) "local-dev"
3) (integer) 9599
4) (integer) 1
2) 1) "2-0"
2) "local-dev"
3) (integer) 9599
4) (integer) 1
The problem that by using xpending we can set filters based on id only. I have a couple of service nodes (A, B) which make zombie check: XPENDING mystream test_group - 5 1
Each of them receives "1-0" item and they make xclaim and only one of them (for example A) becomes the owner and starts processing this item. But B runs xpending again to get new items but it receives again "1-0" because it hasn't been processed yet (A is working) and it looks like all my queue is blocked.
Is there any solution how I can avoid it and process pending items concurrently?
You want to see the documentation, in particular Recovering from permanent failures.
The way this is normally used is:
You allow the same consumer to consume its messages from PEL after recovering.
You only XCLAIM from another consumer when a reasonably large time elapsed, that suggests the original consumer is in permanent failure.
You use delivery count to detect poison pills or death letters. If a message has been retried many times, maybe it's better to report it to an admin for analysis.
So normally all you need is to see the oldest age in PEL from other consumers for the Permanent Failure Recovery logic, and you consume one by one.

using .net StackExchange.Redis with "wait" isn't working as expected

doing a R/W test with redis cluster (servers): 1 master + 2 slaves. the following is the key WRITE code:
var trans = redisDatabase.CreateTransaction();
Task<bool> setResult = trans.StringSetAsync(key, serializedValue, TimeSpan.FromSeconds(10));
Task<RedisResult> waitResult = trans.ExecuteAsync("wait", 3, 10000);
trans.Execute();
trans.WaitAll(setResult, waitResult);
using the following as the connection string:
[server1 ip]:6379,[server2 ip]:6379,[server3 ip]:6379,ssl=False,abortConnect=False
running 100 threads which do 1000 loops of the following steps:
generate a GUID as key and random as value of 1024 bytes
writing the key (using the above code)
retrieve the key using "var stringValue =
redisDatabase.StringGet(key, CommandFlags.PreferSlave);"
compare the two values and print an error if they differ.
running this test a few times generates several errors - trying to understand why as the "wait" with (10 seconds!) operation should have guaranteed the write to all slaves before returning.
Any idea?
WAIT isn't supported by SE.Redis as explained by its prolific author at Stackexchange.redis lacks the "WAIT" support
What about improving consistency guarantees, by adding in some "check, write, read" iterations?
SET a new key value pair (master node)
Read it (set CommandFlags to DemandReplica.
Not there yet? Wait and Try X times.
4.a) Not there yet? SET again. go back to (3) or give up
4.b) There? You're "done"
Won't be perfect but it should reduce probability of losing a SET??

Is this redis lua script that deals with key expire race conditions a pure function?

I've been playing around with redis to keep track of the ratelimit of an external api in a distributed system. I've decided to create a key for each route where a limit is present. The value of the key is how many request I can still make until the limit resets. And the reset is made by setting the TTL of the key to when the limit will reset.
For that I wrote the following lua script:
if redis.call("EXISTS", KEYS[1]) == 1 then
local remaining = redis.call("DECR", KEYS[1])
if remaining < 0 then
local pttl = redis.call("PTTL", KEYS[1])
if pttl > 0 then
--[[
-- We would exceed the limit if we were to do a call now, so let's send back that a limit exists (1)
-- Also let's send back how much we would have exceeded the ratelimit if we were to ignore it (ramaning)
-- and how long we need to wait in ms untill we can try again (pttl)
]]
return {1, remaining, pttl}
elseif pttl == -1 then
-- The key expired the instant after we checked that it existed, so delete it and say there is no ratelimit
redis.call("DEL", KEYS[1])
return {0}
elseif pttl == -2 then
-- The key expired the instant after we decreased it by one. So let's just send back that there is no limit
return {0}
end
else
-- Great we have a ratelimit, but we did not exceed it yet.
return {1, remaining}
end
else
return {0}
end
Since a watched key can expire in the middle of a multi transaction without aborting it. I assume the same is the case for lua scripts. Therefore I put in the cases for when the ttl is -1 or -2.
After I wrote that script I looked a bit more in depth at the eval command page and found out that a lua script has to be a pure function.
In there it says
The script must always evaluates the same Redis write commands with
the same arguments given the same input data set. Operations performed
by the script cannot depend on any hidden (non-explicit) information
or state that may change as script execution proceeds or between
different executions of the script, nor can it depend on any external
input from I/O devices.
With this description I'm not sure if my function is a pure function or not.
After Itamar's answer I wanted to confirm that for myself so I wrote a little lua script to test that. The scripts creates a key with a 10ms TTL and checks the ttl untill it's less then 0:
redis.call("SET", KEYS[1], "someVal","PX", 10)
local tmp = redis.call("PTTL", KEYS[1])
while tmp >= 0
do
tmp = redis.call("PTTL", KEYS[1])
redis.log(redis.LOG_WARNING, "PTTL:" .. tmp)
end
return 0
When I ran this script it never terminated. It just went on to spam my logs until I killed the redis server. However time dosen't stand still while the script runs, instead it just stops once the TTL is 0.
So the key ages, it just never expires.
Since a watched key can expire in the middle of a multi transaction without aborting it. I assume the same is the case for lua scripts. Therefore I put in the cases for when the ttl is -1 or -2.
AFAIR that isn't the case w/ Lua scripts - time kinda stops (in terms of TTL at least) when the script's running.
With this description I'm not sure if my function is a pure function or not.
Your script's great (without actually trying to understand what it does), don't worry :)