Transactions and watch statement in Redis - redis

Could you please explain me following example from "The Little Redis Book":
With the code above, we wouldn't be able to implement our own incr
command since they are all executed together once exec is called. From
code, we can't do:
redis.multi()
current = redis.get('powerlevel')
redis.set('powerlevel', current + 1)
redis.exec()
That isn't how Redis transactions work. But, if we add a watch to
powerlevel, we can do:
redis.watch('powerlevel')
current = redis.get('powerlevel')
redis.multi()
redis.set('powerlevel', current + 1)
redis.exec()
If another client changes the value of powerlevel after we've called
watch on it, our transaction will fail. If no client changes the
value, the set will work. We can execute this code in a loop until it
works.
Why we can't execute increment in transaction that can't be interrupted by other command? Why we need to iterate instead and wait until nobody changes value before transaction starts?

There are several questions here.
1) Why we can't execute increment in transaction that can't be interrupted by other command?
Please note first that Redis "transactions" are completely different than what most people think transactions are in classical DBMS.
# Does not work
redis.multi()
current = redis.get('powerlevel')
redis.set('powerlevel', current + 1)
redis.exec()
You need to understand what is executed on server-side (in Redis), and what is executed on client-side (in your script). In the above code, the GET and SET commands will be executed on Redis side, but assignment to current and calculation of current + 1 are supposed to be executed on client side.
To guarantee atomicity, a MULTI/EXEC block delays the execution of Redis commands until the exec. So the client will only pile up the GET and SET commands in memory, and execute them in one shot and atomically in the end. Of course, the attempt to assign current to the result of GET and incrementation will occur well before. Actually the redis.get method will only return the string "QUEUED" to signal the command is delayed, and the incrementation will not work.
In MULTI/EXEC blocks you can only use commands whose parameters can be fully known before the begining of the block. You may want to read the documentation for more information.
2) Why we need to iterate instead and wait until nobody changes value before transaction starts?
This is an example of concurrent optimistic pattern.
If we used no WATCH/MULTI/EXEC, we would have a potential race condition:
# Initial arbitrary value
powerlevel = 10
session A: GET powerlevel -> 10
session B: GET powerlevel -> 10
session A: current = 10 + 1
session B: current = 10 + 1
session A: SET powerlevel 11
session B: SET powerlevel 11
# In the end we have 11 instead of 12 -> wrong
Now let's add a WATCH/MULTI/EXEC block. With a WATCH clause, the commands between MULTI and EXEC are executed only if the value has not changed.
# Initial arbitrary value
powerlevel = 10
session A: WATCH powerlevel
session B: WATCH powerlevel
session A: GET powerlevel -> 10
session B: GET powerlevel -> 10
session A: current = 10 + 1
session B: current = 10 + 1
session A: MULTI
session B: MULTI
session A: SET powerlevel 11 -> QUEUED
session B: SET powerlevel 11 -> QUEUED
session A: EXEC -> success! powerlevel is now 11
session B: EXEC -> failure, because powerlevel has changed and was watched
# In the end, we have 11, and session B knows it has to attempt the transaction again
# Hopefully, it will work fine this time.
So you do not have to iterate to wait until nobody changes the value, but rather to attempt the operation again and again until Redis is sure the values are consistent and signals it is successful.
In most cases, if the "transactions" are fast enough and the probability to have contention is low, the updates are very efficient. Now, if there is contention, some extra operations will have to be done for some "transactions" (due to the iteration and retries). But the data will always be consistent and no locking is required.

Related

Understanding transactions and WATCH

I am reading https://redis.io/docs/manual/transactions/, and I am not sure I fully understand the importance of WATCH in a Redis transaction. From https://redis.io/docs/manual/transactions/#usage, I am led to believe EXEC is atomic:
> MULTI
OK
> INCR foo
QUEUED
> INCR bar
QUEUED
> EXEC
1) (integer) 1
2) (integer) 1
But between MULTI and EXEC, the values at keys foo and bar could have been changed (e.g. in response to a request from another client). Is that why WATCH is useful? To error out the transaction if e.g. foo or bar changed after MULTI and before EXEC?
Do I have it correct? Anything I am missing?
I am led to believe EXEC is atomic
YES
Is that why WATCH is useful? To error out the transaction if e.g. foo or bar changed after MULTI and before EXEC?
NOT exactly. Redis fails the transaction if some key the client WATCHed before the transaction has been changed (after WATCH and before EXEC). The watched key might even not used in transaction.
MULTI-EXEC ensures commands between them runs atomically, while WATCH provides check-and-set(CAS) behavior.
With MULTI-EXEC, Redis ensures running INCR foo and INCR bar atomically, otherwise, Redis might run some other commands between INCR foo and INCR bar (these commands might and might not change foo or bar). If you watch some key (either foo, or bar, or even any other keys) before the transaction, and the watched key has been modified since the WATCH command, the transaction fails.
UPDATE
The doc you referred has a good example of using WATCH: implement INCR with GET and SET. In order to implement INCR, you need to 1) GET the old counter from Redis, 2) incr it on client side, and 3) SET the new value to Redis. In order to make it work, you need to ensure no other clients update/change the counter after you run step 1, otherwise, you might write incorrect value and should fail the transaction. 2 clients both get the old counter, i.e. 1, and incr it locally, and client 1 update it to 2, then client 2 update it to 2 again (the counter should be 3). In this case, WATCH is helpful.
WATCH mykey
val = GET mykey
val = val + 1
MULTI
SET mykey $val
EXEC
If some other client update the counter after we watch it, the transaction fails.
When do watched keys no longer watched?
EXEC is called.
UNWATCH is called.
DISCARD is called.
connection is closed.

using .net StackExchange.Redis with "wait" isn't working as expected

doing a R/W test with redis cluster (servers): 1 master + 2 slaves. the following is the key WRITE code:
var trans = redisDatabase.CreateTransaction();
Task<bool> setResult = trans.StringSetAsync(key, serializedValue, TimeSpan.FromSeconds(10));
Task<RedisResult> waitResult = trans.ExecuteAsync("wait", 3, 10000);
trans.Execute();
trans.WaitAll(setResult, waitResult);
using the following as the connection string:
[server1 ip]:6379,[server2 ip]:6379,[server3 ip]:6379,ssl=False,abortConnect=False
running 100 threads which do 1000 loops of the following steps:
generate a GUID as key and random as value of 1024 bytes
writing the key (using the above code)
retrieve the key using "var stringValue =
redisDatabase.StringGet(key, CommandFlags.PreferSlave);"
compare the two values and print an error if they differ.
running this test a few times generates several errors - trying to understand why as the "wait" with (10 seconds!) operation should have guaranteed the write to all slaves before returning.
Any idea?
WAIT isn't supported by SE.Redis as explained by its prolific author at Stackexchange.redis lacks the "WAIT" support
What about improving consistency guarantees, by adding in some "check, write, read" iterations?
SET a new key value pair (master node)
Read it (set CommandFlags to DemandReplica.
Not there yet? Wait and Try X times.
4.a) Not there yet? SET again. go back to (3) or give up
4.b) There? You're "done"
Won't be perfect but it should reduce probability of losing a SET??

Flink: How to process rest of finite stream with combination of countWindowAll()

//assume following logic
val source = arrayOf(1,2,3,4,5,6,7,8,9,10,11,12) // total 12 elements
val env = StreamExecutionEnvironment.createLocalEnvironment(1);
val input = env.fromCollection(source)
.countWindowAll(5)
.aggregate(...) // pack them to List<Int> for bulk upload to DB
.addSink(...) // sends bulk
When i execute it - only first 10 processed, but rest 2 elements
are thrown away - flink shutdown without processing of them.
The only avoid for me - while i totally controll source data, i can push some well-known IGNORABLE_VALUES to source collection to fit window size and then ignore them in sink... but i think where is some far more professional way in flink.
You have a finite stream of 12 and a window that triggers for every 5 elements. So the first window gets 5 elements and then triggers, then the next 5 are received and it triggers, but the last 2 come and the job knows that no more are going to come. So since there aren't 5 elements in the window the trigger doesn't fire so nothing is done with them.

What is the race condition for Redis INCR Rate Limiter 2?

I have read the INCR documentation here but I could not understand why the Rate limiter 2 has a race condition.
In addition, what does it mean by the key will be leaked until we'll see the same IP address again in the documentation?
Can anyone help explain? Thank you very much!
You are talking about the following code, which has two problems in multiple-threaded environment.
1. FUNCTION LIMIT_API_CALL(ip):
2. current = GET(ip)
3. IF current != NULL AND current > 10 THEN
4. ERROR "too many requests per second"
5. ELSE
6. value = INCR(ip)
7. IF value == 1 THEN
8. EXPIRE(ip,1)
9. END
10. PERFORM_API_CALL()
11.END
the key will be leaked until we'll see the same IP address again
If the client dies, e.g. client is killed or machine is down, before executing LINE 8. Then the key ip won't be set an expiration. If we'll never see this ip again, this key will always persist in Redis database, and is leaked.
Rate limiter 2 has a race condition
Suppose key ip doesn't exist in the database. If there are more than 10 clients, say, 20 clients, execute LINE 2 simultaneously. All of them will get a NULL current, and they all will go into the ELSE clause. Finally all these clients will execute LINE 10, and the API will be called more than 10 times.
This solution fails, because these's a time window between LINE 2 and LINE 3.
A Correct Solution
value = INCR(ip)
IF value == 1 THEN
EXPIRE(ip, 1)
END
IF value <= 10 THEN
return true
ELSE
return false
END
Wrap the above code into a Lua script to ensure it runs atomically. If this script returns true, perform the API call. Otherwise, do nothing.

Scalable delayed task execution with Redis

I need to design a Redis-driven scalable task scheduling system.
Requirements:
Multiple worker processes.
Many tasks, but long periods of idleness are possible.
Reasonable timing precision.
Minimal resource waste when idle.
Should use synchronous Redis API.
Should work for Redis 2.4 (i.e. no features from upcoming 2.6).
Should not use other means of RPC than Redis.
Pseudo-API: schedule_task(timestamp, task_data). Timestamp is in integer seconds.
Basic idea:
Listen for upcoming tasks on list.
Put tasks to buckets per timestamp.
Sleep until the closest timestamp.
If a new task appears with timestamp less than closest one, wake up.
Process all upcoming tasks with timestamp ≤ now, in batches (assuming
that task execution is fast).
Make sure that concurrent worker wouldn't process same tasks. At the same time, make sure that no tasks are lost if we crash while processing them.
So far I can't figure out how to fit this in Redis primitives...
Any clues?
Note that there is a similar old question: Delayed execution / scheduling with Redis? In this new question I introduce more details (most importantly, many workers). So far I was not able to figure out how to apply old answers here — thus, a new question.
Here's another solution that builds on a couple of others [1]. It uses the redis WATCH command to remove the race condition without using lua in redis 2.6.
The basic scheme is:
Use a redis zset for scheduled tasks and redis queues for ready to run tasks.
Have a dispatcher poll the zset and move tasks that are ready to run into the redis queues. You may want more than 1 dispatcher for redundancy but you probably don't need or want many.
Have as many workers as you want which do blocking pops on the redis queues.
I haven't tested it :-)
The foo job creator would do:
def schedule_task(queue, data, delay_secs):
# This calculation for run_at isn't great- it won't deal well with daylight
# savings changes, leap seconds, and other time anomalies. Improvements
# welcome :-)
run_at = time.time() + delay_secs
# If you're using redis-py's Redis class and not StrictRedis, swap run_at &
# the dict.
redis.zadd(SCHEDULED_ZSET_KEY, run_at, {'queue': queue, 'data': data})
schedule_task('foo_queue', foo_data, 60)
The dispatcher(s) would look like:
while working:
redis.watch(SCHEDULED_ZSET_KEY)
min_score = 0
max_score = time.time()
results = redis.zrangebyscore(
SCHEDULED_ZSET_KEY, min_score, max_score, start=0, num=1, withscores=False)
if results is None or len(results) == 0:
redis.unwatch()
sleep(1)
else: # len(results) == 1
redis.multi()
redis.rpush(results[0]['queue'], results[0]['data'])
redis.zrem(SCHEDULED_ZSET_KEY, results[0])
redis.exec()
The foo worker would look like:
while working:
task_data = redis.blpop('foo_queue', POP_TIMEOUT)
if task_data:
foo(task_data)
[1] This solution is based on not_a_golfer's, one at http://www.saltycrane.com/blog/2011/11/unique-python-redis-based-queue-delay/, and the redis docs for transactions.
You didn't specify the language you're using. You have at least 3 alternatives of doing this without writing a single line of code in Python at least.
Celery has an optional redis broker.
http://celeryproject.org/
resque is an extremely popular redis task queue using redis.
https://github.com/defunkt/resque
RQ is a simple and small redis based queue that aims to "take the good stuff from celery and resque" and be much simpler to work with.
http://python-rq.org/
You can at least look at their design if you can't use them.
But to answer your question - what you want can be done with redis. I've actually written more or less that in the past.
EDIT:
As for modeling what you want on redis, this is what I would do:
queuing a task with a timestamp will be done directly by the client - you put the task in a sorted set with the timestamp as the score and the task as the value (see ZADD).
A central dispatcher wakes every N seconds, checks out the first timestamps on this set, and if there are tasks ready for execution, it pushes the task to a "to be executed NOW" list. This can be done with ZREVRANGEBYSCORE on the "waiting" sorted set, getting all items with timestamp<=now, so you get all the ready items at once. pushing is done by RPUSH.
workers use BLPOP on the "to be executed NOW" list, wake when there is something to work on, and do their thing. This is safe since redis is single threaded, and no 2 workers will ever take the same task.
once finished, the workers put the result back in a response queue, which is checked by the dispatcher or another thread. You can add a "pending" bucket to avoid failures or something like that.
so the code will look something like this (this is just pseudo code):
client:
ZADD "new_tasks" <TIMESTAMP> <TASK_INFO>
dispatcher:
while working:
tasks = ZREVRANGEBYSCORE "new_tasks" <NOW> 0 #this will only take tasks with timestamp lower/equal than now
for task in tasks:
#do the delete and queue as a transaction
MULTI
RPUSH "to_be_executed" task
ZREM "new_tasks" task
EXEC
sleep(1)
I didn't add the response queue handling, but it's more or less like the worker:
worker:
while working:
task = BLPOP "to_be_executed" <TIMEOUT>
if task:
response = work_on_task(task)
RPUSH "results" response
EDit: stateless atomic dispatcher :
while working:
MULTI
ZREVRANGE "new_tasks" 0 1
ZREMRANGEBYRANK "new_tasks" 0 1
task = EXEC
#this is the only risky place - you can solve it by using Lua internall in 2.6
SADD "tmp" task
if task.timestamp <= now:
MULTI
RPUSH "to_be_executed" task
SREM "tmp" task
EXEC
else:
MULTI
ZADD "new_tasks" task.timestamp task
SREM "tmp" task
EXEC
sleep(RESOLUTION)
If you're looking for ready solution on Java. Redisson is right for you. It allows to schedule and execute tasks (with cron-expression support) in distributed way on Redisson nodes using familiar ScheduledExecutorService api and based on Redis queue.
Here is an example. First define a task using java.lang.Runnable interface. Each task can access to Redis instance via injected RedissonClient object.
public class RunnableTask implements Runnable {
#RInject
private RedissonClient redissonClient;
#Override
public void run() throws Exception {
RMap<String, Integer> map = redissonClient.getMap("myMap");
Long result = 0;
for (Integer value : map.values()) {
result += value;
}
redissonClient.getTopic("myMapTopic").publish(result);
}
}
Now it's ready to sumbit it into ScheduledExecutorService:
RScheduledExecutorService executorService = redisson.getExecutorService("myExecutor");
ScheduledFuture<?> future = executorService.schedule(new CallableTask(), 10, 20, TimeUnit.MINUTES);
future.get();
// or cancel it
future.cancel(true);
Examples with cron expressions:
executorService.schedule(new RunnableTask(), CronSchedule.of("10 0/5 * * * ?"));
executorService.schedule(new RunnableTask(), CronSchedule.dailyAtHourAndMinute(10, 5));
executorService.schedule(new RunnableTask(), CronSchedule.weeklyOnDayAndHourAndMinute(12, 4, Calendar.MONDAY, Calendar.FRIDAY));
All tasks are executed on Redisson node.
A combined approach seems plausible:
No new task timestamp may be less than current time (clamp if less). Assuming reliable NTP synch.
All tasks go to bucket-lists at keys, suffixed with task timestamp.
Additionally, all task timestamps go to a dedicated zset (key and score — timestamp itself).
New tasks are accepted from clients via separate Redis list.
Loop: Fetch oldest N expired timestamps via zrangebyscore ... limit.
BLPOP with timeout on new tasks list and lists for fetched timestamps.
If got an old task, process it. If new — add to bucket and zset.
Check if processed buckets are empty. If so — delete list and entrt from zset. Probably do not check very recently expired buckets, to safeguard against time synchronization issues. End loop.
Critique? Comments? Alternatives?
Lua
I made something similar to what's been suggested here, but optimized the sleep duration to be more precise. This solution is good if you have few inserts into the delayed task queue. Here's how I did it with a Lua script:
local laterChannel = KEYS[1]
local nowChannel = KEYS[2]
local currentTime = tonumber(KEYS[3])
local first = redis.call("zrange", laterChannel, 0, 0, "WITHSCORES")
if (#first ~= 2)
then
return "2147483647"
end
local execTime = tonumber(first[2])
local event = first[1]
if (currentTime >= execTime)
then
redis.call("zrem", laterChannel, event)
redis.call("rpush", nowChannel, event)
return "0"
else
return tostring(execTime - currentTime)
end
It uses two "channels". laterChannel is a ZSET and nowChannel is a LIST. Whenever it's time to execute a task, the event is moved from the the ZSET to the LIST. The Lua script with respond with how many MS the dispatcher should sleep until the next poll. If the ZSET is empty, sleep forever. If it's time to execute something, do not sleep(i e poll again immediately). Otherwise, sleep until it's time to execute the next task.
So what if something is added while the dispatcher is sleeping?
This solution works in conjunction with key space events. You basically need to subscribe to the key of laterChannel and whenever there is an add event, you wake up all the dispatcher so they can poll again.
Then you have another dispatcher that uses the blocking left pop on nowChannel. This means:
You can have the dispatcher across multiple instances(i e it's scaling)
The polling is atomic so you won't have any race conditions or double events
The task is executed by any of the instances that are free
There are ways to optimize this even more. For example, instead of returning "0", you fetch the next item from the zset and return the correct amount of time to sleep directly.
Expiration
If you can not use Lua scripts, you can use key space events on expired documents.
Subscribe to the channel and receive the event when Redis evicts it. Then, grab a lock. The first instance to do so will move it to a list(the "execute now" channel). Then you don't have to worry about sleeps and polling. Redis will tell you when it's time to execute something.
execute_later(timestamp, eventId, event) {
SET eventId event EXP timestamp
SET "lock:" + eventId, ""
}
subscribeToEvictions(eventId) {
var deletedCount = DEL eventId
if (deletedCount == 1) {
// move to list
}
}
This however has it own downsides. For example, if you have many nodes, all of them will receive the event and try to get the lock. But I still think it's overall less requests any anything suggested here.