Order of processing different Redis Streams messages - redis

Given that I have a service P (Producer) and C (Consumer) my P service needs to:
Create an object X
Create an object Y (dependent on X)
Create an object Z (dependent on Y)
Notify C about X, Y, and Z (via Redis Streams)
C needs to use data from Z, Y, and X to do some local data persistence
Updated to Y and fairly common, but to X are rare
From C's perspective, is there a way to guarantee that it had all the info it needed for successful persistence?
I know that services like Kafka and Redis Streams are not generally built for this stuff, but how does one overcome this?
Idea 1:
Send X, Y, and Z in that particular order to the same consumer group. But if we scale the number of workers to anything above 1, we run into the problem
Idea 2:
Instead of sending X and Y separately to C, I can send a compound object Z, which has Y and X embedded. But seems really like overkill - doesn't it?
Is there any obvious way to handle object dependencies?

I believe IDEA 2 is a better solution cuz I think keep the whole message in one data structure is a good idea.
And probably you can try to use multiple keys.
For example:
On Service P
def now_timestamp = datetime.currentstamp # let`s say it is 1515151551
redis sadd not_processed_timestamp 1515151551
redis set X_1515151551 INFO_OF_X
redis set Y_1515151551 INFO_OF_Y
redis set Z_1515151551 INFO_OF_Z
On service C, create a new thread
def new_task_timestamp = redis spop not_processed_timestamp # let`s say it is 1515151551
redis blocking-get X_1515151551
redis blocking-get Y_1515151551
redis blocking-get Z_1515151551
# process the rest

This is a good question covered in this note about Redis Streams.
We could say that schematically the following is true:
If you use 1 stream -> 1 consumer, you are processing messages in
order.
If you use N streams with N consumers, so that only a given consumer
hits a subset of the N streams, you can scale the above model of 1
stream -> 1 consumer.
If you use 1 stream -> N consumers, you are load balancing to N
consumers, however in that case, messages about the same logical item
may be consumed out of order, because a given consumer may process
message 3 faster than another consumer is processing message 4.
So basically Kafka partitions are more similar to using N different
Redis keys, while Redis consumer groups are a server-side load
balancing system of messages from a given stream to N different
consumers.
So if you want to process them in order and use more than one consumer you will need to deal with that yourself.

Related

Are there alternatives to Redis when buffering is undesired and unacceptable?

I want to stream real-time sensor data(webcam, laser point cloud, etc.) from one robot to multiple observers.
In this use case, only the newest data is useful. For example, when a new frame of point cloud arrives, the older ones will be useless.
Redis has nice publisher/consumer support, but it has buffers according to (Redis Pubsub and Message Queueing).
So are there better alternatives? Something like ROS's publishers/subscribers. They have a message queue size parameter.
/**
* The subscribe() call is how you tell ROS that you want to receive messages
* on a given topic.
*
* The second parameter to the subscribe() function is the size of the message
* queue. If messages are arriving faster than they are being processed, this
* is the number of messages that will be buffered up before beginning to throw
* away the oldest ones.
*/
ros::Subscriber sub = n.subscribe("chatter", 1000, chatterCallback);
Maybe you can use redis list data structure for your purpose, like a queue. The list data structure in redis is made with linked list and adding a new item is O(1). Whenever your robot produces data it can put it in a list with LPUSH command, and when you want to get the latest item from the list use LRANGE "key-name" 0 0. This command will retrive the latest pushed item. Also if you want to not accumulate the data in queue, you may try to use LTRIM before LRANGE to maintain the latest records. For example LTRIM "key-name" 0 10 will keep the records of last 10 elements. This trim interval should be set according to your observer processing speeds. ref: https://redis.io/docs/data-types/lists/

Block until Complete Count

I am using Redis streams and need to block my clients until there are at least say X number of messages in the stream and return when it reaches that X count..
Is there any way to achieve this?
EG: XREADGROUP GROUP G1 C2 COUNT 10 BLOCK 0 STREAMS L > until all 10 messages have arrived in the stream key
COUNT specifies the maximum number of elements to return per stream, if there are any. If the streams are empty (and the BLOCK option was used), the consumer blocks. Once there's a single incoming message, the blocking is released with that message. So no, you can't block until COUNT messages are read. But you can probably group messages on the application level.
See XREAD for more details.

In Redis, how can I guarantee fetching N number items from a list in multi-clients environment?

Assume that there is a key K in Redis that is holding a list of values.
Many producer clients are adding elements to this list, one by one using LPUSH or RPUSH.
On the other hand, another set of consumer clients are popping elements from the list, though with certain restriction. Consumers will only attempt to pop N number of items, only if the list contains at least N number of items. This ensures that the consumer will hold N items in hand after finishing popping process
If the list contains fewer than N number of items, consumers shouldn't even attempt to pop elements from the list at all, because they won't have at least N items at the end.
If there is only 1 Consumer client, the client can simply run LLEN command to check if the list contains at least N items, and subtract N using LPOP/RPOP.
However, if there are many consumer clients, there can be a race condition and they can simultaneously pop items from the list, after reading LLEN >= N. So we might end up in a state where each consumer might pop fewer than N elements, and there is no item left in the list in Redis.
Using a separate locking system seems to be one way to tackle this issue, but I was curious if this type of operation can be done only using Redis commands, such as Multi/Exec/Watch etc.
I checked Multi/Exec approach and it seems that they do not support rollback. Also, all commands executed between Multi/Exec transaction will return 'QUEUED' so I won't be able to know if N number of LPOP that I will be executing in the transaction will all return elements or not.
So all you need is an atomic way to check on list length and pop conditionally.
This is what Lua scripts are for, see EVAL command.
Here a Lua script to get you started:
local len = redis.call('LLEN', KEYS[1])
if len >= tonumber(ARGV[1]) then
local res = {n=len}
for i=1,len do
res[i] = redis.call('LPOP', KEYS[1])
end
return res
else
return false
end
Use as
EVAL "local len = redis.call('LLEN', KEYS[1]) \n if len >= tonumber(ARGV[1]) then \n local res = {n=len} \n for i=1,len do \n res[i] = redis.call('LPOP', KEYS[1]) \n end \n return res \n else \n return false \n end" 1 list 3
This will only pop ARGV[1] elements (the number after the key name) from the list if the list has at least that many elements.
Lua scripts are ran atomically, so there is no race condition between reading clients.
As OP pointed in comments, there is risk of data-loss, say because power failure between LPOPs and the script return. You can use RPOPLPUSH instead of LPOP, storing the elements on a temporary list. Then you also need some tracking, deleting, and recovery logic. Note your client could also die, leaving some elements unprocessed.
You may want to take a look at Redis Streams. This data structure is ideal for distributing load among many clients. When using it with Consumer Groups, it has a pending entries list (PEL), that acts as that temporary list.
Clients then do a XACK to remove elements from the PEL once processed. Then you are also protected from client failures.
Redis Streams are very useful to solve the complex problem you are trying to solve. You may want to do the free course about this.
You could use a prefetcher.
Instead of each consumer greedily picking an item from the queue which leads to the problem of 'water water everywhere, but not a drop to drink', you could have a prefetcher that builds a packet of size = 6. When the prefetcher has a full packet, it can place the item in a separate packet queue (another redis key with list of packets) and pop the items from the main queue in a single transaction. Essentially, what you wrote:
If there is only 1 Consumer client, the client can simply run LLEN
command to check if the list contains at least N items, and subtract N
using LPOP/RPOP.
If the prefetcher doesn't have a full packet, it does nothing and keeps waiting for the main queue size to reach 6.
On the consumer side, they will just query the prefetched packets queue and pop the top packet and go. It is always 1 pre-built packet (size=6 items). If there are no packets available, they wait.
On the producer side, no changes are required. They can keep inserting into the main queue.
BTW, there can be more than one prefetcher task running concurrently and they can synchronize access to the main queue between themselves.
Implementing a scalable prefetcher
Prefetcher implementation could be described using a buffet table analogy. Think of the main queue as a restaurant buffet table where guests can pick up their food and leave. Etiquette demands that the guests follow a queue system and wait for their turn. Prefetchers also would follow something analogous. Here's the algorithm:
Algorithm Prefetch
Begin
while true
check = main queue has 6 items or more // this is a queue read. no locks required
if(check == true)
obtain an exclusive lock on the main queue
if lock successful
begin a transaction
create a packet and fill it with top 6 items from
the queue after popping them
add the packet to the prefetch queue
if packet added to prefetch queue successfully
commit the transaction
else
rollback the transaction
end if
release the lock
else
// someone else has the excl lock, we should just wait
sleep for xx millisecs
end if
end if
end while
End
I am just showing an infinite polling loop here for simplicity. But this could be implemented using pub/sub pattern through Redis Notifications. So, the prefetcher just waits for a notification that the main queue key is receiving an LPUSH and then executes the logic inside the while loop body above.
There are other ways you could do this. But this should give you some ideas.

How does TensorFlow sync tensors which share a buffer between different step-ids

I have a problem in my contrib implementation for distributed TensorFlow. Not to bother you with non-relevant details, the solution applies a certain message protocol in order to utilize RDMA writes directly from/to source/destination tensors, to save memory copies on the CPU.
Let's say I have 2 sides, A and B, and A wants to receive a tensor from B.
The protocol is as follows:
A sends a REQUEST message to B.
B lookups the tensor locally (BaseRendezvousMgr::RecvLocalAsync) and sends a META-DATA response to A.
A uses the meta data to allocate the destination tensor, and sends a ACK to B containing the destination address.
B receives the ACK, and performs a remote DMA write to the destination address.
Between the REQUEST and the ACK, B keeps the local tensor alive (and Ref() > 0), by saving it in a local map (the REQUEST copies the tensor to the local map, the ACK pops it from the map).
To validate my solution, I added a checksum calculation at each step. Occasionally, I see that the checksum changes between the REQUEST and the ACK. This happens when I run PS with two workers:
Line 1 is REQUEST to worker 0.
Line 2 is ACK to worker 0.
Line 3 is REQUEST to worker 1.
Line 4 is ACK to worker 1.
The last value on each line is the checksum. The errors happens about 50% of the times. I always see it on line 4.
I also saw that the problematic tensor has a shared buffer for all step-ids (this is a given. I can't control it). So it is very likely that some other thread changed the tensor's content between lines 3 and 4, which is something I want to prevent.
So the question is how? What prevented the content from changing between lines 1 and 2, and 2 and 3? To emphasize, the time elapsed between lines 3 and 4 is less than 0.04 seconds, while the time elapsed between 2 and 3 is almost 2.5 seconds.
Thanks for your help. Code will be posted if required.
Are you using tf.Variable for the shared buffer? If so using tfe.Variable (to enable reasonable read-write semantics) or tf.get_variable(..., use_resource=True) to construct it will make any synchronization issues go away.
Otherwise this is hard to understand without knowing more about the generating graph.

Is there good way to support pop members from the Redis Sorted Set?

Is there good way to support pop members from the Redis Sorted Set just like the api LPOP of the List ?
What I figured out for poping message from the Redis Sorted Set is using ZRANGE +ZREM , however it is not thread security and need the distributed lock when multi threads accessing them at the same time from the different host.
Please kind suggesting if there is better way to pop the members from the Sorted Set?
In Redis 5.0 or above, you can use [B]ZPOP{MIN|MAX} key [count] for this scenario.
The MIN version takes the item(s) with the lowest scores; MAX takes the item(s) with the highest scores. count defaults to 1, and the B prefix blocks until the data is available.
ZPOPMIN
ZPOPMAX
BZPOPMIN
BZPOPMAX
You can write a Lua script to do the job: wrap these two commands in a single Lua script. Redis ensures that the Lua script runs in an atomic way.
local key = KEYS[1]
local result = redis.call('ZRANGE', key, 0, 0)
local member = result[1]
if member then
redis.call('ZREM', key, member)
return member
else
return nil
end