Reading the documentation it doesn't look like Set POP (SPOP) is atomic, where as LPOP and RPOP, etc are. This is also what I'm seeing with my code where I have two clients using lettuce and reactive streams listening for SPOP. When something is pushed, both clients get the same value that was just pushed. I was really hoping to avoid that because I need a set to keep my values unique and was hoping SPOP would be identical to LPOP.
I have a pubsub client that pushes to this Set, and multiple instances will have multiple pubsub clients, which is why the Set to prevent extra work from being done.
I can either make sure list only contains unique items or I can make SPOP atomic. How should I go about doing this?
There is someone else who explains why SPOP isn't atomic:
https://medium.com/#stockholmux/redis-spop-culture-800cf306cbe6
Related
Can anyone explain in which cases I need to create multiple queues (one user -> one queue name), and when one queue name for all clients with different routing keys (one user -> one routing key) and why?
A user should not be able to read messages intended for another user.
I'm using direct exchange type.
First off I am going to assume that when you say "user" you are interchangeably referring to a consumer or producer, and they aren't the same thing so I would read up on that here in rabbitmq's simplest explanation. Walking through that tutorial will definitely help solidify your understanding of rabbit a bit more overall too, which is always good.
In any case, I would recommend doing this:
Create multiple queue's, each one linked to a single consumer. The reason for doing this instead of using a single queue with multiple is discussed here but if you don't want a bunch of programmer jargon, it pretty much says that a single queue is super slow because only one message can be consumed at a time from the queue.
Also, there is a built in "default exchange" that you can use instead of setting up another direct exchange which it sounds like you're putting effort into that you might not need to, obviously I'm not sure what you are doing but I would take that into consideration... hope this helps!
I'm working on creating DB with Redis.
One of my recruitments is that all the clients in the system will be able to listen to set events and get information about both key and value change.
I know that publishing value may be big(512 MB) but I know that in my system the size of value will not be more than 100 chars.
I have 3 possible solutions and I wonder which one will be better or consider other solutions:
1) After each set operation client will also publish it (PUB/SUB)
2)Edit setGenericCommand function to publish the value as well and use keyspace binding.
3)After client receive keyspace notification it will get the value with get operation.
I would like to understand which approach will be better?
Thank you!
So, 1st and foremost, remember that PubSub is at-most-once delivery. If you really need to process every change in the client, you should consider a more resilient way to do so.
That said, assuming you're ok with PubSub's promises, 1 is the simplest and I'd go with that. At most, I'd provide the clients with a Lua wrapper that combines the SET and PUBLISH commands. This, of course, removes the need to actually listen to Keyspace notifications as you basically implementing it yourself.
2 means hacking Redis, which is great but means you'll have to maintain your own which is meh--;
3 is also simple enough, but with 1 you get away with a single round trip instead of 2.
Another (4) approach is to write a custom module, but IMO too complex for this need. Go with 1 and Lua, and may the force be with you.
I'm implementing redis Keyspace notifications in my application which is having 10 instances on our production environment.
My pubsub listens for expired event in map1 and decrements in map2 based on that.
This works fine on my local machine. My issue is that when I deploy my application with multiple instances , I think all instances will read expired event and all will decrement the key whereas I want to restrict that only 1 instance should decrement.
Is there any way to achieve this ?
Your listeners will have to coordinate the decrement somehow. You can do that with some sort of locking, but a simpler way perhaps would be embed a notion of version/timestamp into this logic. Here's what I had in mind.
What if you include a timestamp in your "map2"? An expired event has it's own timestamp, so you can have the listeners check-and-set against that (tip: I'd use Lua for the CAS). This will prevent race-like conditions and multiple decrements in one go.
Note: Redis PubSub is amazing, but note that your current solution does not ensure the decrement in "map2" in case a message is lost. In the very near future, Redis will offer the Stream data type, that is much more suitable for that type of job. Specifically, the Stream Consumer Groups functionality is IMO just what you need here for replacing keyspace notifications.
I'm trying to implement a Tagging using Redis. This is how it looks like:
mykey (my item)
mykey:tags (a set with the tags associated to that item)
tags:tag1 (a set with references to all items tagged with "tag1")
...
I'm planning on using Redis Keyspace Notifications to prevent expired keys to stay on my tag sets forever (even when every item in the cache has a default TTL set, I don't like to keep stale data around).
These are the options I'm considering:
1) Subscribe to all "expired" events.
psubscribe '__keyevent#*:expired'
Pros:
Only 1 subscriber.
Cons:
Since not all items contain tags, I will have to check for mykey:tags
and if exists get the tags and remove the item from each tag set.
The contention on this method will increase with the amount of keys
in the store.
2) Subscribe to all events for those keys containing tags only.
psubscribe '__keyspace#*:mykey'
Pros:
Subscriptions will be created for those items with tags only.
Cons:
There must be overhead associated with each subscriber.
The number of subscriber can grow pretty fast depending on the number
of tagged items in the store.
Questions:
Which option should I implement? Should I be concerned about the
number of subscribers on 2) or is the contention on 1) a bigger
deal? I couldn't find any recommendations about this subject.
The end game is to implement this on Redis Cluster. Does this add
any extra concern to the implementation?
Update 1:
This is a generic implementation for tagging on top of our cache. I'm not sure at this point about how we ended up using it. This is more like a PoC I'm working on. Some numbers trying to answer some questions in the comments:
Volume: We have tens of millions of unique visitors per day. Not all items stored in cache for each visitor has tags though. But this changes constantly.
Tags: Tags are managed. There are currently a couple of dozen of tags. We are considering supporting free text tags in the future.
I haven't tested any of the two approaches I'm suggesting here. I was hoping that one of the options were so bad that was not even an option :)
Update 2:
After some trials and errors and some more research I discarded 2). There is a limit for redis clients as well as for the Output Buffers which makes this option a no go. You can find more information here and here.
I tried 1) and it works just fine. I even set the expiration of the keys 5ms apart from each other and the code handle it properly. This can be an alternative to go.
Another option can be the one suggested by #thepirat000. I'm marking this answer as the accepted one but I'm also adding a little tweak to his suggestion: I don't want to do maintenance in the tags on every tag operation, instead I can randomly determine when to do it. This is a good enough approach which doesn't use pub/sub nor the keyspace notifications.
There will be probably too much overhead by using Keyspace Notifications for this.
Why don't you do the clean-up as a scheduled or recurring task, or even when the keys are retrieved by tag?
I've worked on something similar on CachingFramework.Redis where the cleanup is optionally run when retrieving the keys related to a tag. Also the tag set TTL is the MAX(TTL) of the keys it contains.
Question
I want to pass data between applications, in a publish-subscribe manner. Data may be produced at a much higher rate than consumed and messages get lost, which is not a problem. Imagine a fast sensor and a slow sensor data processor. For that, I use redis pub/sub and wrote a class which acts as a subscriber, receives every message and puts that into a buffer. The buffer is overwritten when a new message comes in or nullified when the message is requested by the "real" function. So when I ask this class, I immediately get a response (hint that my function is slower than data comes in) or I have to wait (hint that my function is faster than the data).
This works pretty good for the case that data comes in fast. But for data which comes in relatively seldom, let's say every five seconds, this does not work: imagine my consumer gets launched slightly after the producer, the first message is lost and my consumer needs to wait nearly five seconds, until it can start working.
I think I have to solve this with Redis tools. Instead of a pub/sub, I could simply use the get/set methods, thus putting the cache functionality into Redis directly. But then, my consumer would have to poll the database instead of the event magic I have at the moment. Keys could look like "key:timestamp", and my consumer now has to get key:* and compare the timestamps permamently, which I think would cause a lot of load. There is no natural possibility to sleep, since although I don't care about dropped messages (there is nothing I can do about), I do care about delay.
Does someone use Redis for a similar thing and could give me a hint about clever use of Redis tools and data structures?
edit
Ideally, my program flow would look like this:
start the program
retrieve key from Redis
tell Redis, "hey, notify me on changes of key".
launch something asynchronously, with a callback for new messages.
By writing this, an idea came up: The publisher not only publishes message on topic key, but also set key message. This way, an application could initially get and then subscribe.
Good idea or not really?
What I did after I got the answer below (the accepted one)
Keyspace notifications are really what I need here. Redis acts as the primary source for information, my client subscribes to keyspace notifications, which notify the subscribers about events affecting specific keys. Now, in the asynchronous part of my client, I subscribe to notifications about my key of interest. Those notifications set a key_has_updates flag. When I need the value, I get it from Redis and unset the flag. With an unset flag, I know that there is no new value for that key on the server. Without keyspace notifications, this would have been the part where I needed to poll the server. The advantage is that I can use all sorts of data structures, not only the pub/sub mechanism, and a slow joiner which misses the first event is always able to get the initial value, which with pub/sib would have been lost.
When I need the value, I obtain the value from Redis and set the flag to false.
One idea is to push the data to a list (LPUSH) and trim it (LTRIM), so it doesn't grow forever if there are no consumers. On the other end, the consumer would grab items from that list and process them. You can also use keyspace notifications, and be alerted each time an item is added to that queue.
I pass data between application using two native redis command:
rpush and blpop .
"blpop blocks the connection when there are no elements to pop from any of the given lists".
Data are passed in json format, between application using list as queue.
Application that want send data (act as publisher) make a rpush on a list
Application that want receive data (act as subscriber) make a blpop on the same list
The code shuold be (in perl language)
Sender (we assume an hash pass)
#Encode hash in json format
my $json_text = encode_json \%$hash_ref;
#Connect to redis and send to list
my $r = Redis->new(server => "127.0.0.1:6379");
$r->rpush("shared_queue","$json_text");
$r->quit;
Receiver (into a infinite loop)
while (1) {
my $r = Redis->new(server => "127.0.0.1:6379");
my #elem =$r->blpop("shared_queue",0);
#Decode hash element
my $hash_ref=decode_json($elem\[1]);
#make some stuff
}
I find this way very usefull for many reasons:
The element are stored into list, so temporary disabling of receiver has no information loss. When recevier restart, can process all items into the list.
High rate of sender can be handled with multiple instance of receiver.
Multiple sender can send data on unique list. In ths case should be easily implmented a data collector
Receiver process that act as daemon can be monitored with specific tools (e.g. pm2)
From Redis 5, there is new data type called "Streams" which is append-only datastructure. The Redis streams can be used as reliable message queue with both point to point and multicast communication using consumer group concept Redis_Streams_MQ