How should I choose between pubsub and blocking operations of redis ?
Redis gives blocking operations like BLPOP which blocks the operation till an element can be popped from the list. Why should I not use this to achieve the functionality of PUBSUB. PUBSUB allows you to define channels which are a higher level construct than the basic lists. If my usecase is simple without multiple channels, can I go with the basic blocking operations.
There is an important difference between using lists with blocking operations and the pub/sub facilities.
A list with blocking operations can easily be used as a queue, while pub/sub channels do not involve any queuing. The only buffers involved in pub/sub are related to communication (i.e. socket management). It means when a message is published, it will be transmitted ASAP to the subscribers, it is never kept in Redis. A consequence is if the subscribers do not listen anymore to the Redis socket, the items are lost for these subscribers.
Another important difference is the pub/sub mechanism can multicast items. When the items are published they are sent to all the subscribers. On the contrary, considering multiple daemons dequeuing a list in Redis using blocking operations, pushing an item to the list will result in the item to be dequeued by one and only one daemon.
Blocking lists (i.e. queues) and pub/sub channels are really complementary facilities.
If you do not need to multicast items, you should rather use list with blocking operations, since they are much more reliable.
Related
Similar to this question, we have FIFO queues and the messages must be processed in order. We want competing consumers from different machines for redundancy and performance reasons, but only one consumer on one machine should handle a message for a given queue at a time.
I tried setting the prefetch count to 1, but I believe this will only work if used with a single machine. Is this possible by default with RabbitMQ or do we need to implement our own lock?
Given a single queue with multiple consumers there is no way to block one of the consumers, all of them receive the messages in round-robin fashion.
EDIT
See https://www.rabbitmq.com/consumers.html#single-active-consumer
/EDIT
You could see this plugin, https://github.com/rabbitmq/rabbitmq-consistent-hash-exchange to distribute the load using different queues.
I tried setting the prefetch count to 1
prefetch=1 means that the consumers take one message at a time.
do we need to implement our own lock
Yes, If you want one single consumer for queue avoiding other consumers.
EDIT
There are also the Exclusive Queues https://www.rabbitmq.com/queues.html#exclusive-queues but note:
Exclusive queues are deleted when their declaring connection is closed or gone (e.g. due to underlying TCP connection loss). They, therefore, are only suitable for client-specific transient state.
Provided that both the client subscribed and the server publishing the message retain the connection, is Redis guaranteed to always deliver the published message to the subscribed client eventually, even under situations where the client and/or server are massively stressed? Or should I plan for the possibility that Redis might ocasionally drop messages as things get "hot"?
Redis does absolutely not provide any guaranteed delivery for the publish-and-subscribe traffic. This mechanism is only based on sockets and event loops, there is no queue involved (even in memory). If a subscriber is not listening while a publication occurs, the event will be lost for this subscriber.
It is possible to implement some guaranteed delivery mechanisms on top of Redis, but not with the publish-and-subscribe API. The list data type in Redis can be used as a queue, and as the the foundation of more advanced queuing systems, but it does not provide multicast capabilities (so no publish-and-subscribe).
AFAIK, there is no obvious way to easily implement publish-and-subscribe and guaranteed delivery at the same time with Redis.
Redis does not provide guaranteed delivery using its Pub/Sub mechanism. Moreover, if a subscriber is not actively listening on a channel, it will not receive messages that would have been published.
I previously wrote a detailed article that describes how one can use Redis lists in combination with BLPOP to implement reliable multicast pub/sub delivery:
http://blog.radiant3.ca/2013/01/03/reliable-delivery-message-queues-with-redis/
For the record, here's the high-level strategy:
When each consumer starts up and gets ready to consume messages, it registers by adding itself to a Set representing all consumers registered on a queue.
When a producers publishes a message on a queue, it:
Saves the content of the message in a Redis key
Iterates over the set of consumers registered on the queue, and pushes the message ID in a List for each of the registered consumers
Each consumer continuously looks out for a new entry in its consumer-specific list and when one comes in, removes the entry (using a BLPOP operation), handles the message and moves on to the next message.
I have also made a Java implementation of these principles available open-source:
https://github.com/davidmarquis/redisq
These principles have been used to process about 1,000 messages per second from a single Redis instance and two instances of the consumer application, each instance consuming messages with 5 threads.
I need a way to publish messages to unknown number of subscribers. The messages should be durable/persisted and categorized into three priorities (high, medium and low). One of the subscribers can only handle a limited load and some messages are just more important. High-prioritized messages processed first etc.
How do I do that with Rebus? I guess I need three queues per subscriber?
Where can I find a publish/subscribe example with durable queues and MSMQ?
First, some info: Rebus likes to work with durable queues, durable messaging, and guaranteed delivery. In fact, unless you actively do stuff to opt out, that's the way everything works. So if you manage to make pub/sub work with Rebus, it's durable :)
Publishing by definition works with an "unknown number of subscribers" - at least that's a bus concern, and not an application concern.
In reality, subscribers initiate pub/sub conversation by issuing a SubscriptionMessage (which can be seen as a subscription request), which is then followed by the publisher publishing some number of events (which can be seen as "subscription replies"). The "bus part" of the publisher keeps track of who subscribed to any given event type.
So far, so good.
Regarding priorities, there's no out-of-the-box way to achieve that with Rebus. One way to ensure a maximum latency on certain message types is, as you're suggesting, by making separate endpoints whose input queues will not be clogged by low priority messages.
But there is some stuff around how Rebus is configured that strongly suggests having only one single input queue in each process, so that would probably imply that you should create separate processes that subscribe to those high priority message types.
I know that MSMQ supports some kind of priority on messages, so I guess it could be supported by having MsmqMessageQueue understand certain headers (similar to how express delivery and time-to-be-received are implemented - see here) - pull requests are happily accepted and strongly encouraged :)
I have found this image is very similar to my bussiness model. I need to split message to some queue.
for some heavy work. I can add more worker thread for them. But for some no much heavy work. I can
let single consumer to subscribe their message. But how to do that in rabbitMQ.
Through their document. I just found that single-queue-multi-consumer model.
You can add multiple workers to a queue
There can be multiple queues bound to an exchange.
In RabbitMQ, the producer always sends the message to an exchange. So, in your case, I hope only one exchange is enough. If you want to load balance at the consumer side, you have the above said two options.
You can also read my article:
https://techietweak.wordpress.com/2015/08/14/rabbitmq-a-cloud-based-message-oriented-middleware/
RabbitMQ has a very flexible model, which enables a wide variety of routing scenarios to take place.
I need to split message to some queue. for some heavy work. I can add more worker thread for them.
Yes, this is supported via a direct exchange. Publish a message using a routing key that is the same as the name of the queue. For convenience, let's say you use the fully-qualified object name (e.g. MyApp.Objects.DataTypeOne). All you need to do is subscribe multiple consuming processes to this queue, and RabbitMQ will load-balance using a round-robin approach.
But for some no much heavy work. I can let single consumer to subscribe their message.
Yes, you can do this also. Same process as in the paragraph above. Just don't attach multiple consuming processes.
I have found this image is very similar to my business model.
The diagram isn't very useful, because it lacks information about the type of messages being published. In that sense, it is only an interconnect diagram. The interesting lines are the ones connecting the queues to the exchange, as that is what you specify within RabbitMQ via Queue Bindings. You can also bind exchanges to one another, but that's a bit further than we probably need to go.
Everything else on the diagram is fully under your control as the user of the RabbitMQ/AMQP system. You can create an arbitrary number of publishers and have an arbitrary number of consuming processes each consuming from an arbitrary number of queues. There are no hard and fast limits, though there are some practical aspects you probably will want to think about to ensure your system is maintainable.
In a web application, if I need to write an event to a queue, I would make a connection to redis to write the event.
Now if I want another backend process (say a daemon or cron job) to process the or react the the publishing of the event in redis, do I need a persistant connection?
Little confused on how this pub/sub process works in a web application.
Basically in Redis there are two different messaging models:
Fire and Forget / One to Many: Pub/Sub. At the time a message is PUBLISH-ed all the subscribers will receive it, but this message is then lost forever. If a client was not subscribed there is no way it can get it back.
Persisting Queues / One to One: Lists, possibly used with blocking commands such as BLPOP. With lists you have a producer pushing into a list, and one or many consumers waiting for elements, but one message will reach only one of the waiting clients. With lists you have persistence, and messages will wait for a client to pop them instead of disappearing. So even if no one is listening there is a backlog (as big as your available memory, or you can limit the backlog using LTRIM).
I hope this is clear. I suggest you studying the following commands to understand more about Redis and messaging semantics:
LPUSH/RPUSH, RPOP/LPOP, BRPOP/BLPOP
PUBLISH, SUBSCRIBE, PSUBSCRIBE
Doc for this commands is available at redis.io
I'm not totally sure, but I believe that yes, pub/sub requires a persistent connection.
For an alternative I would take a peek at resque and how it handles that. Instead of using pub/sub it simply adds an item to a list in redis, and then whatever daemon or cron job you have can use the lpop command to get the first one.
Sorry for only giving a pseudo answer and then a plug.