TTL can be set on queues, so they will expire after a period of time if they are not used. Is there a similar option for exchanges?
I'm trying to build a social application and each exchange represents a user. Each time someone wants to send a message to this user, he would send the message to the user exchange. If number of the users become large, like 20 million users, there would be 20 million exchanges in the system. I'm afraid that much exchanges degrade the system. Instead I want to only keep exchanges for online users.
By the way the messages are only valuable if the user is online and I don't want to store messages for later delivery.
Having a separate exchange for every user ideed would be overkill. Try a different approach.
Use a single direct exchange.
When a client comes online it creates a new exclusive, auto-delete queue and consumes from it.
The client also binds the single exchange to its queue using the name of the user as the routing key.
Producers publish messages to the single exhange with the name of the user as the routing key of the message.
This will automatically
only keep queues for online users and
discard messages for offline users.
Edit: If a user shall be able to use multiple clients, that's possible using the above approach.
Every client creates a new exclusi auto-delete queue and consumes from it as above.
It binds the single exchange to this queue as above.
Note that it is possible to have multiple bindings from an exchange using identical routing keys. Every client has its own queue and its own binding, even if the routing key on this binding is the same routing key as on another binding created by a different client.
Related
We are using a microservice architecture to implement Web-APIs in nodejs. Every service exposes HTTP endpoints so the app / website can interact with it. To synchronize the different databases, we are currently using RabbitMQ. A microservice can publish a message on a fanout exchange and every subscribed microservice receives the message.
There are two problems with this architecture.
What if we want to add a second instance of a microservice (for loadbalancing purposes etc.). If the second service would subscribe to the same fanout exchange, the messages would be consumed two times.
Either acknowledgments do not work with fanout exchanges, or I'm doing something wrong. When I publish a message on a fanout exchange without subscribers, the messages disappears immediately without being acked.
This leads me to the my question. Is RabbitMQ a good choice for microservice synchronization or should we change our architecture. Here is a short example of how I would like it to work:
The user creates a new account
The auth-mc inserts the user in its database and publishes a 'user.created' event
a1-mc, a2-mc (same mc, just loadbalanced) and b1-mc are subscribers of the exchange. Either a1 or a2, as well as b1 receive the event and insert the user in their respective database
The event is only removed from each microservices queue, after its acknowledged
This way I can be sure, that every microservice (loadbalanced or not) receives the message one time. Can such a pattern even be implemented using RabbitMQ?
EDIT: Also looking for good literature about microservices if there are any suggestions.
Let's use topic exchange instead of fanout for your purpose. Only one consumer will receive the message instead of all of them. You could route your messages based on routing_key params for different consumers. For instance, you have an exchange. You have three different queues bound to this exchange with the same routing key. Your message will be duplicated for each queue! Your consumers from different microservices could read the message separately and do what they need. The message will be not dropped until you acknowledge it, but it's a good practice to push message with TTL.
I'm building a basic event based message system for a couple of services.
For my user service, I'm going to use a user topic exchange which will have routing keys like user.event.created, user.event.updated and user.event.deleted.
My logs service will consume user.event.* keys so I can log all events, whereas my email service will only listen for user.event.created as I'll only send out email on creation.
Now say I created a posts service, I want the logs service to consume events from here as well. Is it ok for me to bind both exchanges to the single logs.process queue?
Is there a better way of achieving this?
As long as each of the consume threads has it's own connection, it's fine. So, one thread consumes from topic exchange, the other from direct one etc.
As for the better part, I don't know - would require some more details.
The undelying use case
It is typical pubsub use case: Consider we have M news sources, and there are N subscribers who subscribe to the desired news sources, and who want to get news updates. However, we want these updates to land up in mongodb - essentially maintain most recent 'k' updates (and can be indexed and searched etc.). We want to design for M to scale upto million publishers, N to scale to few millions.
Subscribers' updates are finally received and stored in more than one hosts and their native mongodbs.
Modeling in rabbitmq
Rabbitmq will be used to persist the mappings (who subscribes to which news source).
I have setup a pubsub system in this way: We create publisher exchanges (each mapping to one news source) and of type 'fanout'.
For modelling subscribers, there are two options.
In the first option, have one queue for each subscriber bound to relevant publisher exchanges. And let the client process open connections to all these subscriber queues and receive the updates (and persist them to mongodb). Note that in this option, when the client is restarted, it has to manage list of all susbcribers, and open connections to all subscriber queues it is responsible for.
In the second option, we want to be able to remove overhead of having to explicitly open on each user queue upon startup. Instead, we want to listen to only one queue - representative of all subscribers who will send updates to this client host.
For achieving this, we first create one exchange for each subscriber and let it bind to the publisher exchange(s) that it follows. We let a single queue for each client, and let the subscriber exchange bind to this queue (type=direct) if the subscriber belongs to that client.
Once the client receives the update message, it should come to know which subscriber exchange it came from. Only then we can add it to mongodb for relevant subscriber. Presumably the subscriber exchange should add this information as a new header on the message.
As per rabbitmq docs, I believe there is no way to get achieve this. (Or more specifically, to get the 'delivery path' property from the delivered message, from which we can get this information).
My questions:
Is it possible to add a new header to message as it passes through exchange?
If this is not possible, then can we achieve it through custom exchange and relevant plugin? Any plugin that I can readily use for this purpose?
I am curious as to why rabbitmq is not providing delivery path property as an optional configuration?
Is there any other way I can achieve the same? (See pubsubhubbub note below)
PubSubHubBub
The use case is very similar to what pubsubhubbub protocol provides for. And there is rabbitmq plugin too called rabbithub. However, our system will be a closed system, and I believe that the webhook approach of the protocol is going to be too much of overhead compared to listening on single queue (and from performance perspective.)
The producer (RMQ Client) of the message should add all the required headers (including the originator's identity) before producing (publishing) it on RMQ. These headers are used for routing.
If, while in transit, the message (including headers) needs to be transformed (e.g. adding new headers), it needs to be sent to the transformer (another RMQ Client). This transformer will essentially become the new publisher.
The actual consumer should receive its intended messages (for which it has subscribed to) through single queue. The routing of all its subscribed messages should be arranged on the RMQ Exchange.
Managing the last 'K' updates should neither be the responsibility of the producer nor the consumer. So, it should be done in the transformer. Producers' messages should be routed to this transformer (for storage) before further re-routing to exchange(s) from where consumers consume.
I have some camel routes with mina sockets and jetty websockets. I am able to broadcast a message to all the clients connected to the websocket but how do i send a message to a specific endpoint. How do i maintain a list of all connected clients with a client id as reference so i can route to a specific client. Is that possible? Will i be able to mention a dynamic client in the to URI?
Or maybe i am thinking about this wrong and i need to create topics on active mq and have the clients subscribe to it. That would mean that i create a topic for every websocket client? and route the message to the right topic.
Am i atleast on the right track here, any examples you can point out? Google was not helpful.
The approach you take depends on how sensitive the client information is. The downside of a single topic with selectors is that anyone can subscribe to the topic without a selector and see all the information for everyone - not usually something that you want to do.
A better scheme is to use a message distribution mechanism (set of Camel routes) that act as an intermediary between the websocket clients and the system producing the messages. This mechanism is responsible for distributing messages from a single destination to client-specitic destinations. I have worked on a couple of banking web front-ends that used a similar scheme.
In order for this to work you first generate for each user a distinct token/UUID; this is presented to the user when the session is established (usually through some sort of profile query/message).
It's essential that the UUID can be worked out as a hash of the clientId rather than being stored in a DB, as it will be used all the time and you want to make sure this is worked out quickly.
The user then uses that information to connect to specific topics that use that UUID as a suffix. For example two users subscribing to an orderConfirmation topic would each subscribe to their own version of that topic:
clientA -> orderConfirmation.71jqsd87162iuhw78162wd7168
clientB -> orderConfirmation.76232hdwe7r23j92irjh291e0d
To keep track of "presence", your clients would need to periodically send a heartbeat message containing their clientId to a well-known topic that your distribution mechanism listens on. Clients should not be able to subscribe to this topic for reads (see ActiveMQ Security). The message distribution mechanism needs to keep in memory a data structure that contains the clientId and the time a heartbeat was last seen.
When a message is received by the distribution mechanism, it checks whether the clientID for which it received the message has a "live/present" session, determines the UUID for the client, and broadcasts the message on the appropriate topic.
Over time this will create a large number of topics on your broker that you don't want hanging around when the user has gone away. You can configure ActiveMQ to delete these if they have been inactive for some time.
You definitely do not want to create separate endpoint for each client.
Topic and a subscription with selector is an elegant way to resolve it.
I would say the best one.
You need single topic, which every client would subscribe to with the selector looking like where clientId in ('${myClientId}', 'EVERYONE'). Now when you want to publish a message to specific client, you set a property clientId to the id of this client. If you want to broadcast, you set it to 'EVERYONE'
I hope I understand the problem right...
I have to implement this scenario:
An external application publish message to rabbitmq.
This message has a client_id property. We can place this id to routing key or message header or some other property.
I have to implement sharding in a exchange routng logic - the message should be delivered to specific queue based on the client_id range.
Is it possible to implement in a standard exchanges?
If not what exchange should I take as the base?
How to dynamicly change client_id ranges?
Take a look at the rabbitmq plugin. It's included in the RabbitMQ distribution from v3.6.0 onwards.
Just have your producer put enough info into the routing key that causes the message to go into the right queue on the other side of the Exchange.
So for example, create two queues called 1 and 2 and bind them with routing keys matching the names. Then have your producer decide which routing key to use when producing the event message. Customers with names starting with letters a-m go to 1, n-z go to 2, you get the idea. It pushes the sharding to the producer but that might be OK for your application.
AMQP doesn't have any explicit implementation of sharding, but its architecture should help you to do that.
Spreading messages to several queues is just a rabbitmq challenge (and part of amqp specification), and with routing, way you can attach hetereogeneous consumers to handle specific messages routed via the same exchange. Therefore, producer should push a specific key to be consumed by specific queue/consumer...
You can decide to make a static sharding, perhaps you have 10 queues with one consumer per queue. You could implement a consistent hashing function such that key is CLIENT_ID % 10.
Another ways and none static solutions could be propoused, and you can try to over this architecture.