Can you decide which actors take which keys when using consistent hashing? - akka.net

I've experimented a little with the Akka .NET consistent hashing router. It seems to me that although you can specify what key to use for the hashing, it is the router who decides how to allocate the keys across actors.
I would have liked to do something like Actor A takes messages of type A, Actor B takes messages of type B, etc. Is this at all possible with the consistent hashing router?

No, it's not possible for existing routers.
You can subscribe your actors to a particular message types using EventBus (Context.System.EventStream.Subscribe(Self, typeof<MyMessage>);) and publish them by calling system.EventStream.Publish(new MyMessage()); - this way published message will be send to all subscribers. Limitation of that approach is that it works only in the scope of a single ActorSystem.
For distributed publish/subscribe scenarios you may use Akka.Cluster.Tools plugin, which exposes such option. Remember however that in this case subscription key is string instead of message type.

Related

How to quickly subscribed to relevant subset of large set of routing keys?

I have the feeling I am not understanding something fundamental in AMQP/RabbitMQ, since I cannot find much help on this specific detail.
Let's assume I have a system made up of several components sending each other messages via a RabbitMQ broker. The messages can have routing keys of the form XXX.YYY. Let's further assume XXX and YYY are numbers between 000 and 999. That means there are a total of 1,000,000 different possible routing keys.
Now, not every component in my system is interested in every message. Let's say there is a component that wants all the messages in which XXX is between 300 and 500 and YYY is between 600 and 900. That means the component wants to process messages referring to 200*300 = 60,000 different routing keys. Also, the component might be restarted at any point in time and needs to be able to start processing the messages quickly after restart.
Furthermore, the routing keys the component is interested in might change at runtime.
There are several ways to approach this that I can think of:
Use topic exchanges and subscribe to each routing key. If I do this using one connection and one channel, it is awfully slow. My understanding is that bindings are created sequentially for each channel and thus creating 60,000 bindings takes a while. Adding and removing bindings is trivial, though. Would it be feasible to create more channels so that bindings can be created in parallel?
Use topic exchanges and wildcards, discard messages you're not interested in in the client. We could subscribe to *.* and receives messages for all 1,000,000 routing keys => much more load in the client. Or subscribe to all 200 relevant values of XXX.* and receive messages for 200,000 routing keys. Is this a generally applied pattern?
Use headers exchanges and set x-match to any. This feels a little hacky and it seems headers exchanges are not widely used. You also have to deal with the maximum size of the header when defining a binding. Do people do this? You only need a handful of bindings though, so re-creating the bindings after a restart is very fast. Updating the set of topics we're interested in is also not a problem: Just re-create everything.
So, I guess my question is: What's the best practice to subscribe to a large amount of topics very quickly (<5s) and still be able change routing keys dynamically at run-time?
Would it be feasible to split the component which needs the messages and the subscription into two components? One component is only responsible for keeping the subscriptions up-to-date (this would exchange-to-exchange subscriptions) and the other components receives every message from the downstream exchange.

What is the difference between correlation id and delivery tag

I've searched for a good explanation for the difference between these two,
but didn't really find one.
What I know till now is that:
correlation id is a string (Guid which was converted to string), and delivery tag is an int.
correlation id is unique for each message, and delivery tag is unique only in
the channel (The channel is the scope).
That's fine....but what is the difference in the purposes? why do we need two identifiers for a message?
The two identifiers exist at two different conceptual layers of communication, and have different properties that are useful in each case. While a protocol could probably be designed that had one identifier serving both purposes, keeping them separate makes both implementations simpler.
Delivery tags
Part of the AMQP communication layer, built into RabbitMQ itself.
Example use: a consumer process can acknowledge that a message has been processed and can be permanently discarded on the broker (RabbitMQ server).
Automatically assigned within the open channel for every message delivered.
Must be unique within that channel in order for the protocol to function correctly. Does not need to be unique across different channels, so a simple incrementing integer is simple to implement.
The same message may be delivered at different times with different delivery tags, or even exist on multiple queues and be delivered simultaneously to different consumers.
Correlation IDs
Part of the logic of the application that is using RabbitMQ, not the broker itself.
Example use: using a matching correlation ID and "reply to" on two separate messages, which the application wants to treat as a request and a response in an RPC pattern.
Needs to be manually added when the message is first created, and is optional.
Not guaranteed to be unique by the protocol, which just sees it as an arbitrary string. It is up to an application to generate in a way that is sufficiently unlikely to collide for its use case, such as an appropriate form of UUID.
Will stay the same every time a message is delivered, no matter how many times it is forwarded or duplicated into multiple queues.
Correlation ID is generally used in the context of RabbitMQ when I want to see a synchronous behavior in which a message is sent and in response to it another sender will send a response but will have the correlationID in the reply-to tag . The common pattern which is replicated in RabbitMQ is the RPC call which is more like a Synchronous messaging.
Delivery Tag is however an indicator of the delivery of the message per channel and generally comes in scope when Acknowledged Delivery model is being followed.
Both have completely different purpose and are not message identifier as such.

What to use: multiple queue names or multiple routing keys and when?

Can anyone explain in which cases I need to create multiple queues (one user -> one queue name), and when one queue name for all clients with different routing keys (one user -> one routing key) and why?
A user should not be able to read messages intended for another user.
I'm using direct exchange type.
First off I am going to assume that when you say "user" you are interchangeably referring to a consumer or producer, and they aren't the same thing so I would read up on that here in rabbitmq's simplest explanation. Walking through that tutorial will definitely help solidify your understanding of rabbit a bit more overall too, which is always good.
In any case, I would recommend doing this:
Create multiple queue's, each one linked to a single consumer. The reason for doing this instead of using a single queue with multiple is discussed here but if you don't want a bunch of programmer jargon, it pretty much says that a single queue is super slow because only one message can be consumed at a time from the queue.
Also, there is a built in "default exchange" that you can use instead of setting up another direct exchange which it sounds like you're putting effort into that you might not need to, obviously I'm not sure what you are doing but I would take that into consideration... hope this helps!

Routing Key logic

For the Orange Live Objects, I want to "filter" the messages coming from a certain "profile" and sends them to a MQTT queue.
For messages with another profile, I would like to send them to a different MQTT queue.
It seems that I can use the Routing Key logic for this, although all examples are based on DevEUI as example (a single sensor) and not on a sensor type (which makes much more sense as you would like to decode your messages per sensor_type iso sensor.
Has anyone already tried if the Routing Key could work with selecting on "profile" level?
for now it's not possible, but with the new version of Live Objects announced for next july, it will be possible to build groups of Lora devices and route corresponding messages in different queues

How to make topic exchanges expandable

So we will have a topic exchange that looks something like
{class}.{genus}
So we have some consumers that bind with the topic
mammal.*
(or bird.*, etc.)
Now suppose later on we want to include species information so the topic exchange now looks like this:
{class}.{genus}.{species}
Now the old consumers are broken :(
However they could have bound as
mammal.*.#
And been able to listen to whatever future information is added. However, this is something my team came up with on our own which leads me to ask:
Is this good practice?
Are there tradeoffs to this I should be aware of?
Is there an alternate way to have a producer be able to add information without breaking existing consumers, without publishing to multiple exchanges?
Typically if you have a need maximum control on queue delivery and want to do the logic in rabbit, then you should consider header exchanges.
Usually when we code up the publish we know exactly which queue it needs to go to, so whether you want to use a routing key or a boolean to do this might not make much difference depending on your application.
This brings up another design consideration to be aware of: whether you want routing logic in rabbit. Someone people prefer to just use simple routing keys and either direct or topic exchanges, focusing on flexible consumers. Its going to be hard to guess at what is best for your application obviously.
Keep in mind that your consumers will be subscribed, often statically, to the queue(s) that the exchange delivers to. Also mammal.# is the same as mammal.*.# (see: ref)