Akka.net cluster sharding: Unable to register coordinator - akka.net

I am trying to setup akka.net cluster sharding by creating a simple project.
Project layout:
Actors - class library that defines one actor and message. Is reference by other projects
Inbound - Starts ShardedRegion and is the only node participating in cluster sharding. And should be the one hosting the coordinator too.
MessageProducer - Will host only shardedregion proxy to send messages to the ProcessorActor.
Lighthouse - seed node
Uploaded images show that the coordinator singleton is not initialized and messages send through sharedregion proxy are not delivered.
Based on the blog post by petabridge, petabridge.com/blog/cluster-sharding-technical-overview-akkadotnet/, I have excluded lighthouse, by setting akka.cluster.sharding.role, from participating in cluster sharding so that coordinator is not created on it.
Not sure what am I missing to get this to work.

This was already answered on gitter, but here's the tl;dr:
Shard region proxy needs to share the same role as a corresponding shard region. Otherwise proxy may not be able to find shard coordinator, and therefore not able to find initial location of a shard, it wants to send message to.
IMessageExtractor.GetMessage method is used to extract an actual message, that is going to be send to sharded actor. In example message extractor was used to extract string property from enveloping message, yet a receiver actor has Receive handler set for envelope, not a string.

Related

How Akka.Net handles system falts during message processing

Suppose that one of cluster nodes received a message and one of actors started to process it. Somewhere in the middle this node died for some reason. What will happen with message, I mean will it be processed by another available node or will be lost?
By default akka (and every other actor model framework) offers at-most-once delivery. This means that messages are send to actors using best effort guarantees - if they won't reach the target they won't be redelivered. This also means, that if message reached the target, but the process associated with it was interrupted before finishing, it won't be retried.
That being said, there are numerous ways to offer a redelivery between actors with various guarantees.
The simplest and most unreliable is to use Ask pattern in combination with i.e. Polly library. This however won't help if a node, on which sender lives, will die - simply because message are still stored only in memory.
The more reliable pattern is to use some event log/queue in front of your cluster (i.e. Azure Service Bus, RabbitMQ or Kafka). In this approach clients are sending requests via bus/queue, while the first actor in process pipeline is responsible for picking it up. If some actor or node in pipeline dies, the whole pipeline for that message is being retried.
Another idea is to use at-least-once delivery found in Akka.Peristence module. It allows you to use eventsourcing capabilities of persistent actors to persist messages. However IMO it requires a bit of exerience with Akka.
All of these approaches present at-least-once delivery guarantees, which means that it's possible to send the same message to its destination more than once. This also means, that your processing logic needs to acknowledge that by either an idempotent behavior or by recognizing and removing duplicates on the receiver side.

How to successfully set up a simple cluster singleton in Akka.NET

I was running into a problem attempting to set up a Cluster Singleton within an Akka.NET cluster where more than one instance of the singleton was starting up and running within my cluster. The cluster consists of Lighthouse (the seed node) and x number of instances of the main cluster node of which there are cluster shards as well as this singleton that exist within this node.
In order to reproduce the problem I was having I set up an example solution in GitHub but unfortunately I'm having a different problem here as I always get Singleton not available messages and my singleton never receives a message. This is sort of opposite problem that I was getting originally but nonetheless I would like to sort out a working example of cluster singleton.
[DEBUG][8/22/2016 3:06:18 PM][Thread
0015][[akka://singletontest/user/my-singleton-proxy#1237572454]]
Singleton not available, buffering message type [System.String]
In the Lighthouse process I see the following messags.
Akka.Remote.EndpointWriter: Dropping message
[Akka.Actor.ActorSelectionMessage] for non-local recipient
[[akka.tcp://sync#127.0.0.1:4053/]] arriving at
[akka.tcp://sync#127.0.0.1:4053] inbound addresses
[akka.tcp://singletontest#127.0.0.1:4053]
Potentially related:
https://github.com/akkadotnet/akka.net/issues/1960
It appears that the only bit that was missing was that the actor system specified in the actor path for my seed node did not match the actor system name specified in both Lighthouse and my Cluster Node processes. After ensuring that it matches in all three places the cluster is now behaving as expected.
https://github.com/jpierson/x-akka-cluster-singleton/commit/77ae63209042841c144f69d4cd70e9925b68a79a
Special thanks to Chris G. Stevens for his assistance.

Retrieve name of RabbitMQ Queue where message was consumed from

Using a SimpleMessageListenerContainer that is attached to multiple queues and configured with a ChannelAwareMessageListener. Is it possible to determine which queue a message has been consumed from? In particular if the message was routed to the queue from an Exchange.
It looks that if a message is sent directly to a queue that the MessageProperties#getReceivedRoutingKey will contain the queue name but if the message is routed to a queue via an Exchange then this information contains the routing key that was used.
I'm looking for a mechanism that would allow this information to be extracted correctly regardless of how the message was delivered to the queue. Or a mechanism to enrich the information with a header containing this information on the RabbitMQ side.
I had a similar issue where I wanted to add the queue name to slf4j MDC context.
The only solution I have found is to subclass SimpleMessageListenerContainer and set a ThreadLocal variable for the queue name or in my case the MDC context (which is basically threadlocals).
Because SimpleMessageListenerContainer still doesn't know exactly which queue (you can bind multiple queues to a container) you will have to allow only single queue per container which in my opinion is what you should do regardless.
In my companies own code base we have a magical SimpleMessageListenerContainerFactory that does the creation of custom SimpleMessageListenerContainer based on routing annotations (think spring mvc #RequestMapping for amqp). If there is interest perhaps we can expedite opensourcing it.

How to get delivery path in rabbitmq to become message property?

The undelying use case
It is typical pubsub use case: Consider we have M news sources, and there are N subscribers who subscribe to the desired news sources, and who want to get news updates. However, we want these updates to land up in mongodb - essentially maintain most recent 'k' updates (and can be indexed and searched etc.). We want to design for M to scale upto million publishers, N to scale to few millions.
Subscribers' updates are finally received and stored in more than one hosts and their native mongodbs.
Modeling in rabbitmq
Rabbitmq will be used to persist the mappings (who subscribes to which news source).
I have setup a pubsub system in this way: We create publisher exchanges (each mapping to one news source) and of type 'fanout'.
For modelling subscribers, there are two options.
In the first option, have one queue for each subscriber bound to relevant publisher exchanges. And let the client process open connections to all these subscriber queues and receive the updates (and persist them to mongodb). Note that in this option, when the client is restarted, it has to manage list of all susbcribers, and open connections to all subscriber queues it is responsible for.
In the second option, we want to be able to remove overhead of having to explicitly open on each user queue upon startup. Instead, we want to listen to only one queue - representative of all subscribers who will send updates to this client host.
For achieving this, we first create one exchange for each subscriber and let it bind to the publisher exchange(s) that it follows. We let a single queue for each client, and let the subscriber exchange bind to this queue (type=direct) if the subscriber belongs to that client.
Once the client receives the update message, it should come to know which subscriber exchange it came from. Only then we can add it to mongodb for relevant subscriber. Presumably the subscriber exchange should add this information as a new header on the message.
As per rabbitmq docs, I believe there is no way to get achieve this. (Or more specifically, to get the 'delivery path' property from the delivered message, from which we can get this information).
My questions:
Is it possible to add a new header to message as it passes through exchange?
If this is not possible, then can we achieve it through custom exchange and relevant plugin? Any plugin that I can readily use for this purpose?
I am curious as to why rabbitmq is not providing delivery path property as an optional configuration?
Is there any other way I can achieve the same? (See pubsubhubbub note below)
PubSubHubBub
The use case is very similar to what pubsubhubbub protocol provides for. And there is rabbitmq plugin too called rabbithub. However, our system will be a closed system, and I believe that the webhook approach of the protocol is going to be too much of overhead compared to listening on single queue (and from performance perspective.)
The producer (RMQ Client) of the message should add all the required headers (including the originator's identity) before producing (publishing) it on RMQ. These headers are used for routing.
If, while in transit, the message (including headers) needs to be transformed (e.g. adding new headers), it needs to be sent to the transformer (another RMQ Client). This transformer will essentially become the new publisher.
The actual consumer should receive its intended messages (for which it has subscribed to) through single queue. The routing of all its subscribed messages should be arranged on the RMQ Exchange.
Managing the last 'K' updates should neither be the responsibility of the producer nor the consumer. So, it should be done in the transformer. Producers' messages should be routed to this transformer (for storage) before further re-routing to exchange(s) from where consumers consume.

JMS message received at only one server

I'm having a problem with a JEE6 application running in a clustered environment using WebSphere ApplicationServer 8.
A search index is used for quick search in the UI (using Lucene), which must be re-indexed after new data arrived in the corresponding DB layer. To achieve this we're sending a JMS message to the application, then the search index will be refreshed.
The problem is, that the messages only arrives at one of the cluster members. So only there the search index is up to date. At the other servers it remains outdated.
How can I achieve that the search index gets updated at all cluster members?
Can I receive the message somehow on all servers?
Or is there a better way to do this?
I found a possible solution:
Generally, a JMS message delivered via a queue goes only to one of the cluster members. I found a possible way to get the info to all of the cluster members, using a EJB timer. Creating a non-persistent timer should call the callback method on all of the cluster members. This might be a convenient way to recreate the local search index on all the cluster members.
It is important to be a non-persistent ejb timer, because persistent timers get synchronized on the cluster and are only executed on one of the cluster members.