Akka.NET with persistence dropping messages when CPU in under high pressure? - akka.net

I make some performance testing of my PoC. What I saw is my actor is not receiving all messages that are sent to him and the performance is very low. I sent around 150k messages to my app, and it causes a peak on my processor to reach 100% utilization. But when I stop sending requests 2/3 of messages are not delivered to the actor. Here is a simple metrics from app insights:
To prove I have almost the same number of event persistent in mongo that my actor received messages.
Secondly, performance of processing messages is very disappointing. I get around 300 messages per second.
I know Akka.NET message delivery is at most once by default but I don't get any error saying that message were dropped.
Here is code:
Cluster shard registration:
services.AddSingleton<ValueCoordinatorProvider>(provider =>
{
var shardRegion = ClusterSharding.Get(_actorSystem).Start(
typeName: "values-actor",
entityProps: _actorSystem.DI().Props<ValueActor>(),
settings: ClusterShardingSettings.Create(_actorSystem),
messageExtractor: new ValueShardMsgRouter());
return () => shardRegion;
});
Controller:
[ApiController]
[Route("api/[controller]")]
public class ValueController : ControllerBase
{
private readonly IActorRef _valueCoordinator;
public ValueController(ValueCoordinatorProvider valueCoordinatorProvider)
{
_valueCoordinator = valuenCoordinatorProvider();
}
[HttpPost]
public Task<IActionResult> PostAsync(Message message)
{
_valueCoordinator.Tell(message);
return Task.FromResult((IActionResult)Ok());
}
}
Actor:
public class ValueActor : ReceivePersistentActor
{
public override string PersistenceId { get; }
private decimal _currentValue;
public ValueActor()
{
PersistenceId = Context.Self.Path.Name;
Command<Message>(Handle);
}
private void Handle(Message message)
{
Context.IncrementMessagesReceived();
var accepted = new ValueAccepted(message.ValueId, message.Value);
Persist(accepted, valueAccepted =>
{
_currentValue = valueAccepted.BidValue;
});
}
}
Message router.
public sealed class ValueShardMsgRouter : HashCodeMessageExtractor
{
public const int DefaultShardCount = 1_000_000_000;
public ValueShardMsgRouter() : this(DefaultShardCount)
{
}
public ValueShardMsgRouter(int maxNumberOfShards) : base(maxNumberOfShards)
{
}
public override string EntityId(object message)
{
return message switch
{
IWithValueId valueMsg => valueMsg.ValueId,
_ => null
};
}
}
akka.conf
akka {
stdout-loglevel = ERROR
loglevel = ERROR
actor {
debug {
unhandled = on
}
provider = cluster
serializers {
hyperion = "Akka.Serialization.HyperionSerializer, Akka.Serialization.Hyperion"
}
serialization-bindings {
"System.Object" = hyperion
}
deployment {
/valuesRouter {
router = consistent-hashing-group
routees.paths = ["/values"]
cluster {
enabled = on
}
}
}
}
remote {
dot-netty.tcp {
hostname = "desktop-j45ou76"
port = 5054
}
}
cluster {
seed-nodes = ["akka.tcp://valuessystem#desktop-j45ou76:5054"]
}
persistence {
journal {
plugin = "akka.persistence.journal.mongodb"
mongodb {
class = "Akka.Persistence.MongoDb.Journal.MongoDbJournal, Akka.Persistence.MongoDb"
connection-string = "mongodb://localhost:27017/akkanet"
auto-initialize = off
plugin-dispatcher = "akka.actor.default-dispatcher"
collection = "EventJournal"
metadata-collection = "Metadata"
legacy-serialization = off
}
}
snapshot-store {
plugin = "akka.persistence.snapshot-store.mongodb"
mongodb {
class = "Akka.Persistence.MongoDb.Snapshot.MongoDbSnapshotStore, Akka.Persistence.MongoDb"
connection-string = "mongodb://localhost:27017/akkanet"
auto-initialize = off
plugin-dispatcher = "akka.actor.default-dispatcher"
collection = "SnapshotStore"
legacy-serialization = off
}
}
}
}

So there are two issues going on here: actor performance and missing messages.
It's not clear from your writeup, but I'm going to make an assumption: 100% of these messages are going to a single actor.
Actor Performance
The end-to-end throughput of a single actor depends on:
The amount of work it takes to route the message to the actor (i.e. through the sharding system, hierarchy, over the network, etc)
The amount of time it takes the actor to process a single message, as this determines the rate at which a mailbox can be emptied; and
Any flow control that affects which messages can be processed when - i.e. if an actor uses stashing and behavior switching, the amount of time an actor spends stashing messages while waiting for its state to change will have a cumulative impact on the end-to-end processing time for all stashed messages.
You will have poor performance due to item 3 on this list. The design that you are implementing calls Persist and blocks the actor from doing any additional processing until the message is successfully persisted. All other messages sent to the actor are stashed internally until the previous one is successfully persisted.
Akka.Persistence offers four options for persisting messages from the point of view of a single actor:
Persist - highest consistency (no other messages can be processed until persistence is confirmed), lowest performance;
PersistAsync - lower consistency, much higher performance. Doesn't wait for the message to be persisted before processing the next message in the mailbox. Allows multiple messages from a single persistent actor to be processed concurrently in-flight - the order in which those events are persisted will be preserved (because they're sent to the internal Akka.Persistence journal IActorRef in that order) but the actor will continue to process additional messages before the persisted ones are confirmed. This means you probably have to modify your actor's in-memory state before you call PersistAsync and not after the fact.
PersistAll - high consistency, but batches multiple persistent events at once. Same ordering and control flow semantics as Persist - but you're just persisting an array of messages together.
PersistAllAsync - highest performance. Same semantics as PersistAsync but it's an atomic batch of messages in an array being persisted together.
To get an idea as to how the performance characteristics of Akka.Persistence changes with each of these methods, take a look at the detailed benchmark data the Akka.NET organization has put together around Akka.Persistence.Linq2Db, the new high performance RDBMS Akka.Persistence library: https://github.com/akkadotnet/Akka.Persistence.Linq2Db#performance - it's a difference between 15,000 per second and 250 per second on SQL; the write performance is likely even higher in a system like MongoDB.
One of the key properties of Akka.Persistence is that it intentionally routes all of the persistence commands through a set of centralized "journal" and "snapshot" actors on each node in a cluster - so messages from multiple persistent actors can be batched together across a small number of concurrent database connections. There are many users running hundreds of thousands of persistent actors simultaneously - if each actor had their own unique connection to the database it would melt even the most robustly vertically scaled database instances on Earth. This connection pooling / sharing is why the individual persistent actors rely on flow control.
You'll see similar performance using any persistent actor framework (i.e. Orleans, Service Fabric) because they all employ a similar design for the same reasons Akka.NET does.
To improve your performance, you will need to either batch received messages together and persist them in a group with PersistAll (think of this as de-bouncing) or use asynchronous persistence semantics using PersistAsync.
You'll also see better aggregate performance if you spread your workload out across many concurrent actors with different entity ids - that way you can benefit from actor concurrency and parallelism.
Missing Messages
There could be any number of reasons why this might occur - most often it's going to be the result of:
Actors being terminated (not the same as restarting) and dumping all of their messages into the DeadLetter collection;
Network disruptions resulting in dropped connections - this can happen when nodes are sitting at 100% CPU - messages that are queued for delivery at the time can be dropped; and
The Akka.Persistence journal receiving timeouts back from the database will result in persistent actors terminating themselves due to loss of consistency.
You should look for the following in your logs:
DeadLetter warnings / counts
OpenCircuitBreakerExceptions coming from Akka.Persistence
You'll usually see both of those appear together - I suspect that's what is happening to your system. The other possibility could be Akka.Remote throwing DisassociationExceptions, which I would also look for.
You can fix the Akka.Remote issues by changing the heartbeat values for the Akka.Cluster failure-detector in configuration https://getakka.net/articles/configuration/akka.cluster.html:
akka.cluster.failure-detector {
# FQCN of the failure detector implementation.
# It must implement akka.remote.FailureDetector and have
# a public constructor with a com.typesafe.config.Config and
# akka.actor.EventStream parameter.
implementation-class = "Akka.Remote.PhiAccrualFailureDetector, Akka.Remote"
# How often keep-alive heartbeat messages should be sent to each connection.
heartbeat-interval = 1 s
# Defines the failure detector threshold.
# A low threshold is prone to generate many wrong suspicions but ensures
# a quick detection in the event of a real crash. Conversely, a high
# threshold generates fewer mistakes but needs more time to detect
# actual crashes.
threshold = 8.0
# Number of the samples of inter-heartbeat arrival times to adaptively
# calculate the failure timeout for connections.
max-sample-size = 1000
# Minimum standard deviation to use for the normal distribution in
# AccrualFailureDetector. Too low standard deviation might result in
# too much sensitivity for sudden, but normal, deviations in heartbeat
# inter arrival times.
min-std-deviation = 100 ms
# Number of potentially lost/delayed heartbeats that will be
# accepted before considering it to be an anomaly.
# This margin is important to be able to survive sudden, occasional,
# pauses in heartbeat arrivals, due to for example garbage collect or
# network drop.
acceptable-heartbeat-pause = 3 s
# Number of member nodes that each member will send heartbeat messages to,
# i.e. each node will be monitored by this number of other nodes.
monitored-by-nr-of-members = 9
# After the heartbeat request has been sent the first failure detection
# will start after this period, even though no heartbeat mesage has
# been received.
expected-response-after = 1 s
}
Bump the acceptable-heartbeat-pause = 3 s value to something larger like 10,20,30 if needed.
Sharding Configuration
One last thing I want to point out with your code - the shard count is way too high. You should have about ~10 shards per node. Reduce it to something reasonable.

Related

Kafka Parallel Consumer is not splitting work between different processes

I am using confluent parallel-consumer in order to acheive fast writes into different Data stores. I implemented my code and everything worked just fine locally with dockers.
Once I started several hosts with several consumers (with the same group id) I noticed that only one of the nodes (processes) is really consuming data. The topic I am reading from has 24 partitions, and I have 3 different nodes, I expected that kafka will split the work between them.
Here are parts of my code:
fun buildConsumer(config: KafkaConsumerConfig): KafkaConsumer<String, JsonObject> {
val props = Properties()
props[ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG] = config.kafkaBootstrapServers
props[ConsumerConfig.AUTO_OFFSET_RESET_CONFIG] = "earliest"
props[ConsumerConfig.GROUP_ID_CONFIG] = "myGroup"
// Auto commit must be false in parallel consumer
props[ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG] = false
props[ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG] = StringDeserializer::class.java.name
props[ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG] = JsonObjectDeSerializer::class.java.name
val consumer = KafkaConsumer<String, JsonObject>(props)
return consumer
}
private fun createReactParallelConsumer(): ReactorProcessor<String, JsonObject> {
val options = ParallelConsumerOptions.builder<String, JsonObject>()
.ordering(ParallelConsumerOptions.ProcessingOrder.KEY)
.maxConcurrency(10)
.batchSize(1)
.consumer(buildConsumer(kafkaConsumerConfig))
.build()
return ReactorProcessor(options)
}
And my main code:
pConsumer = createReactParallelConsumer()
pConsumer.subscribe(UniLists.of(kafkaConsumerConfig.kafkaTopic))
pConsumer.react { context ->
batchProcessor.processBatch(context)
}
Would appreciate any advice
We hit an issue that was closed in version 0.5.2.4 https://github.com/confluentinc/parallel-consumer/issues/409
The Parallel client kept old unfinished offsets, since our consumer was slow (many different reasons) we got to the end of the retention (earliest strategy), so every time we restarted the consumer, it was scanning all those incompatible offsets (which it did not truncate them - AKA the bug). Fix was just updating version from 0.5.2.3 to 0.5.2.4

Infinispan clustered lock performance does not improve with more nodes?

I have a piece of code that is essentially executing the following with Infinispan in embedded mode, using version 13.0.0 of the -core and -clustered-lock modules:
#Inject
lateinit var lockManager: ClusteredLockManager
private fun getLock(lockName: String): ClusteredLock {
lockManager.defineLock(lockName)
return lockManager.get(lockName)
}
fun createSession(sessionId: String) {
tryLockCounter.increment()
logger.debugf("Trying to start session %s. trying to acquire lock", sessionId)
Future.fromCompletionStage(getLock(sessionId).lock()).map {
acquiredLockCounter.increment()
logger.debugf("Starting session %s. Got lock", sessionId)
}.onFailure {
logger.errorf(it, "Failed to start session %s", sessionId)
}
}
I take this piece of code and deploy it to kubernetes. I then run it in six pods distributed over six nodes in the same region. The code exposes createSession with random Guids through an API. This API is called and creates sessions in chunks of 500, using a k8s service in front of the pods which means the load gets balanced over the pods. I notice that the execution time to acquire a lock grows linearly with the amount of sessions. In the beginning it's around 10ms, when there's about 20_000 sessions it takes about 100ms and the trend continues in a stable fashion.
I then take the same code and run it, but this time with twelve pods on twelve nodes. To my surprise I see that the performance characteristics are almost identical to when I had six pods. I've been digging in to the code but still haven't figured out why this is, I'm wondering if there's a good reason why infinispan here doesn't seem to perform better with more nodes?
For completeness the configuration of the locks are as follows:
val global = GlobalConfigurationBuilder.defaultClusteredBuilder()
global.addModule(ClusteredLockManagerConfigurationBuilder::class.java)
.reliability(Reliability.AVAILABLE)
.numOwner(1)
and looking at the code the clustered locks is using DIST_SYNC which should spread out the load of the cache onto the different nodes.
UPDATE:
The two counters in the code above are simply micrometer counters. It is through them and prometheus that I can see how the lock creation starts to slow down.
It's correctly observed that there's one lock created per session id, this is per design what we'd like. Our use case is that we want to ensure that a session is running in at least one place. Without going to deep into detail this can be achieved by ensuring that we at least have two pods that are trying to acquire the same lock. The Infinispan library is great in that it tells us directly when the lock holder dies without any additional extra chattiness between pods, which means that we have a "cheap" way of ensuring that execution of the session continues when one pod is removed.
After digging deeper into the code I found the following in CacheNotifierImpl in the core library:
private CompletionStage<Void> doNotifyModified(K key, V value, Metadata metadata, V previousValue,
Metadata previousMetadata, boolean pre, InvocationContext ctx, FlagAffectedCommand command) {
if (clusteringDependentLogic.running().commitType(command, ctx, extractSegment(command, key), false).isLocal()
&& (command == null || !command.hasAnyFlag(FlagBitSets.PUT_FOR_STATE_TRANSFER))) {
EventImpl<K, V> e = EventImpl.createEvent(cache.wired(), CACHE_ENTRY_MODIFIED);
boolean isLocalNodePrimaryOwner = isLocalNodePrimaryOwner(key);
Object batchIdentifier = ctx.isInTxScope() ? null : Thread.currentThread();
try {
AggregateCompletionStage<Void> aggregateCompletionStage = null;
for (CacheEntryListenerInvocation<K, V> listener : cacheEntryModifiedListeners) {
// Need a wrapper per invocation since converter could modify the entry in it
configureEvent(listener, e, key, value, metadata, pre, ctx, command, previousValue, previousMetadata);
aggregateCompletionStage = composeStageIfNeeded(aggregateCompletionStage,
listener.invoke(new EventWrapper<>(key, e), isLocalNodePrimaryOwner));
}
The lock library uses a clustered Listener on the entry modified event, and this one uses a filter to only notify when the key for the lock is modified. It seems to me the core library still has to check this condition on every registered listener, which of course becomes a very big list as the number of sessions grow. I suspect this to be the reason and if it is it would be really really awesome if the core library supported a kind of key filter so that it could use a hashmap for these listeners instead of going through a whole list with all listeners.
I believe you are creating a clustered lock per session id. Is this what you need ? what is the acquiredLockCounter? We are about to deprecate the "lock" method in favour of "tryLock" with timeout since the lock method will block forever if the clustered lock is never acquired. Do you ever unlock the clustered lock in another piece of code? If you shared a complete reproducer of the code will be very helpful for us. Thanks!

Increment negative number displayed for Queue Size field in the Member Clients Table

After enabling subscription conflate in my regions, I saw increment negative number (-XXXXX) in the queue size field in the Member Client Table in GemFire Pulse Website. Any reason that the negative number appear in the queue size field?
GemFire Version : 9.8.6
Number of Regions : 1
1 Client Application updating regions every 0.5 seconds (Caching Proxy)
1 Client Application reading data from regions (Caching Proxy - Register interest for all keys)
1 Locators and 1 Cache Server in same virtual machine
Queue Size. The size of the queue used by server to send events in case of a subscription enabled client or a client that has continuous queries running on the server. [https://gemfire.docs.pivotal.io/910/geode/developing/events/tune_client_message_tracking_timeout.html].
Additional Discovery
Pulse Website (Negative Number in Queue Size)
JConsole (showClientQueueDetail)
(numVoidRemovals (4486)
#ClientCacheApplication(locators = {
#ClientCacheApplication.Locator(host = "192.168.208.20", port = 10311) }, name = "Reading-Testing", subscriptionEnabled = true)
#EnableEntityDefinedRegions(basePackageClasses = Person.class, clientRegionShortcut = ClientRegionShortcut.CACHING_PROXY, poolName = "SecondPool")
#EnableGemfireRepositories(basePackageClasses = PersonRepository.class)
#EnablePdx
#Import({ GemfireCommonPool.class })
public class PersonDataAccess {
....
}
#Configuration
public class GemfireCommonPool {
#Bean("SecondPool")
public Pool init() {
PoolFactory poolFactory = PoolManager.createFactory();
poolFactory.setPingInterval(8000);
poolFactory.setRetryAttempts(-1);
poolFactory.setMaxConnections(-1);
poolFactory.setReadTimeout(30000);
poolFactory.addLocator("192.168.208.20", 10311);
poolFactory.setSubscriptionEnabled(true);
return poolFactory.create("SecondPool");
}
}
Additonal Discovery 2
When i remove the poolName field in #EnableEntityDefinedRegions, I found out that the pulse website does not display negative number for the queue size. However, in the showClientQueueDetail, it display negative number for queue size.
Is it my coding error or conflate issue?
Thank you so much.

Ridiculously slow simultaneous publish/consume rate with RabbitMQ

I'm evaluating RabbitMQ and while the general impression (of AMQP as such, and also RabbitMQ) is positive, I'm not very impressed by the result.
I'm attempting to publish and consume messages simultaneously and have achieved very poor message rates. I have a durable direct exchange, which is bound to a durable queue and I publish persistent messages to that exchange. The average size of the message body is about 1000 bytes.
My publishing happens roughly as follows:
AMQP.BasicProperties.Builder bldr = new AMQP.BasicProperties.Builder();
ConnectionFactory factory = new ConnectionFactory();
factory.setUsername("guest");
factory.setPassword("guest");
factory.setVirtualHost("/");
factory.setHost("my-host");
factory.setPort(5672);
Connection conn = null;
Channel channel = null;
ObjectMapper mapper = new ObjectMapper(); //com.fasterxml.jackson.databind.ObjectMapper
try {
conn = factory.newConnection();
channel = conn.createChannel();
channel.confirmSelect();
} catch (IOException e) {}
for(Message m : messageList) { //the size of messageList happens to be 9945
try {
channel.basicPublish("exchange", "", bldr.deliveryMode(2).contentType("application/json").build(), mapper.writeValueAsBytes(cm));
} catch (Exception e) {}
}
try {
channel.waitForConfirms();
channel.close();
conn.close();
} catch (Exception e1) {}
And consuming messages from the bound queue happens as so:
AMQP.BasicProperties.Builder bldr = new AMQP.BasicProperties.Builder();
ConnectionFactory factory = new ConnectionFactory();
factory.setUsername("guest");
factory.setPassword("guest");
factory.setVirtualHost("/");
factory.setHost("my-host");
factory.setPort(5672);
Connection conn = null;
Channel channel = null;
try {
conn = factory.newConnection();
channel = conn.createChannel();
channel.basicQos(100);
while (true) {
GetResponse r = channel.basicGet("rawDataQueue", false);
if(r!=null)
channel.basicAck(r.getEnvelope().getDeliveryTag(), false);
}
} catch (IOException e) {}
The problem is that when the message publisher (or several of them) and consumer (or several of them) run simultaneously then the publisher(s) appear to run at full throttle and the RabbitMQ management web interface shows a publishing rate of, say, ~2...3K messages per second, but a consumption rate of 0.5...3 per consumer. When the publisher(s) finish then I get a consumption rate of, say, 300...600 messages per consumer. When not setting the QOS prefetch value for the Java client, then a little less, when setting it to 100 or 250, then a bit more.
When experimenting with throttling the consumers somewhat, I have managed to achieve simultaneous numbers like ~400 published and ~50 consumed messages per second which is marginally better but only marginally.
Here's, a quote from the RabbitMQ blog entry which claims that queues are fastest when they're empty which very well may be, but slowing the consumption rate to a crawl when there are a few thousand persistent messages sitting in the queue is still rather unacceptable.
Higher QOS prefetching values may help a bit but are IMHO not a solution as such.
What, if anything, can be done to achieve reasonable throughput rates (2 consumed messages per consumer per second is not reasonable in any circumstance)? This is only a simple one direct exchange - one binding - one queue situation, should I expect more performance degradation with more complicated configurations? When searching around the internet there have also been suggestions to drop durability, but I'm afraid in my case that is not an option. I'd be very happy if somebody would point out that I'm stupid and that there is an evident and straightforward solution of some kind :)
Have you tried with the autoAck option? That should improve your performance. It is much faster than getting the messages one by one and ack'ing them. Increasing the prefetch count should make it even better too.
Also, what is the size of the messages you are sending and consuming including headers? Are you experiencing any flow-control in the broker?
Another question, are you creating a connection and channel every time you send/get a message? If so, that's wrong. You should be creating a connection once, and use a channel per thread (probably in a thread-local fashion) to send and receive messages. You can have multiple channels per connection. There is no official documentation about this, but if you read articles and forums this seems to be the best performance practice.
Last thing, have you considered using the basicConsume instead of basicGet? It should also make it faster.
Based on my experience, I have been able to run a cluster sending and consuming at rates around 20000 messages per second with non-persistent messages. I guess that if you are using durable and persistent messages the performance would decrease a little, but not 10x.
Operating system could schedule your process to the next time slot, if sleep is used. This could create significant performance decrease.

How to write a transactional, multi-threaded WCF service consuming MSMQ

I have a WCF service that posts messages to a private, non-transactional MSMQ queue. I have another WCF service (multi-threaded) that processes the MSMQ messages and inserts them in the database.
My issue is with sequencing. I want the messages to be in certain order. For example MSG-A need to go to the database before MSG-B is inserted. So my current solution for that is very crude and expensive from database perspective.
I am reading the message, if its MSG-B and there is no MSG-A in the database, I throw it back on the message queue and I keep doing that till MSG-A is inserted in the database. But this is a very expensive operation as it involves table scan (SELECT stmt).
The messages are always posted to the queue in sequence.
Short of making my WCF Queue Processing service Single threaded (By setting the service behavior attribute InstanceContextMode to Single), can someone suggest a better solution?
Thanks
Dan
Instead of immediately pushing messages to the DB after taking them out of the queue, keep a list of pending messages in memory. When you get an A or B, check to see if the matching one is in the list. If so, submit them both (in the right order) to the database, and remove the matching one from the list. Otherwise, just add the new message to that list.
If checking for a match is too expensive a task to serialize - I assume you are multithreading for a reason - the you could have another thread process the list. The existing multiple threads read, immediately submit most messages to the DB, but put the As and Bs aside in the (threadsafe) list. The background thread scavenges through that list finding matching As and Bs and when it finds them it submits them in the right order (and removes them from the list).
The bottom line is - since your removing items from the queue with multiple threads, you're going to have to serialize somewhere, in order to ensure ordering. The trick is to minimize the number of times and length of time you spend locked up in serial code.
There might also be something you could do at the database level, with triggers or something, to reorder the entries when it detects this situation. I'm afraid I don't know enough about DB programming to help there.
UPDATE: Assuming the messages contain some id that lets you associate a message 'A' with the correct associated message 'B', the following code will make sure A goes in the database before B. Note that it does not make sure they are adjacent records in the database - there could be other messages between A and B. Also, if for some reason you get an A or B without ever receiving the matching message of the other type, this code will leak memory since it hangs onto the unmatched message forever.
(You could extract those two 'lock'ed blocks into a single subroutine, but I'm leaving it like this for clarity with respect to A and B.)
static private object dictionaryLock = new object();
static private Dictionary<int, MyMessage> receivedA =
new Dictionary<int, MyMessage>();
static private Dictionary<int, MyMessage> receivedB =
new Dictionary<int, MyMessage>();
public void MessageHandler(MyMessage message)
{
MyMessage matchingMessage = null;
if (IsA(message))
{
InsertIntoDB(message);
lock (dictionaryLock)
{
if (receivedB.TryGetValue(message.id, out matchingMessage))
{
receivedB.Remove(message.id);
}
else
{
receivedA.Add(message.id, message);
}
}
if (matchingMessage != null)
{
InsertIntoDB(matchingMessage);
}
}
else if (IsB(message))
{
lock (dictionaryLock)
{
if (receivedA.TryGetValue(message.id, out matchingMessage))
{
receivedA.Remove(message.id);
}
else
{
receivedB.Add(message.id, message);
}
}
if (matchingMessage != null)
{
InsertIntoDB(message);
}
}
else
{
// not A or B, do whatever
}
}
If you're the only client of those queues, you could very easy add a timestamp as a message header (see IDesign sample) and save the Sent On field (kinda like an outlook message) in the database as well. You could process them in the order they were sent (basically you move the sorting logic at the time of consumption).
Hope this helps,
Adrian