What is causing EventStore to throw ConcurrencyException so easily? - nservicebus

Using JOliver EventStore 3.0, and just getting started with simple samples.
I have a simple pub/sub CQRS implementation using NServiceBus. A client sends commands on the bus, a domain server recieves and processes the commands and stores events to the eventstore, which are then published on the bus by the eventstore's dispatcher. a read-model server then subscribes to those events to update the read-model. Nothing fancy, pretty much by-the-book.
It is working, but just in simple tests I am getting lots of concurrency exceptions (intermittantly) on the domain server when the event is stored to the EventStore. It properly retries, but sometimes it hits the 5 retry limit and the command ends up on the error queue.
Where could I start investigating to see what is causing the concurrency exception? I remove the dispatcher and just focus on storing events and it has the same issue.
I'm using RavenDB for persistence of my EventStore. I'm not doing anything fancy, just this:
using (var stream = eventStore.OpenStream(entityId, 0, int.MaxValue))
{
stream.Add(new EventMessage { Body = myEvent });
stream.CommitChanges(Guid.NewGuid());
}
The stack trace for the exception looks like this:
2012-03-17 18:34:01,166 [Worker.14] WARN
NServiceBus.Unicast.UnicastBus [(null)] <(null)> -
EmployeeCommandHandler failed handling message.
EventStore.ConcurrencyException: Exception of type
'EventStore.ConcurrencyException' was thrown. at
EventStore.OptimisticPipelineHook.PreCommit(Commit attempt) in
c:\Code\public\EventStore\src\proj\EventStore.Core\OptimisticPipelineHook.cs:line
55 at EventStore.OptimisticEventStore.Commit(Commit attempt) in
c:\Code\public\EventStore\src\proj\EventStore.Core\OptimisticEventStore.cs:line
90 at EventStore.OptimisticEventStream.PersistChanges(Guid
commitId) in
c:\Code\public\EventStore\src\proj\EventStore.Core\OptimisticEventStream.cs:line
168 at EventStore.OptimisticEventStream.CommitChanges(Guid
commitId) in
c:\Code\public\EventStore\src\proj\EventStore.Core\OptimisticEventStream.cs:line
149 at CQRSTest3.Domain.Extensions.StoreEvent(IStoreEvents
eventStore, Guid entityId, Object evt) in
C:\dev\test\CQRSTest3\CQRSTest3.Domain\Extensions.cs:line 13 at
CQRSTest3.Domain.ComandHandlers.EmployeeCommandHandler.Handle(ChangeEmployeeSalary
message) in
C:\dev\test\CQRSTest3\CQRSTest3.Domain\ComandHandlers\Emplo
yeeCommandHandler.cs:line 55

I figured it out. Had to dig through source code to find it though. I wish this was better documented! Here's my new eventstore wireup:
EventStore = Wireup.Init()
.UsingRavenPersistence("RavenDB")
.ConsistentQueries()
.InitializeStorageEngine()
.Build();
I had to add .ConsistentQueries() in order for the raven persistence provider to internally use WaitForNonStaleResults on the queries eventstore was making to raven.
Basically when I add a new event, and then try to add another before raven has caught up with indexing, the stream revision was not up to date. The second event would step on the first one.

Related

Kafka Streams write an event back to the input topic

in my kafka streams app, I need to re-try processing a message whenever a particular type of exception is thrown in the processing logic.
Rather than wrapping my logic in the RetryTemplate (am using springboot), am considering just simply writing the message back into the input topic, my assumption is that this message will be added to the back of the log in the appropriate partition and it will eventually be re-processed.
Am aware that this would mess up the ordering and am okay with that.
My question is, would kafka streams have an issue when it encounters a message that was supposedly already processed in the past (am assuming kafka streams has a way of marking the messages it has processed especially when exactly is enabled)?
Here is an example of the code am considering for this solution.
val branches = streamsBuilder.stream(inputTopicName)
.mapValues { it -> myServiceObject.executeSomeLogic(it) }
.branch(
{ _, value -> value is successfulResult() }, // success
{ _, error -> error is errorResult() }, // exception was thrown
)
branches[0].to(outputTopicName)
branches[1].to(inputTopicName) //write them back to input as a way of retrying

Kafka streams: groupByKey and reduce not triggering action exactly once when error occurs in stream

I have a simple Kafka streams scenario where I am doing a groupyByKey then reduce and then an action. There could be duplicate events in the source topic hence the groupyByKey and reduce
The action could error and in that case, I need the streams app to reprocess that event. In the example below I'm always throwing an error to demonstrate the point.
It is very important that the action only ever happens once and at least once.
The problem I'm finding is that when the streams app reprocesses the event, the reduce function is being called and as it returns null the action doesn't get recalled.
As only one event is produced to the source topic TOPIC_NAME I would expect the reduce to not have any values and skip down to the mapValues.
val topologyBuilder = StreamsBuilder()
topologyBuilder.stream(
TOPIC_NAME,
Consumed.with(Serdes.String(), EventSerde())
)
.groupByKey(Grouped.with(Serdes.String(), EventSerde()))
.reduce { current, _ ->
println("reduce hit")
null
}
.mapValues { _, v ->
println(Id: "${v.correlationId}")
throw Exception("simulate error")
}
To cause the issue I run the streams app twice. This is the output:
First run
Id: 90e6aefb-8763-4861-8d82-1304a6b5654e
11:10:52.320 [test-app-dcea4eb1-a58f-4a30-905f-46dad446b31e-StreamThread-1] ERROR org.apache.kafka.streams.KafkaStreams - stream-client [test-app-dcea4eb1-a58f-4a30-905f-46dad446b31e] All stream threads have died. The instance will be in error state and should be closed.
Second run
reduce hit
As you can see the .mapValues doesn't get called on the second run even though it errored on the first run causing the streams app to reprocess the same event again.
Is it possible to be able to have a streams app re-process an event with a reduced step where it's treating the event like it's never seen before? - Or is there a better approach to how I'm doing this?
I was missing a property setting for the streams app.
props["processing.guarantee"]= "exactly_once"
By setting this, it will guarantee that any state created from the point of picking up the event will rollback in case of a exception being thrown and the streams app crashing.
The problem was that the streams app would pick up the event again to re-process but the reducer step had state which has persisted. By enabling the exactly_once setting it ensures that the reducer state is also rolled back.
It now successfully re-processes the event as if it had never seen it before

Camunda - Intermedia message event cannot correlate to a single execution

I created a small application (Spring Boot and camunda) to process an order process. The Order-Service receives the new order via Rest and calls the Start Event of the BPMN Order workflow. The order process contains two asynchronous JMS calls (Customer check and Warehouse Stock check). If both checks return the order process should continue.
The Start event is called within a Spring Rest Controller:
ProcessInstance processInstance =
runtimeService.startProcessInstanceByKey("orderService", String.valueOf(order.getId()));
The Send Task (e.g. the customer check) sends the JMS message into a asynchronous queue.
The answer of this service is catched by a another Spring component which then trys to send an intermediate message:
runtimeService.createMessageCorrelation("msgReceiveCheckCustomerCredibility")
.processInstanceBusinessKey(response.getOrder().getBpmnBusinessKey())
.setVariable("resultOrderCheckCustomterCredibility", response)
.correlate();
I deactivated the warehouse service to see if the order process waits for the arrival of the second call, but instead I get this exception:
1115 06:33:08.564 WARN [o.c.b.e.jobexecutor] ENGINE-14006 Exception while executing job 67d2cc24-0769-11ea-933a-d89ef3425300:
org.springframework.messaging.MessageHandlingException: nested exception is org.camunda.bpm.engine.MismatchingMessageCorrelationException: ENGINE-13031 Cannot correlate a message with name 'msgReceiveCheckCustomerCredibility' to a single execution. 4 executions match the correlation keys: CorrelationSet [businessKey=1, processInstanceId=null, processDefinitionId=null, correlationKeys=null, localCorrelationKeys=null, tenantId=null, isTenantIdSet=false]
This is my process. I cannot see a way to post my bpmn file :-(
What can't it not correlate with the message name and the business key? The JMS queues are empty, there are other messages with the same businessKey waiting.
Thanks!
Just to narrow the problem: Do a runtimeService eventSubscription query before you try to correlate and check what subscriptions are actually waiting .. maybe you have a duplicate message name? Maybe you (accidentally) have another instance of the same process running? Once you identified the subscriptions, you could just notify the execution directly without using the correlation builder ...

Suppressing NServicebus Transaction to write errors to database

I'm using NServiceBus to handle some calculation messages. I have a new requirement to handle calculation errors by writing them the same database. I'm using NHibernate as my DAL which auto enlists to the NServiceBus transaction and provides rollback in case of exceptions, which is working really well. However if I write this particular error to the database, it is also rolled back which is a problem.
I knew this would be a problem, but I thought I could just wrap the call in a new transaction with the TransactionScopeOption = Suppress. However the error data is still rolled back. I believe that's because it was using the existing session with has already enlisted in the NServiceBus transaction.
Next I tried opening a new session from the existing SessionFactory within the suppression transaction scope. However the first call to the database to retrieve or save data using this new session blocks and then times out.
InnerException: System.Data.SqlClient.SqlException
Message=Timeout expired. The timeout period elapsed prior to completion of the >operation or the server is not responding.
Finally I tried creating a new SessionFactory using it to open a new session within the suppression transaction scope. However again it blocks and times out.
I feel like I'm missing something obvious here, and would greatly appreciate any suggestions on this probably common task.
As Adam suggests in the comments, in most cases it is preferred to let the entire message fail processing, giving the built-in Retry mechanism a chance to get it right, and eventually going to the error queue. Then another process can monitor the error queue and do any required notification, including logging to a database.
However, there are some use cases where the entire message is not a failure, i.e. on the whole, it "succeeds" (whatever the business-dependent definition of that is) but there is some small part that is in error. For example, a financial calculation in which the processing "succeeds" but some human element of the data is "in error". In this case I would suggest catching that exception and sending a new message which, when processed by another endpoint, will log the information to your database.
I could see another case where you want the entire message to fail, but you want the fact that it was attempted noted somehow. This may be closest to what you are describing. In this case, create a new TransactionScope with TransactionScopeOption = Suppress, and then (again) send a new message inside that scope. That message will be sent whether or not your full message transaction rolls back.
You are correct that your transaction is rolling back because the NHibernate session is opened while the transaction is in force. Trying to open a new session inside the suppressed transaction can cause a problem with locking. That's why, most of the time, sending a new message asynchronously is part of the solution in these cases, but how you do it is dependent upon your specific business requirements.
I know I'm late to the party, but as an alternative suggestion, you coudl simply raise another separate log message, which NSB handles independently, for example:
public void Handle(DebitAccountMessage message)
{
var account = this.dbcontext.GetById(message.Id);
if (account.Balance <= 0)
{
// log request - new handler
this.Bus.Send(new DebitAccountLogMessage
{
originalMessage = message,
account = account,
timeStamp = DateTime.UtcNow
});
// throw error - NSB will handle
throw new DebitException("Not enough funds");
}
}
public void Handle(DebitAccountLogMessage message)
{
var messageString = message.originalMessage.Dump();
var accountString = message.account.Dump(DumpOptions.SuppressSecurityTokens);
this.Logger.Log(message.UniqueId, string.Format("{0}, {1}", messageString, accountString);
}

Nhibernate + Spring.Net + Transaction + too many threads

I am using Sping.Net 1.3.1 and Nhibernate 3.0.
I use Spring's Transaction Interceptor in order to create my Transactions.
I mark my Transactional methods with the Transaction Attribute.
My server gets something like 20 - 25 requests per second, each request is
handled on a new thread, using parallel's Task.
I run a stress test in order to verify my server capability of handling the calls.
when i run only two or three calls ion a time, every thing works great, but
when I run 5 -10 calls simultanly I got an exception from Spring.
The exception is:
Spring.Transaction.TransactionSystemException was unhandled by user code
Message=Could not commit Hibernate transaction
Source=Spring.Data.NHibernate30
StackTrace:
at Spring.Data.NHibernate.HibernateTransactionManager.DoCommit(DefaultTransactionStatus status) in c:\_svn\spring-net\tags\spring-net-1.3.1\src\Spring\Spring.Data.NHibernate\Data\NHibernate\HibernateTransactionManager.cs:line 568
at Spring.Transaction.Support.AbstractPlatformTransactionManager.ProcessCommit(DefaultTransactionStatus status)
InnerException: NHibernate.TransactionException
Message=Transaction not connected, or was disconnected
Source=NHibernate
StackTrace:
at NHibernate.Transaction.AdoTransaction.CheckNotZombied() in d:\CSharp\NH\nhibernate\src\NHibernate\Transaction\AdoTransaction.cs:line 408
at NHibernate.Transaction.AdoTransaction.Commit() in d:\CSharp\NH\nhibernate\src\NHibernate\Transaction\AdoTransaction.cs:line 181
at Spring.Data.NHibernate.HibernateTransactionManager.DoCommit(DefaultTransactionStatus status) in c:\_svn\spring-net\tags\spring-net-1.3.1\src\Spring\Spring.Data.NHibernate\Data\NHibernate\HibernateTransactionManager.cs:line 556
InnerException:
Thank you very much,
Or Chubook.
I am sure you have already discovered this answer by now. When you share a NHibernate session among multiple threads, you will run into this concurrency problem. Each thread must have their own session in scope in order to avoid a transaction disconnect state.