Pause a sub-process BPMN - bpmn

I've recently started at a new business and some of the processes are becoming a bit of a challenge to map out. Quite frequently we have a process that needs to go "on hold" when an event, which can occur at any point, is triggered. The problem I'm having mapping this out correctly is how to "restart" the process from where it left off, since it can effectively pause/unpause at any point.
Here's what I currently have:
Process Example
Basically, I need to have "Something Happened 2" not fully interrupt the sub-process, it just needs to put it on "hold". The actual situation is essentially that a customer can make a complaint while we handle their overdue bill, so we put the process on hold wherever it was at until we resolve the complaint, and then restart the process.
I'm not entirely sure the best approach to documenting this and couldn't find anything clear in the documentation, since a non-interupting event seems to have the rest of the process still continue forward in parallel.
Any help would be majorly appreciated.

If you really want to restart the whole sub-process from the beginning, then you could frontload an exclusive gateway. Once the complaint is dealt with, you can direct the sequence flow to that gateway, which would restart the sub-process. See below for an example (I have simplified your diagram a bit).

Related

How to prevent NServiceBus from not sending messages on errors

I'm new to NServiceBus, so maybe I'm asking something pretty silly here, but is there a way to make NServiceBus not stop sending any messages that are sent in response to a message whose handler fails?
Let me explain with a simple example.
Suppose I have an OrderPaidEvent that has a handler that does the following:
Look for the customer
Start a DB transaction
Update the customer to a good customer
Send an CustomerUpgradedToGoodCustomerEvent message
Commit the DB transaction
Fairly straightforward, all is well in the world. Now a few months later someone else figures that an email would be nice when an order is paid and thus adds another handler to the OrderPaidEvent to send an email.
Unfortunately, now whenever the mailserver has an issue, this second handler will fail with an error which will however prevent the original CustomerUpgradedToGoodCustomerEvent message from being sent (step 4). But because the DB transaction was already committed (step 5) the customer has already been upgraded to a good customer in the database.
This means that even if the OrderPaidEvent handler is retried the customer no longer changes and thus the CustomerUpgradedToGoodCustomerEvent message is never sent. Worse yet, this is all because of a change to the code that has nothing to do with the original message handler and will thus be difficult to detect.
This seems like a massive flaw and since I'm new to this I'm certain there's something I'm doing wrong, but I can't seem to figure out what it is.
Any help from you fine people would be great.
Thanks in advance.
How about breaking down your procedural code into separate handlers?
Thereafter each logical operation will either be done or will not be done based on successful completion of each granular task.
If you add a Saga to the mix then you can make business decisions based on the completed steps in your Saga.
Also maybe read more about transactions and NServiceBus here
First of all I would send out the CustomerUpgradedToGoodCustomerEvent after the commit. At that point you are sure that the event actually took place.
And in response to your question: You could handle the email in some 'SendEmail' command that is raised after the db commit and before the event is published. If that command fails it will not hurt the handling of the OrderPaid event. When mail is up again, the command can be retried and handled normally.

Understanding Eventual Consistency, BacklogItem and Tasks example from Vaughn Vernon

I'm struggling to understand how to implement Eventual Consistency with the exposed example of BacklogItems and Tasks from Vaughn Vernon. The statement I've understood so far is (considering the case where he splits BacklogItem and Task into separate aggregate roots):
A BacklogItem can contain one or more tasks. When all remaining hours from a the tasks of a BacklogItem are 0, the status of the BacklogItem should change to "DONE"
I'm aware about the rule that says that you should not update two aggregate roots in the same transaction, and that you should accomplish that with eventual consistency.
Once a Domain Service updates the amount of hours of a Task, a TaskRemainingHoursUpdated event should be published to a DomainEventPublisher which lives in the same thread as the executing code. And here it is where I'm at a loss with the following questions:
I suppose that there should be a subscriber (also living in the same thread I guess) that should react to TaskRemainingHoursUpdated events. At which point in your Desktop/Web application you perform this subscription to the Bus? At the very initialization of your app? In the application code? Is there any reasoning to place domain subscriptors in a specific place?
Should that subscriptor (in the same thread) call a BacklogItem repository and perform the update? (But that would be a violation of the rule of not updating two aggregates in the same transaction since this would happen synchronously, right?).
If you want to achieve eventual consistency to fulfil the previously mentioned rule, do I really need a Message Broker like RabbitMQ even though both BacklogItem and Task live inside the same Bounded Context?
If I use this message broker, should I have a background thread or something that just consumes events from a RabbitMQ queue and then dispatches the event to update the product?
I'd appreciate if someone can shed some clear light over this since it is quite complex to picture in its completeness.
So to start with, you need to recognize that, if the BacklogItem is the authority for whether or not it is "Done", then it needs to have all of the information to compute that for itself.
So somewhere within the BacklogItem is data that is tracking which Tasks it knows about, and the known state of those tasks. In other words, the BacklogItem has a stale copy of information about the task.
That's the "eventually consistent" bit; we're trying to arrange the system so that the cached copy of the data in the BacklogItem boundary includes the new changes to the task state.
That in turn means we need to send a command to the BacklogItem advising it of the changes to the task.
From the point of view of the backlog item, we don't really care where the command comes from. We could, for example, make it a manual process "After you complete the task, click this button here to inform the backlog item".
But for the sanity of our users, we're more likely to arrange an event handler to be running: when you see the output from the task, forward it to the corresponding backlog item.
At which point in your Desktop/Web application you perform this subscription to the Bus? At the very initialization of your app?
That seems pretty reasonable.
Should that subscriptor (in the same thread) call a BacklogItem repository and perform the update? (But that would be a violation of the rule of not updating two aggregates in the same transaction since this would happen synchronously, right?).
Same thread and same transaction are not necessarily coincident. It can all be coordinated in the same thread; but it probably makes more sense to let the consequences happen in the background. At their core, events and commands are just messages - write the message, put it into an inbox, and let the next thread worry about processing.
If you want to achieve eventual consistency to fulfil the previously mentioned rule, do I really need a Message Broker like RabbitMQ even though both BacklogItem and Task live inside the same Bounded Context?
No; the mechanics of the plumbing matter not at all.

Updating workflow - new functionality not being used

I'm busy playing around with various things, and am making changes a fair bit for educational purposes.
However, now, any changes I make are not being accepted and old behaviour is still happening. IN this case, I had a email watcher setup to write a file to our domain controller and send an SMS.
I changed it to do something different, but no number of stop and restarts help - it continues to do the first action.
Pointers welcome.
You can try to use the Stop All in the run now screen. This will stop all the workflow instances.
However, if the workflow is set to always on, it will pull up again automatically after a few minutes.
It is best if you disable always on, and set it back to always on.
Hope this helps

blocked requests in io_service

I have implemented client server program using boost::asio library.
In my implementation there are times when io_service.run() blocks indefinitely. In case I pass another request to io_service, the blocked call begins to execute normally.
Is there any way to see what are the pending requests inside the io_service queue ?
I have not used work object to block the run call!
There are no official ways to query into the io_service to find all pending request. However, there are a few techniques to debug the problem:
Boost 1.47 introduced handler tracking. Simply define BOOST_ASIO_ENABLE_HANDLER_TRACKING and Boost.Asio will write debug output, including timestamps, an identifier, and the operation type, to the standard error stream.
Attach a debugger dig through the layers to find and examine operation queues. This answer covers both understanding handler tracking and using a debugger to examine an operation queue for the epoll_reactor.
Finally, if you believe it is a bug, then it may be worth updating to the latest version or checking the revision history for relevant changes. Regardless, describing the problem in more detail may allow others to help identify the source of the problem and potential solutions.
Now i spent a few hours reading and experimenting (i need more boost::asio functionality for work as well) and it turns out: Kind of.
But it is not as straightforward or readable as one might hope.
Under the hood (well, under the outermost hood) io_service has a bunch of other services registered, which do the work async_ operations of their respective fields require.
These are the "Services" described in the reference.
Now sadly, the services stay registered, wether there is work to do or not. For example if your io_service has a udp socket, it will still have all the corresponding services, even if the socket itself is inactive.
But you can ask your io_service which services it has. Lets say you want to know wether your io_service called m_io_service has an udp datagram_socket_service. Then you can call something like:
if (boost::asio::has_service<boost::asio::datagram_socket_service<boost::asio::ip::udp> >(m_io_service))
{
//Whatever
}
That does not help a lot, because it will be true no matter wether the socket is active or not. But after you know, that you have that service, you can get a ref to it using use_service instead of has_service but with the same elegant amount of <>.
And now you can inspect the service to see what it is up to. Sadly, it will not tell you what the outstanding handlers names are (probably partly because it does not know them) but if it is a socket, you can get its implemention_type and with that check whether it currently is_open or find either the local_endpoint as well as the remote_endpoint.
In case of a deadline_timer_service you can, among other stuff, find out when it expires_at.
See the reference for more information what the service is and is not willing to tell you.
http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio/reference.html
This information should then hopefully allow you to determine which async_ operation did not return.
And if not, at the very least you can cancel any unexpectedly active services.

Cocoa Distributed Objects, Long Polling, launchd and "Not Responding" in Activity Monitor

Scenario:
I have a Distributed-objects-based IPC between a mac application and a launchd daemon (written with Foundation classes). Since I had issues before regarding asynchronous messaging (e.g. I have a registerClient: on the server's root object and whenever there's an event the server's root object notifies / calls a method in the client's proxy object), I did long-polling which meant that the client "harvests" lists of events / notifications from the daemon. This "harvest" is done through a server object method call, which then returns an NSArray instance.
It works pretty well, until for a few seconds, the server object's process (launched thru launchd) starts being labeled red with the "(Not responding)" tag beside it (inside Activity Monitor). Like I said, functionally, it works well, but we just want to get rid of this "Not responding" label.
How can I prevent this "Not responding" tag?
FYI, I already did launchd-based processes before and this is the first time I did long-polling. Also, I tried NSSocketPortNameServer-based connections and also NSSocketPort-based ones. They didn't have this problem. Locking wasn't also an issue 'coz the locks used were only NSCondition's and we logged and debugged the program and it seems like the only locking "issue" is on the harvesting part, which actually, functionally works. Also, client-process is written in PyObjC while server process was written using ObjC.
Thanks in advance.
Sample the process to find out what it's doing or waiting on.
Peter's correct in the approach, though you may be able to figure it out through simple inspection. "Not responding" means that you're not processing events on your event queue for at least 5 seconds (used to be 2 seconds, but they upped it in 10.4). For a UI process, this would create a spinning wait cursor, but for a non-UI process, you're not seeing the effects as easily.
If this is a runloop-based program, it means you're probably doing something with a blocking (synchronous) operation that should be done with the run loop and a callback (async). Alternately, you need a second thread to process your blocking operations so your mainthread can continue to respond to events.
My problem was actually the call for getting a process's PID using the signature FNDR... that part caused the "Not responding" error and it never was the locks or the long-polling part. Sorry about this guys. But thank God I already found the answer.