Aging of objects in Gigaspaces - sql

I'm fairly new to Gigaspaces. I am using a polling container to fetch events from a space and then dispatch these over a HTTPS connection. If the server endpoint for the connection becomes unavailable, I need to update the state of the event objects to 'blocked' and re-queue them in the space for later retries (for which I have a separate polling container that specifically looks for the blocked events).
What I'm struggling with is finding a good way to ensure the blocked event polling container does not over-rotate on the blocked events (that is, read the events, discover that the endpoint is still blocked, write them back to the space and then immediately re-read them).
Is there a way I could build in a delay in re-reading the events from the space. Options might include:
Setting/updating a timestamp on the object before writing it back, and then comparing this with the current time within the polling process (for this, I expect I would have to use a SQLQuery which includes SYSDATE as the EventTemplate, but would I have to query SYSDATE out of the space every time I want to update the object rather than using System.currentTimeMillias or equivalent, in order to ensure I am comparing apples to apples?)
Applying a configuration setting of some kind on the blocked event polling container or listener that causes it to only poll periodically.

You can use both approach:
docs.gigaspaces.com/xap97/polling-container.html#dynamic-template-definition
docs.gigaspaces.com/sbp/dynamic-polling-container-templates-using-triggeroperationhandler.html
In the future, for GigaSpaces related questions, please use:
ask.gigaspaces.org/questions/
Thanks,
Ester.

Related

How to make a Saga handler Reentrant

I have a task that can be started by the user, that could take hours to run, and where there's a reasonable chance that the user will start the task multiple times during a run.
I've broken the processing of the task up into smaller batches, but the way the data looks it's very difficult to tell what's still to be processed. I batch it using messages that each process a bite sized chunk of the data.
I have thought of using a Saga to control access to starting this process, with a Saga property called Processing that I set at the start of the handler and then unset at the end of the handler. The handler does some work and sends the messages to process the data. I check the value at the start of the handler, and if it's set, then just return.
I'm using Azure storage for Saga storage, if it makes a difference for the next bit. I'm also using NSB 6
I have a few questions though:
Is this the correct approach to re-entrancy with NSB?
When is a change to Saga data persisted? (and is it different depending on the transport?)
Following on from the above, if I set a Saga value in a handler, wait a while and then reset it to its original value will it change the persistent storage at all?
Seem to be cross posted in the Particular Software google group:
https://groups.google.com/forum/#!topic/particularsoftware/p-qD5merxZQ
Sagas are very often used for such patterns. The saga instance would track progress and guard that the (sub)tasks aren't invoked multiple times but could also take actions if the expected task(s) didn't complete or is/are over time.
The saga instance data is stored after processing the message and not when updating any of the saga data properties. The logic you described would not work.
The correct way would be having a saga that orchestrates your process and having regular handlers that do the actual work.
In the saga handle method that creates the saga check if the saga was already created or already the 'busy' status and if it does not have this status send a message to do some work. This will guard that the task is only initiated once and after that the saga is stored.
The handler can now do the actual task, when it completes it can do a 'Reply' back to the saga
When the saga receives the reply it can now start any other follow up task or raise an event and it can also 'complete'.
Optimistic concurrency control and batched sends
If two message are received that create/update the same saga instance only the first writer wins. The other will fail because of optimistic concurrency control.
However, if these messages are not processed in parallel but sequential both fail unless the saga checks if the saga instance is already initialized.
The following sample demonstrates this: https://github.com/ramonsmits/docs.particular.net/tree/azure-storage-saga-optimistic-concurrency-control/samples/azure/storage-persistence/ASP_1
The client sends two identical message bodies. The saga is launched and only 1 message succeeds due to optimistic concurrency control.
Due to retries eventually the second copy will be processed to but the saga checks the saga data for a field that it knows would normally be initialized by by a message that 'starts' the saga. If that field is already initialized it assumes the message is already processed and just returns:
It also demonstrates batches sends. Messages are not immediately send until the all handlers/sagas are completed.
Saga design
The following video might help you with designing your sagas and understand the various patterns:
Integration Patterns with NServiceBus: https://www.youtube.com/watch?v=BK8JPp8prXc
Keep in mind that Azure Storage isn't transactional and does not provide locking, it is only atomic. Any work you do within a handler or saga can potentially be invoked more than once and if you use non-transactional resources then make sure that logic is idempotent.
So after a lot of testing
I don't believe that this is the right approach.
As Archer says, you can manipulate the saga data properties as much as you like, they are only saved at the end of the handler.
So if the saga receives two simultaneous messages the check for Processing will pass both times and I'll have two processes running (and in my case processing the same data twice).
The saga within a saga faces a similar problem too.
What I believe will work (and has done during my PoC testing) is using a database unique index to help out. I'm using entity framework and azure sql, so database access is not contained within the handler's transaction (this is the important difference between the database and the saga data). The database will also operate across all instances of the endpoint and generally seems like a good solution.
The table that I'm using has each of the columns that make up the saga 'id', and there is a unique index on them.
At the beginning of the handler I retrieve a row from the database. If there is a row, the handler returns (in my case this is okay, in others you could throw an exception to get the handler to run again). The first thing that the handler does (before any work, although I'm not 100% sure that it matters) is to write a row to the table. If the write fails (probably because of the unique constraint being violated) the exception puts the message back on the queue. It doesn't really matter why the database write fails, as NSB will handle it.
Then the handler does the work.
Then remove the row.
Of course there is a chance that something happens during processing of the work, so I'm also using a timestamp and another process to reset it if it's busy for too long. (still need to define 'too long' though :) )
Maybe this can help someone with a similar problem.

Network activity indicator and asynchronous sockets

I have an app which continuously reads status updates from a server connection.
All is working well with a stream delegate to handle all the reading and writing asynchronously.
There's no part of the app that is "waiting" for a specific response from the server, it is just continuously handling status updates as they sporadically arrive from the socket. There are no requests on the client side that are waiting for responses.
I'm wondering what the best practice would be for the network activity indicator in this case.
I could turn it on in the stream event handler, and off before we leave the handler, but that would be a very short time (just enough for an non-blocking read or write to occur). Trying this, I only see the faintest flicker of the indicator; it needs to be on longer than just during the event handler.
What about turning it on in the stream delegate, and setting a timer to turn it off a short time later? (This would ensure it's on long enough to be seen, rather than the short time spent in the stream delegate.)
Note: I've tried this last idea: turning on the network activity indicator whenever there's stream activity, and note the NSDate; then in a timer (that I have fired every 1 second), if the time passsed is >.5 second, I turn off the indicator. This seems to give a reasonable indication of network activity.
Any better recommendations?
If the network activity is continuous then it sounds like it might be somewhat annoying to the user, especially if it's turning on and off all the time.
Perhaps better would be to test for lack-of-response up to a certain timeout value and then display an alert view to the user if you aren't getting any response from the server. Even that could be optional if you can provide feedback (like "Last update: 5 mins ago") to the user instead.

Suspending and modifying NServiceBus saga timeouts

I have a saga which represents a long-running work assignment process of a "Person" to a "Case". Several events may kick it off, and at the end of the process we have an assignment confirmation, at which point the saga completes and the Person is assigned to the Case. I would like to have a timeout for this saga so that we don't wait indefinitely for confirmation - definitely a valid business use case. No difficulties there - fairly vanilla.
The twist is that this assignment process can be blocked if someone puts the Case on hold. I have an event I can subscribe to so my assignment saga knows the Case is on hold, but unless I adjust the timeout or suspend it in some way, the assignment saga will likely time out before the Case hold is released. It doesn't make business sense to do this, so I basically want to stop the timeout clock until some other event comes in.
This same issue was mentioned here a couple years ago. Is this still not possible or are there new features in v3.x that would allow it? If not, is it a planned feature?
Thanks!
Why not remove the timeout altogether for the instance when your case is put on hold? Your saga maintains the state of the case and the calculated time when the case would have been due. This could have been set when you created the first timeout. When the case is reactivated, simply calculate the difference in time from the reactivation and the saved "deadline", and create a new timeout for that instance with the difference. You may also want to take into account the time the case was on hold and set a new deadline which you would save back to the instance state.
I don't think there is a way to tab directly to the timer and put the timeout message "on-hold"
I would have that logic inside the timeout handler on the saga. Check if the case is on hold and request another timeout without ending the saga.

How to know when a set of RabbitMQ tasks are complete?

I am using RabbitMQ to have worker processes encode video files. I would like to know when all of the files are complete - that is, when all of the worker processes have finished.
The only way I can think to do this is by using a database. When a video finishes encoding:
UPDATE videos SET status = 'complete' WHERE filename = 'foo.wmv'
-- etc etc etc as each worker finishes --
And then to check whether or not all of the videos have been encoded:
SELECT count(*) FROM videos WHERE status != 'complete'
But if I'm going to do this, then I feel like I am losing the benefit of RabbitMQ as a mechanism for multiple distributed worker processes, since I still have to manually maintain a database queue.
Is there a standard mechanism for RabbitMQ dependencies? That is, a way to say "wait for these 5 tasks to finish, and once they are done, then kick off a new task?"
I don't want to have a parent process add these tasks to a queue and then "wait" for each of them to return a "completed" status. Then I have to maintain a separate process for each group of videos, at which point I've lost the advantage of decoupled worker processes as compared to a single ThreadPool concept.
Am I asking for something which is impossible? Or, are there standard widely-adopted solutions to manage the overall state of tasks in a queue that I have missed?
Edit: after searching, I found this similar question: Getting result of a long running task with RabbitMQ
Are there any particular thoughts that people have about this?
Use a "response" queue. I don't know any specifics about RabbitMQ, so this is general:
Have your parent process send out requests and keep track of how many it sent
Make the parent process also wait on a specific response queue (that the children know about)
Whenever a child finishes something (or can't finish for some reason), send a message to the response queue
Whenever numSent == numResponded, you're done
Something to keep in mind is a timeout -- What happens if a child process dies? You have to do slightly more work, but basically:
With every sent message, include some sort of ID, and add that ID and the current time to a hash table.
For every response, remove that ID from the hash table
Periodically walk the hash table and remove anything that has timed out
This is called the Request Reply Pattern.
Based on Brendan's extremely helpful answer, which should be accepted, I knocked up this quick diagram which be helpful to some.
I have implemented a workflow where the workflow state machine is implemented as a series of queues. A worker receives a message on one queue, processes the work, and then publishes the same message onto another queue. Then another type of worker process picks up that message, etc.
In your case, it sounds like you need to implement one of the patterns from Enterprise Integration Patterns (that is a free online book) and have a simple worker that collects messages until a set of work is done, and then processes a single message to a queue representing the next step in the workflow.

How do I coalesce processing of related events in NServiceBus?

I have a situation where I have a service subscribing to event messages and performing some work when they arrive. There is a certain class of events which can arrive in short bursts of many events which reference the same underlying data. I would like to be able to defer processing of related events for a short period of time, so that I only do the calculation once for each batch of related events, rather than in response to each individual event. Is there some kind of pattern I can follow which will allow me to collect related events for a period of time and then process them all at once? I was thinking a saga + timeout might be able to achieve this, but not sure if this is an appropriate use for that.
Thanks!
Yes, a saga could be the way to go - however consider the performance of the saga persistence (NHibernate over a DB in the current version, RavenDB in the next version) as compared to your fault-tolerance needs (if a machine crashes, would it be acceptable to lose some messages).
No easy answers, I'm afraid.