Apache Ignite's Continuous Queries event handler group & sequencing - ignite

We are trying to use the Continuous Query feature of Ignite. But we are facing an issue on handling that event. Below is our problem statement
We have defined a Continuous Query with remote filter for a cache and shared the filter definition with Thick Client.
We are running multiple replica of the "Thin Client" in Kubernetes cluster.
Now the problem is each instance of the "Thin Client" running in k8s cluster have registered the remote filter and each instance receiving the event and trying to process the data in parallel. This resulting in duplicating data process or even overriding the data in my store.
Is there any way to form a consumer group and ensure that only one instance of the "Thin Client" is receiving the notification and its processing the data ?
My Thick client and Thin Clients are in .NET
Couldn't found any details on Ignite document
https://ignite.apache.org/docs/latest/key-value-api/continuous-queries

Here each thin client is starting its own continuous query and thereby, by design, each thin client is getting its own event to consume. If you want to route an event to a specific client then you would need to start only one continuous query, and distribute that event to your app as you see fit.
Take a look at ignite messaging to see whether it fits your use case.
Also check out the distributed Queue/Set which have unique delivery guarantees.

Related

VM or Flow-Ref for eCommerce system

I am developing a mule application where I have to take orders from One system System-1 & have to send it to the another system say System-2 through soap (which actually takes care of creation of orders, invoices etc) & the response from System-2 is routed back to system-1 with success or failure response. Now what approach should be the best, will a VM be the best approach for referencing purpose or a flow reference ? The number of orders coming could be like 100 per hour. Also for both cases what should be the ideal worker size ?
If the system 2 is to be called SOAP web service, you should be using web service consumer component or an http requester component.
Lets understand the difference between flowref and vm.
Both flowref and vm are used to call another flow within the mule application.
Major difference between both is vm uses in-memory queue to refer to other flow and it creates a transport barrier, hence flow variable and inbound properties wont be propagated across, hence use of vm component should be considered only of it creating a transport barrier is necessary.
If thats(creating transport barrier) not the requirement, usage of flow ref is recommended.

Controlling Gemfire cache updates in the background

I will be implementing a Java program that acts as a gemfire client. The program will continuosly process records that it receives on its port from a remote program. Each record will be processed using the static data cached with my program. The cache may get updated behind the scenes in my program when it is changed on the gemfire server. The processing of one record may take a few seconds. I run the risk of processing half the record with static data that was prevalent before the change and rest of the record with static data that has taken effect after the change. Is there a way I can tell gemfire to not apply the cache to the local client until I am done processing the ongoing record?
Regards,
Yash
Consider this approach: Use a Continuous query "Select *" instead of event registration. A CQ does not update the client region like a subscription does. Make your client region LOCAL. After receiving the CQ event on the client, execute your long running process and put the value that you received from the CQ into your client region. Decoupling client and server in this way will allow your client to run long-running processes.
Alternatively: if you must have the client cache proxied with the server as an absolute requirement, then keep the interest registration AND register a CQ. Ignore the event callback from the subscription but handle your long-running process using the event callback from the CQ.
The following is from page 683 at http://gemfire.docs.pivotal.io/pdf/pivotal-gemfire-ug.pdf
CQs do not update the client region. This is in contrast to other server-to-client messaging like the updates sent to satisfy interest registration and responses to get requests from the client's Pool.

NServiceBus Pub/Subscribe using SQLServer transport - can the subscriber scale out?

Using the latest version of NServiceBus 4.4 I believe.
We are looking to implement NServiceBus and this section is using SQLServer as a transport. We want to pub/subscribe, which is fine but how would it work with scaling out the subscribers?
I have done a PoC where I ran the recieving endpoint of a SQLServer transport multiple times and when a message came in, the first instance of the running reciever got the message and processed it, resulting in the other process NOT processing it, which is correct.
In a pub/subscribe architecture using SQLServer, would this same method of running multiple instances of the subscriber work and since we are using a common queue (SQLServer) it will just sort itself out and not process the message multiple times?
When using SQL Server persistence, the subscribers for your events and messages are held in the Subscription table within the NServiceBus database, so you can check which endpoints are subscribing to what messages or events by viewing the contents of that.
It's worth noting that you can only publish "message" classes with NServiceBus that are implementing the IEvent interface (unless you make use of unobtrusive mode).
When you publish a message or event using bus.Publish, all subscribers to that type will subscribe to it, as long as the individual endpoint names are different.
More information from Particular Software is here:
And here.

JMS message received at only one server

I'm having a problem with a JEE6 application running in a clustered environment using WebSphere ApplicationServer 8.
A search index is used for quick search in the UI (using Lucene), which must be re-indexed after new data arrived in the corresponding DB layer. To achieve this we're sending a JMS message to the application, then the search index will be refreshed.
The problem is, that the messages only arrives at one of the cluster members. So only there the search index is up to date. At the other servers it remains outdated.
How can I achieve that the search index gets updated at all cluster members?
Can I receive the message somehow on all servers?
Or is there a better way to do this?
I found a possible solution:
Generally, a JMS message delivered via a queue goes only to one of the cluster members. I found a possible way to get the info to all of the cluster members, using a EJB timer. Creating a non-persistent timer should call the callback method on all of the cluster members. This might be a convenient way to recreate the local search index on all the cluster members.
It is important to be a non-persistent ejb timer, because persistent timers get synchronized on the cluster and are only executed on one of the cluster members.

Long running workflow in asp.net mvc

I'm developing an intranet site using asp.net mvc4 to manage some of our data. One important feature of this site is to trigger import/export jobs. These jobs can take anywhere between 5 minutes to 1 hour. Users of the site need to be able to determine whether a job is currently running as well as the status of prior jobs. Many jobs will often include warning messages concerning duplicate data and these warnings need to be visible on the site.
My plan is to implement these long running processes as a WCF Workflow Service that the asp.net site will interact with. I've got much of the business logic implemented via activities and have tested it using a simple console application. I should note I'm using a correlation handle in order to partition the service based on specific "Projects" on the site.
My problem is how do I go by querying the status of an active job (if one exists) as well as the warning messages of previous jobs. I suspect the best way to do this would be to use the AppFabric tracking service and have my asp.net query a SQL monitoring store and report back on the current status. After setting up AppFabric and adding custom tracking messages, I ran into a few issues. My first issue is that I cannot figure out how to filter out workflow instances that were not using the correct correlation handle as I'd like to show only workflows for a specific project. The other issue is that the tracking database can be delayed quite a bit which causes issues for me trying to determine if a workflow is currently running.
Another possible solution could be to have the workflow explicitly update a database with its current status and any error messages. I'm leaning towards this solution but could use some expert advice.
TL;DR: I need to know the best way to query the execution status and any warning messages of a WCF Workflow service.
As you want to query workflow status and messages even after the workflow is finished I would start by creating a table where you can convert the correlation values a client send to the related workflow ID. I would create a custom activity to do that and drop it right after the receive that creates the workflow.
Next I would create a regular WCF service the client app uses to query the status. This WCF service can query the WF persistence store to see if a given workflow is still running. If so the active bookmarks column will tell you what SOAP messages the workflow is currently waiting for.
As far as messages go you can either use the AppFabric tracking infrastructure to store and retrieve them or you could create a custom activity and store them in your own database. It really depends if you are also interested in the standard WF tracking messages generated.
Update on cheking for running workflow instances:
There are several downsides to adding an IsRunning message to your workflow. For one you would need to make sure one branch keeps looping and waiting for the message but stops as soon as the other real workflow branch is done. Certainly possible but it complicates the workflow and is a possible source of errors. And as it is not part of the business problem it really has no place in the workflow as far as I am concerned. It also means that you will have to load a workflow from disk and persist it back just to tell you that it is there. If it was finished you will need to wait for a fault to indicate there was no workflow instance. And that usually means you get a timeout exception after, by default, 60 seconds. Add throttling to that and you request might be queued because there are too many other workflow instances or SOAP request being processed. So a timeout might mean that a workflow instance exists but is unreachable due to system constraints. Instead I would opt for the simple thing and check if the record in the instance store is still available. The additional info from the active bookmarks column will tell you what the workflow is waiting on, information I have used in the past to dynamically update the UI by enabling/disabling UI elements.