Prevent subscribers from reading certain samples temporarily - data-distribution-service

We have a situation where there are 2 modules, with one having a publisher and the other subscriber. The publisher is going to publish some samples using key attributes. Is it possible for the publisher to prevent the subscriber from reading certain samples? This case would arise when the module with the publisher is currently updating the sample, which it does not want anybody else to read till it is done. Something like a mutex.
We are planning on using Opensplice DDS but please give your inputs even if they are not specific to Opensplice.
Thanks.

RTI Connext DDS supplies an option to coordinate writes (in the documentation as "coherent write", see Section 6.3.10, and the PRESENTATION QoS.
myPublisher->begin_coherent_changes();
// (writers in that publisher do their writes) /* data captured at publisher */
myPublisher->end_coherent_changes(); /* all writes now leave */
Regards,
rip

If I understand your question properly, then there is no native DDS mechanism to achieve what you are looking for. You wrote:
This case would arise when the module with the publisher is currently updating the sample, which it does not want anybody else to read till it is done. Something like a mutex.
There is no such thing as a "global mutex" in DDS.
However, I suspect you can achieve your goal by adding some information to the data-model and adjust your application logics. For example, you could add an enumeration field to your data; let's say you add a field called status and it can take one of the values CALCULATING or READY.
On the publisher side, in stead of "taking a the mutex", your application could publish a sample with the status value set to CALCULATING. When the calculation is finished, the new sample can be written with the value of status set to READY.
On the subscriber side, you could use a QueryCondition with status=READY as its expression. Read or take actions should only be done through the QueryCondition, using read_w_condition() or take_w_condition(). Whenever the status is not equal to READY, the subscribing side will not see any samples. This approach takes advantage of the mechanism that newer samples overwrite older ones, assuming that your history depth is set to the default value of 1.
If this results in the behaviour that you are looking for, then there are two remaining disadvantages to this approach. First, the application logics get somewhat polluted by the use of the status field and the QueryCondition. This could easily be hidden by an abstraction layer though. It would even be possible to hide it behind some lock/unlock-like interface. The second disadvantage is due to the extra sample going over the wire when setting the status field to CALCULATING. But extra communications can not be avoided anyway if you want to implement a distributed mutex-like functionality. Only if your samples are pretty big and/or high-frequent, this is an issue. In that case, you might have to resort to a dedicated, small Topic for the single purpose of simulating the locking mechanism.

The PRESENTATION Qos is not specific RTI Connext DDS. It is part of the OMG DDS specification. That said the ability to write coherent changes on multiple DataWriters/Topics (as opposed to using a single DataWriter) is part of one of the optional profiles (object model profile), so no all DDS implementations necessariiy support it.
Gerardo

Related

Need help in selecting the right design pattern

We are into the lead business. We capture leads and pass it on to the clients based on some rules. integration to each client very in nature like nature of the API and in some cases, data mapping is also required. We perform the following steps in order to route leads to the client.
Select the client
Check if any client-specific mapping(master data) is required.
Send Lead to nearest available dealer(optional step)
Call client api to send lead
Update push status of the lead to database
Note that some of the steps can be optional.
Which design pattern would be suitable to solve this problem. The motive is to simplify integration to each client.
You'll want to isolate (and preferably externalize) the aspects that differ between clients, like the data mapping and API, and generalize as much as possible. One possible force to consider is how easily new clients and their APIs can be accommodated in the future.
I assume you have a lot of clients, and a database or other persistent mechanism that holds this client list, so data-driven routing logic that maps leads to clients shouldn't be a problem. The application itself should be as "dumb" as possible.
Data mapping is often easily described with meta-data, and also easily data-driven. Mapping meta-data is client specific, so it could easily be kept in your database associated with each client in XML or some other format. If the transformations to leads necessary to conform to specific APIs are very complex, the logic could be isolated through the use of a strategy pattern, with the specific strategy selected according to the target client. If an extremely large number of clients and APIs need to be accommodated, I'd bend over backwards to make the API data-driven as well. If you have just a few client types (say less than 20), I'd employ some distributed asynchronicity, and just have my application publish the lead and client info to a topic corresponding to client-type, and have subscribed external processors specific for each client-type do their thing and publish the results on another single queue. A consumer listing to the results queue would update the database.
I will divide your problem statement into three parts mentioned below:
1) Integration of API with different clients.
2) Perfom some steps in order to route leads to the client.
3) Update push status of the lead to database.
Design patterns involved in above three parts:
1) Integration of API with different clients - Integration to each client vary in nature like the nature of the API. It seems you have incompitable type of interface so, you should design this section by using "Adapter Design Pattern".
2) Perform some steps in order to route leads to the client- You have different steps of execution. Next step is based on the previous steps. So, you should design this section by using "State Design Pattern".
3) Update push status of the lead to database: This statement shows that you want to notify your database whenever push status of the lead happens so that information will be updated into database. So, you should design this section by using "Observer Design Pattern".
Sounds like this falls in the workflow realm.
If you're on Amazon Web Services, there's SWF, otherwise, there's a lot of workflow solutions out there for your favorite programming language.

Is there a RabbitMQ pattern for a client election

Is there a way to have a pub/sub queue in RabbitMq in which any of the subscribers could vote and all give a thumbs up (or more importantly a thumbs down) before processing continues?
I am not sure what to call this so It is very hard to research.
I am trying to make subscribers that can be added and have the ability to veto a process without knowing about them up front.
{edit below}
I am essentially trying to build a distributed set of services that could filter in very specific use cases as they are discovered. I am trying to do this so I do not have to down my service and version every time one of these new use cases is discovered.
Here is a contrived example but it gets the point across:
Lets say I want to calculate if a number is Prime.
I would like to have a service that has general easy rules, (is factor of 2? is factor of 3?)
But lets say that we are getting into very large numbers, and I find a new algorithm that is faster for finding specific cases.
I would like to be able to add this service, have it subscribe to the feed and be able to trigger "NotPrime" and have all other subscribers abandon their work as a result of the veto.
In a monolithic world I would look at some sort of plug in framework and have that implement the filters. That seems like mixing strategies in a bad way if were to do the same within a micro service.

Duplicate detection with NServiceBus on Azure Service Buus

I'm using NServiceBus as an abstraction layer for Azure Service Bus (in case we move away from Azure). I find that when working with multiple subscribers (who subscribe to the same events) the number of duplicate messages increases. I know Azure Service Bus (ASB) has a way of detecting these duplicates and I can see that the feature is configurable through NServiceBus (according to documentation). However, I can only get a sample of of achieving duplication detection by means of configuration section. What I require is a sample of how to achieve this with code.
Thanks
Suraj
You can specify configuration using code-based approach as well. NServiceBus has to contracts that can help with that IConfigurationSource and IProvideConfiguration<T>. Here's an example how you can take a configuration file section (UnicastBusConfig) and specify values via code.
Specifically to what you've asked, implementing IProvideConfiguration<AzureServiceBusQueueConfig> will allow you configure ASB transport, specifying duplicates and such.
The observation about number of duplicates increasing as a result of increasing subscribers feels as a symptom, not the problem. That is probably a different question, not related to the configuration. Saying that, I'd look into it prior to enabling the native de-dupplication. While you can specify RequiresDuplicateDetection and DuplicateDetectionHistoryTimeWindow be aware that ASB performing duplicate detection on the ID property only. Also, it is better to build your handlers as idempotent, rather than relying on the native de-duplication.

In DDS can the write/publisher cache be read

This question is regarding a roadblock I am currently facing in DDS. I am able to read the Subscriber/Reader Cache using the QueryFilter provided by the respective implementations. But I now want to read the Publisher/Writer Cache and I am not able to do that .
The use case is I am publishing a list of objects and do not want to maintain a list myself locally, since DDS is already doing it. At the Subscriber I am able to get object instances, like I said earlier, using the QueryFilter. But is there any way to do so with the Publisher? I wanted to avoid creating a subscriber at the publisher end or maintain the list locally as well as in GDS.
I am programming in C++ and using OpenSplice, but please do answer even if it is for some other implementation.
There is no DDS standard API available for reading the cache on the DataWriter side. As far as I know, none of the DDS implementations offers anything like that.
The use case is I am publishing a list of objects and do not want to
maintain a list myself locally, since DDS is already doing it [in the Publisher/Writer cache].
Well, as a user, you can not be sure what is in the cache on the DataWriter side. The DDS specification does not exactly specify what is in that cache and it does not exist as such in the API.
The purpose of the cache on the DataWriter side is to store data in order to support the quality of service as requested. For a best-effort DataWriter, the cache might not even exist, or contain minimal information about the key-values published. For a reliable DataWriter, the cache might contain samples that are in the process of being delivered reliably, but they might be removed after they have been delivered. For a reliable, non-volatile DataWriter, the cache might contain all samples that need to be available for lat-joining readers.
I say might because it all depends on how the product is implemented.
The only cache-related method on the DataWriter side is lookup_instance().
I wanted to avoid creating a subscriber at the publisher end or
maintain the list locally as well as in GDS.
Creating a DataReader at the publisher end seems to do exactly what you need. Why do you want to avoid that?

Building a reliable service in WCF

I am currently designing a service (wsHttp) which should be used to return sensitive data. As soon as a client asks for this data, I get it from the database, compile a list, then delete the data from the database and return the list.
My concern is that something happens on the way back to the client (network issues, ...) I have already deleted the data from the database, but the client will never get it.
Which out of the box solution do I have here?
This is an inherent problem in the distributed computing. There is no easy solution. The question is how important it is to recover from such errors.
For example, if one deletes some records but the client gets disconnected, next time he connects he will see those records as deleted. Even if he tries to delete them again (data stayed in the UI), this will do no harm.
For banks transferring money, they have an error resolution mechanism where they match the transactions that happened between them in a second process. Conflicts will be dealt manually.
Some systems such as NServiceBus rely on MSMQ for storing messages and eventual consistency where a message destined to a client will eventually arrive whenever he is connected again.
There is no out of the box solution for this. You would need to implement some form of user/automated confirmation that the data had been recieved and only delete once this was returned.
Ed
There is an easy solution. But it doesn't come in a box.
Protocols like WS-ReliableMessaging (or equally TCP/IP) give you a layer of reliability under your messaging, but all bets are off once that layer offloads the message to the layer above.
So reliability can only be fully addressed at the absolute highest layer - the application layer, not by any lower layer down the communication stack. This makes it a first class business concern, not a purely technical concern.
The problem can be solved with a slight change to the process of deleting your sensitive data.
Instead of deleting it immediately, flag it for deletion. Then, build into the business processes that drive your service the assertion that the client must acknowledge receipt of the sensitive data. Then, when you get the acknowledgement back you can safely delete the data flagged for deletion, knowing that it has been received.
I recently wrote a blog post reasoning that reliability is a first class business concern that cannot be offloaded to a lower layer.