In DDS can the write/publisher cache be read - data-distribution-service

This question is regarding a roadblock I am currently facing in DDS. I am able to read the Subscriber/Reader Cache using the QueryFilter provided by the respective implementations. But I now want to read the Publisher/Writer Cache and I am not able to do that .
The use case is I am publishing a list of objects and do not want to maintain a list myself locally, since DDS is already doing it. At the Subscriber I am able to get object instances, like I said earlier, using the QueryFilter. But is there any way to do so with the Publisher? I wanted to avoid creating a subscriber at the publisher end or maintain the list locally as well as in GDS.
I am programming in C++ and using OpenSplice, but please do answer even if it is for some other implementation.

There is no DDS standard API available for reading the cache on the DataWriter side. As far as I know, none of the DDS implementations offers anything like that.
The use case is I am publishing a list of objects and do not want to
maintain a list myself locally, since DDS is already doing it [in the Publisher/Writer cache].
Well, as a user, you can not be sure what is in the cache on the DataWriter side. The DDS specification does not exactly specify what is in that cache and it does not exist as such in the API.
The purpose of the cache on the DataWriter side is to store data in order to support the quality of service as requested. For a best-effort DataWriter, the cache might not even exist, or contain minimal information about the key-values published. For a reliable DataWriter, the cache might contain samples that are in the process of being delivered reliably, but they might be removed after they have been delivered. For a reliable, non-volatile DataWriter, the cache might contain all samples that need to be available for lat-joining readers.
I say might because it all depends on how the product is implemented.
The only cache-related method on the DataWriter side is lookup_instance().
I wanted to avoid creating a subscriber at the publisher end or
maintain the list locally as well as in GDS.
Creating a DataReader at the publisher end seems to do exactly what you need. Why do you want to avoid that?

Related

Need suggestions: Send multiple images to backend, perform upload operation in backend, send response

I need some best practice guidelines for a backend service in a scenario like this one:
UI sends multiple images for uploading to the backend service
Backend service receives all of the images and processes upload to storage one by one
There can be failure in 1 or multiple image upload
My question is how do I send the response towards UI if my backend service is unable to upload 1 or more file(s).
One way can be to send failed and successful image link together in a JSON response body. So the UI knows about the failure and handles it in its own way.
Another way can be to send only the successfully uploaded images' link which is the best case scenario.
Any suggestions will be welcomed with some reference links.
Use an Orchestrator - something specific that can coordinate multiple actions and provide a meaningful result back to the caller.
This might be as simple as a component sitting in the UI that orchestrates calls to the backend. The UI component and the backend service might be designed as parts of a cohesive solution, or the UI component might simply act as a type of client/proxy/facade to some random backend service.
UI calls the orchestrator with references to all the images it needs uploading.
The orchestrator works through the items, uploading each as you prefer (sequentially or in parallel, etc). For each file, handle errors however you prefer - e.g. try once and die gracefully on failure; put errors into a queue or some other mechanism for retry (how many times is up to you); etc.
Based on rules internal to the orchestrator, return status to the caller.
For potentially long-running processes (like file uploads) make sure the call to the orchestrator is asynchronous.
Rather than only returning "complete" result at the end, the orchestrator might provide a simple status back, allowing callers to get some idea of where processing is at. For example, you might have a call-back (from the orchestrator to it's caller) that simply emits very simple statuses like: processing, failed and complete. A more complex solution would be for the orchestrator to return more specific info like %complete and detailed error info.
Have a look at how the big cloud providers do complex file uploads by reading their documentation and studying their API's.
I need some best practice guidelines for a backend service
In no particular order:
Keep it as simple as possible - generally, the fewer moving parts the better. E.g. pay attention to the Single Responsibility Principle (SRP).
Clean up after yourself. If the upload service generates any data - make sure you have a clean-up process so you don't end up with mountains of un-needed data lying around, especially stuff like image files. If you design an upload solution that maintains state (which is independent of what happens to the images once they are uploaded) then you'll be storing data which probably won't be needed once the images are all processed.
Think about support - not just developer debugging but also operational support. Getting your solution into production is not the end result, it's just the beginning.
If designing this solution across teams (e.g. frontend and backend teams) make sure both teams are involved in the design. If the backend team can't provide a solution that works for the frontend team then it's not going to end well.
Think about the likely error scenarios and how can you handle them.
This isn't really just a question of best practice, as there are multiple ways you could implement it, more than one of which could be valid. This is actually an architecture and design question, with more than one valid answer, hence I don't think it fits as a Stack Overflow question and you will not get references to any one correct approach.
That said, by way of an answer I will outline what I think you need. At a very high level, and not necessarily in this order but taking these factors into account, I would:
Design the UI process flow. For example, you may decide that the user process will have several stages:
User selects first image for upload;
User selects each subsequent image for upload;
User presses some kind of "Go" button after selecting all images;
System now uploads the batch, and user receives a response confirming success or otherwise;
User has option to click through to detailed success/error details.
Design the required success/error reports
Design the data needed to support the overall functionality
Provide one or more APIs giving the upload function and the report function(s) the CRUD access they need to this data
If you hit any specific technical issues at any stage, then please post a new questions accordingly as you go.
As to the point you mentioned, how to send the UI response, there is more than one valid way but I would return a basic success/falure response initially, containing only minimal details such as number of successes, and return more details in further messages in response to user actions (such as clicking through to detailed success/error details), at which point I would retrieve the requested error details from the database.
As I said at the start of my answer, I don't think your question can be answered just in terms of best practices, as it's a whole architecture and design question, but I hope my answer helps you along this path.

Interoperability in DDS

I am new to DDS domain and need to have the below understanding.
how to publish common topics between two vendors to achieve interoperability in DDS?
The Scenario is :
Suppose there are two vendor products V1 and V2. V1 has a publisher which publishes on topic T1. V2 wants to subscribe for this topic.How will the Subscriber(V2) know that there exists a Topic T1?
I have a similar doubt on Domain level.how will a subscriber know to which domain it has to participate in?
I am using OpenDDS.
Thanks
Interoperability between vendors is possible, and regularly tested/demonstrated by the main vendors.
You will need to configure your DDS implementation to use RTPS ( I think RTPS2 currently), rather than any proprietary transport that vendors may use. This might be enabled by default.
In terms of which domain to participate in, you programmatically create a domain participant in a particular domain (which domain it connects to might be controlled by a config file) and all further entities (publishers, subscribers, etc) that you create then belong to that domain participant and therefore operate in that domain
To build on #rcs's answer a bit... the actual amount of work you have to do can depend on DDS implementations (OpenDDS, RTI, Prismtech...) because they'll have different defaults. If you use the same on both ends, then your configuration becomes a lot simpler since defaults should line up for things like domain and RTPS.
You will need to make sure the following match:
Domain ID
Domain Partion
Transport (I recommend RTPS, FWIW version difference between 2.1 and 2.2 hasn't mattered in my experience)
TCP or UDP
Discovery port and data port - this will matter more or less depending which implementations you use and whether or not you're using the same one on both ends of the connection, if use using the same, they'll have the same defaults.
Make sure the topic one publishes matches the topic the other subscribes to, this will apply to the Topic and the Type (see more here)
Serialization of the data
Discovery (unicast vs. multicast, make sure
whatever setup you choose is valid, ex: both devices are in the same
multicast group)
QoS settings will need to line up, though I think defaults will likely work (read more here)
Get the Shapes demo working between the machines you're working on first, this does some basic sanity checking to know that it is possible with the given configuration and network setup. Every vendor/implementation that I've seen has a shapes demo to run, for example, here is RTI's.
That's all I can think of right now, hope that helps. I have found DDS documentation to be really good, especially if you know when you can (and when you can't) use any vendor's documentation's answer for your implementation (ex: answer found on RTI's doc or forum and whether or not it works for your OpenDDS application). Often the solutions are similar, but you'll find RTI supports the most and RTI + Prismtech have some of the best documentation.
The DDS RTPS protocol exchanges discovery information so that different applications participating in the same domain (!) know who is out there, and what they are offering/requesting. You need to make sure that the two applications are using the same domain ID (specified on the domain participant). Also, as some implementations allow for different transport options, make sure to use RTPS (sometimes called DDSI) networking.
The RTPS specification contains a mapping from domain ID to port numbers, so if applications from different vendors use the same ID it should just work. Implementations might however override portnumbers with configuration.
To maximize the chance that the applications communicate properly, ensure they use the same IDL datamodel. Vendors have different approaches to type evolution / mapping types that don't exactly match, and not all of them implement the XTypes specification (yet).
Also, as some implementations are stricter than others, ensure that you stay within bounds of the specification. This means that a topic name should only contain alphanumerical characters (I sometimes see ':' to indicate scoping, that is not allowed).
Things that will definitely not work between vendors is TRANSIENT/PERSISTENT durability or communication over TCP, as both have not been standardized yet. TRANSIENT_LOCAL should work. The difference between TRANSIENT_LOCAL and TRANSIENT is that with TRANSIENT_LOCAL, data is no longer aligned after a publisher (writer) leaves the system, whereas with TRANSIENT that data will still be available.
Also note that for API-level interoperability between vendors, your best chance is to use the new isocpp API, since that one has been implemented pretty consistently across the vendor implementations I've seen.
Hope that helps!

Prevent subscribers from reading certain samples temporarily

We have a situation where there are 2 modules, with one having a publisher and the other subscriber. The publisher is going to publish some samples using key attributes. Is it possible for the publisher to prevent the subscriber from reading certain samples? This case would arise when the module with the publisher is currently updating the sample, which it does not want anybody else to read till it is done. Something like a mutex.
We are planning on using Opensplice DDS but please give your inputs even if they are not specific to Opensplice.
Thanks.
RTI Connext DDS supplies an option to coordinate writes (in the documentation as "coherent write", see Section 6.3.10, and the PRESENTATION QoS.
myPublisher->begin_coherent_changes();
// (writers in that publisher do their writes) /* data captured at publisher */
myPublisher->end_coherent_changes(); /* all writes now leave */
Regards,
rip
If I understand your question properly, then there is no native DDS mechanism to achieve what you are looking for. You wrote:
This case would arise when the module with the publisher is currently updating the sample, which it does not want anybody else to read till it is done. Something like a mutex.
There is no such thing as a "global mutex" in DDS.
However, I suspect you can achieve your goal by adding some information to the data-model and adjust your application logics. For example, you could add an enumeration field to your data; let's say you add a field called status and it can take one of the values CALCULATING or READY.
On the publisher side, in stead of "taking a the mutex", your application could publish a sample with the status value set to CALCULATING. When the calculation is finished, the new sample can be written with the value of status set to READY.
On the subscriber side, you could use a QueryCondition with status=READY as its expression. Read or take actions should only be done through the QueryCondition, using read_w_condition() or take_w_condition(). Whenever the status is not equal to READY, the subscribing side will not see any samples. This approach takes advantage of the mechanism that newer samples overwrite older ones, assuming that your history depth is set to the default value of 1.
If this results in the behaviour that you are looking for, then there are two remaining disadvantages to this approach. First, the application logics get somewhat polluted by the use of the status field and the QueryCondition. This could easily be hidden by an abstraction layer though. It would even be possible to hide it behind some lock/unlock-like interface. The second disadvantage is due to the extra sample going over the wire when setting the status field to CALCULATING. But extra communications can not be avoided anyway if you want to implement a distributed mutex-like functionality. Only if your samples are pretty big and/or high-frequent, this is an issue. In that case, you might have to resort to a dedicated, small Topic for the single purpose of simulating the locking mechanism.
The PRESENTATION Qos is not specific RTI Connext DDS. It is part of the OMG DDS specification. That said the ability to write coherent changes on multiple DataWriters/Topics (as opposed to using a single DataWriter) is part of one of the optional profiles (object model profile), so no all DDS implementations necessariiy support it.
Gerardo

Is Message Queuing the right strategy for a high-bandwidth data feed?

I have a huge network of data-collection servers which generate a large volume of real-time data.
In the past I've provided partners with the ability to get this data in near-real-time using HTTP GET's. But for many reasons I'm eager to ditch this.
So yeah... I'm eager to build out a new distribution system and I was thinking that a Message Queuing System was the way to go.
I need to be able to distribute data from my sources to a number of different partners. Some partners receive all of it, others just get a portion. And, if a partner gets disconnected, they need to be able to reconnect and not miss any data. (Although, for the sake of disk and memory I'd like their queued messages to expire after hour or so)
Lastly I need the system to be able to handle tens of thousands of enqueue's per minute.
Do you think Message Queuing is an appropriate scheme?
I was looking at using RabbitMQ. Is it difficult to maintain?
Thanks Very Much!
-Z
I cannot tell you if it is the right strategy in your specific case, but message products are indeed used in high message rate systems every day.
Much of the investment world uses various products, both commercial (Tibco) and Open source (ZeroMQ) to name just two, to handle market data from exchanges and other sources. These are likely at least as active as your data sensors are.
The publish/subscribe model, where some receivers want some messages and some receivers want all, along with late-join or other so-called guaranteed messaging are indeed standard features on most of these products.
So do go ahead and investigate products, I have not used RabbitMQ myself, so cannot comment on it specifically, however with a minimal abstraction layer, you should be able to insulate yourself from too many platform specific calls, and therefore allow you to swap message-bus implementers if the need arises. (You may even want to build such a shim as part of a proof-of-concept to test out more than one product for your specific purpose. You get experience in multiple products, flesh out the facade layer, and get up to speed on the products)
Good Luck

NServiceBus Dynamic End Points

Is it possible to create end points dynamically at runtime. E.g. Send a message to a known endpoint with details of a new endpoint so that a network node can learn of new nodes on the fly.
NServiceBus does not support this out of the box, but if you really really want it (and you are sure that it is the right way to go), you are free to implement your own message routing and send messages explicitly to an endpoint with bus.Send(endpoint, message).
In a project I am currently involved with, we do this with great success, because it allows us to seamlessly sign services in and out of the system while it is running, resulting in zero downtime during upgrades.
It took a bit of work to get it working though, so I would only recommend this if you are certain that your requirements demand it.