Interoperability in DDS - data-distribution-service

I am new to DDS domain and need to have the below understanding.
how to publish common topics between two vendors to achieve interoperability in DDS?
The Scenario is :
Suppose there are two vendor products V1 and V2. V1 has a publisher which publishes on topic T1. V2 wants to subscribe for this topic.How will the Subscriber(V2) know that there exists a Topic T1?
I have a similar doubt on Domain level.how will a subscriber know to which domain it has to participate in?
I am using OpenDDS.
Thanks

Interoperability between vendors is possible, and regularly tested/demonstrated by the main vendors.
You will need to configure your DDS implementation to use RTPS ( I think RTPS2 currently), rather than any proprietary transport that vendors may use. This might be enabled by default.
In terms of which domain to participate in, you programmatically create a domain participant in a particular domain (which domain it connects to might be controlled by a config file) and all further entities (publishers, subscribers, etc) that you create then belong to that domain participant and therefore operate in that domain

To build on #rcs's answer a bit... the actual amount of work you have to do can depend on DDS implementations (OpenDDS, RTI, Prismtech...) because they'll have different defaults. If you use the same on both ends, then your configuration becomes a lot simpler since defaults should line up for things like domain and RTPS.
You will need to make sure the following match:
Domain ID
Domain Partion
Transport (I recommend RTPS, FWIW version difference between 2.1 and 2.2 hasn't mattered in my experience)
TCP or UDP
Discovery port and data port - this will matter more or less depending which implementations you use and whether or not you're using the same one on both ends of the connection, if use using the same, they'll have the same defaults.
Make sure the topic one publishes matches the topic the other subscribes to, this will apply to the Topic and the Type (see more here)
Serialization of the data
Discovery (unicast vs. multicast, make sure
whatever setup you choose is valid, ex: both devices are in the same
multicast group)
QoS settings will need to line up, though I think defaults will likely work (read more here)
Get the Shapes demo working between the machines you're working on first, this does some basic sanity checking to know that it is possible with the given configuration and network setup. Every vendor/implementation that I've seen has a shapes demo to run, for example, here is RTI's.
That's all I can think of right now, hope that helps. I have found DDS documentation to be really good, especially if you know when you can (and when you can't) use any vendor's documentation's answer for your implementation (ex: answer found on RTI's doc or forum and whether or not it works for your OpenDDS application). Often the solutions are similar, but you'll find RTI supports the most and RTI + Prismtech have some of the best documentation.

The DDS RTPS protocol exchanges discovery information so that different applications participating in the same domain (!) know who is out there, and what they are offering/requesting. You need to make sure that the two applications are using the same domain ID (specified on the domain participant). Also, as some implementations allow for different transport options, make sure to use RTPS (sometimes called DDSI) networking.
The RTPS specification contains a mapping from domain ID to port numbers, so if applications from different vendors use the same ID it should just work. Implementations might however override portnumbers with configuration.
To maximize the chance that the applications communicate properly, ensure they use the same IDL datamodel. Vendors have different approaches to type evolution / mapping types that don't exactly match, and not all of them implement the XTypes specification (yet).
Also, as some implementations are stricter than others, ensure that you stay within bounds of the specification. This means that a topic name should only contain alphanumerical characters (I sometimes see ':' to indicate scoping, that is not allowed).
Things that will definitely not work between vendors is TRANSIENT/PERSISTENT durability or communication over TCP, as both have not been standardized yet. TRANSIENT_LOCAL should work. The difference between TRANSIENT_LOCAL and TRANSIENT is that with TRANSIENT_LOCAL, data is no longer aligned after a publisher (writer) leaves the system, whereas with TRANSIENT that data will still be available.
Also note that for API-level interoperability between vendors, your best chance is to use the new isocpp API, since that one has been implemented pretty consistently across the vendor implementations I've seen.
Hope that helps!

Related

Seperate or Merge Kafka Consumer and API services together

After recently reading about event-based architecture, I wanted to change my architecture into one making use of such strengths.
I have two services that expose an API (crud, graphql), each based around a different entity and using a different database.
However, now whenever someone deletes a certain type of row in service A, i need to delete a coupled row in Service B.
So I added Kafka to my design, and whenever I delete the entity in service A, it publishes a notification message into Kafka.
In service B I am currently consuming the same topic so whenever a new message is received the Service will also handle the deletion of the matching entity, because it already has access that table because the same service already exposes the CRUD API to users.
What i'm not sure about is whether putting the Kafka Consumer and the API together in the same service is a good design. It contradicts the point of single responsibility in micro services, and whether there is an issue in one part of the service, it will likely affect the second.
However, creating a new service will also cause me issues - i will have 2 different services accessing the same table, and i will have to make sure i always maintain them together, whenever making changes to the table or database.
What is the best practice in a incident such as this? Is it inevitable to have different services have data coupling or is it not so bad to use the same service for two, similiar usages.
There is nothing wrong with using Kafka... You could do the same with point-to-point service communication, however (JSON-RPC / gRPC), however.
The real problem you seem to be asking about is dual-writes or race-conditions leading to data inconsistency.
While you could use a single consumer group and one topic-partition to preserve order and locking across consumers interested in those events, that does not lock out other consumer-groups from interacting with the database to perform the same action. Therefore, Kafka itself won't help with this problem.
You'll need external, distributed locks (e.g. Zookeeper can be used here) that fence off your database clients while you are performing actions against it.
To the original question, Kafka Connect offers an API and is also a Producer and Consumer client (and would be recommended for database interactions). So is Confluent Schema Registry, KSQLdb, etc.
I believe that the consumer of your service B would not be considered "a service" or part of the "service", as in that it is not called as part the code which services requests. Yet it does provide functionality that is required for the domain function of your microservice. So yes I would consider the consumer part of the Microservice in terms of team/domain responsibility.
There may be different opinions on if the consumer code should share the same code base/repo as the "service" code. Some people believe that it is better to limit the repo scope to a single "executable", others believe it is beneficial to keep the domain scope and have everything in a single repo. I probably belong to the latter group but do not have a very strong opinion on it. I would argue it is more important to have a central documentation / wiki for the domain that will point to the repos involved etc.

Two-way encryption/authentication between servers and clients

To be honest I don't know if this is the appropriate title since I am completely new to this area, but I will try my best to explain below.
The scenario can be modeled as a group of functionally identical servers and a group of functionally identical clients. Assume each client knows the endpoints of all the servers (possibly from a broker or some kind of name service), and randomly chooses one to talk to.
Problem 1: The client and the server first need to authenticate themselves to each other (i.e. the client must show the server that it's a valid client, vice versa).
Problem 2: After that, the client and server talk to each other over some kind of encryption.
For Problem 1, I don't know what's the best solution. For Problem 2, I'm thinking about letting each clients create a private key and give the corresponding public key to the server it talks to right after authentication, so that no one else can decrypt its messages; and let all servers share a private key and distribute the corresponding public key to all clients, so that the external world (including the clients) can't decrypt what the clients send to the servers.
These are probably very naive approaches though, so I'd really appreciate any help & thoughts on the problems. Thank you.
I asked a similar question about half a year ago here, I've been redirected to Information Security.
After reading through my answer and rethinking your question, if you still have questions that are so broad, I suggest to ask there. StackOverflow, from what I know, is more about programming (thus security in programming) than security concepts. Either way, you will probably have to migrate there at some point when doing your project.
To begin with, you need to seriously consider what needs protecting in your system. Like here (check Gilles' comment and others), one of the first and most important things to do is to think over what security measures you have to take. You just mentioned authentication and encryption, but there are many more things that are important, like data integrity. Check wiki page for security measures. After knowing more about these, you can choose what (if any) encryption algorithms, hashing functions and others you need.
For example, forgetting about data integrity is forgetting about hashing, which is the most popular security measure I encounter on SO. By applying encryption, you 'merely' can expect no one else to be able to read the message. But you cannot be sure if it reaches the destination unchanged (if anything), either because of interceptors or signal losses. I assume you need to be sure.
A typical architecture I am aware of, assumes asymmetric encryption for private key exchange and then communicating using private keys. This is because public-key infrastructure (PKI) assumes that the key of one of the sides is publicly known, making communication easier, but certainly slower (e.g. due to key length: RSA [asymmetric] starts with 512bits, but typical key length now is 2048, which I can compare to weakest, but still secure AES [symmetric], which key lengths start with 128bits). The problem is, as you stated, the server and user are not authenticated to each other, so the server does not really know if the person sending the data really is who they claim they are. Also, the data could have been changed during traffic.
To prevent that, you need a so called 'key exchange algorithm', such as one of the Diffie Hellman schemes (so, DH might be the 'raw' answer to both of your problems).
Taking all above into consideration, you might want to use one (or more) of the popular protocols and/or services to define your architecture. Popular ones are SSH, SSL/TLS and IPSec. Read about them, define what services you need, check if they are present in one of the services above and you are willing to use the service. If not, you can always design your own using raw crypto algorithms and digests (hashes).

Does WebRTC allow one-to-many (multicast) connections?

I've read a lot about WebRTC, but there's one question that still remains. I hope you can help me with that:
Does WebRTC allow me to create a one-to-many connection? I don't mean "being able to have multiple connections to different computers", I really talk about having one connection that multicasts its data to multiple endpoints without the need to "upload" the data once for each endpoint. Will it be possible to send one single package to the web, that, when it reaches the web, magically splits itself into multiple packages with different targets?
I hope you get what I'm looking for :)
Until now, I've only seen one-to-one connections, or solutions that have one connection to a central server that does the multicast for them (which usually results in twice the ping).
But to me, one-to-one connections don't seem to be really useful (due to low upload-bandwith of clients), and solutions with a central server are also possible without WebRTC (using WebSockets), so the only real use case for WebRTC would be one-to-many connections.
So.. is this something that will be possible in the future? Or is it already possible today?
Three things:
IP multicast in the Internet is not possible at the moment (multicast addresses are not routed by ISPs)
WebRTC fits many use cases beyond one-to-many communication, just have a look at this document: https://datatracker.ietf.org/doc/html/draft-ietf-rtcweb-use-cases-and-requirements-06
WebRTC connections between browsers are always encrypted (using SRTP for A/V data and DTLS for generic data) and the encryption parameters (session keys etc.) are negotiated for every connection separately. How would you do that in a multicast environment (think of it as a distribution tree)?
So no, WebRTC cannot be used with IP multicast.
I would answer "It doesn't for now", because as a programmer, I can tell you, that there are number of ways browser devs to make it work if we (users) insist on it's importance. But how ? Since there's encryption, they could allow sharing of the session's encryption keys to the group of 'registered' (multicast) users. But how ? Well, Web was created for sharing. The most obvious way is through web server mediation and JS WebRTC API function (to load the user keys). Since multicast is most often used for efficient video distribution, you have a RTP/SRTP video server. The web server can coexist at the same machine. If they decide to extend it to web browsers - then just the "server" role can be done by the Web browser who created the multicast stream (the sender). The clients need to know who is it.
Again: In December 2013, this is still not possible. And multicasts are allowed on the Internet only in:
some experimental WAN nets
some internet+video ISP nets
LANs (when enabled at switch level, cheap switches transmit it to all ports). But you can be an ISP, researcher or LAN user, so it's necessary.

Dynamic server discovery list

I'd like to create a web service that an application server can contact to add itself to a list of servers implementing the application. Clients could then contact the service to get a list of servers. Something similar to how minecraft's heartbeats work for adding your server to the main server list.
I could implement it myself pretty easily, but I'm hoping someone has already created something like this.
Advanced features would be useful. Things like:
Allowing a client to perform queries on application-specific properties like the number of users currently connected to the server
Distributing the server list across more than one machine
Timing out a server's entry in the list if it hasn't sent a heartbeat within some amount of time
Does anyone know of a service like this? I know there are open protocols and servers for doing local-LAN service discovery, but this would be a WAN service.
The protocols I could find that had any relevance to your intended application are these:
XRDS (eXtensible Resource Descriptor Sequence).
XMPP Service Discovery protocol.
The XRDS documentation is obtuse, but you may be able to push service descriptions in XML format. The service type specification might be generic, but I get a headache from trying to decipher committee-speak.
The XMPP Service Discovery protocol (part of the protocol Formerly Known As Jabber) also looked promising, but it seems that even though you could push your service description, they expect it to be one of the services mentioned on this list. Extending it would make it nonstandard.
Finally, I found something called seap (SErvice Announcement Protocol). It's old, it's rickety, the source may be propriety, it's written in C and Perl, it's a kludge, but it seems to do what you want, kind-of.
It seems like pushing a service announcement pulse is such an application-specific and trivial problem, that almost nobody has considered solving the general case.
My advice? Read the protocols and sources mentioned above for inspiration (I'd start with seap), and then write, implement, and publish a generic (probably xml-based) protocol yourself. All the existing ones seem to be either application-specific, incomprehensible, or a kludge.
Basically, you can write it yourself though I am not aware if anyone has one for public (I wrote one over 10 yrs ago, but for a company).
database (TableCols: auto-counter, svr_name, svr_ip, check_in_time, any-other-data)
code to receive heartbeat (http://<you-app.com>?svr_name=XYZ&svr_ip=P.Q.R.S)
code to list out servers within certain check_in_time
code to do some housecleaning once a while (eg: purge old records)
To send a heartbeat out, you only need to send a http:// call, on Linux use wget* with crontab, on windows use wget.exe with task scheduler.
It is application specific, so even if you wrote one yourself, others can't use it without modifying the source code.

Verify WCF interface is the same between client and server applications

We've got a Windows service that is connected to various client applications via a duplex WCF channel. The client and server applications are installed on different machines, in different locations, potentially at widely different times, and by different people. In addition, the client can be pointed at a different machine running the same Windows service at startup.
Going forward, we know that the interface between the client and the server applications will likely evolve. The application in the field will be administered by local IT personnel, and we have no real control over what version of either of these applications will be installed when/where or which will be connecting to the other. Since these are installed at various physical locations and by different people, there's a high likely that either the client or server application will be out of date compared to the other.
Since we can't control what versions of the applications in the field are trying to connect to each other, I'd like to be able to verify that the contracts between the client application and the server application are compatible.
Some things I'm looking for (may not be able to realistically get them all):
I don't think I care if the server's interface is newer or older, as long as the server's interface is a super-set of the client's
I want to use something other than an "interface version number". Any developer-kept version number will eventually be forgotten about or missed.
I'd like to use a computed interface comparison if that's possible
How can I do this? Any ideas on how to go about this would be greatly appreciated.
Seems like this is a case of designing your service for versioning. WCF has very good versioning capabilities and extension points. Here are a couple of good MSDN articles on versioning the service contract and more specifically the data contracts. For backward and "forward" compatible versioning look at this article on using the IExtensibleDataObject interface.
If the server's endpoint has metadata publishing enabled, you can programmatically inspect an endpoint's interface by using the MetadataResolver class. This class lets you retrieve the metadata from the server endpoint, and in your case, you would be interested in the ContractDescription which contains the list of all operations. You could then compare the list of operations to your client proxy's endpoint operations.
Of course now, comparing the lists of operations would need to be implemented, you could simply compare the operations names and fail if one of the client's operations is not found within the server's operations. This would not necessarily cover all incompatiblities, ex. request/response schema changes.
I have not tried implementing any of this by the way, so it's more of a theoretical view of your problem. If you don't want to fiddle with the framework, you could implement a custom operation that would return the list of operation names. This would be of minimal effort but is less standards-compliant.