sending data between server and client in twisted - twisted

I'm trying to transport data between server and client implemented with twisted. As far as I know, using self.transport.write([data]) will work only if data is a string. Is there any other way I can send an object of other type? Thank you!

Sockets carry bytes. That's the only kind of thing they carry. Any two endpoints of a TCP connection can only convey bytes to each other.
Bytes aren't the most useful data structure for every form of communication. So on top of this byte transport, we invent schemes for formatting and interpreting bytes. These are protocols.
Twisted represents protocols as classes, almsot always subclasses of twisted.internet.protocol.Protocol, which implement a particular scheme.
These classes have methods for turning something which isn't pure bytes into something which is pure bytes. For example, twisted.protocols.basic.NetstringReceiver is an implementation of the netstring protocol. It turns a particular number of bytes into bytes which represent both the number of bytes and the bytes themselves. This is a rather subtle protocol, since it's not instantly obvious that the byte count is information that needs to be conveyed as well.
These classes also interpret bytes received from the network, in their dataReceived method, according to the protocol they implement, and turn the resulting information into something more structured. NetstringReceiver uses the length information to accept exactly the right number of bytes from the network and then deliver them to its stringReceived callback as a single Python str instance.
Other protocols do more than NetstringReceiver. For example, twisted.protocols.ftp includes an implementation of the FTP protocol. FTP is a protocol geared towards passing file listings and files over a socket (or several sockets, actually). twisted.mail.pop3 implements POP3, a protocol for transferring email over sockets.
There are lots and lots of different protocols because there are lots and lots of different things you might want to do. Depending on exactly what you're trying to do, there are probably different ways to convert to and from bytes to make things easier or faster or more robust (and so on). So there's no single protocol that is ideal for the general case. That includes the case of "sending an object", since objects can take many different forms, and there may be many different reasons you want to send them, and many different ways you might want to handle things like mutation of an object you'd previously sent, and so on.
You probably want to spend a little time thinking about what kind of communication you need. This should suggest certain things about the protocol you'll select to do the communication.
For example, if you want to be able to call methods on Python objects that exist on the other side of a connection, then Twisted Spread might be interesting.
If you want something cross-language instead, and only need to convey simple types, like integers, strings, and lists, then XML-RPC (Twisted How-To) might be a better fit.
If you need a protocol that's more space efficient than XML-RPC and supports serialization of more complicated types, then AMP might be more appropriate.
And the list goes on. :)

Related

Serialization of data for protocol implementation

I am to implement a communication protocol. The data structures used in the protocol are defined as bytes per field in each message
bytes 1-2 -> stx bytes
bytes 3 -> mesg type
bytes 4-5 -> size of pay load
bytes 6-... -> pay load bytes (unsigned bytes)
bytes ... - ...+1 -> checksum from byte 3 - ...
bytes ...+2 -> end byte
the example above is variable pay load size, but some Messages are also fixed size.
I have checked a serialization library, namely "protocol buffers" for this purpose but I concluded that protobuf is not complainant as the variant types used change the data serialized.
similar libraries exist but I am not sure if they can be used fir this purpose (flat buffers, cap'n proto).
So, is there a framework to define the interface structures and generate appropriate code (data structures + parser + serializer, with support for multiple languages if possible) for the defined interface?
Or what is the best approach you would suggest for this purpose?
Defining the messages used in a protocol by defining what each byte means is, well, old fashioned. Having said that, an awful lot of current protocols are defined that way.
If you can, it's better to start off with a schema for the protocol (e.g. a .proto file for Google Protocol Buffers, or an .asn file for ASN.1, etc. There's many to choose from), defining the messages you want to exchange, and then use your chosen serialisation technologies tools (e.g. protoc for G.P.B, asn1c for ASN.1, etc) to generate code.
That schema would be defining the content of the "payload" bytes in your example, and you'd leave it up to GPB or whatever to work out how to convey message type, size and length for you. Different serialisation technologies have different capabilities in this regard. In GPB you'd use a oneof structure to incorporate all the different types of content you want to send into a single structure, but GPB doesn't demarcate between different messages on the wire (you have to add that yourself, perhaps by sending messages using ZeroMQ). Several ASN.1 wire formats do demarcate between different messages, saving you the bother (useful on a raw stream connection). XML is also demarcated. I don't think Cap'n Proto does demarcation.
If you're stuck with the protocol as defined byte-by-byte, exactly as you've shown, it's going to be difficult to find a serialisation technology that usefully matches. You'd likely be condemned to writing all the code yourself.

What would be the best way of reusing uvm sequences for different environment

Let's say I have a DUT (e.g. l2 cache) with AXI bus in master port and I have created a class AXI_transfer extended from sequence_item, 100 sequences of interesting test scenarios and a uvm driver. Now, the bus protocol of DUT has changed from AXI to AHB. Testbench components that need to be modified are the sequence_item, and the driver (because they are protocol dependent). Now, I don't like to redevelop sequences for AHB because they are transaction level scenarios. Instead, I'd like to reuse all my sequences tied to AXI_transfer items. What would be the best methodology?
My idea is that I define a base_transfer extended from sequence_item and extend AXI_transfer and AHB_transfer from this base_transfer. Also, I modify all my sequences to be parameterized with this base_transfer type. Now, in my uvm test, I can do
base_transfer::type_id::set_type_override( AXI_transfer::get_type());
if I need to use AXI_transfer or
base_transfer::type_id::set_type_override( AHB_transfer::get_type());
if I need to use AHB_transfer. For driver, I need to develop two drivers -- one for AXI and the other for AHB.
Do you think this would work? If not what other methods are recommended?
In general, I believe you seek a LAYERED approach. Your upper layer simply sends and receives abstract traffic. It's up to the lower layer to handle details of the protocol.
This is exactly the approach used by the uvm_register_adapter. See something like this: http://cluelogic.com/2012/10/uvm-tutorial-for-candy-lovers-register-abstraction
In practice, what is causing you to have a hundred different sequences for an interface? The things that are causing you to create additional sequences are quite likely the kind of things that will cause difficulty translating between protocols. Your AXI/ACE will certainly use memory barriers, but how are you going to create them on an AHB interface, for instance?

Deciding extent of coupling

I have a Component which has API exposed with some 10 functionality in all. I can think of two ways to achieve it:
Give out all these functionality as separate functions.
Expose only one function which takes an XML as input. Based on request_Type specified and the parameters passed in the XML, I internally call one of the respective functions.
Q1. Will the second design be more loosely coupled than the first ?
I always read about how I should try my components to be loosely coupled, should I really go to this extent to achieve lose coupling ?
Q2. Which one of these would be a better design in terms of OOP and why?
Edit:
If I am exposing this API over D-Bus for others to use, will type checking still be a consideration to compare the two approaches? From what I understand type checking is done at compile time, but in case when this function is exposed over some IPC, issue of type checking comes into picture ?
The two alternatives you propose do not differ in the (obviously quite large) number of "functions" you want to offer from your API. However, the second seems to have many disadvantages because you are loosing any strong type checking, it will become much harder to document the functionality etc. (The only advantage I see is that you don't need to change your API if you add functionality. But at the disadvantage that users will not be able to figure out API changes like deleted functions until run-time.)
What is more related with this question is the Single Responsiblity Principle (http://en.wikipedia.org/wiki/Single_responsibility_principle). As you are talking about OOP, you should not expose your tens of functions within one class but split them among different classes, each with a single responsibility. Defining good "responsibilities" and roles requires some practice, but following some basic guidelines will help you to get started quickly. See Are there any rules for OOP? for a good starting point.
Reply to the question edit
I haven't used D-Bus, so this might be totally wrong. But from a quick look at the tutorial I read
Each object supports one or more interfaces. Think of an interface as
a named group of methods and signals, just as it is in GLib or Qt or
Java. Interfaces define the type of an object instance.
DBus identifies interfaces with a simple namespaced string, something
like org.freedesktop.Introspectable. Most bindings will map these
interface names directly to the appropriate programming language
construct, for example to Java interfaces or C++ pure virtual classes.
As far as I understand, D-Bus has the concept of differnt objects which provide interfaces consisting of several methods. This means (to me) that my answer above still applies. The "D-Bus native" way of specifying your API would mean to exhibit interfaces and I don't see any reason why good OOP design guidelines shouldn't be valid, here. As D-Bus seems to map these even to native language constructs, this is even more likely.
Of course, nobody keeps you from just building your own API description language in XML. However, things like are some kind of abuse of underlying techniques. You should have good reasons for doing such things.

Why do we use serialization?

Why do we need to use serialization?
If we want to send an object or piece of data through a network we can use streams of bytes. If we want to save some data to the disk, we can again use the binary mode along with the byte streams and save it.
So what's the advantage of using serialization?
Technically on the low-level, your serialized object will also end up as a stream of bytes on your cable or your filesystem...
So you can also think of it as a standardized and already available way of converting your objects to a stream of bytes. Storing/transferring object is a very common requirement, and it has less or little meaning to reinvent this wheel in every application.
As other have mentioned, you also know that this object->stream_of_bytes implementation is quite robust, tested, and generally architecture-independent.
This does not mean it is the only acceptable way to save or transfer an object: in some cases, you'll have to implement your own methods, for example to avoid saving unnecessary/private members (for example for security or performance reasons). But if you are in a simple case, you can make your life easier by using the serialization/deserialization of your framework, language or VM instead of having to implement it by yourself.
Hope this helps.
Quoting from Designing Data Intensive Applications book:
Programs usually work with data in (at least) two different
representations:
In memory, data is kept in objects, structs, lists, arrays, hash tables, trees, and so on. These data structures are optimized for
efficient access and manipulation by the CPU (typically using
pointers).
When you want to write data to a file or send it over the network, you have to encode it as some kind of self-contained sequence of bytes
(for example, a JSON document). Since a pointer wouldn’t make sense to
any other process, this sequence-of-bytes representation looks quite
different from the data structures that are normally used in memory.
Thus, we need some kind of translation between the two
representations. The translation from the in-memory representation to
a byte sequence is called encoding (also known as serialization or
marshalling), and the reverse is called decoding (parsing,
deserialization, unmarshalling).
Among other reasons to be compatible between architecture. An integer doesn't have the same number of bytes from one architecture to another, and sometimes from one compiler to another.
Plus what you're talking about is still serialization. Binary Serialization. You're putting all the bytes of your object together in order to store them and be able to reconvert them as an object later. This is serializing.
More info on wikipedia
Serialization is the process of converting an object into a stream so that it can be saved in any physical file like (XML) or can be saved in Database. The main purpose of Serialization in C# is to persist an object and save it in any specified storage medium like stream, physical file or DataBase.
In General, serialization is a method to persist an object's state, but i suggest you to read this wiki page, it is pretty detailed and correct in my opinion:
http://en.wikipedia.org/wiki/Serialization
In serialization, the point is not turning an object into bits and bytes, objects ARE bits and bytes already. Serialization is the process of making the object's "state" persistent. Notice the word "state", which means the values of the instance variables of the entire object graph (the target object and all the objects it references either directly or indirectly) WITHOUT methods and other extra runtime stuff stuck to them (and of course plus a little more info that JVM needs for restoration of these objects, such as their class types).
So this is the main reason of its necessity: Storing the whole bytes of objects would be expensive, and for all intents and purposes, unnecessary.

Using NSStringFromSelector to send method over a network

I'm currently making a client-client approach on some simulation with objective-c with two computers (mac1 and mac2).
I have a class Client, and every computer has a instance of the "Client" on it (client1,client2). I expect that both clients will be synchronized: they will both be equal apart from memory locations.
When a user presses a key on mac1, I want both client1 and client2 to receive a given method from class Client (so that they are synchronized, i.e. they are the same apart from it's memory location on each mac).
To this approach, my current idea is to make 2 methods:
- (void) sendSelector:(Client*)toClient,...;
- (void) receiveSelector:(Client*)fromClient,...;
sendSelector: uses NSStringFromSelector() to transform the method to a NSString, and send it over the network (let's not worry about sending strings over net now).
On the other hand, receiveSelector: uses NSSelectorFromString() to transform a NSString back to a selector.
My first question/issue is: to what extent is this approach "standard" on networking with objective-c?
My second question:
And the method's arguments? Is there any way of "packing" a given class instance and send it over the network? I understand the pointer's problem when packing, but every instance on my program as an unique identity, so that should be no problem since both clients will know how to retrieve the object from its identity.
Thanks for your help
Let me address your second question first:
And the method's arguments? Is there any way of "packing" a given
class instance and send it over the network?
Many Cocoa classes implement/adopt the NSCoding #protocol. This means they support some default implementation for serializing to a byte stream, which you could then send over the network. You would be well advised to use the NSCoding approach unless it's fundamentally not suited to your needs for some reason. (i.e. use the highest level of abstraction that gets the job done)
Now for the more philosophical side of your first question; I'll rephrase your question as "is it a good approach to use serialized method invocations as a means of communication between two clients over a network?"
First, you should know that Objective-C has a not-often-used-any-more, but reasonably complete, implementation for handling remote invocations between machines with a high level of abstraction. It was called Distributed Objects. Apple appears to be shoving it under the rug to some degree (with good reason -- keep reading), but I was able to find an old cached copy of the Distributed Objects Programming Topics guide. You may find it informative. AFAIK, all the underpinnings of Distributed Objects still ship in the Objective-C runtime/frameworks, so if you wanted to use it, if only to prototype, you probably could.
I can't speculate as to the exact reasons that you can't seem to find this document on developer.apple.com these days, but I think it's fair to say that, in general, you don't want to be using a remote invocation approach like this in production, or over insecure network channels (for instance: over the Internet.) It's a huge potential attack vector. Just think of it: If I can modify, or spoof, your network messages, I can induce your client application to call arbitrary selectors with arbitrary arguments. It's not hard to see how this could go very wrong.
At a high level, let me recommend coming up with some sort of protocol for your application, with some arbitrary wire format (another person mentioned JSON -- It's got a lot of support these days -- but using NSCoding will probably bootstrap you the quickest), and when your client receives such a message, it should read the message as data and make a decision about what action to take, without actually deriving at runtime what is, in effect, code from the message itself.
From a "getting things done" perspective, I like to share a maxim I learned a while ago: "Make it work; Make it work right; Make it work fast. In that order."
For prototyping, maybe you don't care about security. Maybe when you're just trying to "make it work" you use Distributed Objects, or maybe you roll your own remote invocation protocol, as it appears you've been thinking of doing. Just remember: you really need to "make it work right" before releasing it into the wild, or those decisions you made for prototyping expedience could cost you dearly. The best approach here will be to create a class or group of classes that abstracts away the network protocol and wire format from the rest of your code, so you can swap out networking implementations later without having to touch all your code.
One more suggestion: I read in your initial question a desire to 'keep an object (or perhaps an object graph) in sync across multiple clients.' This is a complex topic, but you may wish to employ a "Command Pattern" (see the Gang of Four book, or any number of other treatments in the wild.) Taking such an approach may also inherently bring structure to your networking protocol. In other words, once you've broken down all your model mutation operations into "commands" maybe your protocol is as simple as serializing those commands using NSCoding and shipping them over the wire to the other client and executing them again there.
Hopefully this helps, or at least gives you some starting points and things to consider.
These days it would seem that the most standard way is to package everything up on JSON.