Protocol buffer serialize into packets - serialization

I need to send serialized data by satellite, which involves sending data as packets that have a maximum size of 2kB each.
Is there a built-in/easy way to serialize data with protobuf into packets limited to X size? And then rebuild them on the other end?

Protobuf always serializes a message into one byte array which is as long as is needed to represent the data.
Your best bet is to split the bytes into chunks an a lower layer, then re-assemble at the other end.
In all likelihood, these packets are not reliably delivered, and so you will also need a mechanism for acknowledging packets, re-transmitting dropped packets, congestion control, etc. These are all things that TCP normally does for you. If I were you, I would look for an existing TCP implementation -- or something like it -- that can sit on top of your satellite link.

Related

What is the relation between Serialization and streaming?

Always when I find some articles or videos are talking about stream they're necessairly talking about serialization?
what is the relation between those? or to be specific,
Could we say that the data stream always needs serialization or could we find some data stream without serialization?
Firstly, it useful to have a reminder of serial vs parallel communication: if we take a simple example of transmitting a byte, in the parallel case all 8 bits are sent at the same time and in the serial case the 8 bits are sent one by one and the byte built again on the receiving side.
For your video domain example, If you imagine a frame of a video as being a large collection of bytes, lets say 720 by 1280 pixels and each pixel is represented by a byte, then we need 921,600 bytes to represent the frame.
If you are streaming the video you need to send each frame (plus overhead which we'll ignore here for simplicity) from the server to the client device, hence you need to send the 921,600 bytes for each frame.
If you had a very (very!) large parallel connections that could transmit 921,600 bytes in parallel between the server and the client in a single communication then this would be easy to understand.
However, this is almost always not the case, even for much smaller data structures, so serialisation is the name generally given to the process of taking the 921,600 bytes and breaking them down into the size which you can transmit - and that size is often one bit at a time.
Generally a video will be broken down into packets and the packets transmitted to the client. The packets themselves are just collections of bytes also and if the connection allows only a single bit of information to be transmitted at a time, then the packet needs to be broken down and sent 'serially' one bit at a time.
To complicate things, as is commonly the case in computer science and communications, the terms can mean different things in different contexts.
For example you may see it mentioned that you can either stream or 'serialise an object' in some client server communication. What this generally means is that you can either send the raw data 'stream' and let the client be responsible for how to interpret it, or you can use a framework or underlying mechanism which will take an object, convert it into a format that can be transmitted serially, and then reconstruct it on the other end and give it to the client. In fact the actually communication is serial in both cases (if it is using a serial communication channel) so the terms are being used in a different way here.

Missing line in "Follow UDP Stream" in wireshark

I am streaming raw UDP packets (rf data) from GNU Radio to Octave (or any other program). The data consists of 390625 4 byte floats per second. This is 1562500 bytes per second. When GNU Radio streams UDP, there is no header or sequence number in the UDP data, it's just raw floats. Because this is localhost to localhost, I am able to use a very large MTU.
Attached is a screenshot of Wireshark after right clicking and doing "Follow UDP Stream". There is a "blank" part of the Hex Dump at 0x6F38C8. I don't understand what this means? (I know that UDP does not provide reliable delivery and packets can be dropped and arrive out of order at any moment). Any help would be great!
The blank part is simply used as a barrier to differentiate between 2 UDP packets, solely for your convenience.
If you track down that exact data in the normal wireshark window you'll notice that the data before the blank part belongs to a certain UDP packet and that the data after the blank part belongs to the subsequent UDP packet.
The HEX numbers on the left specify the offset of the first byte of the line from the beginning of the stream in that direction (i.e. in the half blank line, the byte 0x08 has 0x6F38C8 bytes sent before it in that direction). Since the half blank line has just 8 bytes, the offset of the next line is 0x6F38C8 + 0x8 = 0x6F38D0. This is another indicator that the blank part isn't used as a filler for certain data that couldn't be represented due to some obscure reason.
Note that it would be impossible for wireshark to know whether data is lost and represent it to you in any manner, since UDP is unreliable (that is, unless the underlying protocol maintains certain counters by itself, and is parsed by wireshark).

UDP Client and Server Buffer Agreement

Hi I am writing a program that will send a file from client to server using UDP socket using different packet sizes for example 512B, 1KB and 2KB and i don't want use fixed buffer size in the receiver(server).I need some codes in Java that will allow both server and client to agree upon a packet size before transfer start. Many thanks
Don't you forget that UDP packets may be fragmented, duplicated and lost? There is a whole bunch of things to take care of, starting with lost packet retransmissions.
I hate to give a "don't do this" kind of answers, but for this one, just use TCP. And if you want some user-level "packets", you can have them with TCP also (prefix each one with its length, that's enough).

UDP Packet size and fragements

Let's say I am trying to send data using udp socket. If the data is big then I think the data is going to be divided into several packets and sent to the destination.
At the destination, if there is more than one incoming packets then how to I combined those separated packets into the original packet? Do I need to have a data structure that save all the incoming udp based on the sender ? Thanks in advance..
If you are simply sending the data in one datagram, using a single send() call, then the fragmentation and reassembly will be done for you, by the transport layer. All you need to do is supply a large enough buffer to recv(), and if all the fragments have arrived, then they will be reassembled and presented to you as a single datagram.
Basically, this is the service that UDP provides you (where a "datagram" is a single block of data sent by a single send() call):
The datagram may not arrive at all;
The datagram may arrive out-of-order with respect to other datagrams;
The datagram may arrive more than once;
If the datagram does arrive, it will be complete and correct1.
However, if you are performing the division of the data into several UDP datagrams yourself, at the application layer, then you will of course be responsible for reassembling it too.
1. Correct with the probability implied by the UDP checksum, anyway.
You should use TCP for this. TCP is for structured data that needs to arrive in a certain order without being dropped.
On the other hand, UDP is used when the packet becomes irrelevant after ~500 ms. This is used in games, telephony, and so on.
If your problem requires UDP, then you need to handle any lost, duplicate, or out-of-order packets yourself, or at least write code that is resilient to that possibility.
http://en.wikipedia.org/wiki/User_Datagram_Protocol
If you can't afford lost packets, then TCP is probably a better option than UDP, since it provides that guarantee out of the box.

Guarantee TCP Packet Size

We use an embedded device to send packets from a serial port over a serial-to-Ethernet converter to a server. One manufacturer we use, Moxa, will always send the packets in the same manner which they are constructed. Meaning, if we construct a packet size of 255, it will always send the packet in a 255 length. The other manufacturer, Tibbo, if we send the packet size 255, it will break the packet up if it is greater than 128. This is the answer I received from the Tibbo engineers at the time:
"From what I understand and what the
engineers said, even if the other
devices provide you with the right
packet size now does not guarantee
that when implemented in other
networks the same will happen. This
is the reason why we feel that packet
size based data transfer through TCP
is not reliable as it was not the way
TCP was designed to be used."
I understand that this may not be how TCP was designed to be used, but if I create a packet of 255 bytes and TCP allows it, then how is this outside of how TCP works? I understand that at some point the packet may get broken up but if the server is expecting a certain packet size and Moxa's offering does not have the same problem as the Tibbo device.
So, is it possible to guarantee a reasonable TCP packet size?
No. TCP is not a packet protocol, it is a stream protocol. It guarantees that the bytes you send will all arrive, and in the right order, but nothing else. In particular, TCP does not give you any kind of message or packet boundaries. If you want such things, they need to be implemented at a higher level by your protocol.