OMNeT++ 5G video stream: send a full encoded frame or fragment it into several packets to stay within MTU? - udp

I want to simulate a simple client/server application streaming 2-3 4k 25 fps videos within 5G network using OMNeT++ stack. After capturing all incoming video flows with opencv and encoding with h264 codec I have roughly 20 kilobytes for each frame encoded as bytes -- uint8_t. Of course capturing for each of flow happens in a separate thread.
Now I want to send it to some clients over 5G using UDP protocol. If you look almost at every open source implementation of video streaming, the process of transmission is presented very simple:
uint_8* buffer; // encoded frame
int size; // buffer size
send(clientSocket, buffer, size, 0); // send to client
and on the client side a loop with an appropriate recv pulls. The same basically happens in every OMNeT++ simualtion.
Of course a UDP packet with around 20Kb payload will be fragmented on its way to pass good old IPv4 1500 bytes MTU at the backhaul part of a standard 5G architecture. So here comes my question: could I benefit something if I try to reduce my maximal UDP payload to, let's say, 1280 bytes to avoid IP fragmentation and to fit IPv6 minimum reassembly buffer size?
I'm afraid that if I just cluelessly send the encoded frame as in the code above, some fragmented packets may be lost and my decoder (h264 as well) on the client side can fail to decode the frame. Well however the same could happen if I send mine fragmented 1280 bytes packets... so here the question is pretty general, considering the fact that it happens in OMNeT++ simulation but with real video files: is there any advantage of controlling the packet size before sending or you just can cluelessly send any less than 64 Kb UDP datagram and just chill?

Related

RTP Packet maximum size?

Im trying to figure out which is the maximum size of a RTP packet. I know that the minimum header size is 12 bytes, but i dont find anything about the payload.
It is possible that the maximum size of the RTP packet is the same as the UDP payload maximum size? I mean, that i have only a RTP packet with a huge payload. Is this possible and, in this case, there is any recommended size for the RTP packet for not doing this?
For example im encapsulating MP3 frames in RTP. Do I make an RTP frame with 1 MP3 frame, 2, or how many?
I hope you understand my question :)
It is possible that the maximum size of the RTP packet is the same as the UDP payload maximum size?
The RTP standard does not set a maximum size so you're free to do this.
(Jumbo packets often have issues of their own with transport, but that's generally to do with the lower layer protocols playing up.)
It is possible that the maximum size of the RTP packet is the same as the UDP payload maximum size? I mean, that i have only a RTP packet with a huge payload. Is this possible and, in this case, there is any recommended size for the RTP packet for not doing this?
Yes, you could create a 1446 byte long payload and put it in a 12 byte RTP packet (1458 bytes) on a network with an MTU of 1500 bytes.
By the time you include an 8 byte UDP header + 20 byte IP header + 14 byte Ethernet header you've 42 bytes of overhead which takes you to 1500 bytes.
In practice if you're transporting this over the internet, this traffic is getting encapsulated or carried across varied transport layers, you'd probably want to keep it below 1400 bytes to be on the safe side.
For example im encapsulating MP3 frames in RTP. Do I make an RTP frame with 1 MP3 frame, 2, or how many?
RTP has a one-to-one mapping from one unique source to one RTP stream (Unless the streams are mixed/muxed together into one stream) so each of the MP3 sources would be put into their own RTP stream, each with it's own unique Synchronization Source Identifier (SSRC) to differentiate between each stream.
For more info on RTP there's the RFC 3550 itself, there's a great book by Colin Perkins called "RTP: Audio and Video for the Internet", I'vea fair bit on RTP on my blog and I've also created a Python library for creating RTP packets.

Calculate OPUS packet length

I have a simple application where I send OPUS packets from one client to other say A to B.
A reads one packet from a OPUS file, and send to B.
Again after 20ms or 30ms reads one more packet and send to B, so on..
Till now I was using RTP over UDP, so on receiving side at B, when I receive the packet, I receive complete packet. After receiving complete packet I write to a new file.
This works fine.
Now I am planning to support RTP over TCP.
A will read a complete packet from OPUS file and send to B.
When packet is received at B, it may be received as a single packet or multiple packet (tcp behaviour). My requirement is, I should buffer the data till I receive complete packet. Once I receive complete packet, I will write it to a file.
Now my question is, how do I determine the length of OPUS packet at B while I receiving, so that I can buffer it.
Do not want to use libopus etc if somehow I can avoid it. If by any means from received data, can I find out length of packet?
TCP is a stream protocol. You have two primary choices: add a length word (16 bits is enough) before each Opus packet (read length, then read packet, dealing with buffering (wait to get enough bytes for the read), or pad every Opus packet to a specific size. Opus doesn't use fixed-size packets; they depend on the content and the bitrate and quality settings.

WebRTC lowest possible latency

I have a simple UDP streaming protocol that takes RAW H264 video frames and sends them instantly from server side to the client side.
Using this protocol I can get near network RTT latency (no packet resending and I don't care about packet loss), so if I have 20 ms latency from server to the client I can make a video frame to be ready from encoder output to the client side (ready to be decoded) in... let's say 30 ms.
My question is:
Is WebRTC (over UDP) capable of going down to this kind of latencies?
Not taking into account encoding and decoding times, what is the
lowest latency possible I can get with WebRTC for the protocol layer?
I don't know if this kind of latencies will require my own protocol to be more deeply developed or I may go to something more generic like WebRTC for my video server development in order to instantly be supported by every web browser.
WebRTC can have the same low latency as regular SIP/RTP stacks.
WebRTC stack vendors does their best to reduce delay.
For recording and sending out there is no any delay. The stack will send the packets immediately once received from the recorder device and compressed with the selected codec. Some codec's (and some codec settings) might introduce some delay here to enable some features such as FEC.
Regarding the receiver side:
In optimal circumstances the stack should not delay the playback of the packets, so they can be display as soon as they arrive.
However in sub-optimal circumstances (with network delays or packet loss) the stack will introduce a jitter buffer. The lower is the network quality, the higher will be the jitter buffer length.
So, to achieve the lowest delay, you might have to do the followings:
choose a codec with the smallest processing time
remove FEC and disable any other settings which might cause additional delays
remove the jitter buffer (most WebRTC stacks doesn't have a setting for this so you might have to modify the code yourself, but it is an easy modification, because you just need to deactivate a part of the code)
WebRTC uses RTP as the underlying media transport which has only a small additional header at the beginning of the payload compared to plain UDP. This means it should be on par with what you achieve with plain UDP. RTP is heavily used in latency critical environments like real time audio and video (its the media transport in SIP, H.323, XMPP) and thus you can expect the latency to be sufficient for this purpose.

The most reliable and efficient udp packet size?

Would sending lots a small packets by UDP take more resources (cpu, compression by zlib, etc...). I read here that sending one big packet of ~65kBYTEs by UDP would probably fail so I'm thought that sending lots of smaller packets would succeed more often, but then comes the computational overhead of using more processing power (or at least thats what I'm assuming). The question is basically this; what is the best scenario for sending the maximum successful packets and keeping computation down to a minimum? Is there a specific size that works most of the time? I'm using Erlang for a server and Enet for the client (written in c++). Using Zlib compression also and I send the same packets to every client (broadcasting is the term I guess).
The maximum size of UDP payload that, most of the time, will not cause ip fragmentation is
MTU size of the host handling the PDU (most of the case it will be 1500) -
size of the IP header (20 bytes) -
size of UDP header (8 bytes)
1500 MTU - 20 IP hdr - 8 UDP hdr = 1472 bytes
#EJP talked about 534 bytes but I would fix it to 508. This is the number of bytes that FOR SURE will not cause fragmentation, because the minimum MTU size that an host can set is 576 and IP header max size can be 60 bytes (508 = 576 MTU - 60 IP - 8 UDP)
By the way i'd try to go with 1472 bytes because 1500 is a standard-enough value.
Use 1492 instead of 1500 for calculation if you're passing through a PPPoE connection.
Would sending lots a small packets by UDP take more resources ?
Yes, it would, definitely! I just did an experiment with a streaming app. The app sends 2000 frames of data each second, precisely timed. The data payload for each frame is 24 bytes. I used UDP with sendto() to send this data to a listener app on another node.
What I found was interesting. This level of activity took my sending CPU to its knees! I went from having about 64% free CPU time, to having about 5%! That was disastrous for my application, so I had to fix that. I decided to experiment with variations.
First, I simply commented out the sendto() call, to see what the packet assembly overhead looked like. About a 1% hit on CPU time. Not bad. OK... must be the sendto() call!
Then, I did a quick fakeout test... I called the sendto() API only once in every 10 iterations, but I padded the data record to 10 times its previous length, to simulate the effect of assembling a collection of smaller records into a larger one, sent less often. The results were quite satisfactory: 7% CPU hit, as compared to 59% previously. It would seem that, at least on my *NIX-like system, the operation of sending a packet is costly just in the overhead of making the call.
Just in case anyone doubts whether the test was working properly, I verified all the results with Wireshark observation of the actual UDP transmissions to confirm all was working as it should.
Conclusion: it uses MUCH less CPU time to send larger packets less often, then the same amount of data in the form of smaller packets sent more frequently. Admittedly, I do not know what happens if UDP starts fragging your overly-large UDP datagram... I mean, I don't know how much CPU overhead this adds. I will try to find out (I'd like to know myself) and update this answer.
534 bytes. That is required to be transmitted without fragmentation. It can still be lost altogether of course. The overheads due to retransmission of lost packets and the network overheads themselves are several orders of magnitude more significant than any CPU cost.
You're probably using the wrong protocol. UDP is almost always a poor choice for data you care about transmitting. You wind up layering sequencing, retry, and integrity logic atop it, and then you have TCP.

WebRTC Overhead

I want to know, how much overhead WebRTC produces when sending data over datachannels.
I know that Websockets have 2 - 14 Bytes overhead for each frame. Does WebRTC use more Overhead? I cannot find some useful information on the web. Its clear for me, that Datachannels can not be used for now. How much overhead is used with Mediastreams?
Thanks
At the application layer, you can think of DataChannel as sending and
receiving over SCTP. In the PPID (Payload Protocol Identifier) field of the
SCTP header, Datachannel sets value 0x51 for indicating that it's sending UTF-8
data and 0x52 for binary data.
Yes, you are right. RTCDataChannel uses SCTP over DTLS and UDP. DTLS is used for
security. However, SCTP has problems traversing most NAT/Firewall setups.
Hence, to overcome that, SCTP is tunneled through UDP. So the overall overhead
to send data would be overhead of:
SCTP + DTLS + UDP + IP
and that is:
28 bytes + 20-40 bytes + 8 bytes + 20 - 40 bytes
So, the overhead would be rougly about 120 bytes. The maximum size of the SCTP
packet that a WebRTC client can send is 1280 bytes. So at max, you can send
roughly 1160 bytes of data per SCTP packet.
WebRTC uses RTP to send its media. RTP runs over UDP.
Besides the usual IP and UDP headers, there are two additional headers:
The RTP header itself starts from 12 bytes and can grow from there, depending on what gets used.
The payload header - the header that is used for each data packet of the specific codec being used. This one depends on the codec itself.
RTP is designed to have as little overhead as possible over its payload due to the basic reasoning that you want to achieve better media quality, which means dedicating as many bits as possible to the media itself.
Here's a screenshot of 2 peer.js instances (babylon.js front end) sending exactly 3 bytes every 16ms (~60 per second).
The profiler shows 30,000 bits / second:
30,000 bits / 8 bits per byte / 60 per second = 62.5 bytes, so after the 3 bytes I'm sending it's ~59.5 bytes according to the profiler.
I'm not sure if something is not counted on the incoming, because it is only profiling half that, 15k bits / second