I am receiving RTPs via UDP (video data).
The RTPs are holding H264 that I need to decode. Unfortunately, most of the RTPs hold fragmented data. As RTP sequences are missing, I cannot reconstruct the H264 properly.
Any idea on how to reduce data loss in order to be able to decode at least o couple of frames ?
There is not much one can say. Lost data is lost as the adjective suggests. You can't get it back. In almost any case you can still feed the remaining NALs into the decoder and render the video. You will see artifacts that are introduced by the missing NALs but that's life.
Lost data is lost.
In order to reduce data loss you will need to change your transmission protocol. Interleaved RTP in RTSP could be a good choice that bases on a similar technolgy stack.
Changing to TCP will obviously only help if you got enough bandwidth to transmit the video.
If you have control on H264 encoder, enable Error resilience tools,(http://www.slideshare.net/coldfire7/error-resiliency-and-concealment-in-h264-presentation)
which makes your video more robust towards transmission errors.
So that your RTP over UDP becomes 'more resistant' towards packets losses.
Related
We would like to be able to play music in another tab (say YouTube, Spotify, Soundcloud, etc) and then stream that over a WebRTC connection to other peers.
We are doing this through the screenshare and it's mostly working, but the music will sometimes cut in and out for the listeners, giving it a choppy sound. In other words, it sounds smooth to the person sending it (ie sharing it from the originating URL), but it sounds choppy to the on the receiving side of the WebRTC connection.
Any thoughts on what might be causing this? Is this a buffering issue? If so, is it more likely buffering on the sending or the receiving side?
Thanks so much for any help!
WebRTC favors low latency over quality, with the goal of ensuring you can have normal speech communication. To do this, a lot of things happen to your audio:
Playback rate is constantly changed. If playback gets behind, the rate speeds up. If it's too far ahead, it slows down.
There is a very small buffer, creating more opportunities for the playback buffer run dry.
If packets are lost, the audio for their time is simply discarded... skipped over. Playback isn't likely to buffer a bit and then continue.
When audio is lost, a bit of a trail-off is synthesized. This is fine for speech, but sounds bad for music.
On the media capture end, there are also audio "enhancements" designed for dealing with bad webcam microphones which can sometimes be applied to other mediastreams if configured incorrectly. These include:
Echo cancellation
Noise reduction
Automatic gain control
Finally, it's usually the case that audio bitrates are quite low by default. You'll usually have to munge the SDP if you want stereo high quality audio.
All this to say, WebRTC might not be the right choice for you if you are concerned with quality. I often resort to the MediaRecorder API.
The closest I came across this is this question on SO but that is just for basic understanding.
My question is: when Media Source Extension (MSE) is used where the media source is fetched from a remote end point, for example, through AJAX or fetch API or even websocket, the media is sent over TCP.
That will handle packet loss and sequencing so protocol like RTP with RTCP is not used. Is that correct?
But this will result in delay so it cannot be truly used for real-time communication. Yes?
There is no security/encryption requirement for MSE like in WebRTC (DTLS/SRTP). Yes?
One cannot, for example, mix a remote audio source from MSE with an audio mediaStreamTrack from a RTCPeerConnection as they do not have any common param like CNAME (RTCP) or are part of the same mediastream). In other words, the world of MSE and WebRTC cannot mix unless synchronization is not important. Correct?
That will handle packet loss and sequencing so protocol like RTP with RTCP is not used. Is that correct?
AJAX and Fetch are just JavaScript APIs for making HTTP requests. Web Socket is just an API and protocol extended from an initial HTTP request. HTTP uses TCP. TCP takes care of ensuring packets arrive and arrive in-order. So, yes, you won't need to worry about packet loss and such, but not because of MSE.
But this will result in delay so it cannot be truly used for real-time communication. Yes?
That depends entirely on your goals. It's a myth that TCP isn't fast, or that TCP increases general latency for every packet. What is true is that the initial 3-way handshake takes a few round trips. It's also true that if a packet does actually get dropped, the application sees latency as suddenly sharply increased until the packet is requested again and sent again.
If your goals are something like a telephony application where the loss of a packet or two is meaningless overall, then UDP is more appropriate. (In voice communications, we talk slow enough that if a few milliseconds of sound go missing, we can still decipher what was being said. Our spoken language is robust enough that if entire words get garbled or are silent, we can figure out the gist of what was being said from context.) It's also important that immediate continuity be kept for voice communications. The tradeoff is that realtime-ness is better than accuracy at any particular instant/packet.
However, if you're doing something, say a one-way stream, you might choose a protocol over TCP. In this case, it may be important to be as realtime as possible, but more important that the audio/video don't glitch out. Consider the Super Bowl, or some other large sporting event. It's a live event and important that it stays realtime. However, if the time reference for the viewer is only 3-5 seconds delayed from live, it's still "live" enough for the viewer. The viewer would be far more angry if the video glitched out and they missed something happening in the game, rather than if they were just behind a few seconds. Since it's one-way streaming and there is no communication feedback loop, the tradeoff for reliability and quality over extreme low latency makes sense.
There is no security/encryption requirement for MSE like in WebRTC (DTLS/SRTP). Yes?
MSE doesn't know or care how you get your data.
One cannot, for example, mix a remote audio source from MSE with an audio mediaStreamTrack from a RTCPeerConnection as they do not have any common param like CNAME (RTCP) or are part of the same mediastream). In other words, the world of MSE and WebRTC cannot mix unless synchronization is not important. Correct?
Mix, where? Synchronization, where? No matter what you do, if you have streams coming from different places... or even different devices without sync/gen lock, they're out of sync. However, if you can define a point of reference where you consider things "synchronized", then it's all good. You could, for example, have independent streams going into a server and the server uses its current timestamps to set everything up and distribute together via WebRTC.
How you do this, or what you do, depends on the specifics of your application.
Is WebRTC a good fit for broad casting audio streams in relaying cases? What I mean by relaying is avoiding a broadcasting peer seed to every listening peer itself and making the listener peers also seed the audio data to a number of other listeners that may be limited by some parameters of network quality etc. (I am sorry if I use the term relay wrongly.)
Listeners can be 1th to 6th order of magnitude of 10. Is it known and what would be the situation in every magnitude step about the quality of broadcasting and connectivity?
I would appreciate side notes about same questions for video streams, as well.
I am working with an Arducam OV2640 to capture images and transmit from one microcontroller to another. I am getting inconsistent images. Occasionally they turn out ok but a large portion of the time they have a 'bogus Huffman Table'.
I am familiar with Huffman tables which has me guessing that I am losing some bytes in transmission be it from the camera to the micro-controller or from the wireless link between the micro-controllers I am using.
The only thing that has me confused is that I tested several thousand packets over the communication link between the two micros and I have a BER of zero (16 byte packets with 16 bit CRC with packet rejection and re-transmission if errors occur).
The image is also fine when I transmit it from the camera micro to my computer through UART.
Is the camera occasionally having issues? I have seen it mentioned as a problem but have no idea how this might be resolved.
I know that VoIP uses UDP for the transport layer which doesn't ensure ordered delivery. Whenever I use VoIP phone, some times I experience lost sentences and blurred sentences. However, I never hear an older sentence arriving after a new sentence. How does VoIP manage to do this?
Thanks in advance,
Pavan.
This depends on implementation. RTP combined with SIP might be common protocol set used and RTP packets have timestamps. RTP packet receiver has usually something called jitter buffer that is delaying playback a little (~100ms range) and is managing list of already received packets (capacity of few packets). Packets that arrived slightly out of order can be inserted in the middle of this list thus playback order can be restored.
Regardless of this hearing audio in reversed order would be very unlikely. Each packet holds only about 20 ms of audio so even if some dumb implementation would ignore timestamps and/or wouldn't be able to restore order you wouldn't hear this as sentence reorder but rather serious audio distortions. Used compression may also be important as codec may not be able to recover quickly if it receives packets in wrong order.