Is WebRTC a good fit for broad casting audio streams in relaying cases? What I mean by relaying is avoiding a broadcasting peer seed to every listening peer itself and making the listener peers also seed the audio data to a number of other listeners that may be limited by some parameters of network quality etc. (I am sorry if I use the term relay wrongly.)
Listeners can be 1th to 6th order of magnitude of 10. Is it known and what would be the situation in every magnitude step about the quality of broadcasting and connectivity?
I would appreciate side notes about same questions for video streams, as well.
Related
The closest I came across this is this question on SO but that is just for basic understanding.
My question is: when Media Source Extension (MSE) is used where the media source is fetched from a remote end point, for example, through AJAX or fetch API or even websocket, the media is sent over TCP.
That will handle packet loss and sequencing so protocol like RTP with RTCP is not used. Is that correct?
But this will result in delay so it cannot be truly used for real-time communication. Yes?
There is no security/encryption requirement for MSE like in WebRTC (DTLS/SRTP). Yes?
One cannot, for example, mix a remote audio source from MSE with an audio mediaStreamTrack from a RTCPeerConnection as they do not have any common param like CNAME (RTCP) or are part of the same mediastream). In other words, the world of MSE and WebRTC cannot mix unless synchronization is not important. Correct?
That will handle packet loss and sequencing so protocol like RTP with RTCP is not used. Is that correct?
AJAX and Fetch are just JavaScript APIs for making HTTP requests. Web Socket is just an API and protocol extended from an initial HTTP request. HTTP uses TCP. TCP takes care of ensuring packets arrive and arrive in-order. So, yes, you won't need to worry about packet loss and such, but not because of MSE.
But this will result in delay so it cannot be truly used for real-time communication. Yes?
That depends entirely on your goals. It's a myth that TCP isn't fast, or that TCP increases general latency for every packet. What is true is that the initial 3-way handshake takes a few round trips. It's also true that if a packet does actually get dropped, the application sees latency as suddenly sharply increased until the packet is requested again and sent again.
If your goals are something like a telephony application where the loss of a packet or two is meaningless overall, then UDP is more appropriate. (In voice communications, we talk slow enough that if a few milliseconds of sound go missing, we can still decipher what was being said. Our spoken language is robust enough that if entire words get garbled or are silent, we can figure out the gist of what was being said from context.) It's also important that immediate continuity be kept for voice communications. The tradeoff is that realtime-ness is better than accuracy at any particular instant/packet.
However, if you're doing something, say a one-way stream, you might choose a protocol over TCP. In this case, it may be important to be as realtime as possible, but more important that the audio/video don't glitch out. Consider the Super Bowl, or some other large sporting event. It's a live event and important that it stays realtime. However, if the time reference for the viewer is only 3-5 seconds delayed from live, it's still "live" enough for the viewer. The viewer would be far more angry if the video glitched out and they missed something happening in the game, rather than if they were just behind a few seconds. Since it's one-way streaming and there is no communication feedback loop, the tradeoff for reliability and quality over extreme low latency makes sense.
There is no security/encryption requirement for MSE like in WebRTC (DTLS/SRTP). Yes?
MSE doesn't know or care how you get your data.
One cannot, for example, mix a remote audio source from MSE with an audio mediaStreamTrack from a RTCPeerConnection as they do not have any common param like CNAME (RTCP) or are part of the same mediastream). In other words, the world of MSE and WebRTC cannot mix unless synchronization is not important. Correct?
Mix, where? Synchronization, where? No matter what you do, if you have streams coming from different places... or even different devices without sync/gen lock, they're out of sync. However, if you can define a point of reference where you consider things "synchronized", then it's all good. You could, for example, have independent streams going into a server and the server uses its current timestamps to set everything up and distribute together via WebRTC.
How you do this, or what you do, depends on the specifics of your application.
I got asked in an interview recently to design a file upload feature. After the initial discussion, The interviewer asked if I can design for multiple threads. My thought was, As the network bandwidth is limited and the internet is connected through a serial data connection, the network bottleneck will kick-in much before the CPU bottleneck, and a multiple thread implementation would have a limited performance improvement. But the interviewer was hell bend on the multi-thread approach. What are the arguments in favor of a multi-thread upload approach? (I recently came to know that AWS has a library which permits uploads on multiple threads. So there should be some advantages I am unaware of.)
A TCP connection can be limited in rate even on a high-speed network because of the bandwidth delay product.
A high bandwidth-delay product is an important problem case in the design of protocols such as Transmission Control Protocol (TCP) in respect of TCP tuning, because the protocol can only achieve optimum throughput if a sender sends a sufficiently large quantity of data before being required to stop and wait until a confirming message is received from the receiver, acknowledging successful receipt of that data. If the quantity of data sent is insufficient compared with the bandwidth-delay product, then the link is not being kept busy and the protocol is operating below peak efficiency for the link.
One easy way to work around TCP limitations on connections with large bandwith delay products is to do multiple streams in parallel.
I know that VoIP uses UDP for the transport layer which doesn't ensure ordered delivery. Whenever I use VoIP phone, some times I experience lost sentences and blurred sentences. However, I never hear an older sentence arriving after a new sentence. How does VoIP manage to do this?
Thanks in advance,
Pavan.
This depends on implementation. RTP combined with SIP might be common protocol set used and RTP packets have timestamps. RTP packet receiver has usually something called jitter buffer that is delaying playback a little (~100ms range) and is managing list of already received packets (capacity of few packets). Packets that arrived slightly out of order can be inserted in the middle of this list thus playback order can be restored.
Regardless of this hearing audio in reversed order would be very unlikely. Each packet holds only about 20 ms of audio so even if some dumb implementation would ignore timestamps and/or wouldn't be able to restore order you wouldn't hear this as sentence reorder but rather serious audio distortions. Used compression may also be important as codec may not be able to recover quickly if it receives packets in wrong order.
The only possible reason that I could think of is the low overhead ie fixed header size of only 2 bytes minimum, leading to low packet size. Are there other factors in the design of the protocol?
EDIT:- I am sorry, I made a mental typo (?), as #Shashi pointed out, I actually meant high latency, low bandwidth.
MQTT is designed for devices with little memory footprint, low network bandwidth etc. Devices, for example sensors, energy meters, pace makers etc are ideal use cases for MQTT. Low latency means high speed. For low latency you require different protocol, like Reliable Multicast running over Gigabit Ethernet or InfiniBand networks.
One of the key factors is, that the TCP connection a MQTT client establishes is reused all the time. That means you don't have to establish a new connection all the time as it's the case with classic HTTP. Also, as you already suspected, the very low packet size is key here, typical MQTT messages don't have much overhead over the raw TCP packet.
To save more bandwidth on unreliable networks, the persistent session feature of MQTT allows clients to subscribe only once and on reconnect the subscriptions are retained for the client. For subscribing clients this can drastically reduce the overhead as the subscription message is only sent once.
Another reason, it seems is the Last Will and Testament feature, which is a useful to have feature in high latency network, low bandwidth and unreliable networks.
I am receiving RTPs via UDP (video data).
The RTPs are holding H264 that I need to decode. Unfortunately, most of the RTPs hold fragmented data. As RTP sequences are missing, I cannot reconstruct the H264 properly.
Any idea on how to reduce data loss in order to be able to decode at least o couple of frames ?
There is not much one can say. Lost data is lost as the adjective suggests. You can't get it back. In almost any case you can still feed the remaining NALs into the decoder and render the video. You will see artifacts that are introduced by the missing NALs but that's life.
Lost data is lost.
In order to reduce data loss you will need to change your transmission protocol. Interleaved RTP in RTSP could be a good choice that bases on a similar technolgy stack.
Changing to TCP will obviously only help if you got enough bandwidth to transmit the video.
If you have control on H264 encoder, enable Error resilience tools,(http://www.slideshare.net/coldfire7/error-resiliency-and-concealment-in-h264-presentation)
which makes your video more robust towards transmission errors.
So that your RTP over UDP becomes 'more resistant' towards packets losses.