Media Source Extension Javascript API vis-a-vis WebRTC. Some questions - webrtc

The closest I came across this is this question on SO but that is just for basic understanding.
My question is: when Media Source Extension (MSE) is used where the media source is fetched from a remote end point, for example, through AJAX or fetch API or even websocket, the media is sent over TCP.
That will handle packet loss and sequencing so protocol like RTP with RTCP is not used. Is that correct?
But this will result in delay so it cannot be truly used for real-time communication. Yes?
There is no security/encryption requirement for MSE like in WebRTC (DTLS/SRTP). Yes?
One cannot, for example, mix a remote audio source from MSE with an audio mediaStreamTrack from a RTCPeerConnection as they do not have any common param like CNAME (RTCP) or are part of the same mediastream). In other words, the world of MSE and WebRTC cannot mix unless synchronization is not important. Correct?

That will handle packet loss and sequencing so protocol like RTP with RTCP is not used. Is that correct?
AJAX and Fetch are just JavaScript APIs for making HTTP requests. Web Socket is just an API and protocol extended from an initial HTTP request. HTTP uses TCP. TCP takes care of ensuring packets arrive and arrive in-order. So, yes, you won't need to worry about packet loss and such, but not because of MSE.
But this will result in delay so it cannot be truly used for real-time communication. Yes?
That depends entirely on your goals. It's a myth that TCP isn't fast, or that TCP increases general latency for every packet. What is true is that the initial 3-way handshake takes a few round trips. It's also true that if a packet does actually get dropped, the application sees latency as suddenly sharply increased until the packet is requested again and sent again.
If your goals are something like a telephony application where the loss of a packet or two is meaningless overall, then UDP is more appropriate. (In voice communications, we talk slow enough that if a few milliseconds of sound go missing, we can still decipher what was being said. Our spoken language is robust enough that if entire words get garbled or are silent, we can figure out the gist of what was being said from context.) It's also important that immediate continuity be kept for voice communications. The tradeoff is that realtime-ness is better than accuracy at any particular instant/packet.
However, if you're doing something, say a one-way stream, you might choose a protocol over TCP. In this case, it may be important to be as realtime as possible, but more important that the audio/video don't glitch out. Consider the Super Bowl, or some other large sporting event. It's a live event and important that it stays realtime. However, if the time reference for the viewer is only 3-5 seconds delayed from live, it's still "live" enough for the viewer. The viewer would be far more angry if the video glitched out and they missed something happening in the game, rather than if they were just behind a few seconds. Since it's one-way streaming and there is no communication feedback loop, the tradeoff for reliability and quality over extreme low latency makes sense.
There is no security/encryption requirement for MSE like in WebRTC (DTLS/SRTP). Yes?
MSE doesn't know or care how you get your data.
One cannot, for example, mix a remote audio source from MSE with an audio mediaStreamTrack from a RTCPeerConnection as they do not have any common param like CNAME (RTCP) or are part of the same mediastream). In other words, the world of MSE and WebRTC cannot mix unless synchronization is not important. Correct?
Mix, where? Synchronization, where? No matter what you do, if you have streams coming from different places... or even different devices without sync/gen lock, they're out of sync. However, if you can define a point of reference where you consider things "synchronized", then it's all good. You could, for example, have independent streams going into a server and the server uses its current timestamps to set everything up and distribute together via WebRTC.
How you do this, or what you do, depends on the specifics of your application.

Related

Is WebRTC a good fit for broad casting audio streams

Is WebRTC a good fit for broad casting audio streams in relaying cases? What I mean by relaying is avoiding a broadcasting peer seed to every listening peer itself and making the listener peers also seed the audio data to a number of other listeners that may be limited by some parameters of network quality etc. (I am sorry if I use the term relay wrongly.)
Listeners can be 1th to 6th order of magnitude of 10. Is it known and what would be the situation in every magnitude step about the quality of broadcasting and connectivity?
I would appreciate side notes about same questions for video streams, as well.

Reliable transport protocol on top of UDP

UDP has one good feature - it is connectionless. But it has many bad features - packets can be lost, arrive multiple times, there is no packet sequence - packet 2 can arrive faster than 1. How to keep good and remove bad?. Is there any good implementations that provide reliable transport protocol on top of udp so that we are still conectionless but without mentioned problems. One example of what can be done with it is mosh.
What you describe as bad isn't really bad depending on the context.
For example UDP is used a lot in realtime streaming, delivery confirmation and resending is useless in this context.
That being said there are e few implementations that you might want to look at:
ENet (http://enet.bespin.org/)
RUDP (https://en.wikipedia.org/wiki/Reliable_User_Datagram_Protocol)
UDT (https://en.wikipedia.org/wiki/UDP-based_Data_Transfer_Protocol)
I work in embedded context:
CoAP (https://en.wikipedia.org/wiki/Constrained_Application_Protocol) also implements a lot of these features, so its worth a look.
What is your reason for not choosing TCP?

How VoIP handles datagrams that arrive late?

I know that VoIP uses UDP for the transport layer which doesn't ensure ordered delivery. Whenever I use VoIP phone, some times I experience lost sentences and blurred sentences. However, I never hear an older sentence arriving after a new sentence. How does VoIP manage to do this?
Thanks in advance,
Pavan.
This depends on implementation. RTP combined with SIP might be common protocol set used and RTP packets have timestamps. RTP packet receiver has usually something called jitter buffer that is delaying playback a little (~100ms range) and is managing list of already received packets (capacity of few packets). Packets that arrived slightly out of order can be inserted in the middle of this list thus playback order can be restored.
Regardless of this hearing audio in reversed order would be very unlikely. Each packet holds only about 20 ms of audio so even if some dumb implementation would ignore timestamps and/or wouldn't be able to restore order you wouldn't hear this as sentence reorder but rather serious audio distortions. Used compression may also be important as codec may not be able to recover quickly if it receives packets in wrong order.

Chip to chip communication protocol over SPI

I'm trying to design an efficient communication protocol between a micro-controller on one side and an ARM processor on a multi-core TI chip on the other side through SPI.
The requirements for the needed protocol:
1 - Multi-session with queuing support, as I have multiple sending/receiving threads, so it will be more than one application using this communication protocol and I need the protocol to handle queuing these requests (I will keep holding the buffer if the transmission is queue but I just need the protocol to manage scheduling the queues).
2 - Works over SPI as an underlying protocol.
3 - Simple error checking.
In this thread: "Simple serial point-to-point communication protocol", PPP was a recommended option, however I see PPP does only part of the job.
I also found Light weight IP (LwIP) project featuring PPP over serial (which I assume that I can use it over SPI), so I thought about the possibility of utilizing any of the upper layers protocols like TCP/UDP to do the rest of the required jobs. Fortunately, I found TI including LwIP as part of their ethernet SW in the starterware package, which I assume to ease porting at least on the TI chip side.
So, my questions are:
1 - Is it valid to use LwIP for this communication scheme? Won't this introduce much overhead due to IP headers which are not necessary for a point to point (on the chip level) communication and kill the throughput?
2 - Will the TCP or any similar protocol residing in LwIP handle the queuing of transmission requests, for example if I request transmission through a socket while the communication channel is busy transmitting/receiving request for another socket (session) of another thread, will this be managed by the protocol stack? If so, which protocol layer manages it?
3 - Is their a more efficient protocol stack than LwIP, that meets the above requirements?
Update 1: More points to consider
1 - SPI is the only available option, I use it with available GPIOs to indicate to the master when the slave has data to send.
2 - The current implemented (non-standard) protocol uses DMA with SPI, and a message format of《STX_MsgID_length_payload_ETX》with a fixed message fragments length, however the main drawback of the current scheme is that the master waits for a response on the message (not fragment) before sending another one, which kills the throughput and does not utilise the full duplex nature of SPI.
3- An improvement to this point was to use a kind of mailbox for receiving fragments, so a long message can be interrupted by a higher priority one so that fragments of a single message can arrive non sequentially, but the problem is that this design lead to complicating things especially that I don't have much available resources for many buffers to use the mailbox approach on the controller (master) side. So I thought that it's like I'm re-inventing the wheel by designing a protocol stack for a simple point to point link which may not be efficient.
4- What kind of higher level protocols can be normally used above SPI to establish multiple sessions and solve the queuing/scheduling of messages?
Update 2: Another useful thread "A good serial communications protocol/stack for embedded devices?"
Update 3: I had a look at Modbus protocol, it seems to specify the application layer then directly the data link layer for serial line communication, which sounds to skip the unnecessary overhead of network oriented protocols layers.
Do you think this will be a better option than LwIP for the intended purpose? Also, is there a widely used open source implementation like LwIP but for Modbus?
I think that perhaps you are expecting too much of the humble SPI.
An SPI link is little more a pair of shift registers one in each node. The master selects a single node to connect to its SPI shift register. As it shifts in its data, the slave simultaneously shifts data out. Data is not exchanged unless the master explicitly clocks the data out. Efficient protocols on SPI involve the slave having something useful to output while the master inputs. This may be difficult to arrange, so you usually need a means of indicating null data.
PPP is useful when establishing a connection between two arbitrary endpoints, when the endpoints are fixed and known a priori, PPP would serve no purpose other than to complicate things unnecessarily.
SPI is not a very sophisticated nor flexible interface and probably unsuited to heavyweight general purpose protocols such as TCP/IP. Since "addressing" on SPI is performed by physical chip-select, the addressing inherent in such protocols is meaningless.
Flow control is also a problem with SPI. The master has no way of determining that the slave has copied the data from SPI the shift register before pushing more data. If your slave SPI supports DMA you would be wise to use it.
Either way I suggest that you develop something specific to your purpose. Since SPI is not a network as such, you only need a means to address threads on the selected node. This could be as simple as STX<thread ID><length><payload>ETX.
Added 27 September 2013 in response to comments
Generally SPI as its names suggests is used to connect to peripheral devices, and in that context the protocol is defined by the peripheral. EEPROMS for example typically use a common or at least compatible command interface across vendors, and SD/MMC card SPI interface uses a standardised command test and protocol.
Between two microcontrollers, I would imagine that most implementations are proprietary and application specific. Open protocols are designed for generic interoperability and to achieve that might impose significant unnecessary overhead for a closed system, unless perhaps the nodes were running a system that already had a network stack built in.
I would suggest that if you do want to use a generic network stack that you should abstract the SPI with device drivers at each end that give the SPI a standard I/O stream interface (open(), close(), read(), write() etc.), then you can use the higher-level PPP and TCP/IP protocols (although PPP can probably be avoided since the connection is permanent). However that would only be attractive if both nodes already supported these protocols (running Linux for example), otherwise it will be significant effort and code for little benefit, and would certainly not be "efficient".
I assume you dont really want or have room for a full ip (lwip) stack on the microcontroller? This just sounds like a lot of overkill. Why not just roll your own simple packet structure to move the data items you need to move. Depending on how spi is supported on both sides you may or may not be able to use it to define the frame for your data, if not a simple start pattern, length and a trailing checksum and maybe tail pattern would suffice for finding packet boundaries in the stream (no different than a serial/uart solution). You can even use the PPP solution for that with a start pattern and I think end pattern with the payload using a two byte pattern whenever the start pattern happens to show up in the data. I dont remember all the details now.
Whatever your frame is then add a packet type and your handshakes, or if the data is going to just be microcontroller to arm then you dont even need to do that.
To get back to your direct question. Yes, I think that an ip stack (lwip or other) will introduce a lot of overhead. both bandwidth and more important the amount of code needed to support that stack will chew up rom/ram on both sides. If you ultimately need to present this data in an ip fashion (a website hosted by the embedded system) then somewhere in the path you need an ip stack, etc.
I cant imagine that lwip manages your queues for you. I assume you would need to do that yourself. the various queues might want to talk to a single driver that deals with the single spi bus (assuming there is a single spi bus with multiple chip selects). It also depends on how you are using the spi interface, if you are allowing the arm to talk to multiple microcontrollers and the packets of data are broken up into a little bit from this controller a little from that controller so that nobody has to wait to long before they get a few more bytes of data. Or will a complete frame have to move from one microcontroller before moving onto the next gpio interrupt to pull that guys data? The long and short of it is I would assume you have to manage the shared resource just like you would in any other situation where you have multiple users of a shared resource (rtos, full blown operating system, etc). I dont remember lwip that well at all but with a full blown berkeley sockets application interface the user could write separate applications where each application only cared about one TCP or UDP port and the libraries and drivers managed separating those packets out to each application as well as all of the rules for the IP stack.
If you are not already doing experiments with moving data over the spi interface(s) I would start with simple experiments first just to get the feel for how well it is or isnt going to work, the sizes of transfers you can do reliably per spi transction, etc. Your solution may naturally just fall out of those experiments.

Moving from Qt to Boost - Networking

I's using Qt to build Udp and Tcp servers/clients. Now in recent project, Qt is not allowed and i'm supposed to use Boost. Having gone through boost::asio, i feel confused since there are so many features and usages are very different from Qt. For Udp (multicast,broadcast etc) i'm more or less clear now. But for Tcp i'm having problems understanding:
(have to do all async)
TCP is stream oriented so there is no way to know the number of bytes that have arrived. So we use asio::ip::tcp::socket::async_read_some(). However in Qt we have readAll() for a socket which can be made to read all data on readyRead() signal and return a QByteArray, the size of which we needn't specify as it can grow. How do i achieve this in Boost? readyRead() signal i think would map roughly to async_read_some(), what about the rest (readAll() and "growable" buffer/array?
disconnected() signal is emitted in Qt if the TCP connection disconnects (server/client off, lan cable unplugged etc). What would be the equivalent check in Boost? Do i need a timer to periodically send and expect heartbeat messages myself?
Use some overload of async_read free function with the appropriate completion-condition.
As you see there, this function accepts asio::streambuf, which is growing automatically.
Yes, the only portable & reliable way is to send some heartbeat. I don't know how Qt does that, but try cutting the network cable and see whether you get any notification from Qt...