I have a peer to peer videoconference app using simpleWebRTC and signalmaster for signaling. When more than 4 users connects the stress it causes on the network and the TURN server is too big, so I was thinking, is it possible to implement a MCU in this case? What would it take to do it?
For more than 4 participants, full mesh (connecting each participant with everyone else peer-to-peer) is impractical. If there are n participants, then each of them needs to have n - 1 outgoing and n - 1 incoming video streams, which quickly saturates the bandwidth, especially on mobile.
An SFU, for example Janus, forwards packets between call participants. The advantage of the SFU for group calls is that each participant needs to push their video stream only once - to the SFU - which then forwards it to everyone else. There are still n - 1 incoming streams for each participant though.
An MCU is capable of combining multiple video streams into one, so each participant ends up with 1 outgoing video stream and 1 incoming composite video stream. To produce a composite video stream out of n - 1 individual ones, an MCU needs to re-encode video in realtime, which makes it a CPU hog.
I would suggest giving Janus (SFU) a try first and seeing how that works for you.
Related
We're experimenting with a Freeswitch based multiparty video conferencing solution (Zoom like). The users are connecting via WebRTC (Verto clients) and the streams are all muxed and displayed on the canvas (mod_conference in mux mode). It works OK, but we notice high media latency for mixed output and this makes it very difficult to have a real-time dialogue. This is not load related, even with only 1 caller watching himself on the canvas (the mux conference output), it takes almost 1 second to see a local move being reflected on the screen (e.g. if I raise my hand I can see it happening on the screen after almost 1 second ). This is obviously the roundtrip delay, but after discarding the intrinsic network latency (measured to be about 100 ms roundtrip) there seem to be around 800-900 ms added latency. There's no TURN relaying involved. It seems this is being introduced along the buffering/ transcoding/ muxing pipeline. Any suggestions please what to try to reduce the latency? What sort of latency should we expect, what's your experience, has anyone deployed a Freeswitch video conferencing with acceptable latency for bidirectional, real time conversations? Ultimately I'm trying to understand if Freeswitch can be used for a multiparty real time video conversation or I should give up look for something else. Thanks!
does anyone know how to stream html5 camera output to other users.
If that's possible should I use sockets, images and stream them to the users or other technology.
Is there any video tutorial where I can take a look about it.
Many thanks.
The two most common approaches now are most likely:
stream from the source to a server, and allow users connect to the server to stream to their devices, typically using some form of Adaptive Bit Rate streaming protocol (ABR - basically creates multiple bit rate versions of your content and chunks them, so the client can choose the next chunk from the best bit rate for the device and current network conditions).
Stream peer to peer, or via a conferencing hub, using WebRTC
In general, the latter is more focused towards real time, e.g. any delay should be below the threshold which would interfere with audio and video conferences, usually less than 200ms for audio for example. To achieve this it may have to sacrifice quality sometimes, especially video quality.
There are some good WebRTC samples available online (here at the time of writing): https://webrtc.github.io/samples/
I have a simple UDP streaming protocol that takes RAW H264 video frames and sends them instantly from server side to the client side.
Using this protocol I can get near network RTT latency (no packet resending and I don't care about packet loss), so if I have 20 ms latency from server to the client I can make a video frame to be ready from encoder output to the client side (ready to be decoded) in... let's say 30 ms.
My question is:
Is WebRTC (over UDP) capable of going down to this kind of latencies?
Not taking into account encoding and decoding times, what is the
lowest latency possible I can get with WebRTC for the protocol layer?
I don't know if this kind of latencies will require my own protocol to be more deeply developed or I may go to something more generic like WebRTC for my video server development in order to instantly be supported by every web browser.
WebRTC can have the same low latency as regular SIP/RTP stacks.
WebRTC stack vendors does their best to reduce delay.
For recording and sending out there is no any delay. The stack will send the packets immediately once received from the recorder device and compressed with the selected codec. Some codec's (and some codec settings) might introduce some delay here to enable some features such as FEC.
Regarding the receiver side:
In optimal circumstances the stack should not delay the playback of the packets, so they can be display as soon as they arrive.
However in sub-optimal circumstances (with network delays or packet loss) the stack will introduce a jitter buffer. The lower is the network quality, the higher will be the jitter buffer length.
So, to achieve the lowest delay, you might have to do the followings:
choose a codec with the smallest processing time
remove FEC and disable any other settings which might cause additional delays
remove the jitter buffer (most WebRTC stacks doesn't have a setting for this so you might have to modify the code yourself, but it is an easy modification, because you just need to deactivate a part of the code)
WebRTC uses RTP as the underlying media transport which has only a small additional header at the beginning of the payload compared to plain UDP. This means it should be on par with what you achieve with plain UDP. RTP is heavily used in latency critical environments like real time audio and video (its the media transport in SIP, H.323, XMPP) and thus you can expect the latency to be sufficient for this purpose.
I'm currently working a network protocol which includes a client-to-client system with auto-discovering of clients on the current local network.
Right now, I'm periodically broadsting over 255.255.255.255 and if a client doesn't emit for 30 seconds I consider it dead (then offline). The goal is to keep an up-to-date list of clients runing. It's working well using UDP, but UDP does not ensure that the packets have been sucessfully delivered. So when it comes to the WiFi parts of the network, I sometimes have "false postivives" of dead clients. Currently I've reduced the time between 2 broadcasts to solve the issue (still not working well), but I don't find this clean.
Is there anything I can do to keep a list of "online" clients without this risk of "false positives" ?
To minimize the false positives, due to dropped packets you should alter a little bit the logic of your heartbeat protocol.
Rather than relying on a single packet broadcast per N seconds, you can send a burst 3 or more packets immediately one after the other every N seconds. This is an approach that ping and traceroute tools follow. With this method you decrease significantly the probability of a lost announcement from a peer.
Furthermore, you can specify a certain number of lost announcements that your application can afford. Also, in order to minimize the possibility of packet loss using wireless network, try to minimize as much as possible the size of the broadcast UDP packet.
You can turn this over, so you will broadcast "ServerIsUp" message
and every client than can register on server. When client is going offline it will unregister, otherwise you can consider it alive.
i would like build an only-audio conference system based on webrtc for a game, but i must avoid lag.
for example i can use https://github.com/muaz-khan/WebRTC-Experiment/tree/master/RTCMultiConnection
How many clients can be connected in the same time?and about bandwidth?
I think that the upload will be a limit increasing the amount of connected client.
Could i make a Tree of clients, so each client have only 2 or 3 connections?
The server can be in any language because i will use vert.x
regards
There are two scenarios:
1: Peer-to-Peer
In this model; maximum peer connections limit per page on chromium is 256.
2: Peer-to-Server
In this model; you can use media server to relay stream over unlimited peers.
In 1st model; you can face bandwidth/cpu usage issue.
In 2nd model; all such things are handled by the media server.
If you're planning to setup peer-to-peer video-conferencing; you'll use mesh model. It is suggested to limit conferencing to 5 users only. Otherwise, you'll face issues like audio-lost, echo and obviously bandwidth/cpu huge usage!
In broadcasting p2p scenarios; you can relay i.e. forward remote streams to overcome burden from single peer.