What happens in a multi-party webrtc connection when some users use STUN and/or TURN? - webrtc

In my case, I have a webrct based web app that supports multi-parti video chat and has STUN and TURN servers configured. The connections are done in a mesh way (peer to peer) What happens when some of the users involved in the video chat need to establish the connection via TURN? Do all of the users start to use TURN? What if I'm the user that's behind a NAT? Does that mean that all connections established with me are using TURN?

Connections are peer to peer, so if one connection uses TURN then it doesn't affect other peer to peer connections.
If the user is behind a NAT, he may not need TURN in some cases: all depends on the type of the NAT.

Related

Understanding SFU's, TURN servers in WebRTC

If I am building a WebRTC app and using a Selective Forwarding Unit media server, does this mean that I will have no need for STUN / TURN servers?
From what I understand, STUN servers are used for clients to discover their public IP / port, and TURN servers are used to relay data between clients when they are unable to connect directly to each other via STUN.
My question is, if I deploy my SFU media server with a public address, does this eliminate the need for STUN and TURN servers? Since data will always be relayed through the SFU and the clients / peers will never actually talk to each other directly?
However, I noticed that the installation guide for Kurento (a popular media server with SFU functionality) contains a section about configuring STUN or TURN servers. Why would STUN or TURN servers be necessary?
You should still use a TURN server when running an SFU. To understand diving into ICE a little bit will help. All SFUs work a little differently, but this is true for most.
For each PeerConnection the SFU will listen on a random UDP (and sometimes TCP port)
This IP/Port combination is giving to each peer who then attempts to contact the SFU.
The SFU then checks the incoming packets if they contain a valid hash (determined by upwd). This ensures there is no attacker connecting to this port.
A TURN server works by
Provides a single allocation port that peers can connect to. You can use UDP, DTLS, TCP or TLS. You need a valid username/password.
Once authenticated you send packets via this connection and the TURN server relays them for you.
The TURN server will then listen on a random port so that others can then send stuff back to the Peer.
So a TURN server has a few nice things that an SFU doesn't
You only have to listen on a single public port. If you are communicating with a service not on the internet you can just have your clients only connect to the allocation
You can also make your service available via UDP, DTLS, TCP and TLS. Most ICE implementations only support UDP.
These two factors are really important in government/hospital situations. You have networks that only allow TLS traffic over port 443. So a TURN server is your only solution (you run your allocation on TLS 443)
So you need to design your system to your needs. But IMO you should always run a well configured TURN server in real world environments.

Difference between STUN/TURN(coTURN) servers and Signaling servers (written with socket.io/websocket) in WebRTC?

I am building this video teaching site and did some research and got a good understanding but except for this thing. So when a user want's to connect to another user, P2P, I need signaling server to get their public IP to get them connected. Now STUN is doing that job and TURN will relay the media if the peers cannot connect. Now if I write signaling server with WebSocket to communicate the SDP messages and have ICE working, do I need coTURN installed? What will be the job of the job of them particularly?
Where exactly I am confused is the work of my simply written WebSocket Signaling server (from what I saw in different tutorials) and the work of the coTURN server I'll install. And how to connect them with the media server I'll install.
A second question, is there a way to use P2P when there is only two/three participants and get the media servers involved is there is more than that so that I don't use up the participant's bandwidth too much?
The signaling server is required to exchange messages between peers (SDP packets) until they have established a P2P connection.
A STUN server is there to help a peer discover information about its public IP and to open up firewall ports. The main problem this is solving is that a lot of devices are behind NAT routers within small private networks; NAT basically allows outgoing requests and their response, but blocks any other "unsolicited" incoming requests. You therefore have a Catch-22 scenario when both peers are behind a NAT router and could make an outgoing request, but have nowhere to send it to since the opposite peer doesn't expose anything to make a request to. STUN servers act as a temporary middleman to make requests to, which opens a port on the NAT device to allow the response to come back, which means there's now a known open port the other peer can use. It's a form of hole-punching.
A TURN server is a relay in a publicly accessible location, in case a P2P connection is impossible. There are still cases where hole-punching is unsuccessful, e.g. due to more restrictive firewalls. In those cases the two peers simply cannot talk 1-on-1 directly, and all their traffic is relayed through a TURN server. That's a 3rd party server that both peers can connect to unrestrictedly and that simply forwards data from one peer to the other. One popular implementation of a TURN server is coturn.
Yes, basically all those functions could be fulfilled by a single server, but they’re deliberately separated. The WebRTC specification has absolutely nothing to say about signaling servers, since the signaling mechanism is very unique to each application and could take many different forms. TURN is very bandwidth intensive and must usually be delegated to a larger server farm if you’re hoping to scale at all, so is impractical to mix in with any of the other two functions. So you end up with three separate components.
Regarding multi-peer connections: yes, you can set up a P2P group chat just fine. However, each peer will need to be connected to every other peer, so the number of connections and bandwidth per peer increases with each new peer. That’s probably going to work okay for 3 or 4 peers, but beyond that you may start to run into bandwidth and CPU limits of individual peers, especially if you’re doing decent quality video streaming.

Does exchanging SDP insecurely jeopardize the security of a peer connection? [duplicate]

i have a problem. I've developed a web-app using WebRtc for one-to-one videocall via browser using WebRtc with signalling server on node js (listening e.g. on 8181 port).
Now i would implement MITM attack. I was thinking that, wheen Peer_1 should invoke two rtc peer connection, one for the second peer (Peer_2), one to the MITM. The same thing for the second peer.
Now, i was thinking that signalling server needs to listen on another port, for each rtc peer connection received from the two peers (e.g. 8282 for Peer_1 and 8383 for Peer_2).
Am i right? I think that because signalling server's implementation is to one-to-one communication.
In this way, signalling server on port 8181 allows end-to-end communication for Peer_1 and Peer_2, on 8282 there is the signalling path for Peer_1 and the MITM, and on 8383 for MITM and Peer_2.
Am i right or not? Thanks for the support.
Man in the middle refers to interception during transmission, which WebRTC itself is secured against using DTLS and key exchange, so the weak point is usually the signaling server chosen by an application instead.
But what you describe however sounds like Man on both ends. You have to trust the service (the server) to guarantee whom you're being connected to. If that server is compromised, or either client is compromised - say by injection - then there's no guarantee whom you're talking to, since a client can easily forward a transmission to another party.

Connect to specific user from STUN server in WEB RTC

I'm trying to achieve peer to peer video conference using google stun server.
I can connect anyone by stun server randomly.Because stun gives multiple and random addresses and connect with it.
But is there any way to connect specific peer by stun server for a login based system or room based system?
I want to achive something like - https://apprtc.appspot.com/
You need to design your signalling method (this is up to the application developer), which is independent of STUN.
WebRTC does not specify the mechanism for signalling. Signalling is the method whereby users discover each other and establish that a call (media streams between two peers) is going to take place.
The 'discovery' process could involve a registration-based system (eg using SIP proxy) or room based where two users have access to a 'room' (by knowing the credentials or some means of authentication). Once two peers have found each other, their browsers then need to share and negotiate network topology and media capabilities to ensure that the streams can reach the intended destination and can be encoded/decoded properly.

Does WebRTC allow actual peer-to-peer communication?

Is the signaling server used only the first time to establish a connection between 2 peers or is it also used to send and receive data-streams between the peers?
According to the w3c proposal:
An RTCPeerConnection allows two users to communicate directly, browser to browser. Communications are coordinated via a signaling channel which is provided by unspecified means, but generally by a script in the page via the server, e.g. using XMLHttpRequest.
So the Server is only used for signalig not for data transmission. But signaling is not limited to establishing the first connection. The signaling channel is also used for transmitting error messages, metadata such as codecs, codec settings, networkdata and keys for secure transmission.
This depends on the network configuration.
If at least one of the peers is not behind a NAT firewall, the peer that is directly on the internet acts as server, and the signalling server is no longer used after the connection is established.
If both peers are behind a NAT appliance, under certain circumstances it might be possible to negociate a client server connection between the peers, and the data is again sent directly between the two peers.
If both peers are behind a NAT firewall that is locked down, all the traffic between the peers passes through the signalling server.
Notice also that in the first two cases, a STUN server is used to establish the connection. If the full data is relayed through the server, a TURN server is used.
Look at a good explanation in the article an video on html5rocks. They claim only about 14% of all connexions need TURN, which seems a really low number to me (This corresponds to only 37% of all clients are behind a locked down NAT router).