Do both endpoints in WebRTC need STUN/TURN config/credentials - webrtc

Working in WebRTC, it would seem like only the offering client would need to provide STUN and TURN locations and credentials that would be encased in the offer and then used by the receiving client(s). Is that the case? If not, why not?

No, clients on both ends need to provide some sort of STUN/TURN configuration, note that these configurations need not to be the same.
Recall that STUN and TURN just provides you the tools to get around NAT. In other words, it provides the tools for a peer to figure out a way to be reachable publicly. They do that by generating ICE candidates that we send through signalling. As long as we can generate at least one valid ICE candidate and tell our peer about it, we can establish a connection.
The reason why both ends need to provide configuration is because otherwise, one of the peers would have no way to tell which IP address belongs to the other. Therefore, even though the answering peer has your ICE candidates (so it knows how to reach you), if the ICE candidates are generated only by the offering side, then this side has no way to securely tell that an incoming connection attempt is actually coming from the peer who you sent your offer to (although it most likely would be).
And for the bounty question "I would like to know if it's possible to connect peers over TURN when only one peer has the required TURN credentials.", the answer is also no.
To understand why, you need to understand that a TURN server is there so in case you can't establish a direct connection due to firewalls, incompatibilities, etc. it generates you "fake" ICE candidates to send to your peer because, in reality, these candidates are actually representing your TURN server. Your TURN server would then relay data sent by your peer to you, at this point, it is not considered peer-to-peer anymore.
That said, your peer doesn't even know about your TURN server, it sees your TURN generated candidates like any other candidate. Your peer still has to gather ICE candidates somehow to send you. And it can't use your TURN server to do that because you never provided its credentials throughout the workflow.
I recommend you read the book WebRTC For the Curious if you are interested in this stuff. It's very comprehensive.

Related

WebRTC Peer to Peer without ICE / STUN / TURN but with SSH

Let's say I have access to a server via ssh and my own laptop. However, I wish to use webrtc to transmit data via a peer connection for it's ultra low latency over UDP. It's also crucial that I have NO communication between a third server.
What I would like to do, is use ssh to get the signalling information for the two machines (offers and answers) to then create a webRTC channel. This way I can say that no information ever went to a third server (including the signalling part).
I understand the requirement is a bit silly, but it's due to how we network things currently, and the important part is the super low latency webRTC provides.
Or is this the wrong tool for the job perhaps? Open to other suggestions :)

Are stun/turn servers expected to be in the webrtc offer?

When creating webrtc peer connection, I passed in a list of STUN/TURN ICE servers (no trickling ICE). But when establishing the connection I only saw local ip address in ice candidates in the offer created. The same for the answer received. The connection worked sometimes. Was that expected? Where I could check if that was wrong?
It turned out that TURN server urls do exist in the sdp offer and answer. The issue in my case was I added the ICE servers after peer connection creation so it didn't add to the offer.
No, the STUN/TURN ICE urls will not be present offer/answer SDP.
These servers are meant to be used by a peer connection to generate ICE candidates. The peer you would like to connect to has nothing to do with how you gather your candidates, therefore it does not care about your stun/turn configurations. It only cares about your actual ice candidates and the metadata of the media you're about to send.
If you are interested enough, I recommend you take a look at WebRTC for the curious, it covers this topic in detail yet it is quite comprehensible.

why does WebRTC require both browsers to generate connection info?

So I am looking into building a game using WebRTC, mostly just to learn how to use WebRTC more than anything. What I envisioned in my head was one browser (lets call it Alice) wants to start a game. They figure out their connection information and then send that info to another browser (Bob) who they want to join their game. I like the idea of a link similar to a discord invite.
What I had imagined, was that this was all that was required. Bob's browser knows where Alice is, and Alice is expecting a connection from someone who knows about their connection information (their SDP). Instead what is required is that Bob needs to generate his own connection information (his SDP) and then hand that back to Alice somehow. (For reference, here is an implementation of a "serverless" WebRTC client, which requires both parties to pass their connection info to the other person https://github.com/lesmana/webrtc-without-signaling-server)
Because there are two required messages, telling users to do this manually is very much a pain, and gets increasingly difficult with more users (e.g. Alice, Bob and Charlie want to connect). For this reason we have "signaling servers" which handle this handshaking.
My question is why is all of this necessary? Is it for security? Couldnt you consider a browser secure enough if their SDP info included a generated hash that only those they expect (like Bob) have access to?
Don't confuse connection info (ice candidates) with SDP.
What are ICE Candidates and how do the peer connection choose between them?
If you are asking specifically about web browsers - then yes, you have to collect connection info, nothing to do with SDP, from each browser. This is because browsers do not listen on a specific, well known port, which is open in firewalls too. So it's not like one browser could just connect to another one, using well-known endpoint (IP:Port).
The idea is that Stun server will drill a hole in both firewalls and thus will make direct connection between browsers possible. Read STUN spec to see how this is done.
However, if one peer is a browser, and another peer is your own application that listens on specific port (WebRTC gateways, media servers), then you don't need to collect connection info (ice candidates) from the browser. Nobody needs it. Stun/Turn servers are not involved. Browser always connects to your application. You can hardcode ice candidate in your webpage, which will contain the endpoint exposed by your application.
You always have to exchange SDPs between two peers, because they carry codecs information and other info about media stream, that another peer needs to know about. Browsers need to agree that they can decode the incoming stream, for example.

Difference between STUN/TURN(coTURN) servers and Signaling servers (written with socket.io/websocket) in WebRTC?

I am building this video teaching site and did some research and got a good understanding but except for this thing. So when a user want's to connect to another user, P2P, I need signaling server to get their public IP to get them connected. Now STUN is doing that job and TURN will relay the media if the peers cannot connect. Now if I write signaling server with WebSocket to communicate the SDP messages and have ICE working, do I need coTURN installed? What will be the job of the job of them particularly?
Where exactly I am confused is the work of my simply written WebSocket Signaling server (from what I saw in different tutorials) and the work of the coTURN server I'll install. And how to connect them with the media server I'll install.
A second question, is there a way to use P2P when there is only two/three participants and get the media servers involved is there is more than that so that I don't use up the participant's bandwidth too much?
The signaling server is required to exchange messages between peers (SDP packets) until they have established a P2P connection.
A STUN server is there to help a peer discover information about its public IP and to open up firewall ports. The main problem this is solving is that a lot of devices are behind NAT routers within small private networks; NAT basically allows outgoing requests and their response, but blocks any other "unsolicited" incoming requests. You therefore have a Catch-22 scenario when both peers are behind a NAT router and could make an outgoing request, but have nowhere to send it to since the opposite peer doesn't expose anything to make a request to. STUN servers act as a temporary middleman to make requests to, which opens a port on the NAT device to allow the response to come back, which means there's now a known open port the other peer can use. It's a form of hole-punching.
A TURN server is a relay in a publicly accessible location, in case a P2P connection is impossible. There are still cases where hole-punching is unsuccessful, e.g. due to more restrictive firewalls. In those cases the two peers simply cannot talk 1-on-1 directly, and all their traffic is relayed through a TURN server. That's a 3rd party server that both peers can connect to unrestrictedly and that simply forwards data from one peer to the other. One popular implementation of a TURN server is coturn.
Yes, basically all those functions could be fulfilled by a single server, but they’re deliberately separated. The WebRTC specification has absolutely nothing to say about signaling servers, since the signaling mechanism is very unique to each application and could take many different forms. TURN is very bandwidth intensive and must usually be delegated to a larger server farm if you’re hoping to scale at all, so is impractical to mix in with any of the other two functions. So you end up with three separate components.
Regarding multi-peer connections: yes, you can set up a P2P group chat just fine. However, each peer will need to be connected to every other peer, so the number of connections and bandwidth per peer increases with each new peer. That’s probably going to work okay for 3 or 4 peers, but beyond that you may start to run into bandwidth and CPU limits of individual peers, especially if you’re doing decent quality video streaming.

How coturn works?

I was read the turnserver rfc-5766. In this the figure 1 the following are the things are gets take placed.
Turn server
Turn client
Peer A
Peer B
My understanding is the peer A and peer B are used for the webrtc clients who wants to communicate each other behind a symmetric NAT. They uses the turn server to communicate.
But I don't know what is the use of turn client in here. Then I did not understand what is the use of send indication and data indication. They said the data is transfered using the allocated relayed address. But that address is not yet used. But the the data transferring is gets successful. I don't know how it gets succeed. Please explain me the flow of the coturn 4.5.0.3 turn server.