Every time I set up WebRTC video call clients, it never works unless I specify a TURN server. No matter how many STUN servers I supply, it always falls back onto TURN. It could be the case that the people I have tested on all coincidentally happened to be behind symmetric NAT. The only time it doesn't fall back to TURN is when I test locally on my own network. Are STUN servers just very infrequently or rarely used? Or are they used more often and my experience just happens to be anomalous.
STUN servers get used very sparingly, during session setup, to help WebRTC endpoints behind NATs discover their public IP addresses. STUN services put a very small load on their server machines. They're similar to the "what's my ip?" websites on the internet.
TURN servers, when needed, relay the media data from endpoint to endpoint. All the video, audio, and media streams go up to a TURN server and then back down to a recipient. The TURN server load is higher. TURN service is only needed when endpoints cannot reach each other via direct peer-to-peer connections.
STUN isn't a substitute for TURN.
Related
If I am building a WebRTC app and using a Selective Forwarding Unit media server, does this mean that I will have no need for STUN / TURN servers?
From what I understand, STUN servers are used for clients to discover their public IP / port, and TURN servers are used to relay data between clients when they are unable to connect directly to each other via STUN.
My question is, if I deploy my SFU media server with a public address, does this eliminate the need for STUN and TURN servers? Since data will always be relayed through the SFU and the clients / peers will never actually talk to each other directly?
However, I noticed that the installation guide for Kurento (a popular media server with SFU functionality) contains a section about configuring STUN or TURN servers. Why would STUN or TURN servers be necessary?
You should still use a TURN server when running an SFU. To understand diving into ICE a little bit will help. All SFUs work a little differently, but this is true for most.
For each PeerConnection the SFU will listen on a random UDP (and sometimes TCP port)
This IP/Port combination is giving to each peer who then attempts to contact the SFU.
The SFU then checks the incoming packets if they contain a valid hash (determined by upwd). This ensures there is no attacker connecting to this port.
A TURN server works by
Provides a single allocation port that peers can connect to. You can use UDP, DTLS, TCP or TLS. You need a valid username/password.
Once authenticated you send packets via this connection and the TURN server relays them for you.
The TURN server will then listen on a random port so that others can then send stuff back to the Peer.
So a TURN server has a few nice things that an SFU doesn't
You only have to listen on a single public port. If you are communicating with a service not on the internet you can just have your clients only connect to the allocation
You can also make your service available via UDP, DTLS, TCP and TLS. Most ICE implementations only support UDP.
These two factors are really important in government/hospital situations. You have networks that only allow TLS traffic over port 443. So a TURN server is your only solution (you run your allocation on TLS 443)
So you need to design your system to your needs. But IMO you should always run a well configured TURN server in real world environments.
I'm passing a handful of STUN and TURN servers for my WebRTC application (built on top of mediasoup). When I do this, I get a message in the console telling me: "Using more than two STUN/TURN servers slows down discovery"
I can cut down the servers to 2... but... why does more hurt? Wouldn't I want the most options available to make a connection?
Because how ICE works is that initially the browser looks at what IP based network interfaces it can find on your machine.
Then it takes the number of network interfaces and combines it with ever single STUN server and every single TURN server the service provides. Next it needs to send request to every single of the these combinations and keep track of the responses. In case of STUN it's just a single response, in case of TURN it is usually multiple round trips. So the more STUN and TURN servers you have the longer it takes to get answers from all of them.
Now if STUN server #1 reports your external IP address as A, with port 1, most likely STUN server #2 is also going to tell you it sees A as your external IP address, with the only difference it reports port 2. In almost all cases all the STUN servers are going to report back the exact same external IP A. But reporting the same external IP address over and over again to the ICE agent on the other end does not increase the chances of establishing the connection at all since its request are all going to hit the same router/NAT/firewall.
With TURN servers one can argue that more TURN servers could help more, as each TURN server should give the browser a different IP address for relaying. But it would be highly unusual if a given browser is able to reach one TURN server and not the another.
In the end all of these servers result in the browser emitting more ICE candidates. And so more ICE candidates need to get send to the other browser or ICE agent. As a result the ICE checking table which tries all possible permutations of local and remote ICE candidates grows bigger and bigger with each ICE candidate. And so it produces a lot more network traffic, while not much increasing the chances of establishing a connection.
I developed p2p webrtc application which users can video call from web application to mobile application. Web application will run on kiosk and kiosks will be behind firewall. Is it required turn server and all of data relayed by turn server?
It doesn't matter where your application is hosted at, what matters is if the clients on the ends of their calls are behind firewalls or other restrictions that restrict p2p and WebRTC, in that case, the data will have to be relayed through TURN servers.
So, no, not all data will be relayed through TURN servers, it depends on the clients' internet environment, but generally TURN servers are a requirement for production applications since there will be a significant amount of situations (about 20% to 30% in average, based on my own experience) where a TURN server is needed for WebRTC to connect.
All kiosks will be behind the firewall, so all traffic goes through turn server means high hardware costs. All of the calls will be from kiosks.10-20% of the traffic to pass through, but not 100% of the cost will increase a lot
I am building this video teaching site and did some research and got a good understanding but except for this thing. So when a user want's to connect to another user, P2P, I need signaling server to get their public IP to get them connected. Now STUN is doing that job and TURN will relay the media if the peers cannot connect. Now if I write signaling server with WebSocket to communicate the SDP messages and have ICE working, do I need coTURN installed? What will be the job of the job of them particularly?
Where exactly I am confused is the work of my simply written WebSocket Signaling server (from what I saw in different tutorials) and the work of the coTURN server I'll install. And how to connect them with the media server I'll install.
A second question, is there a way to use P2P when there is only two/three participants and get the media servers involved is there is more than that so that I don't use up the participant's bandwidth too much?
The signaling server is required to exchange messages between peers (SDP packets) until they have established a P2P connection.
A STUN server is there to help a peer discover information about its public IP and to open up firewall ports. The main problem this is solving is that a lot of devices are behind NAT routers within small private networks; NAT basically allows outgoing requests and their response, but blocks any other "unsolicited" incoming requests. You therefore have a Catch-22 scenario when both peers are behind a NAT router and could make an outgoing request, but have nowhere to send it to since the opposite peer doesn't expose anything to make a request to. STUN servers act as a temporary middleman to make requests to, which opens a port on the NAT device to allow the response to come back, which means there's now a known open port the other peer can use. It's a form of hole-punching.
A TURN server is a relay in a publicly accessible location, in case a P2P connection is impossible. There are still cases where hole-punching is unsuccessful, e.g. due to more restrictive firewalls. In those cases the two peers simply cannot talk 1-on-1 directly, and all their traffic is relayed through a TURN server. That's a 3rd party server that both peers can connect to unrestrictedly and that simply forwards data from one peer to the other. One popular implementation of a TURN server is coturn.
Yes, basically all those functions could be fulfilled by a single server, but they’re deliberately separated. The WebRTC specification has absolutely nothing to say about signaling servers, since the signaling mechanism is very unique to each application and could take many different forms. TURN is very bandwidth intensive and must usually be delegated to a larger server farm if you’re hoping to scale at all, so is impractical to mix in with any of the other two functions. So you end up with three separate components.
Regarding multi-peer connections: yes, you can set up a P2P group chat just fine. However, each peer will need to be connected to every other peer, so the number of connections and bandwidth per peer increases with each new peer. That’s probably going to work okay for 3 or 4 peers, but beyond that you may start to run into bandwidth and CPU limits of individual peers, especially if you’re doing decent quality video streaming.
WebRTC signalling is driving me crazy. My use-case is quite simple: a bidirectional audio intercom between a kiosk and to a control room webapp. Both computers are on the same network. Neither has internet access, all machines have known static IPs.
Everything I read wants me to use STUN/TURN/ICE servers. The acronyms for this is endless, contributing to my migraine but if this were a standard application, I'd just open a port, tell the other client about it (I can do this via the webapp if I need to) and have the other connect.
Can I do this with WebRTC? Without running a dozen signalling servers?
For the sake of examples, how would you connect a browser running on 192.168.0.101 to one running on 192.168.0.102?
STUN/TURN is different from signaling.
STUN/TURN in WebRTC are used to gather ICE candidates. Signaling is used to transmit between these two PCs the session description (offer and answer).
You can use free STUN server (like stun.l.google.com or stun.services.mozilla.org). There are also free TURN servers, but not too many (these are resource expensive). One is numb.vigenie.ca.
Now there's no signaling server, because these are custom and can be done in many ways. Here's an article that I wrote. I ended up using Stomp now on client side and Spring on server side.
I guess you can tamper with SDP and inject the ICE candidates statically, but you'll still need to exchange SDP (and that's dinamycally generated each session) between these two PCs somehow. Even though, taking into account that the configuration will not change, I guess you can exchange it once (through the means of copy-paste :) ), stored it somewhere and use it every time.
If your end-points have static IPs then you can ignore STUN, TURN and ICE, which are just power-tools to drill holes in firewalls. Most people aren't that lucky.
Due to how WebRTC is structured, end-points do need a way to exchange call setup information (SDP) like media ports and key information ahead of time. How you get that information from A to B and back to A, is entirely up to you ("signaling server" is just a fancy word for this), but most people use something like a web socket server, the tic-tac-toe of client-initiated communication.
I think the simplest way to make this work on a private network without an internet connection is to install a basic web socket server on one of the machines.
As an example I recommend the very simple https://github.com/emannion/webrtc-web-socket which worked on my private network without an internet connection.
Follow the instructions to install the web socket server on e.g. 192.168.1.101, then have both end-points connect to 192.168.0.101:1337 with Chrome or Firefox. Share camera on both ends in the basic demo web UI, and hit Connect and you should be good to go.
If you need to do this entirely without any server, then this answer to a related question at least highlights the information you'd need to send across (in a cut'n'paste demo).