When creating a WebRTC PeerConnection can I skip the ICE (STUN/TURN) discovery process? - webrtc

In my setup, I have a custom server in the cloud handling audio and video so I don't need (and don't want) the whole "where am I and what's my private and public address etc." discovery process. Essentially I want the SDP offer and don't care about the IP-address/port; that offer goes to the server, the server chooses codecs and gets the SRTP key and replies with an SDP answer to the browser which would contain a public address, the codec choice and it's key. Ideally the browser starts sending media to the server and the server simply sends "peer" media back from whence it came (which would tunnel back through any UDP friendly NAT devices).
I know this is technically possible because I already do this with Win32/OSX desktop clients... the question is, is this possible with WebRTC and RTCPeerConnection? I've tried a few configuration types, e.g. {} and { "iceServers": [] } but it still seems to go through discovery gyrations. Are there perhaps other ways to shortcut the process? Thanks!

No, you cannot skip the process, since the WebRTC implementation forces the use of ICE and STUN checks, to fix some security problems. So, the current Chrome implementation will force that the STUN checks are made to the ip/ports negotiated in the ICE candidates.
But yes, there are many applications working without this requirement. One day we have to change to better and more secure implementations. The day is now...

No, you can t skip it in webrtc browsers, but webrtc devices (here your gateway) can simplify the process by only implementing ICE Lite.

Related

How to do signaling in WebRTC without using WebSocket or http or mail

As far as I search, All WebRTC handshakes are done through any signaling server [ HTTP, WebSocket, etc..] even through Mail or Whatsapp.
But I expect to connect without using any of them. Is there any way to archive this?
If yes, please give me a brief solution. ThankYou!
WebRTC (as implemented in browsers) requires signalling. You can choose a different medium for signalling such as a piece of paper or calling someone on the phone, but it needs signalling.
If you want to be able to initiate unsolicited connections you need to use TCP or UDP directly which (if we ignore NAT and firewalls) don't have that restriction.

Difference between STUN/TURN(coTURN) servers and Signaling servers (written with socket.io/websocket) in WebRTC?

I am building this video teaching site and did some research and got a good understanding but except for this thing. So when a user want's to connect to another user, P2P, I need signaling server to get their public IP to get them connected. Now STUN is doing that job and TURN will relay the media if the peers cannot connect. Now if I write signaling server with WebSocket to communicate the SDP messages and have ICE working, do I need coTURN installed? What will be the job of the job of them particularly?
Where exactly I am confused is the work of my simply written WebSocket Signaling server (from what I saw in different tutorials) and the work of the coTURN server I'll install. And how to connect them with the media server I'll install.
A second question, is there a way to use P2P when there is only two/three participants and get the media servers involved is there is more than that so that I don't use up the participant's bandwidth too much?
The signaling server is required to exchange messages between peers (SDP packets) until they have established a P2P connection.
A STUN server is there to help a peer discover information about its public IP and to open up firewall ports. The main problem this is solving is that a lot of devices are behind NAT routers within small private networks; NAT basically allows outgoing requests and their response, but blocks any other "unsolicited" incoming requests. You therefore have a Catch-22 scenario when both peers are behind a NAT router and could make an outgoing request, but have nowhere to send it to since the opposite peer doesn't expose anything to make a request to. STUN servers act as a temporary middleman to make requests to, which opens a port on the NAT device to allow the response to come back, which means there's now a known open port the other peer can use. It's a form of hole-punching.
A TURN server is a relay in a publicly accessible location, in case a P2P connection is impossible. There are still cases where hole-punching is unsuccessful, e.g. due to more restrictive firewalls. In those cases the two peers simply cannot talk 1-on-1 directly, and all their traffic is relayed through a TURN server. That's a 3rd party server that both peers can connect to unrestrictedly and that simply forwards data from one peer to the other. One popular implementation of a TURN server is coturn.
Yes, basically all those functions could be fulfilled by a single server, but they’re deliberately separated. The WebRTC specification has absolutely nothing to say about signaling servers, since the signaling mechanism is very unique to each application and could take many different forms. TURN is very bandwidth intensive and must usually be delegated to a larger server farm if you’re hoping to scale at all, so is impractical to mix in with any of the other two functions. So you end up with three separate components.
Regarding multi-peer connections: yes, you can set up a P2P group chat just fine. However, each peer will need to be connected to every other peer, so the number of connections and bandwidth per peer increases with each new peer. That’s probably going to work okay for 3 or 4 peers, but beyond that you may start to run into bandwidth and CPU limits of individual peers, especially if you’re doing decent quality video streaming.

With WebRTC, is it possible to connect successfully every time without TURN sever?

These days, I'm really into webRTC technology, and I've been studying webRTC. But, I'm faced with a problem.
I understand that webRTC is using the ICE framework, which has TURN, STUN sever for relay and signaling. But as this article said, webRTC doesn't need a TURN server.
So I'm really curious whether it is possible to connect successfully every time without a TURN sever?
If it is, please tell me the way, and if it isn't, how often are peers using the TURN server in average?
Thank you.
(PS, Azar (one of the biggest apps using webRTC) also said they don't use the TURN sever on their website)
Yes it's possible to connect without a TURN server. Every time? Yes. Everyone? No. Because firewalls.
The Holy Grail of WebRTC is a direct client-to-client network connection without going through an intermediary server (a relay).
TURN is an intermediary server. It's used as a fallback when peers are behind symmetric NATs.
Negotiating this, is the purpose of ICE. There are articles written on how, but in short, "ICE agents" (browsers) collaborate on both ends, communicating through your JS signaling channel, to poke holes from inside the firewall on each end to connect up.
This related answer suggests TURN usage is ~20%.
STUN is not a relay, but merely a mirror server for agents to learn their own external IPs.

Can I simplify WebRTC signalling for computers on the same private network?

WebRTC signalling is driving me crazy. My use-case is quite simple: a bidirectional audio intercom between a kiosk and to a control room webapp. Both computers are on the same network. Neither has internet access, all machines have known static IPs.
Everything I read wants me to use STUN/TURN/ICE servers. The acronyms for this is endless, contributing to my migraine but if this were a standard application, I'd just open a port, tell the other client about it (I can do this via the webapp if I need to) and have the other connect.
Can I do this with WebRTC? Without running a dozen signalling servers?
For the sake of examples, how would you connect a browser running on 192.168.0.101 to one running on 192.168.0.102?
STUN/TURN is different from signaling.
STUN/TURN in WebRTC are used to gather ICE candidates. Signaling is used to transmit between these two PCs the session description (offer and answer).
You can use free STUN server (like stun.l.google.com or stun.services.mozilla.org). There are also free TURN servers, but not too many (these are resource expensive). One is numb.vigenie.ca.
Now there's no signaling server, because these are custom and can be done in many ways. Here's an article that I wrote. I ended up using Stomp now on client side and Spring on server side.
I guess you can tamper with SDP and inject the ICE candidates statically, but you'll still need to exchange SDP (and that's dinamycally generated each session) between these two PCs somehow. Even though, taking into account that the configuration will not change, I guess you can exchange it once (through the means of copy-paste :) ), stored it somewhere and use it every time.
If your end-points have static IPs then you can ignore STUN, TURN and ICE, which are just power-tools to drill holes in firewalls. Most people aren't that lucky.
Due to how WebRTC is structured, end-points do need a way to exchange call setup information (SDP) like media ports and key information ahead of time. How you get that information from A to B and back to A, is entirely up to you ("signaling server" is just a fancy word for this), but most people use something like a web socket server, the tic-tac-toe of client-initiated communication.
I think the simplest way to make this work on a private network without an internet connection is to install a basic web socket server on one of the machines.
As an example I recommend the very simple https://github.com/emannion/webrtc-web-socket which worked on my private network without an internet connection.
Follow the instructions to install the web socket server on e.g. 192.168.1.101, then have both end-points connect to 192.168.0.101:1337 with Chrome or Firefox. Share camera on both ends in the basic demo web UI, and hit Connect and you should be good to go.
If you need to do this entirely without any server, then this answer to a related question at least highlights the information you'd need to send across (in a cut'n'paste demo).

WebRTC HowTo PeerConnection via LAN with 2 Browsers

since few days I'm trying to build a basic webRTC Videochat. I've got some Demos running localy, even via LAN. But now I want to build one by my one at the really basics without so much overload some Demos come with.
But I still don't get a complete peer connection.
Eg. this example seems to be broken, because I can't "createSignalingChannel();" w3.org/TR/webrtc/#simple-example
Some other examples (https://webrtc-experiment.appspot.com/) want me to link their scripts, but I wont do this, because I want to understand the magic of the peer connection and how to get a handshake between 2 browsers.
I also explored examples with the Google App Engine but thats not what I want.
I want to run it in really easy JS and HTML just on the minimum of what is neccessary.
Here is my code:
https://github.com/mexx91/basicVideoRTC EDIT: Should work now
So what will I have to add to get an handshake and peer connection, so that I can send eg. the mediaStream to eachother.
Thanks a lot!
createSignalingChannel() is only pseudo-code to illustrate the existence of a separate channel. You need for the initial connection handling a separate message channel.
You can achieve that with hosted services like Pusher, Brightcontext or PubNub, or you can host your own backend with open-source projects like socket.io or SignalR.
Then you just need to send the offers, answers and iceCandidates through your separate channel.
List of Realtime Services: http://www.leggetter.co.uk/real-time-web-technologies-guide
Imagine a video conferencing web-app, which users A and B originally access from some webserver. Suppose that web app supports presence, so the web server knows who's currently on-line. Imahine the UI allows A to try and place a video call to B. Via say XMLHttpRequest(), A's browser informs the server this is wanted, and B's javascript pops up something saying that A wants to call B. No WebRTC has happened at all yet. But at this stage, A can indirecttly communicated with B by sending messages using e.g. XMLHttpeRequest. In WebRTC parlance, this is the "signalling channel". So, A and B can both interact with their ICE agents to discover candidate addresses, and SDP descriptions, and send these to each ot6her, via the server, over this signallinh channel. E.g. the web app on A calls a WebRTC API to get its ICE candidates, and packages these up as it sees fit, to send to B. B's reader receives this message from the server (e.g over a WebSocket or long poll) and hyence it can unpack this, and format as needed to send to the ICE agent on B, using the RTCPeerConnection object. Similalrly, SDP offer/answer can be sent betweent he two apps, and passe through into the ICE agnet in the browsers, to get agreed media formats etc. At that stage, media connections can get set uo by the browser (meida streams are added to the RTCPeerConnection initially (which aren't communicating, but whihc have attributes that can be queried to describe the codec etc, and when the API is asked to create an SDP description, it does that using these attributes, but adjust the IP address and port based on how the ICE agent on each local browser has figured out what addresses can reach that local browser / port (NAT traversal).