What are ICE Candidates and how do the peer connection choose between them? - webrtc

I newly wrote a simple chat application, but I didn't really understand the background of ICE Candidates.
When the peer create a connection they get ICE Candidates and they exchange them and set
them finally to the peerconnection.
So my question is, where do the ICE Candidates come from and how are they used and are they all really used ?
I have noticed that my colleague got less candidates when he executes the application on his machine, what could be the reason for different amount of Candidates ?

the answer from #Ichigo is correct, but it is a litte bit bigger. Every ICE contains 'a node' of your network, until it has reached the outside. By this you send these ICE's to the other peer, so they know through what connection points they can reach you.
See it as a large building: one is in the building, and needs to tell the other (who is not familiar) how to walk through it. Same here, if I have a lot of network devices, the incoming connection somehow needs to find the right way to my computer.
By providing all nodes, the RTC connection finds the shortest route itself. So when you would connect to the computer next to you, which is connected to the same router/switch/whatever, it uses all ICE's and determine the shortest, and that is directly through that point. That your collegue got less ICE candidates has to do with the ammount of devices it has to go through.
Please note that every network adapter inside your computer which has an IP adress (I have a vEthernet switch from hyper-v) it also creates an ICE for it.

ICE stands for Interactive Connectivity Establishment , its a techniques used in NAT( network address translator ) for establishing communication for VOIP, peer-peer, instant-messaging, and other kind of interactive media.
Typically ice candidate provides the information about the ipaddress and port from where the data is going to be exchanged.
It's format is something like follows
a=candidate:1 1 UDP 2130706431 192.168.1.102 1816 typ host
here UDP specifies the protocol to be used, the typ host specifies which type of ice candidates it is, host means the candidates is generated within the firewall.
If you use wireshark to monitor the traffic then you can see the ports that are used for data transfer are same as the one present in ice-candidates.
Another type is relay , which denotes this candidates can be used when communication is to be done outside the firewall.
It may contain more information depending on browser you are using.
Many time i have seen 8-12 ice-candidates are generated by browser.

Ichigo has a good answer, but doesn't emphasise how each candidate is used. I think MarijnS95's answer is plain wrong:
Every ICE contains 'a node' of your network, until it has reached the outside
By providing all nodes, the RTC connection finds the shortest route itself.
First, he means ICE candidate, but that part is fine. Maybe I'm misinterpreting him, but by saying 'until it has reached the outside', he makes it seem like a client (the initiating peer) is the inner most layer of an onion, and suggests the ICE candidate helps you peel the layers until you get to the 'internet', where can get to the responding peer, perhaps peeling another onion to get to it. This is just not true. If an initiating peer fails to reach a responding peer through the transport address, it discards this candidate and will try a different candidate. It does not store any nodes anywhere in the candidate. The ICE candidates are generated before any communication with the responding peer. An ice candidate does not help you peel the proverbial NAT onion. Also regarding the second quote I made from his answer, he makes it seem like ICE is used in a shortest path algorithm, where 'shortest' does not show up in the ICE RFC at all.
From RFC8445 terminology list:
ICE allows the agents to discover enough information
about their topologies to potentially find one or more paths by which
they can establish a data session.
The purpose of ICE is to discover which pairs of addresses will work. The way that ICE does this is to systematically try all possible pairs (in a carefully sorted order) until it finds one or more that work.
Candidate, Candidate Information: A transport address that is a
potential point of contact for receipt of data. Candidates also
have properties -- their type (server reflexive, relayed, or
host), priority, foundation, and base.
Transport Address: The combination of an IP address and the
transport protocol (such as UDP or TCP) port.
So there you have it, (ICE) Candidate was defined (an IP address and port that could potentially be an address that receives data, which might not work), and the selection process was explained (the first transport address pair that works). Note, it is not a list of nodes or onion peels.
Different users may have different ice candidates because of the process of "gathering candidates". There are different types of candidates, and some are obtained from the local interface. If you have an extra virtual interface on your device, then an extra ICE will be generated (I did not test this!). If you want to know how ICE candidates are 'gathered', read the 2.1. Gathering Candidates

Related

Trickle ICE and two peers on the same internal network

I noticed that if SDP sets icepolicy to trickle (a=ice-options:trickle) and the two peers are on the same internal network, the ICE agents do not generate Server Reflexive candidates and in fact no attempt is made to get server reflexive candidates. That appears to be the logical right decision but is the very definition of trickle ice not intended to gather ALL candidates, even if it is obvious that the first one will ultimately get nominated/selected?
Chrome stops gathering candidates when it find a writeable candidate pair. That is somewhat understandable, since it would otherwise gather a relay candidate (which consumes resources) and then immediately deallocate it.

Will ICE negotiations between peers behind two symmetric NAT's result in requiring two TURN servers?

I read RFC6577 and RFC8445 but I feel like there is a bit of a disconnect between how TURN can be used versus how ICE actually utilizes the relay candidates.
The TURN RFC describes the use of one single TURN server to ferry data between a client and a peer. The transport address on the TURN server accepts data flow from a client via TURN messages, whereas the relayed transport address accepts data flow from peer(s) via UDP. This sounds great - one TURN server and bidirectional data flow.
However in reading about ICE, I feel like this never happens. Both caller and callee independently allocate on potentially two TURN servers, and then send their respective relayed transport addresses to each other. More like an I can be reached via this relayed transport address sort of thing. Connectivity checks then occur and thus, two TURN servers end up being used here where data only flows in one direction through the relayed transport address of each participants allocated TURN server.
Is this true?
From the TURN RFC, it says the following:
The client can arrange for the server to relay packets to and from
certain other hosts (called peers) and can control aspects of how the
relaying is done. The client does this by obtaining an IP address and
port on the server, called the relayed transport address. When a peer
sends a packet to the relayed transport address, the server relays the
packet to the client. When the client sends a data packet to the
server, the server relays it to the appropriate peer using the relayed
transport address as the source.
However, I can't see a scenario whereby through ICE negotiations, data would ever flow through the transport address from the client to the peer. Both the caller and the callee independently allocate on a TURN server and send relayed transport addresses to each other to be reached on.
Basically, TURN can do bidirectional data flow, but with ICE between two symmetric NAT's, it wont. Is this correct?
Its a bit complicated.... bear with me. Reading just the TURN RFC isn't enough, you need context from RFC 5245 on ICE too.
The following scenario is the baseline case:
client A allocates a relay address 8.8.8.8:43739, sends it to client B
client B sends a UDP packet to 8.8.8.8:43739
TURN servers wraps the packet in a stun message, sends it to client A
Now as you say, typically client B will also allocate its own relay address and send it to A. Why isn't that used all the time (or half the time)? The priorities of the candidates are equal after all.
However the candidate pair priority which determines which pair to pick includes a factor which acts as a tie-breaker:
pair priority = 2^32*MIN(G,D) + 2*MAX(G,D) + (G>D?1:0)
Where G>D?1:0 is an expression whose value is 1 if G is greater than
D, and 0 otherwise.
This means the pair where the callers (assuming its the controlling agent) relay address is used has a higher priority than the pair with the callees relay address.
Additionally, there is another candidate in the game here for the port client B uses to send to port 8.8.8.8:43739. This will typically be from one of the local candidates and the TURN server sees (and puts into the data indication) the public (post-nat) ip of client B. On client A this will show up as a remote srflx candidate -- which has a higher priority than a relayed candidate and will therefore be used.
Now if B is behind a symmetric NAT (I think) the TURN server will see a different port from client B than anything for which client A has added a permission. This will typically mean the TURN server will drop the packet and that pair won't work.
If client A is not behind a symmetric NAT, the baseline process will be repeated in the other direction. Slightly less priority but its the same in terms of latency so users won't notice.
If both clients are (and now we are finally at the case you're asking about) are behind symmetric NAT, neither will work and a relay-relay pair will be used. This is fairly rare (<1% probably) and the latency impact is typically insignificant even when both clients are on different TURN servers.

WebRTC - TURN and ICE functions

I'm trying to understand a concept of WebRTC. As I found in some descriptions (for example here http://www.innoarchitech.com/content/images/2015/02/webrtc-complete-diagram.png), there is such a way of making a connection:
Call STUN, to get your IP:port address.
Get some channel from TURN - with that channel you can send info to other peer.
Send to other peer ICE candidates.
Accept ICE candidates with other peers- start a call.
The question is, what do we need ICE candidates for? We know our IP, we can send it to TURN therefore to other peer, and on TURN we have a nice connection with other peer- so we don't have to scary about NATs. Why except that we are sending ICE candidates (why many?), and why we need to use them?
We have 3 main concepts here:
ICE
TURN
STUN
The ICE negotiation is not that simple...
To execute ICE, UAs have to identify all address candidates, transport addresses. Transport addresses are a combination of IP address and port for a particular transport protocol. There are three types of candidates:
Host Candidate – transport address associated with a UA’s local interface
Relayed Candidate – transport address associated with a TURN server (can only be obtained from a TURN server)
Server Reflexive Candidate – translated address on the public side of the NAT (obtained from either a STUN server or a TURN server)
After UA1 has gathered all of its candidates, it arranges them in order of priority from highest to lowest and sends them to UA2 in attributes in an SDP offer message. UA2 performs the same candidate gathering and sends a SDP response with it’s list of candidates. Each UA takes the two lists of candidates and pairs them up to make candidate pairs. Each UA gathers these into check lists and schedules connectivity checks, STUN request/response transaction, to see which pairs work. Figure 3 shows the components of the candidate pairs that make up the UA check list.
ICE assigns one of the agents as the “Controlling Agent” and the other as the “Controlled Agent”. The controlling agent used the valid candidate pairs to nominate a pair to use for the media. There are two nomination methods that can be used:
Regular Nomination The checks continue until there is at least one valid candidate pair. The controlling agent picks from the valid pairs and sends a second STUN request on that pair with a flag to tell the peer that this is the one that is nominated for use.
Aggressive Nomination The nomination flag is sent with every STUN request, once the first check succeeds ICE processing for that media stream is finished and a second STUN request is not needed.
Each candidate pair in the check list has a state associated with it. The state is assigned by the UA once the check list has been computed. There are five possible states:
Frozen This pair can only be checked after being put in the waiting state. To enter the waiting state some other check must succeed first.
Waiting As soon as this is the highest priority pair in the check list a check will be performed.
In-Progress A check has been sent for this pair and the transaction is in progress
Succeeded Successful result from pair check.
Failed Failed result from pair check.
The link below includes more information and diagrams of the ICE flow.
Reference:
http://www.vocal.com/networking/ice-interactive-connectivity-establishment/
RFC https://www.rfc-editor.org/rfc/rfc5245#page-9
TURN is typically used only as a fallback when a direct peer-to-peer connection cannot be established. The latter is the hard part, and is what ICE is for.
Always using TURN, is an option, but a bit of an edge-case.

How to Validate pair in the ICE protocol?

Related WebRTC, ICE protocol gives the which pair of addresses will work for direct media transfer between the pairs.
Let A and B are two endpoints
To choose which address will work for direct communication between A and B, Person A first gather candidates, encode candidate attribute, encode the SDP offer message, and send it to another endpoint.
When B get offer message from A,then person B gather candidates, encode the SDP answer message with its own list of candidates and send it to person A.
At this end of this process, each agent has a complete list of local candidates and Remote candidates. Its pairs them up, resulting in CANDIDATE PAIRS. To see, which pair work, each agent performs the connectivity checks using STUN req/resp.
How many connectivity checks are performed, to nominate valid candidate pair?
What are the remaining ICE connectivity checks are performed regarding webRTC call?
To develop ICE module for webRTC call, I have to follow each step in RFC5245 or any thing else?
How many connectivity checks are performed, to nominate valid
candidate pair?
The number of candidate pairs are the number of connectivity checks done by each side.
What are the remaining ICE connectivity checks are performed regarding
webRTC call?
There are no extra ICE connectivity checks for webRTC.
To develop ICE module for webRTC call, I have to follow each step in
RFC5245 or any thing else?
You have to implement or use existing implementation of DTLS protocol, RFC5763 and RFC5764. DTLS implementation can be found on OpenSSL library.
All these seems a lot of work but if you use openssl then its easy enough.

Why does ICE needs both-ways signaling?

To establish WebRTC connections the ICE protocol is used with a signaling server which must send messages in both directions. I wonder why after the initiator sent its offer and candidates to the other participant, the participant needed to send back its answer and candidates using the signaling channel in the other direction. Cannot the participant open the connection to the initiator using candidates from both sides and send back its answer using the open connection?
I started reading ICE RFC and the only relevant part I found is in section 5.2 where the initiator must take the controlling role and nominates candidate pairs. But it does not explain why the other could not initiate connection.
To give some background, I am trying to build a webapp for which I want users to establish WebRTC connections without using a signaling server. I thought of having the app to generate a URL including the offer and candidates and providing this URL to other participants through other medium like instant messaging. The issue I got is that the participant need to send back its answer and candidates using the same medium, which is not practical. In the end I will go for a signalling server but I wonder the technical reason.
Yes, you can do that if caller is behind public IP or Full Cone NAT(in this case, router connection mapping needs not to be timed out).
You can able full fill above conditions rarely.
What's the problem with other NAT types?
For example , PRC(port restricted cone) NAT won't allow you to receive a packet from a IP:Port , if you didn't send any packet to that IP:Port before. So callee will never able to send you a packet.
So if callee sends her candidates list to you . you can send some dummy data(with low TTL) to her IP:Port to fool your PRC NAT (now it allow incoming packets from callee's IP:Port as it sends a packet to that IP:Port before).
To know more about different types of NAT:
https://en.wikipedia.org/wiki/Network_address_translation
http://think-like-a-computer.com/2011/09/16/types-of-nat/