Connect to specific user from STUN server in WEB RTC - webrtc

I'm trying to achieve peer to peer video conference using google stun server.
I can connect anyone by stun server randomly.Because stun gives multiple and random addresses and connect with it.
But is there any way to connect specific peer by stun server for a login based system or room based system?
I want to achive something like - https://apprtc.appspot.com/

You need to design your signalling method (this is up to the application developer), which is independent of STUN.
WebRTC does not specify the mechanism for signalling. Signalling is the method whereby users discover each other and establish that a call (media streams between two peers) is going to take place.
The 'discovery' process could involve a registration-based system (eg using SIP proxy) or room based where two users have access to a 'room' (by knowing the credentials or some means of authentication). Once two peers have found each other, their browsers then need to share and negotiate network topology and media capabilities to ensure that the streams can reach the intended destination and can be encoded/decoded properly.

Related

Difference between STUN/TURN(coTURN) servers and Signaling servers (written with socket.io/websocket) in WebRTC?

I am building this video teaching site and did some research and got a good understanding but except for this thing. So when a user want's to connect to another user, P2P, I need signaling server to get their public IP to get them connected. Now STUN is doing that job and TURN will relay the media if the peers cannot connect. Now if I write signaling server with WebSocket to communicate the SDP messages and have ICE working, do I need coTURN installed? What will be the job of the job of them particularly?
Where exactly I am confused is the work of my simply written WebSocket Signaling server (from what I saw in different tutorials) and the work of the coTURN server I'll install. And how to connect them with the media server I'll install.
A second question, is there a way to use P2P when there is only two/three participants and get the media servers involved is there is more than that so that I don't use up the participant's bandwidth too much?
The signaling server is required to exchange messages between peers (SDP packets) until they have established a P2P connection.
A STUN server is there to help a peer discover information about its public IP and to open up firewall ports. The main problem this is solving is that a lot of devices are behind NAT routers within small private networks; NAT basically allows outgoing requests and their response, but blocks any other "unsolicited" incoming requests. You therefore have a Catch-22 scenario when both peers are behind a NAT router and could make an outgoing request, but have nowhere to send it to since the opposite peer doesn't expose anything to make a request to. STUN servers act as a temporary middleman to make requests to, which opens a port on the NAT device to allow the response to come back, which means there's now a known open port the other peer can use. It's a form of hole-punching.
A TURN server is a relay in a publicly accessible location, in case a P2P connection is impossible. There are still cases where hole-punching is unsuccessful, e.g. due to more restrictive firewalls. In those cases the two peers simply cannot talk 1-on-1 directly, and all their traffic is relayed through a TURN server. That's a 3rd party server that both peers can connect to unrestrictedly and that simply forwards data from one peer to the other. One popular implementation of a TURN server is coturn.
Yes, basically all those functions could be fulfilled by a single server, but they’re deliberately separated. The WebRTC specification has absolutely nothing to say about signaling servers, since the signaling mechanism is very unique to each application and could take many different forms. TURN is very bandwidth intensive and must usually be delegated to a larger server farm if you’re hoping to scale at all, so is impractical to mix in with any of the other two functions. So you end up with three separate components.
Regarding multi-peer connections: yes, you can set up a P2P group chat just fine. However, each peer will need to be connected to every other peer, so the number of connections and bandwidth per peer increases with each new peer. That’s probably going to work okay for 3 or 4 peers, but beyond that you may start to run into bandwidth and CPU limits of individual peers, especially if you’re doing decent quality video streaming.

Kurento : Client side TURN configuration

As per my understanding from my previous question : Kurento: STUN/TURN
The TURN server configured from webrtcendpoint.conf.ini is used only for exchanging ice candidates. Also we can specify only one TURN server in webrtcendpoint.conf.ini because What I have observed is that if i provide 2 or more TRUN server in webrtcendpoint.conf.ini Kurento media server service fails to start, is this correct ?
Also is there any way to verify which STUN/TURN server is being used at Kurento media server and between two peers?
The STUN/TURN which we provide in conferenceroom.js will be used for the actual media flow/mediapipeline between peers. Is this correct ?
If we provide multiple TURN servers in conferenceroom.js then the TURN server neatest/fastest response time will be selected for media flow between the two peers? Same as we get response time form this Link.
Also what I have observed is that if the TURN server provided in webrtcendpoint.conf.ini and conferenceroom.js are different then we are not able to see remote participants video but if both the TURN server are the same then I am able to see remote participants video. Is this correct ?
Edit 1:
In groupcall sample example we have onExistingParticipants() and onNewParticipants() where we can define iceServers in receiveVideo() and onExistingParticipants() so what will happen if we specify TURN server t1 in kurentoUtils.WebRtcPeer.WebRtcPeerSendOnly() and TURN server t2 in kurentoUtils.WebRtcPeer.WebRtcPeerRecvOnly() then will these two TURN server communicate between each other as relay chain ?
The TURN server configured from webrtcendpoint.conf.ini is used only
for exchanging ice candidates. Also we can specify only one TURN
server in webrtcendpoint.conf.ini because What I have observed is that
if i provide 2 or more TRUN server in webrtcendpoint.conf.ini Kurento
media server service fails to start, is this correct ?
It is used for gathering candidates, and if needed as video relay. You KMS won't probably need this, as the location where it's deployed is managed by you. If you can do with STUN only which is the desired way, then the relay server won't be used.
Only one server can be configured,
Also is there any way to verify which STUN/TURN server is being used
at Kurento media server and between two peers?
Yes, the WebRtcEndpoint has methods for this
getStunServerPort()
getStunServerAddress()
getTurnUrl()
The STUN/TURN which we provide in conferenceroom.js will be used for
the actual media flow/mediapipeline between peers. Is this correct ?
It will be used to gather candidates in your client. Also, if your client is behind a NAT that needs to use a relay server, it will use the one configured in conferenceroom.js. Keep in mind that the media path might not be symmetric: while media going from client->kms might be not using a relay server, media going from KMS-> client might due to the network conditions at your client's location.
If we provide multiple TURN servers in conferenceroom.js then the TURN
server neatest/fastest response time will be selected for media flow
between the two peers? Same as we get response time form this Link.
Yes, candidates are probed and the best one is chosen.
Also what I have observed is that if the TURN server provided in
webrtcendpoint.conf.ini and conferenceroom.js are different then we
are not able to see remote participants video but if both the TURN
server are the same then I am able to see remote participants video.
Is this correct ?
This shouldn't be the case, unless one TURN is working and the other not.
EDIT
TURN servers will not exchange media between them. They will be used, if needed, to act as a relay with the other peer. The process is
Each peer gathers candidates: host, srflx (STUN) and relay (TURN). Nothe that if the TURN server is different, the relay candidates will also be different.
Candidates get sent to the other peer.
Each candidate is probed individually, and the best one is chosen.
Since all media is going through KMS, it will be the KMS the one sending media to the relay server. Keep in mind that KMS is always in between peers. It would be
kms->t2->client
client->t1->kms
Even if the connection was browser to browser, the TURN servers would not communicate directly, as they would act as relay for the media sent from one peer to the other. Here it would be
client1->t2->client2
client2->t1->client1

Can I simplify WebRTC signalling for computers on the same private network?

WebRTC signalling is driving me crazy. My use-case is quite simple: a bidirectional audio intercom between a kiosk and to a control room webapp. Both computers are on the same network. Neither has internet access, all machines have known static IPs.
Everything I read wants me to use STUN/TURN/ICE servers. The acronyms for this is endless, contributing to my migraine but if this were a standard application, I'd just open a port, tell the other client about it (I can do this via the webapp if I need to) and have the other connect.
Can I do this with WebRTC? Without running a dozen signalling servers?
For the sake of examples, how would you connect a browser running on 192.168.0.101 to one running on 192.168.0.102?
STUN/TURN is different from signaling.
STUN/TURN in WebRTC are used to gather ICE candidates. Signaling is used to transmit between these two PCs the session description (offer and answer).
You can use free STUN server (like stun.l.google.com or stun.services.mozilla.org). There are also free TURN servers, but not too many (these are resource expensive). One is numb.vigenie.ca.
Now there's no signaling server, because these are custom and can be done in many ways. Here's an article that I wrote. I ended up using Stomp now on client side and Spring on server side.
I guess you can tamper with SDP and inject the ICE candidates statically, but you'll still need to exchange SDP (and that's dinamycally generated each session) between these two PCs somehow. Even though, taking into account that the configuration will not change, I guess you can exchange it once (through the means of copy-paste :) ), stored it somewhere and use it every time.
If your end-points have static IPs then you can ignore STUN, TURN and ICE, which are just power-tools to drill holes in firewalls. Most people aren't that lucky.
Due to how WebRTC is structured, end-points do need a way to exchange call setup information (SDP) like media ports and key information ahead of time. How you get that information from A to B and back to A, is entirely up to you ("signaling server" is just a fancy word for this), but most people use something like a web socket server, the tic-tac-toe of client-initiated communication.
I think the simplest way to make this work on a private network without an internet connection is to install a basic web socket server on one of the machines.
As an example I recommend the very simple https://github.com/emannion/webrtc-web-socket which worked on my private network without an internet connection.
Follow the instructions to install the web socket server on e.g. 192.168.1.101, then have both end-points connect to 192.168.0.101:1337 with Chrome or Firefox. Share camera on both ends in the basic demo web UI, and hit Connect and you should be good to go.
If you need to do this entirely without any server, then this answer to a related question at least highlights the information you'd need to send across (in a cut'n'paste demo).

GameKit/Peer-to-peer over internet

For an iOS app I am developing, I want multiple phone to connect to each other and be able to voice chat between those devices.
I have it working when both devices are on the same network. This was quite simple and most of the stuff I want to do, is possible.
But now I am adding internet support, which is quite a hassle. I'll first try to explain how I want to match the devices, using a small webservice I set up.
Server
Start a new GameKit session, with session-mode GKSessionModePeer
Find the "Peer ID" of the server on the session I just created
Create a new CFSocketRef on an free port and keep it ready to accept connections
Send Peer ID and Port number to my webservice, running on an external server.
WebService
Webservice receives the information and stores it together with an ID and the IP address of the client in a database.
Send ID back to Server, which displays the ID
Client
When the user chooses to use the "Online" feature of GameKit to search for games, I ask the user for an ID (where the user should input the ID the server receives).
Client connects to the webservice supplying the ID. The webservice returns the information about the session (IP, PORT, Peer ID) of the server.
The user tries to connect to the IP address, with the port information and set up an input and output stream with the server.
This does not work ofcourse, because my network does not allow incoming connections and a random port (from an external network).
But now the question is, how do I solve this? I want to be able to set up a peer to peer connection between 2 devices, those devices could be on the same network, but also on separate networks.
Is there a framework, example or anything showing how to do this? I want to be able to send data from device to device, without sending it to a server first.
I'm not aware of any frameworks that do this. I do however have a lot of experience with p2p networking across multiple networks.
One important rule I learned: when communicating between networks, don't create a direct connection unless necessary. There are just too many factors that can (will?) cause issues, such as firewalls, NATs, etc.
Sure, you can let the connection try first. You can try to connect to the given IP addresses*, but in most cases it will fail. Even when using UPnP and NAT-PMP, you'll find that in a lot of cases (more than half?) you won't be able to accept incoming connections at all.
So make sure to have a backup plan. Make a network layer abstraction that doesn't only listen(), but also connects to a server. That way, when you can't connect to the IPs* of the client, you simply setup a connection via the server and the network abstraction takes care of it all.
Let me reiterate the above: don't rely on incoming connections only, always have a backup plan.
* I write IPs because clients can have multiple local/remote IPs. Always iterate over all these IPs when connecting. Example: my phone has 2 local IPv4 addresses (10.0.0.172 and 10.8.0.2), and an IPv6 address ([2001:x:x::6]). Of these three addresses, only the IPv6 address is publicly reachable, and the two IPv4 addresses are on different subnets so whether you can connect to them depends on the subnet that the other client is on. Always try to connect to both, and fall back to a server-proxied connection when it fails.
** I mentioned IPv6, yes. Let's not forget that IPv6 is not limited by NATs, unlike IPv4, and this means that you're far more likely to get a good connection via IPv6 than IPv4, if supported.

Replicate logmein.com behavior for smart devices

I have several smart devices that run Windows CE5 with our application written in .NETCF 3.5. The smart devices are connected to the internet with integrated GPRS modems. My clients would like a remote support option but VNC and similar tools doesn't seem to be able to do the job. I found several issues with VNC to get it to work. First it has severe performance issues when ran on the smart device. The second issue is that the internet provider has a firewall that blocks all incoming requests if they didn't originate from the smart device itself. Therefore I cannot initiate a remote desktop session with the smart devices since the request didn't originate from the smart device.
We could get our own APN however they are too expensive and the monthly cost is too great for the amount of smart devices we have deployed. It's more economical for us if we could add development costs to the initial product cost because our customers dislike high monthly costs and rather pay a large sum up front instead. A remote support solution would also allow us to minimize our onsite support.
That's why we more or less decided to roll our own remote desktop solution. We have code for capturing images on the smart device and only get the data that has changed since the last cycle. What we need is to make a communication solution like logmein.com (doesn't support WinCE5) where the smart devices connect to a server from which we then can stream the data to our support personnel's clients. Basically the smart device initiates a connection to our server and start delivering screen data when the server requests it. A support client connects to the server and gets a list of available streams and then select one to listen in on.
Any suggestions for how to do it considering we have to do the solution in .NETCF 3.5 on the smart devices? We have limited communication experience beyond simple soap web-services.
Since you're asking for a suggestion, I'll suggest this:
Don't reinvent. Reuse whatever you can. You can perform tunneling with SSH, so make an SSH connection (say, a port of PuTTY or plink, inside a loop) out via GPRS on your smart device; forward remote ports to local ports, bound to the SSH server's local address (127.0.0.1 (sshd):4567 => localhost (smart_device_01):4567). Your clients connect to your SSH server and access the assigned port for each device.
With that said, that's probably not the answer you're looking for. Below - the answer you're probably looking for.
Based on my analysis of how LogMeIn works, you'll want to make an HTTPS or TLS server where your smart devices will push data. Let's call it your tunnel server.
You'll probably want to spawn a new thread that repeatedly attempts to make connections to the tunnel server (outbound connections from smart device to the server, per your specified requirement). With a protocol like BEEP/BXXP, you can encapsulate and multiplex message-oriented or stream-oriented sessions. Wrap BXXP/BEEP into TLS, and tunnel through to your tunnel server. BEEP lets you multiplex streams onto one connection -- if you want the full capabilities of an in-house LogMeIn solution, you'll want to use something like this.
Once a connection is established, make a new BEEP session. With the new session, tell the tunnel server your system identification information (device name, device authentication signature). Write heartbeat data (timestamp periodically) into this new session.
Set up a callback (or another thread) which interfaces to your BEEP control session. Watch for a message requesting service. When such a request comes in, spawn the required threads to copy data from your custom remote-display protocol and push this data back through the same channel.
This sets the basic premise for your Smart Device's program. You can add functionality to this as you desire, say, to match what LMI's IT Reach subscription provides (remote registry, secure tunneled Telnet, remote filesystem, remote printing, remote sound... you get the idea)
I'll make some assumptions that you know how to properly secure all this stuff for authentication and authorization for your clients (Is user foo allowed to access smart device bar?).
On your tunnel server, start a server socket (listening for inbound connections, or from the perspective of smart devices, smart device outbound connections) that demultiplexes connections and sessions. Once a connection is opened, fire up BEEP and register a callback / start a thread to wait for the authentication/heartbeat session. Perform the required checks for AAA to smart devices -- are these devices allowed, are they known, how much does it cost, etc. Your tunnel server forwards data on behalf of your smart devices. For each BEEP session, attach a name (device name) to the BEEP session after the AAA procedures succeed; on failure, close the connection and let the AAA mechanism know (to block attackers). Your tunnel server should also set up what's required for interacting with the frontend -- that is, it should have the code to interact with BEEP to demultiplex the stream for your remote display data.
On your frontend server (can be the same box as the tunnel server), install the routine for AAA -- check if the user is known, if the user is allowed, how much the user should be charged, etc. Once all the checks are passed, make a secured connection from the frontend server to tunnel server. Get the device names that the tunnel server knows that the user is allowed to access. At this point, you should be able to get a "plaintext" stream, based on the device name, from the tunnel server. Forward this stream back to the user (via TLS, for example, or again via BEEP over TLS), or send the required configuration for your remote display client to connect to your tunnel server with the required parameters to access the remote display protocol's stream.