GameKit/Peer-to-peer over internet - objective-c

For an iOS app I am developing, I want multiple phone to connect to each other and be able to voice chat between those devices.
I have it working when both devices are on the same network. This was quite simple and most of the stuff I want to do, is possible.
But now I am adding internet support, which is quite a hassle. I'll first try to explain how I want to match the devices, using a small webservice I set up.
Server
Start a new GameKit session, with session-mode GKSessionModePeer
Find the "Peer ID" of the server on the session I just created
Create a new CFSocketRef on an free port and keep it ready to accept connections
Send Peer ID and Port number to my webservice, running on an external server.
WebService
Webservice receives the information and stores it together with an ID and the IP address of the client in a database.
Send ID back to Server, which displays the ID
Client
When the user chooses to use the "Online" feature of GameKit to search for games, I ask the user for an ID (where the user should input the ID the server receives).
Client connects to the webservice supplying the ID. The webservice returns the information about the session (IP, PORT, Peer ID) of the server.
The user tries to connect to the IP address, with the port information and set up an input and output stream with the server.
This does not work ofcourse, because my network does not allow incoming connections and a random port (from an external network).
But now the question is, how do I solve this? I want to be able to set up a peer to peer connection between 2 devices, those devices could be on the same network, but also on separate networks.
Is there a framework, example or anything showing how to do this? I want to be able to send data from device to device, without sending it to a server first.

I'm not aware of any frameworks that do this. I do however have a lot of experience with p2p networking across multiple networks.
One important rule I learned: when communicating between networks, don't create a direct connection unless necessary. There are just too many factors that can (will?) cause issues, such as firewalls, NATs, etc.
Sure, you can let the connection try first. You can try to connect to the given IP addresses*, but in most cases it will fail. Even when using UPnP and NAT-PMP, you'll find that in a lot of cases (more than half?) you won't be able to accept incoming connections at all.
So make sure to have a backup plan. Make a network layer abstraction that doesn't only listen(), but also connects to a server. That way, when you can't connect to the IPs* of the client, you simply setup a connection via the server and the network abstraction takes care of it all.
Let me reiterate the above: don't rely on incoming connections only, always have a backup plan.
* I write IPs because clients can have multiple local/remote IPs. Always iterate over all these IPs when connecting. Example: my phone has 2 local IPv4 addresses (10.0.0.172 and 10.8.0.2), and an IPv6 address ([2001:x:x::6]). Of these three addresses, only the IPv6 address is publicly reachable, and the two IPv4 addresses are on different subnets so whether you can connect to them depends on the subnet that the other client is on. Always try to connect to both, and fall back to a server-proxied connection when it fails.
** I mentioned IPv6, yes. Let's not forget that IPv6 is not limited by NATs, unlike IPv4, and this means that you're far more likely to get a good connection via IPv6 than IPv4, if supported.

Related

Is it possible to successfully negotiate a WebRTC connection between peers who are offering different TURN servers that require credentials?

We use a provider of global TURN servers (Xirsys). When establishing a connection between peers, each peer first identifies the closest TURN server to their location, then fetches credentials for that server. The peers then exchange ICE candidates, including their respective TURN server URLs.
If those peers are in different regions, they will propose different TURN servers. According to the accepted answer to this question: TURN-Server for RTCConfiguration the respective TURN servers will connect to each other to relay streams from Peer1 <> TURN1 <> TURN2 <> Peer2. However, I have been unable to get this to work. Forcing TURN in the clients (i.e. no direct p2p connections), and attempting to establish a peerConnection using a TURN server in e.g. the United States to one in Brazil, negotiation always fails.
Is this because the servers require credentials that are not passed in the ICE candidates? Or perhaps it's a Xirsys-specific problem? Or should it actually work fine and we're doing something else wrong?
No it's not going to be because of the credentials. They are used between the client and its TURN server. The connection between the TURN server and remote end point doesn't use any authentication.
In fact each TURN server should be blissfully unaware that the remote party is even another TURN server. As far as they are concerned they forward packets to the remote end point just the same no matter whether it's a browser, another TURN server or some other application.
So, while working through two TURN servers is possible, it's definitely not easy. The reason is that the first TURN server will generate an allocation with a given port. The second TURN server will need to send data to this port. However, how does the first TURN server know where to send that data? The second TURN server will not yet have an allocation!
Typically, WebRTC applications use a singular TURN server. If you want to use two, it means having control of the allocation generation and massaging of the SDP.

Difference between STUN/TURN(coTURN) servers and Signaling servers (written with socket.io/websocket) in WebRTC?

I am building this video teaching site and did some research and got a good understanding but except for this thing. So when a user want's to connect to another user, P2P, I need signaling server to get their public IP to get them connected. Now STUN is doing that job and TURN will relay the media if the peers cannot connect. Now if I write signaling server with WebSocket to communicate the SDP messages and have ICE working, do I need coTURN installed? What will be the job of the job of them particularly?
Where exactly I am confused is the work of my simply written WebSocket Signaling server (from what I saw in different tutorials) and the work of the coTURN server I'll install. And how to connect them with the media server I'll install.
A second question, is there a way to use P2P when there is only two/three participants and get the media servers involved is there is more than that so that I don't use up the participant's bandwidth too much?
The signaling server is required to exchange messages between peers (SDP packets) until they have established a P2P connection.
A STUN server is there to help a peer discover information about its public IP and to open up firewall ports. The main problem this is solving is that a lot of devices are behind NAT routers within small private networks; NAT basically allows outgoing requests and their response, but blocks any other "unsolicited" incoming requests. You therefore have a Catch-22 scenario when both peers are behind a NAT router and could make an outgoing request, but have nowhere to send it to since the opposite peer doesn't expose anything to make a request to. STUN servers act as a temporary middleman to make requests to, which opens a port on the NAT device to allow the response to come back, which means there's now a known open port the other peer can use. It's a form of hole-punching.
A TURN server is a relay in a publicly accessible location, in case a P2P connection is impossible. There are still cases where hole-punching is unsuccessful, e.g. due to more restrictive firewalls. In those cases the two peers simply cannot talk 1-on-1 directly, and all their traffic is relayed through a TURN server. That's a 3rd party server that both peers can connect to unrestrictedly and that simply forwards data from one peer to the other. One popular implementation of a TURN server is coturn.
Yes, basically all those functions could be fulfilled by a single server, but they’re deliberately separated. The WebRTC specification has absolutely nothing to say about signaling servers, since the signaling mechanism is very unique to each application and could take many different forms. TURN is very bandwidth intensive and must usually be delegated to a larger server farm if you’re hoping to scale at all, so is impractical to mix in with any of the other two functions. So you end up with three separate components.
Regarding multi-peer connections: yes, you can set up a P2P group chat just fine. However, each peer will need to be connected to every other peer, so the number of connections and bandwidth per peer increases with each new peer. That’s probably going to work okay for 3 or 4 peers, but beyond that you may start to run into bandwidth and CPU limits of individual peers, especially if you’re doing decent quality video streaming.

Connect to specific user from STUN server in WEB RTC

I'm trying to achieve peer to peer video conference using google stun server.
I can connect anyone by stun server randomly.Because stun gives multiple and random addresses and connect with it.
But is there any way to connect specific peer by stun server for a login based system or room based system?
I want to achive something like - https://apprtc.appspot.com/
You need to design your signalling method (this is up to the application developer), which is independent of STUN.
WebRTC does not specify the mechanism for signalling. Signalling is the method whereby users discover each other and establish that a call (media streams between two peers) is going to take place.
The 'discovery' process could involve a registration-based system (eg using SIP proxy) or room based where two users have access to a 'room' (by knowing the credentials or some means of authentication). Once two peers have found each other, their browsers then need to share and negotiate network topology and media capabilities to ensure that the streams can reach the intended destination and can be encoded/decoded properly.

PeerConnection based on local IP's

What I want is, basically, to create a connection between two different computers on same local network. But i want to do this by computers' local IP's. (like 192.168.2.23 etc)
This must be a totally local connection. no TURN or STUN Servers. I am not sure if this is possible. Because there are not much documentation/example/information about WebRTC.
So, how can I create a connection from my computer to another one just passing its local IP as parameter?
Update: To be more clear; imagine there is an html page contains some code that activates my camera and audio services. and another -almost same- page is open in other computer. Waiting a connection request... And there is a textbox in my page to type an IP belongs to other computer on my local network. type 192.168.2.xx and bingo! i have connection between me and other computer.
I want this process as IP based, because there may be more than 2 devices on the network. And all of them are possible devices to create connection. So i need to reach them by their IP's.
Any example code or explanation would be great! even if it tells that this is not possible.
Thanks
Peer discovery is a vital part in any WebRTC application. It's an expensive term for saying: "Hi, I'm computer 4 and I want to talk to you!".
See it as calling a friend over the phone. You need to dial his number first.
This part is not defined in the WebRTC standards. You need to implement this logic in your application. Once you know who you want to call, you need a way of exchanging vital information. This is called signaling, like flo850 put in his answer.
Signaling is needed before any peer-to-peer connection can be set up.
To come up with an idea for your use case of 7 devices in a LAN.
If you have these devices connected to for example a WebSockets server and are in the same channel.
The WebSockets server can be written to route messages to specific receivers.
Devices connected to the channel often are identified with some kind of ID, imagine you use the device's IP.
When you want to talk to computer 4 with IP 192.168.0.4 you send the exchange messages (signaling) on the channel to the receiver with ID, the IP of the device you want to connect with.
How to send the signaling (offer, answer) is described here with example code.
Hope this helps
Users usually sit behind NATs; that's why ICE concept implemented in WebRTC.
If both users are sitting behind same NAT; you can skip ICE servers by passing "NULL" parameter value over "RTCPeerConnection" constructor:
var peer = new [webkit|moz]RTCPeerConnection ( null );
Now, browser will use "host" candidates, also known as "local" candidates.
you still need a signaling server. During the ICE candidate search, your clients will exchange their local ip through this signaling server

Replicate logmein.com behavior for smart devices

I have several smart devices that run Windows CE5 with our application written in .NETCF 3.5. The smart devices are connected to the internet with integrated GPRS modems. My clients would like a remote support option but VNC and similar tools doesn't seem to be able to do the job. I found several issues with VNC to get it to work. First it has severe performance issues when ran on the smart device. The second issue is that the internet provider has a firewall that blocks all incoming requests if they didn't originate from the smart device itself. Therefore I cannot initiate a remote desktop session with the smart devices since the request didn't originate from the smart device.
We could get our own APN however they are too expensive and the monthly cost is too great for the amount of smart devices we have deployed. It's more economical for us if we could add development costs to the initial product cost because our customers dislike high monthly costs and rather pay a large sum up front instead. A remote support solution would also allow us to minimize our onsite support.
That's why we more or less decided to roll our own remote desktop solution. We have code for capturing images on the smart device and only get the data that has changed since the last cycle. What we need is to make a communication solution like logmein.com (doesn't support WinCE5) where the smart devices connect to a server from which we then can stream the data to our support personnel's clients. Basically the smart device initiates a connection to our server and start delivering screen data when the server requests it. A support client connects to the server and gets a list of available streams and then select one to listen in on.
Any suggestions for how to do it considering we have to do the solution in .NETCF 3.5 on the smart devices? We have limited communication experience beyond simple soap web-services.
Since you're asking for a suggestion, I'll suggest this:
Don't reinvent. Reuse whatever you can. You can perform tunneling with SSH, so make an SSH connection (say, a port of PuTTY or plink, inside a loop) out via GPRS on your smart device; forward remote ports to local ports, bound to the SSH server's local address (127.0.0.1 (sshd):4567 => localhost (smart_device_01):4567). Your clients connect to your SSH server and access the assigned port for each device.
With that said, that's probably not the answer you're looking for. Below - the answer you're probably looking for.
Based on my analysis of how LogMeIn works, you'll want to make an HTTPS or TLS server where your smart devices will push data. Let's call it your tunnel server.
You'll probably want to spawn a new thread that repeatedly attempts to make connections to the tunnel server (outbound connections from smart device to the server, per your specified requirement). With a protocol like BEEP/BXXP, you can encapsulate and multiplex message-oriented or stream-oriented sessions. Wrap BXXP/BEEP into TLS, and tunnel through to your tunnel server. BEEP lets you multiplex streams onto one connection -- if you want the full capabilities of an in-house LogMeIn solution, you'll want to use something like this.
Once a connection is established, make a new BEEP session. With the new session, tell the tunnel server your system identification information (device name, device authentication signature). Write heartbeat data (timestamp periodically) into this new session.
Set up a callback (or another thread) which interfaces to your BEEP control session. Watch for a message requesting service. When such a request comes in, spawn the required threads to copy data from your custom remote-display protocol and push this data back through the same channel.
This sets the basic premise for your Smart Device's program. You can add functionality to this as you desire, say, to match what LMI's IT Reach subscription provides (remote registry, secure tunneled Telnet, remote filesystem, remote printing, remote sound... you get the idea)
I'll make some assumptions that you know how to properly secure all this stuff for authentication and authorization for your clients (Is user foo allowed to access smart device bar?).
On your tunnel server, start a server socket (listening for inbound connections, or from the perspective of smart devices, smart device outbound connections) that demultiplexes connections and sessions. Once a connection is opened, fire up BEEP and register a callback / start a thread to wait for the authentication/heartbeat session. Perform the required checks for AAA to smart devices -- are these devices allowed, are they known, how much does it cost, etc. Your tunnel server forwards data on behalf of your smart devices. For each BEEP session, attach a name (device name) to the BEEP session after the AAA procedures succeed; on failure, close the connection and let the AAA mechanism know (to block attackers). Your tunnel server should also set up what's required for interacting with the frontend -- that is, it should have the code to interact with BEEP to demultiplex the stream for your remote display data.
On your frontend server (can be the same box as the tunnel server), install the routine for AAA -- check if the user is known, if the user is allowed, how much the user should be charged, etc. Once all the checks are passed, make a secured connection from the frontend server to tunnel server. Get the device names that the tunnel server knows that the user is allowed to access. At this point, you should be able to get a "plaintext" stream, based on the device name, from the tunnel server. Forward this stream back to the user (via TLS, for example, or again via BEEP over TLS), or send the required configuration for your remote display client to connect to your tunnel server with the required parameters to access the remote display protocol's stream.