We have a TURN fleet that has presence with the AWS Regions US-EAST-1 and US-WEST-2.
A client in the Ukraine is trying to connect to a peer in the United Kingdom. Both the client and the peer are behind a symmetric NAT.
The client does not have access to TURN credentials.
The peer does have access to TURN credentials.
The result is:
client:
host
server reflexive
peer:
host
server reflexive
relay
Our TURN fleet creates permissions in IP addresses ONLY, and ignores the port variations.
Success case: Client initiates a call through US-EAST-1.
Client gathers its host, and server reflexive candidate. Sends them to peer.
Peer receives the clients candidates, and gathers its own host, server reflexive and relay candidate. Sends them to the client.
The client sends a STUN binding to the relay transport address.
The TURN relay sees the request, attaches an XOR-PEER-ADDRESS to it and forwards it to the client via a DATA indication.
The client sees the request, generates a response, and sends it to the relay via a SEND indication.
The relay accepts and forwards the message to the peer.
The peer unwraps the XOR-PEER-ADDRESS, determining it to be a peer reflexive candidate and sets the active pair as local: prflx | remote: relay.
Media flows, everyone is happy.
Failure case: Client initiates a call through US-WEST-2.
Steps 1 to 6 occur as they did above.
The peer never receives the response from the TURN server containing the peer reflexive candidate in the XOR-PEER-ADDRESS.
Since it is a symmetric NAT, the call fails.
This only happens to a particular peer, as we see thousands of prflx / relay calls happening in US-WEST-2 even at this very moment. I don't have any reason to believe that the TURN server failed to send the response.
Security groups for outgoing are fully permissive (for IPv4/6) and are identical.
0.0.0.0/0
::/0
The config files in both regions are identical. As far as my exhaustive search has taken me, the only difference between the two calls is the AWS Region that the calls are made through. I could be mistaken, but I've combed through every little detail I could think of.
I've even tested US-EAST-1 and US-WEST-2 myself through a symmetric NAT and was able to establish a prflx-relay call.
Has anyone encountered this before?
Binding Request (peer -> TURN)
1094 5.262769 192.168.1.202 TURN_IP_ADDRESS STUN
170 Binding Request user: 0kZJvRw4xieMyZWtgrfSvGh6EzfhRzxP:5Vi8
TURN to Client DATA indication:
3942 8.729590 TURN_IP_ADDRESS 192.168.1.202 STUN
150 Binding Success Response XOR-MAPPED-ADDRESS: 1.2.3.4:9976 user: ttsEVBZnkpBftWENErHh+3HkiEj3xoU2:8/DZ
Client Receives message:
3272 7.757723 TURN_IP_ADDRESS 192.168.88.108 STUN
250 Data Indication XOR-PEER-ADDRESS: 1.2.3.4:9976
Client Creates a permission:
CreatePermission Request XOR-PEER-ADDRESS: 1.2.3.4:9976 with nonce realm: amazonaws.com user: { USERNAME }
Client responds:
3358 7.930904 192.168.88.108 TURN_IP_ADDRESS STUN
186 Send Indication XOR-PEER-ADDRESS: 1.2.3.4:9976
TURN to Peer response:
nothing received by the peer
Client continues to receive DATA indications of:
3272 7.757723 TURN_IP_ADDRESS 192.168.88.108 STUN
250 Data Indication XOR-PEER-ADDRESS: 1.2.3.4:9976
and continues to respond with:
3358 7.930904 192.168.88.108 TURN_IP_ADDRESS STUN
186 Send Indication XOR-PEER-ADDRESS: 1.2.3.4: 9976
Eventually the call times out.
Related
Edit:
I think based on the below answer here, it seems the answer is "client and server basically only communicate on one port, 3478 (or equivalent")
rfc 5766 : Issue when Both devices support TURN
==========================.
I have been reading several sources on TURN, including RFC.
I get the whole premise:
Client creates allocation on TURN server
Client sends data to Peer through TURN that relays via the relayed transport address
Same way around from peer --> Server --> client
Most resources focus on setting up the server and what ports need to be configured.
The point that I am unclear is on the client side:
After the allocation is done and the client can start sending data, do they send that data to the relayed transport address that the Server allocated? Or do they send it to the standard TURN port e.g. 3478, and then the server takes care of looking up the allocation for this client and send it through the relayed address to the peer?
Example:
Client address 192.6.12.123:45677 (let's assume it's the NAT)
TURN server listens on 34.45.34.123:3478
TURN server has done an allocation for client on 34.45.34.123:50678
So when the client wants to send to a peer application data, do they send on port 3478 or port 50678?
My assumption (based also on some wireshark captures I tried) is that the client always send everything on port 3478 and the server takes care to send via the relayed address.
My assumption (based also on some wireshark captures I tried) is that the client always send everything on port 3478
The client will pick a random local port (e.g 45677), but traffic sent from this port goes to the server's port 3478 (or 5349 if using TLS) on the server. The server will forward it through its allocated port (50678) to whatever remote port the other client established during ICE negotiation.
I am using the ActiveMQ Artemis Broker and publishing to it through a client application.
Behavior observed:
When my client is IPV4 a TLS handshake is established and data is published as expected, no problems.
When my client is IPV6 , I see frequent re-connections being established between the client and the server(broker) and no data is being published.
Details:
When using IPV6 the client does a 3 way handshake and attempts to send data. It also receives a Server Hello and sends application data.
But the connection terminates and again reconnects. This loop keeps occurring.
The client library, network infrastructure, and broker are all completely the same when using IPv4 and IPv6.
The client logs say:
Idle network reply timeout.
The broker logs show an incoming connection request and also an CONNACK for it from the broker, e.g.:
MQTT(): IN << CONNECT protocol=(MQTT, 4), hasPassword=false, isCleanSession=false, keepAliveTimeSeconds=60, clientIdentifier=b_001, hasUserName=false, isWillFlag=false
MQTT(): OUT >> CONNACK connectReturnCode=0, sessionPresent=true
What wire-shark (tcpdump) tells:
Before every re-connection(3 way handshake is done) I see this:
Id Src Dest
1 Broker(App Data) Client
2 Broker(App Data) Client
3 Client(ACK) Broker
4 Client(ACK) Broker
5 Broker(FIN,ACK) Client
6 Client(FIN,ACK) Broker
7 Broker (ACK) Client
8 Client (SYN) Broker
9 Broker (SYN/ACK) Client
10 Client (ACK) Broker
Then the 3 way handshake (Client hello, Change Cipher Spec, Server Hello) and the above repeats again.
Based on packets 5, 6, & 7 I have concluded that the connection is being terminated by the broker (server). The client acknowledges termination and then again attempts to reconnect as it is an infinite loop attempting re connection and publishing.
I am looking at network level analysis for the first time and even wireshark. I'm not sure if my analysis is right.
Also have hit a wall, not sure why re-connection is occurring only when the device is IPV6. Also I don't see any RST to indicate termination of connection.
Broker is also sending a CONNACK (from broker logs), but still no data is sent, just attempts to reconnect not sure why.
Also, I see a few I see a few:
Out-of-Order TCP (when src is broker)
Spurious Re-transmission
DUP ACK (src is client)
Not sure if this is important.
Any headers on what is going on?
The issue was caused due to a LB setting which had a default connection time out of 30 secs , lesser than the connection timeout set by the client.
I have installed the TURN server everything in the server code is working fine. no error in the log file. only a warning stating
0: WARNING: I cannot support STUN CHANGE_REQUEST functionality because only one IP address is provided
but the TURN server running on the server.
here is what shows when I check lsof -i :3478
turnserve 999 root 15u IPv4 446811411 0t0 TCP domain.com:stun (LISTEN)
turnserve 999 root 23u IPv4 446811417 0t0 TCP domain:stun (LISTEN)
turnserve 999 root 24u IPv4 446810998 0t0 UDP domain.com:stun
turnserve 999 root 25u IPv4 446810999 0t0 UDP domain.com:stun
when I check STUN in Trickle ICE it throws an errors
The server stun:xxx.xxx.xxx.xxx:3478 returned an error with code=701:
STUN server address is incompatible.
The server stun:xxx.xxx.xxx.xxx:3478 returned an error with code=701:
STUN allocate request timed out.
what's going wrong in this.
Thank you
I think that 701 error is a more generic connectivity error that Trickle ICE uses to indicate it didn't get a binding response back. Run stunclient your.stun.ip.address with the command line tools at www.stunprotocol.org to see if your STUN service is accessible from the outside world.
STUN technically requires being hosted on a device with two IP addresses and two ports. It's typically a command line parameter to specify which IP addresses the server should listen on. But most server implementations can operate on a host with a single IP address.
The second IP address and port on the server is used for STUN client filtering tests to detect what type of NAT is in effect. The client sends a binding request on the server's primary ip and port, but with a change request attribute to have the server respond from the alternate IP address or port. More often than not, this binding request with a change-request attribute fails since NATs will not forward traffic from the other IP/port.
The filtering test is useful for logging what type of NAT the client is on. So that failed connections can be debugged and that success/failure metrics can be correlated to NAT type.
Since most ICE implementations will exchange all available address candidates (local, mapped, and relay), the filtering test isn't very or useful to connectivity establishment.
I'm surprised Trickle ICE is giving you an error. I didn't think WebRTC ever used the changer-request attribute. I just did a Wireshark trace of a Trickle ICE session to stunserver.stunprotocol.org. I don't see the webrtc client setting the change-request attribute in either of the two binding requests it makes.
More details in RFC 5780 Section 3.2
In macOS, I just do so:
> brew install stuntman
when it done
> stunclient stunserver.stunprotocol.org
Binding test: success
Local address: 198.18.0.1:54898
Mapped address: 210.0.158.130:56750
To specify port, just like this:
> stunclient stunserver.stunprotocol.org 3478
Binding test: success
Local address: 198.18.0.1:63061
Mapped address: 210.0.158.130:37126
Have fun!
We have created Ipsec tunnel using strong-swan as follows,
server (eth interface- 13.13.7.13) --> clinet (eth interface - 13.13.7.18)
when ikev2 phase1 and phae2 messages exchanges happens, source IP and destination IP are same as IP address assigned to eth interfaces. (confirmed via wire-shark). And ISAKMP message exchange has been done successfully.
1) When I started transmitting data via SCP protocol between client & sever , I have noticed ESP and SSH packets.
In which ESP packets just have sequence number but not encrypted payload and SSH packets have encrypted payload. But as per Ipsec protocol data should be encrypted in ESP protocol itself. why there is no payload info in ESP packets ?
FI : I noticed continuous ESP packets after ISAKMP exchange (negotiation and authentication done)
SSH and ESP Packets look like below:
**SSH Protocol**
SSH Version 2 (encryption:chacha20-poly1305#openssh.com mac:<implicit> compression:none)
Packet Length (encrypted): e78d1cd9
Encrypted Packet: 9679398c167c33ca6c1eecc4879e59d417b39545c80b0e40...
MAC: 27b594b6290dcdf3a09fd2fb84884cd7
**ESP Protocol**
Encapsulating Security Payload
ESP SPI: 0xc86cb75d (3362568029)
ESP Sequence: 19
ESP payload ideally not visible in wireshark directly, you need to enable ESP preferences in wireshark tool by providing tunnel SPI, end point IPs, enc/auth keys etc., as mentioned in wiki.
In a typical case, I have two endpoints A & B, and have a turn server say S. A initiated call and send host and relay candidate to B in SDP. B answered call and sent only host candidate in SDP.
Lets say A's candidates are
host: 192.168.1.150:5555
relay: 192.168.1.100:7890
B's host candidate is
host: 192.168.1.151:5690
Say turn server details are as below
192.168.1.100:3478
Now I am about to start ICE connectivity check from A towards B.
First I tried connectivity check from A's host candidate to B's host candidate. It timed out, and its ok.
Next I am about to try ICE connectivity from A's relayed candidate to B's host candidate. Here my doubt is, when A send connectivity check (which is STUN BIND request), to which transport it will send.
Possible cases are,
1) A will send from host transport address to turn server 192.168.1.100:3478
2) A will send from host transport address to A's relay candidate 192.168.1.100:7890
Which one above is correct as per ICE standard.
A will send from a random local udp port previous used when allocating the relay candidate on the TURN server to 192.168.1.100:3478. This will usually be a send indication containing the ICE binding request and specifying Bs host candidate as destination. The turn server will send this from port 7890 to the host candidate of B
In your case, it's likely that ICE will not succeed. A will send from host transport address to turn server 192.168.1.100:3478 and then server will try to forward the packet through port 7890 as raw data (not TURN encapsulated) but as the peer has NATted host so that will not reach to B. Also B will try to send packets to A's relay candidate but server will not allow this packet as A set permission for B's 192.168.1.151 but server will not see this address but the public address of B which has no permission to go through.