Missing UDP fragments when monitoring traffic with tcpdump - udp

I'm on a local LAN with only 8 connected computers using a netgear 24 port gigabit switch, network load is really low and send/receive buffers on all involved nodes(running slackware 11) have been set to 16mb. I'm also running tcpdump on each node to monitor the traffic.
A sending node sends a 10044byte large UDP packet which more often than not (3/4 times) does not end up in the receiving side application, in these cases I notice(using tcpdump) that the first x fragments are missing and only the last 3 (all with offsets > 0 and in order) are caught by tcpdump. The fragmented UDP package can therefore not be reassembled and is most likely thrown away.
I find the missing fragments strange since I have also tried a simple load test bursting out 10000 UDP messages of the same size, the receiving application sends a response and all tests so far gives 100% responses back.
Any clues or hints?

Update!
After resuming the testing of the above mentioned software I found a repeatable way of recreating the error.
Using windump on the sending windows machine, and tcpdump on the receiving machine, after having left the application idle for some time(~5 minutes), I tried sending the udp message but only end up with a single fragment caught by windump and tcpdump, the 3 remaining fragments are lost. Sending the same message one more time works fine and booth windump and tcpdump catches all 4 fragments and the application on the receiving side gets the message. The pattern is repeatable.
Started searching and found the following information, but to me, still not a clear answer.
http://www.eggheadcafe.com/software/aspnet/32856705/first-udp-message-to-a-sp.aspx
Re examining the logs I now notice the ARP request/reply being sent, which matches one of the ideas given in the link above.
NOTE! I filter windump on the sending side using: "dst host receivernode"
Capture from windump: first failed udp message, should be 4 fragments long
14:52:45.342266 arp who-has receivernode tell sendernode
14:52:45.342599 IP sendernode> receivernode : udp
Capture from windump: second udp message, exactly the same contents, all 4 fragments caught
14:52:54.132383 IP sendernode.10104 > receivernode .10113: UDP, length 6019
14:52:54.132397 IP sendernode> receivernode : udp
14:52:54.132406 IP sendernode> receivernode : udp
14:52:54.132414 IP sendernode> receivernode : udp
14:52:54.132422 IP sendernode> receivernode : udp
14:52:56.142421 arp reply sendernode is-at 00:11:11:XX:XX:fd (oui unknown)
Anyone who has a good idea about whats happening? please elaborate!

Related

tcpreplay traffic not being seen in localhost with netcat

I have a pcap file that I modified with tcprewrite to set source and destination IP = 127.0.0.1, while the port numbers are different. I also set both mac addresses to 00:00:00:00:00:00 as I understand that comms over localhost ignore MAC. I made sure checksum was fixed.
When I run tcpreplay -i lo test-lo.pcap in one shell, and tcpdump -i lo -p udp port 50001 in another, I see the traffic. Yet, when I try to view the traffic with netcat -l -u 50001, it sees nothing. Wireshark is capturing the traffic correctly.
Side note: I'm seeing the following warning when running tcpreplay on localhost:
Warning: Unsupported physical layer type 0x0304 on lo. Maybe it works, maybe it won't. See tickets #123/318 That seems worrisome.
I'm asking because my own UDP listener code is also having the same problem as netcat and thought that maybe I'm missing something. Why would traffic be seen by tcpdump and wireshark, and not by netcat?
I'm asking because my own UDP listener code is also having the same problem as netcat and thought that maybe I'm missing something. Why would traffic be seen by tcpdump and wireshark, and not by netcat?
Look at this image of the kernel packet flow from wikipedia:
As you can see, there are different places along the path where packets can be accessed. Wireshark uses libpcap, which uses an AF_PACKET socket to see packets. Your UDP listener, like netcat, uses regular user-space sockets. Let's highlight both on this image. Wireshark obtains packets via the red path, netcat via the purple one:
As you can see, there is a whole sequence of steps packets have to go through in the kernel to get to a local process socket. These steps include bridging, routing, filtering etc. Which step drops your packets? I don't know. You can try tweaking the packets and maybe you'll get lucky.
If you want a more systematic approach, use a tool like dropwatch. It hooks into the kernel and shows you counters of where the kernel drops packets.

UDP hole punching between two clients on one machine

I do UDP hole punching using the following method: I have a lobby server L, and two clients A and B behind a (shared) NAT.
Now, A and B are running on the same machine. They both send a datagram to server L.
Server L tells both A and B the IP+PORT of the other.
Note that the IPs of A and B that the server sees are identical, but the ports are different, as expected.
Then A and B send a datagram to each other, using the server provided addr+port.
Yet, their datagrams to each other never arrive.
My question: does UDP hole punching work if both clients are on the same machine? What if they are just on the same LAN, behind the same NAT?
NOTE: I tried to lower the strictness of my router, but Archer C7 does not seem to have a selection for Cone/Symmetric/Asymmetric unfortunately. I did switch off Stateful Packet Inspection.
UPDATE: When I try sending punch datagrams, I do see this come by over the network device:
ICMP dest unrch (port)
UPDATE: stunclient output:
$ stunclient --mode full stunserver.stunprotocol.org
Binding test: success
Local address: 10.0.1.2:49703
Mapped address: 209.161.250.218:49703
Behavior test: success
Nat behavior: Endpoint Independent Mapping
Filtering test: success
Nat filtering: Address and Port Dependent Filtering

UDP Packet loss at Sender

I have an application that sends logs to a log server residing on a different machine. These logs are sent as UDP packets. My problem is that i see UDP log loss. I have a running sequence number in the UDP packets which is printed along with the log message. I saw gaps in the sequence numbers at the log server. At first i suspected my log server to be the culprit but looks like the problem is at the sender machine dropping packets. The environment is:
Both sender and receiver running on HP G8 servers with CentOS 6.4 (2.6.32-358.el6.i686). Here are the things that I have already tried or observed:
The SndbufErrors and RcvbufErrors are 0 both on sender and receiver (/proc/net/snmp)
The ifconfig does not show any errors
The switch to which these machines are connected does not report any drops/erros
Running tcpdump on the sender side (with increased buffer for tcpdump so that the tcpdump reports 0 packets dropped) reveals that the packets missed by the receiver are infact not sent (or at least those packets does not show up on the captupred pcaps)
The maximum rate of the packet's sent is around 30K per second (the rate is not constant. normally it stays at very low around 500 per second with occasional spikes. The packets are lost during these spikes). The interface is 100Mbps link.
The size of the packet varies from around 80 bytes to 300 bytes.
The ring buffer on the NICs are:
Ring parameters for eth0:
Pre-set maximums:
RX: 2047
RX Mini: 0
RX Jumbo: 0
TX: 511
Current hardware settings:
RX: 200
RX Mini: 0
RX Jumbo: 0
TX: 511
NIC driver info:
driver: tg3
version: 3.124
firmware-version: 5719-v1.38 NCSI v1.2.37.0
bus-info: 0000:03:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
The NIC stats also shows no errors (throug ethtool -S eth0)
If i run my log server on the same machine as sender, there is no log loss.
I am clueless as where these packets are getting dropped (stack/NIC/NW...). I know UDP is un-reliable and packets can be dropped due to multiple reasons. But I am not still not able to find that reason as to why these packets are being dropped in my case at the sender side. On a different sidenote my application has also many TCP connections (around 8-10) with traffic on each connection.

tunneling using SSH

I'm tunneling all of my internet traffic through a remote computer hosting Debian using sshd. But my internet connection becomes so slow (something around 5 to 10 kbps!). Can be anything wrong with the default configuration to cause this problem?
Thanks in advance,
Tunneling TCP within another TCP stream can sometimes work -- but when things go wrong, they go wrong very quickly.
Consider what happens when the "real world" loses one of your TCP packets: after a certain amount of not getting an ACK packet back in response to new data packets, the sending side realizes a packet has gone missing and re-sends the data.
If that packet happens to be a TCP packet whose payload is another TCP packet, then you have two TCP stacks that are upset about their missing packet. The tunneled TCP layer will re-send packets and the outer TCP layer will also resend packets. This causes a giant pileup of duplicate packets that will eventually be delivered and must be dropped on the floor -- because the outer TCP reliably delivered the packet, eventually.
I believe you would be much better served by a more dedicated tunneling method such as GRE tunnels or IPSec.
Yes, tunelling traffic over tcp connection is not a good idea. See http://sites.inka.de/bigred/devel/tcp-tcp.html

UDP port 0.0.0.0

I have a system that is running on windows.
I have in that system a process that waits for another process on the same machine for a udp message. The message itself is not important (garbage), but the important thing is that I got the event of the message itself.
The problem is that it seems that I get from another local program a UDP message and I don't know from where. I added information about the sender in the recieved UDP message. I see that I get message from valid local port but also from the addres 0.0.0.0 .
I can't understand the 0.0.0.0 . Does anyone has an idea ?
A computer without an assigned IP address could send such packet, even across the network - see e.g. a similar mechanism in DHCP, where the DHCP discovery packet is sent with source address of 0.0.0.0
On a local computer, could this be that the packet is sent (and received) on an interface that is up but without an IP address?
Also, this can mean "broadcast" - if this article on e2 is correct, it is a deprecated method of making a broadcast packet, but apparently it was never removed.
Because it is a udp message and using async type, when reading messages that arrive from the other program I can't know when stop reading, when I get reading the message and I get 0.0.0.0 it means I read everything from the UDP buffer from OS.