I'm currently working a network protocol which includes a client-to-client system with auto-discovering of clients on the current local network.
Right now, I'm periodically broadsting over 255.255.255.255 and if a client doesn't emit for 30 seconds I consider it dead (then offline). The goal is to keep an up-to-date list of clients runing. It's working well using UDP, but UDP does not ensure that the packets have been sucessfully delivered. So when it comes to the WiFi parts of the network, I sometimes have "false postivives" of dead clients. Currently I've reduced the time between 2 broadcasts to solve the issue (still not working well), but I don't find this clean.
Is there anything I can do to keep a list of "online" clients without this risk of "false positives" ?
To minimize the false positives, due to dropped packets you should alter a little bit the logic of your heartbeat protocol.
Rather than relying on a single packet broadcast per N seconds, you can send a burst 3 or more packets immediately one after the other every N seconds. This is an approach that ping and traceroute tools follow. With this method you decrease significantly the probability of a lost announcement from a peer.
Furthermore, you can specify a certain number of lost announcements that your application can afford. Also, in order to minimize the possibility of packet loss using wireless network, try to minimize as much as possible the size of the broadcast UDP packet.
You can turn this over, so you will broadcast "ServerIsUp" message
and every client than can register on server. When client is going offline it will unregister, otherwise you can consider it alive.
Related
I have an embedded device (source) which is sending out a stream of (audio) data in chunks of 20 ms (= about 330 bytes) by means of a UDP packets. The network volume is thus fairly low at about 16kBps (practically somewhat more due to UDP/IP overhead). The device is running the lwIP stack (v1.3.2) and connects to a WiFi network using a WiFi solution from H&D Wireless (HDG104, WiFi G-mode). The destination (sink) is a Windows Vista PC which is also connected to the WiFi network using a USB WiFi dongle (WiFi G-mode). A program is running on the PC which allows me to monitor the amount of dropped packets. I am also running Wireshark to analyze the network traffic directly. No other clients are actively sending data over the network at this point.
When I send the data using broadcast or multicast many packets are dropped, sometimes upto 15%. However, when I switch to using UDP unicast, the amount of packets dropped is negligible (< 2%).
Using UDP I expect packets to be dropped (which is OK in my Audio application), but why do I see such a big difference in performance between Broadcast/Multicast and unicast?
My router is a WRT54GS (FW v7.50.2) and the PC (sink) is using a trendnet TEW-648UB network adapter, running in WiFi G-mode.
This looks like it is a well known WiFi issue:
Quoted from http://www.wi-fiplanet.com/tutorials/article.php/3433451
The 802.11 (Wi-Fi) standards specify support for multicasting as part of asynchronous services. An 802.11 client station, such as a wireless laptop or PDA (not an access point), begins a multicast delivery by sending multicast packets in 802.11 unicast data frames directed to only the access point. The access point responds with an 802.11 acknowledgement frame sent to the source station if no errors are found in the data frame.
If the client sending the frame doesnt receive an acknowledgement, then the client will retransmit the frame. With multicasting, the leg of the data path from the wireless client to the access point includes transmission error recovery. The 802.11 protocols ensure reliability between stations in both infrastructure and ad hoc configurations when using unicast data frame transmissions.
After receiving the unicast data frame from the client, the access point transmits the data (that the originating client wants to multicast) as a multicast frame, which contains a group address as the destination for the intended recipients. Each of the destination stations can receive the frame; however, they do not respond with acknowledgements. As a result, multicasting doesnt ensure a complete, reliable flow of data.
The lack of acknowledgments with multicasting means that some of the data your application is sending may not make it to all of the destinations, and theres no indication of a successful reception. This may be okay, though, for some applications, especially ones where its okay to have gaps in data. For instance, the continual streaming of telemetry from a control valve monitor can likely miss status updates from time-to-time.
This article has more information:
http://hal.archives-ouvertes.fr/docs/00/08/44/57/PDF/RR-5947.pdf
One very interesting side-effect of the multicast implementation (at the WiFi MAC layer) is that as long as your receivers are wired, you will not experience any issues (due to the acknowledgement on the receiver side, which is really a unicast connection). However, with WiFi receivers (as in my case), packet loss is enormous and completely unacceptable for audio.
Multicast does not have ack packets and so there is no retransmission of lost packets. This makes perfect sense as there are many receivers and it's not like they can all reply at the same time (the air is shared like coax Ethernet). If they were all to send acks in sequence using some backoff scheme it would eat all your bandwidth.
UDP streaming with packet loss is a well known challenge and is usually solved using some type of forward error correction. Recently a class known as fountain codes, such as Raptor-Q, shows promise for packet loss problem in particular when there are several unreliable sources for the data at the same time. (example: multiple wifi access points covering an area)
I am creating a server on a ST Cortex M3 device. I am using the lwip API and FreeRTOS. All is working, but the response time is way off. I am currently using lwip 1.3.2 and FreeRTOS 7.3.
A single client connects to the server and must have some time-critical data sent frequently. These packets are on the order of 6 or so bytes. Other times, I am sending upwards of 20K.
The problem I am having is that these smaller packets seem to be taking forever to be sent. I assume this is because lwip is waiting for more data to be enqueued to make more efficient transmissions. I cannot wait around for 2 or 3 seconds for the data to be sent; the client is expecting the data nominally in a few micro-seconds or milli-seconds.
I have tried using lwip_send and lwip_write. (I understand that one is the same as the other with a flag passed at the end. Just had to try...) I have tried setting TCP_NODELAY on the socket to no avail. I tried to set SO_SNDLOWAT to '1', but this always returned -1, so I do not think it is supported.
I do not want to redo all of my code using TCP RAW. Is there a way to invoke the tcp_output() function outside of TCP RAW mode? Is there any way to speed things up or is this just how slow lwip TCP with small packets is?
Any and all suggestions are welcome. Thanks.
--EDIT--
I would also like to add that once I am ready to transmit, I make sure that my TX task in FreeRTOS is at the highest priority. There are no other tasks running up to the point at which I call lwip_send/write.
I'm fairly experienced with bare metal lwIP on xilinx and lwip does not wait to send things out. It will pump packets out as fast as your interrupts are acknowledged based on the ethernet hardware. I've been using UDP only. What is coming to mind though, is your problem might be on the receive end. If you are doing TCP, maybe those small packets are coming out late because you are having receive issues. What you need to do is find in the code the lowest level point at which ethernet is transmit, put a general purpose output toggle on that. Then also put a general purpose output toggle on when a ethernet packet is received. Look at the signals on a scope. If it confirms your hypothesis, then move the output toggles around to narrow down the issue. Wash, rinse and repeat until you are down to where the issue its. It's crude and time consuming, but oftentimes this brute force approach solves many "impossible" embedded software problems, due to pure determination. Good luck!
On Redhat Linux, I have a multicast listener listening to a very busy multicast data source. It runs perfectly by itself, no packet losses. However, once I start the second instance of the same application with the exactly same settings (same src/dst IP address, sock buffer size, user buffer size, etc.) I started to see very frequent packet losses from both instances. And they lost exact the same packets. If I stop the one of the instances, the remaining one returns to normal without any packet loss.
Initially, I though it is the CPU/kernel load issue, maybe it could not get the packets out of buffer quickly enough. So I did another test. I still keep one instance of the application running. But then started a totally different multicast listener on the same computer but use the second NIC card and listen to a different but even busier multicast source. Both applications run fine without any packet loss.
So it looks like one NIC card is not powerful enough to support two multicast applications, even though they listen to exact the same thing. The possible cause to the packet loss problem might be that, in this scenario, the NIC card driver needs to copy the incoming data to two sock buffers, and this extra copy task is too much for the ether card to handle so it drops packets. Any deeper analysis on this issue and any possible solutions?
Thanks
You are basically finding out that the kernel is inefficient at fan-out of multicast packets. Worst case scenario the code is for every incoming packet allocating two new buffers, the SKB object and packet payload, and copying the NIC buffer twice.
Pick the best case scenario, for every incoming packet a new SKB is allocated but the packet payload is shared between the two sockets with reference counting. Now imagine what happens when two applications, each on their own core and on separate sockets. Every reference to the packet payload is going to cause the memory bus to stall whilst both core caches have to flush and reload, and above that each application is having to kernel context switch back and forth to pass the socket payload. The result is terrible performance.
You aren't the first to encounter such a problem and many vendors have created solutions to it. The basic design is to limit the incoming data to one thread on one core on one socket, then have that thread distribute the data to all other interested threads, preferably using user space code building upon shared memory and lockless data structures.
Examples are TIBCO's Rendezvous and 29 West's Ultra Messaging showing a 660ns IPC bus:
http://www.globenewswire.com/newsroom/news.html?d=194703
I am developing an application that sends data per UDP using AsyncUDPSocket class to another client on Mac and Windows. It is very important that packets arrive instantly.
The problem is that every approx. 1000 packets I get a delay for about 2 seconds when receiving Packets. A delay of 100-200 ms would be OK, but 2 seconds produce bad user experience.
I have the UDP communication in a separate Thread, so it is little affected by user interaction with UI and such. I have already tried sending Packets faster, slower, different Packet sizes: the delay stays there. Tried using TCP instead of UDP - same result :(
It does not seem to happen on Windows Cliets.
Maybe there is some system buffer in MacOS that needs to be flushed every time it hast N packets or N bytes of data???
Has anyone an idea how can I prevent the delay from happening?
There are a lot of things that can slow down a network program temporarily, it's hard to know where to start. Have you tried this on multiple networks? Both wireless and ethernet networks? What kind of switch do you have? Does this happen on different OS X computers, or just on one? Can you reproduce the delay with a simpler command line program? Are you using garbage collection? (Grasping at straws here...)
Just out of curiosity, I tested the roundtrip time on UDP echo packets sent from my Mac to another computer on the same LAN. Out of over 60,000 UDP packets with a 1,000 byte payload, none of them took longer than 32 ms, the mean round trip was 0.6 ms, and the sample deviation was 0.21.
(I'm also curious what you need such low latency for.)
I'm attempting to send multichannel audio over WiFi from one server to multiple client computers using UDP broadcast on a private network.
I'm using software called Pure Data, with a UDP broadcast tool called netsend~ and netreceive~. The code is here:
http://www.remu.fr/sound-delta/netsend~/
To cut a long story short, I'm able to achieve sending 9 channels to one client computer in a point-to-point network, but when I try to do broadcast to 2 clients (haven't yet tried more), I get no sound. I can compress the audio and send 4 channels compressed (about 10% size of uncompressed) over UDP broadcast to 2 clients successfully. Or I can send 1 channel over UDP broadcast to 2 clients, with some glitches.
The WiFi router is a Linksys WRT300N. All computers are running Windows XP. The IP addresses are 192.168.1.x, with subnet mask 255.255.255.0 and the subnet broadcast address: 192.168.1.255.
I'm curious - what happens to UDP broadcast packets in the router?
If I have a subnet mask of 255.255.255.0, then does the router make 254 packets for every packet sent ot the broadcast address?
My WiFi bandwidth is at least 100Mbps, but I can't seem to send audio of more than around 10Mbps over UDP broadcast to multiple clients.
What's stopping me from sending audio up to the bandwidth limit of the WiFi?
any suggestions for socket code modifications, network setups, router setups, subnet modifications... all very much appreciated!
thanks
Nick
Your problem is caused by the access point's rate control algorithm. With unicast the access point tracks what data rate every particular receiver can reliably receive at, and sends about that rate. With multicast the access point does not know which receivers are interested in the data, so simple access points send the data at the slowest possible rate (1Mb/s). Better implemented access points may send the data at the rate that the slowest connected client is using, and the best access points use IGMP snooping to see who's receiving each IP multicast stream, and they will choose the slowest rate out of the receivers for that stream.
The simplest solution is to not use multicast when you have a small number of WiFi receivers.
Are all parties connected via WiFi or is the sender using a
wired connection to the Access Point? Broadcast data will
be transmitted as unicast data from a station to an access
point and the access point will then retransmit the data
as broadcast/multicast traffic so it will use twice the
on-air bandwidth compared to when the sender sits on the
wired side of the AP.
When sending a unicast frame the AP will wait for an ACK
from the receiving station and it will retransmit the
frame until the ACK arrives (or it times out). Broadcast/multicast
frames are not ACKed and therefore not retransmitted.
If you have a busy/noisy radio environment this will
cause the likelyhood of dropped packets to increase,
potentially a lot, for multicast traffic compared to unicast
traffic. In an audio application this could certainly be audible.
Also, IIRC, broadcast/multicast traffic does not use the
RTS/CTS procedure for reserving the media which exarbates
the dropped packets problem.
It could actually be the case that multiple unicast streams
work better than a single multicast stream under less-than-ideal
radio conditions given that the aggregated bandwidth is
high enough.
If you can I would suggest that you use wireshark to sniff
the WiFi traffic and take a look at the destination address
in the 802.11 header. Then you can verify if the packets
are actually broadcast or not over the air.
Your design is failing due to a common misconception with WiFi speeds. With 802.11n the number 300mb/s is the link speed, not the actual bandwidth available for user data or even the IP layer. The effective bandwidth is closer to 40mb/s best case, have a look at the FAQ on SmallNetBuilder.com that discusses this in further detail.
http://www.smallnetbuilder.com/wireless/wireless-basics/31083-smallnetbuilders-wireless-faq-the-essentials
I'm curious - what happens to UDP broadcast packets in the router? If I have a subnet mask of 255.255.255.0, then does the router make 254 packets for every packet sent ot the broadcast address?
No the "router" doesn't make 254 individual packets. Furthermore, I suspect the protocol leverages "multicast" addresses rather than using a "broadcast" address.
Since broadcast/multicast traffic can easily be misused, there are many networking equipment that limit/block by default such traffic. Of course, some essential protocols (e.g. ARP, DHCP) rely on broadcast/multicast addresses to function and won't be blocked by default.
Hence, it might be a good thing to check for multicast/broadcast control knobs on your router.