Netty loses udp packets at the beginning of the communication

Netty loses udp packets at the beginning of the communication - udp

I have a strange problem with Netty 4.0.27. I'm building a system which have to receive large number of incoming udp packet (all packet is 28 bytes long) and save them to a file. I implemented it and I also developed a simulator which send packets to the server.
I experienced that Netty server has a huge packet loss at the beginning of the communication, but after few thousand packet, the server received all packets successfully. Unfortunately, I can't afford this amount of packet loss (for example it loses 5-6 percent of first 200k packet).
I think, Netty adapts to high traffic, but I found nothing in the documentation. I also tried to use ChannelOption's SO_RCVBUF option but i had the same problem.

Related

How to set packet timeouts in Boost ASIO

I have an API server using Boost ASIO on Windows Server 2008R2 that works with one browser but not another. The browser that doesn't work gets issued an RST from the server. I presume that the time waiting for an ACK is the culprit since the time waiting for ACK response is almost twice the response time (2.5 seconds) of the browser that works (before sending an RST).
In researching this issue is mainly done with Wireshark and examining the packet timing and RST/ACK replies.
The question is, is there a way to set the timeout in ASIO/Winsock to wait a little longer for the ACK packet and before issuing an RST?

Fragmented UDP packet loss?

We have an application doing udp broadcast.
The packet size is mostly higher than the mtu so they will be fragmented.
tcpdump says the packets are all being received but the application doesn't get them all.
The whole stuff isn't happening at all if the mtu is set larger so there isn't fragmentation. (this is our workaround right now - but Germans don't like workarounds)
So it looks like fragmentation is the problem.
But I am not able to understand why and where the packets get lost.
The app developers say they can see the loss of the packets right at the socket they are picking them up. So their application isn't losing the packets.
My questions are:
Where is tcpdump monitoring on linux the device?
Are the packets there already reassembled or is this done later?
How can I debug this issue further?

tcpdump uses libpcap which gets copies of packets very early in the Linux network stack. IP fragment reassembly in the Linux network stack would happen after libpcap (and therefore after tcpdump). Save the pcap and view with Wireshark; it will have better analysis features and will help you find any missing IP fragments (if there are any).

netTcp async callbacks executing very slowly over the internet?

I have a duplex voice and video chatting service hosted in IIS 7 with netTcpBinding. The consumer of the service is a Silverlight client. The client invokes a method on the service: SendVoiceAsync(byte[] buffer) 10 times a second (every 100ms), and the service should call the client back also 10 times a second (in resposnse to every call). I've tested my service rigorously in a LAN and it works very well, for every 100 calls sending the voice/video buffer to the service, the service calls the other clients back 100 times with received voice buffers in that time period. However when I consume the service over the internet it become horrifically slow, lags terribly and often gives me a timeout error. I've noticed that over HTTP the callbacks are being received at a rate of about 1/10th of each call to the server, so for every 100 calls to the service, the server calls the clients back 10 times, when this number should be 100 (as it is in LAN) or something very close to it.
So what could be causing the service to become so laggy over HTTP? I've followed the general guidelines on configuring netTcpBinding for optimised performance, and while it seems to pay dividends in LAN its terrible over the internet. It feels as if something is blocking the client from sending their replies to the service, though I've disabled all firewalls, and forwarded ports 4502-4535 and port 80 on which the website hosting the service resides to the server computer. If it helps, my service has ConcurrencyMode set to Multiple, and it's InstanceContextMode set to Single. Also my service operations are all one way and not request-reply.
Thanks for any help.

The internet is a much more noisy and difficult network than LAN: packets might get lost, re-routed via different routers/switch and the latency is generally pretty bad.
That's why TCP exists, it is an ordered, reliable protocol so every packet is acknowledged by the receiver and re-sent if it didn't make it.
The problem with that is it does not try to be fast, it tries to get all of the data sent across and in the order it was sent.
So I'm not surprised that your setup works in LAN, as LAN Round Trip Times (RTT) are usually about 5 ~ 80 ms, but fails over the internet where an RTT of 250 ms is normal.
You can try to send your data less often, and switch to UDP which is non-ordered and unreliable but faster than TCP. I think UDP is standard for voice/video over the internet as you can compensate for the lost packets with a minor degradation of the voice/video quality.Online games suffer from the same issue, for example the original Quake was unplayable over the internet.

Why is SNMP usually run over UDP and not TCP/IP?

This morning, there were big problems at work because an SNMP trap didn't "go through" because SNMP is run over UDP. I remember from the networking class in college that UDP isn't guaranteed delivery like TCP/IP. And Wikipedia says that SNMP can be run over TCP/IP, but UDP is more common.
I get that some of the advantages of UDP over TCP/IP are speed, broadcasting, and multicasting. But it seems to me that guaranteed delivery is more important for network monitoring than broadcasting ability. Particularly when there are serious high-security needs. One of my coworkers told me that UDP packets are the first to be dropped when traffic gets heavy. That is yet another reason to prefer TCP/IP over UDP for network monitoring (IMO).
So why does SNMP use UDP? I can't figure it out and can't find a good reason on Google either.

UDP is actually expected to work better than TCP in lossy networks (or congested networks). TCP is far better at transferring large quantities of data, but when the network fails it's more likely that UDP will get through. (in fact, I recently did a study testing this and it found that SNMP over UDP succeeded far better than SNMP over TCP in lossy networks when the UDP timeout was set properly). Generally, TCP starts behaving poorly at about 5% packet loss and becomes completely useless at 33% (ish) and UDP will still succeed (eventually).
So the right thing to do, as always, is pick the right tool for the right job. If you're doing routine monitoring of lots of data, you might consider TCP. But be prepared to fall back to UDP for fixing problems. Most stacks these days can actually use both TCP and UDP.
As for sending TRAPs, yes TRAPs are unreliable because they're not acknowledged. However, SNMP INFORMs are an acknowledged version of a SNMP TRAP. Thus if you want to know that the notification receiver got the message, please use INFORMs. Note that TCP does not solve this problem as it only provides layer 3 level notification that the message was received. There is no assurance that the notification receiver actually got it. SNMP INFORMs do application level acknowledgement and are much more trustworthy than assuming a TCP ack indicates they got it.

If systems sent SNMP traps via TCP they could block waiting for the packets to be ACKed if there was a problem getting the traffic to the receiver. If a lot of traps were generated, it could use up the available sockets on the system and the system would lock up. With UDP that is not an issue because it is stateless. A similar problem took out BitBucket in January although it was syslog protocol rather than SNMP--basically, they were inadvertently using syslog over TCP due to a configuration error, the syslog server went down, and all of the servers locked up waiting for the syslog server to ACK their packets. If SNMP traps were sent over TCP, a similar problem could occur.
http://blog.bitbucket.org/2012/01/12/follow-up-on-our-downtime-last-week/

Check out O'Reilly's writings on SNMP: https://library.oreilly.com/book/9780596008406/essential-snmp/18.xhtml
One advantage of using UDP for SNMP traps is that you can direct UDP to a broadcast address, and then field them with multiple management stations on that subnet.

The use of traps with SNMP is considered unreliable. You really should not be relying on traps.
SNMP was designed to be used as a request/response protocol. The protocol details are simple (hence the name, "simple network management protocol"). And UDP is a very simple transport. Try implementing TCP on your basic agent - it's considerably more complex than a simple agent coded using UDP.
SNMP get/getnext operations have a retry mechanism - if a response is not received within timeout then the same request is sent up to a maximum number of tries.

Usually, when you're doing SNMP, you're on a company network, you're not doing this over the long haul. UDP can be more efficient. Let's look at (a gross oversimplification of) the conversation via TCP, then via UDP...
TCP version:
client sends SYN to server
server sends SYN/ACK to client
client sends ACK to server - socket is now established
client sends DATA to server
server sends ACK to client
server sends RESPONSE to client
client sends ACK to server
client sends FIN to server
server sends FIN/ACK to client
client sends ACK to server - socket is torn down
UDP version:
client sends request to server
server sends response to client
generally, the UDP version succeeds since it's on the same subnet, or not far away (i.e. on the company network).
However, if there is a problem with either the initial request or the response, it's up to the app to decide. A. can we get by with a missed packet? if so, who cares, just move on. B. do we need to make sure the message is sent? simple, just redo the whole thing... client sends request to server, server sends response to client. The application can provide a number just in case the recipient of the message receives both messages, he knows it's really the same message being sent again.
This same technique is why DNS is done over UDP. It's much lighter weight and generally it works the first time because you are supposed to be near your DNS resolver.

Where the datagrams are if a client does not listen to a UDP port?

Suppose a client sends a number of datagrams to a server through my application. If my application on the server side stops working and cannot receive any datagrams, but the client still continues to send more data grams to the server through UDP protocol, where are those datagrams going? Will they stay in the server's OS data buffer (or something?)
I ask this question because I want to know that if a client send 1000 datagrams (1K each) to a PC over the internet, will those 1000 datagrams go through the internet (consuming the bandwidth) even if no one is listening to those data?
If the answer is Yes, how should I stop this happening? I mean if a server stops functioning, how should I use UDP to get to know the fact and stops any further sending?
Thanks

I ask this question because I want to know that if a client send 1000 datagrams (1K each) to a PC over the internet, will those 1000 datagrams go through the internet (consuming the bandwidth) even if no one is listening to those data?
Yes
If the answer is Yes, how should I stop this happening? I mean if a server stops functioning, how should I use UDP to get to know the fact and stops any further sending?
You need a protocol level control loop i.e. you need to implement a protocol to take care of this situation. UDP isn't connection-oriented so it is up to the "application" that uses UDP to account for this failure-mode.

UDP itself do not provide facilities to determine if message is successfully received by a client or not. You need you TCP to establish reliable connection and after it sends data over UDP.

The lowest overhead solution would be a keep-alive type thing like jdupont suggested. You can also change to use tcp, which provides this facility for you.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas