UDP hole punching logic puzzle - udp

I am trying to solve a logical puzzle in my UDP hole punching implementation.
The puzzle is the following: "can I guarantee that two clients I am trying to connect will come to the same conclusion (hole punched/hole not punched) within a reasonable time (ideally no more than a few seconds after they were given each other's IP addresses).
With UDP hole punching, both clients have to start sending "punch" packets around the same time. Some of these initial packets will be lost because of the NAT/firewall, it is expected. Then at some point, when client's #1 message gets through, let's take an optimistic scenario,
this does not mean that the message that other client's #2 sent also gets through. Clients then have to reply with "ack" messages to confirm that connection has been established. This is where timing issue also comes into play: one of the clients receives ack before the timeout, while other does not. And they come to different conclusions.
I also tried to make logic more complex: keeping track of how many acks each client sent, giving each new sent ack a unique number. Plus keeping track of how many different acks the given client received. Still, one client condition of "success" does not mean that the other client comes to the same conclusion in a limited time, given the nature of UDP when packets can get lost. And from usability point of view, I cannot hold user waiting forever, I have to present the result ideally within 2 seconds after server connects two clients.
To be more concrete, I can have a loop:
while (within_timeout)
{
// check for new data
// reply with acks if received
if (acks_sent >= 3 && acks_received >= 3) break;
}
success = (acks_sent >= 3 && acks_received >= 3);
Client #1 in our case knows that it sent three or more acks and received at least 3 acks. So it leaves the loop. Client #2 knows that it sent at least 3 acks (because client #1 received that many) but it may not received all acks sent by client #1 and timeout ends for client #2.
Server can also can designate each client "master" or "slave" id. When master will have a final word. Still, even then master will have to tell the slave of his solution and expect the ack, which may not arrive within the reasonable timeout.
It can be that there's no 100% solution, is there a solution that approaches 100%?

Related

What happens to client message if Server does not exist in UDP Socket programming?

I ran the client.java only when I filled the form and pressed send button, it was jammed and I could not do anything.
Is there any explanation for this?
enter image description here
TLDR; the User Datagram Protocol (UDP) is "fire-and-forget".
Unreliable – When a UDP message is sent, it cannot be known if it will reach its destination; it could get lost along the way. There is no concept of acknowledgment, retransmission, or timeout.
So if a UDP message is sent and nobody listens then the packet is just dropped. (UDP packets can also be silently dropped due to other network issues/congestion.)
While there could be a prior-error such as resolving the IP for the server (eg. an invalid hostname) or attempting to use an invalid IP, once the UDP packet is out the door, it's out the door and is considered "successfully sent".
Now, if a program is waiting on a response that never comes (ie. the server is down or packet was "otherwise lost") then that could be .. problematic.
That is, this code which requires a UDP response message to continue would "hang":
sendUDPToServerThatNeverResponds();
// There is no guarantee the server will get the UDP message,
// much less that it will send a reply or the reply will get back
// to the client..
waitForUDPReplyFromServerThatWillNeverCome();
Since UDP has no reliability guarantee or retry mechanism, this must be handled in code. For example, in the above maybe the code would wait for 1 second and retry sending a packet, and after 5 seconds of no responses it would report an error to the client.
sendUDPToServerThatMayOrMayNotRespond();
while (i++ < 5) {
reply = waitForUDPReplyForOneSecond();
if (reply)
break;
}
if (reply)
doSomethingAwesome();
else
showErrorToUser();
Of course, "just using TCP" can often make these sorts of tasks simpler due to the stream and reliability characteristics that the Transmission Control Protoocol (TCP) provides. For example, the pseudo-code above is not very robust as the client must also be prepared to handle latent/slow UDP packet arrival from previous requests.
(Also, given the current "screenshot", the code might be as flawed as while(true) {} - make sure to provide an SSCCE and relevant code with questions.)

RabbitMQ security design to declare queues from server (and use from client)

I have a test app (first with RabbitMQ) which runs on partially trusted clients (in that i don't want them creating queues on their own), so i will look into the security permissions of the queues and credentials that the clients connect with.
For messaging there are mostly one-way broadcasts from server to clients, and sometimes a query from server to a specific client (over which the replies will be sent on a replyTo queue which is dedicated to that client on which the server listens for responses).
I currently have a receive function on the server which looks out for "Announce" broadcast from clients:
agentAnnounceListener.Received += (model, ea) =>
{
var body = ea.Body;
var props = ea.BasicProperties;
var message = Encoding.UTF8.GetString(body);
Console.WriteLine(
"[{0}] from: {1}. body: {2}",
DateTimeOffset.FromUnixTimeMilliseconds(ea.BasicProperties.Timestamp.UnixTime).Date,
props.ReplyTo,
message);
// create return replyTo queue, snipped in next code section
};
I am looking to create the return to topic in the above receive handler:
var result = channel.QueueDeclare(
queue: ea.BasicProperties.ReplyTo,
durable: false,
exclusive: false,
autoDelete: false,
arguments: null);
Alternatively, i could store the received announcements in a database, and on a regular timer run through this list and declare a queue for each on every pass.
In both scenarioes this newly created channel would then be used at a future point by the server to send queries to the client.
My questions are please:
1) Is it better to create a reply channel on the server when receiving the message from client, or if i do it externally (on a timer) are there any performance issues for declaring queues that already exist (there could be thousands of end points)?
2) If a client starts to miss behave, is there any way that they can be booted (in the receive function i can look up how many messages per minute and boot if certain criteria are met)? Are there any other filters that can be defined prior to receive in the pipeline to kick clients who are sending too many messages?
3) In the above example notice my messages continuously come in each run (the same old messages), how do i clear them out please?
I think preventing clients from creating queues just complicates the design without much security benefit.
You are allowing clients to create messages. In RabbitMQ, its not very easy to stop clients from flooding your server with messages.
If you want to rate-limit your clients, RabbitMQ may not be the best choice. It does rate-limiting automatically when servers starts to struggle with processing all the messages, but you can't set a strict rate limit on per-client basis on the server using out-of-the-box solution. Also, clients are normally allowed to create queues.
Approach 1 - Web App
Maybe you should try to use web application instead:
Clients authenticate with your server
To Announce, clients send a POST request to a certain endpoint, ie /api/announce, maybe providing some credentials that allow them to do so
To receive incoming messages, GET /api/messages
To acknowledge processed message: POST /api/acknowledge
When client acknowledges receipt, you delete your message from database.
With this design, you can write custom logic to rate-limit or ban clients that misbehave and you have full control of your server
Approach 2 - RabbitMQ Management API
If you still want to use RabbitMQ, you can potentially achieve what you want by using RabbitMQ Management API
You'll need to write an app that will query RabbitMQ Management API on timer basis and:
Get all the current connections, and check message rate for each of them.
If message rate exceed your threshold, close connection or revoke user's permissions using /api/permissions/vhost/user endpoint.
In my opinion, web app may be easier if you don't need all the queueing functionality like worker queues or complicated routing that you can get out of the box with RabbitMQ.
Here are some general architecture/reliability ideas for your scenario. Responses to your 3 specific questions are at the end.
General Architecture Ideas
I'm not sure that the declare-response-queues-on-server approach yields performance/stability benefits; you'd have to benchmark that. I think the simplest topology to achieve what you want is the following:
Each client, when it connects, declares an exclusive and/or autodelete anonymous queue. If the clients' network connectivity is so sketchy that holding open a direct connection is undesirable, so something similar to Alex's proposed "Web App" above, and have clients hit an endpoint that declares an exclusive/autodelete queue on their behalf, and closes the connection (automatically deleting the queue upon consumer departure) when a client doesn't get in touch regularly enough. This should only be done if you can't tune the RabbitMQ heartbeats from the clients to work in the face of network unreliability, or if you can prove that you need queue-creation rate limiting inside the web app layer.
Each client's queue is bound to a broadcast topic exchange, which the server uses to communicate broadcast messages (wildcarded routing key) or specifically targeted messages (routing key that only matches one client's queue name).
When the server needs to get a reply back from the clients, you could either have the server declare the response queue before sending the "response-needed" message, and encode the response queue in the message (basically what you're doing now), or you could build semantics in your clients in which they stop consuming from their broadcast queue for a fixed amount of time before attempting an exclusive (mutex) consume again, publish their responses to their own queue, and ensure that the server consumes those responses within the allotted time, before closing the server consume and restoring normal broadcast semantics. That second approach is much more complicated and likely not worth it, though.
Preventing Clients Overwhelming RabbitMQ
Things that can reduce the server load and help prevent clients DoSing your server with RMQ operations include:
Setting appropriate, low max-length thresholds on all the queues, so the amount of messages stored by the server will never exceed a certain multiple of the number of clients.
Setting per-queue expirations, or per-message expirations, to make sure that stale messages do not accumulate.
Rate-limiting specific RabbitMQ operations is quite tricky, but you can rate-limit at the TCP level (using e.g. HAProxy or other router/proxy stacks), to ensure that your clients don't send too much data, or open too many connections, at a time. In my experience (just one data point; if in doubt, benchmark!) RabbitMQ cares less about the count of messages ingested per time than it does the data volume and largest possible per-message size ingested. Lots of small messages are usually OK; a few huge ones can cause latency spikes, otherwise, rate-limiting the bytes at the TCP layer will probably allow you to scale such a system very far before you have to re-assess.
Specific Answers
In light of the above, my answers to your specific questions would be:
Q: Should you create reply queues on the server in response to received messages?
A: Yes, probably. If you're worried about the queue-creation rate
that happens as a result of that, you can rate-limit per server instance. It looks like you're using Node, so you should be able to use one of the existing solutions for that platform to have a single queue-creation rate limiter per node server instance, which, unless you have many thousands of servers (not clients), should allow you to reach a very, very large scale before re-assessing.
Q: Are there performance implications to declaring queues based on client actions? Or re-declaring queues?
A: Benchmark and see! Re-declares are probably OK; if you rate-limit properly you may not need to worry about this at all. In my experience, floods of queue-declare events can cause latency to go up a bit, but don't break the server. But that's just my experience! Everyone's scenario/deployment is different, so there's no substitute for benchmarking. In this case, you'd fire up a publisher/consumer with a steady stream of messages, tracking e.g. publish/confirm latency or message-received latency, rabbitmq server load/resource usage, etc. While some number of publish/consume pairs were running, declare a lot of queues in high parallel and see what happens to your metrics. Also in my experience, the redeclaration of queues (idempotent) doesn't cause much if any noticeable load spikes. More important to watch is the rate of establishing new connections/channels. You can also rate-limit queue creations very effectively on a per-server basis (see my answer to the first question), so I think if you implement that correctly you won't need to worry about this for a long time. Whether RabbitMQ's performance suffers as a function of the number of queues that exist (as opposed to declaration rate) would be another thing to benchmark though.
Q: Can you kick clients based on misbehavior? Message rates?
A: Yes, though it's a bit tricky to set up, this can be done in an at least somewhat elegant way. You have two options:
Option one: what you proposed: keep track of message rates on your server, as you're doing, and "kick" clients based on that. This has coordination problems if you have more than one server, and requires writing code that lives in your message-receive loops, and doesn't trip until RabbitMQ actually delivers the messages to your server's consumers. Those are all significant drawbacks.
Option two: use max-length, and dead letter exchanges to build a "kick bad clients" agent. The length limits on RabbitMQ queues tell the queue system "if more messages than X are in the queue, drop them or send them to the dead letter exchange (if one is configured)". Dead-letter exchanges allow you to send messages that are greater than the length (or meet other conditions) to a specific queue/exchange. Here's how you can combine those to detect clients that publish messages too quickly (faster than your server can consume them) and kick clients:
Each client declares it's main $clientID_to_server queue with a max-length of some number, say X that should never build up in the queue unless the client is "outrunning" the server. That queue has a dead-letter topic exchange of ratelimit or some constant name.
Each client also declares/owns a queue called $clientID_overwhelm, with a max-length of 1. That queue is bound to the ratelimit exchange with a routing key of $clientID_to_server. This means that when messages are published to the $clientID_to_server queue at too great a rate for the server to keep up, the messages will be routed to $clientID_overwhelm, but only one will be kept around (so you don't fill up RabbitMQ, and only ever store X+1 messages per client).
You start a simple agent/service which discovers (e.g. via the RabbitMQ Management API) all connected client IDs, and consumes (using just one connection) from all of their *_overwhelm queues. Whenever it receives a message on that connection, it gets the client ID from the routing key of that message, and then kicks that client (either by doing something out-of-band in your app; deleting that client's $clientID_to_server and $clientID_overwhelm queues, thus forcing an error the next time the client tries to do anything; or closing that client's connection to RabbitMQ via the /connections endpoint in the RabbitMQ management API--this is pretty intrusive and should only be done if you really need to). This service should be pretty easy to write, since it doesn't need to coordinate state with any other parts of your system besides RabbitMQ. You'll lose some messages from misbehaving clients with this solution, though: if you need to keep them all, remove the max-length limit on the overwhelm queue (and run the risk of filling up RabbitMQ).
Using that approach, you can detect spamming clients as they happen according to RabbitMQ, not just as they happen according to your server. You could extend it by also adding a per-message TTL to messages sent by the clients, and triggering the dead-letter-kick behavior if messages sit in the queue for more than a certain amount of time--this would change the pseudo-rate-limiting from "when the server consumer gets behind by message count" to "when the server consumer gets behind by message delivery timestamp".
Q: Why do messages get redelivered on each run, and how do I get rid of them?
A: Use acknowledgements or noack (but probably acknowledgements). Getting a message in "receive" just pulls it into your consumer, but doesn't pop it from the queue. It's like a database transaction: to finally pop it you have to acknowledge it after you receive it. Altnernatively, you could start your consumer in "noack" mode, which will cause the receive behavior to work the way you assumed it would. However, be warned, noack mode imposes a big tradeoff: since RabbitMQ is delivering messages to your consumer out-of-band (basically: even if your server is locked up or sleeping, if it has issued a consume, rabbit is pushing messages to it), if you consume in noack mode those messages are permanently removed from RabbitMQ when it pushes them to the server, so if the server crashes or shuts down before draining its "local queue" with any messages pending-receive, those messages will be lost forever. Be careful with this if it's important that you don't lose messages.

Losing data with UDP over WiFi when multicasting

I'm currently working a network protocol which includes a client-to-client system with auto-discovering of clients on the current local network.
Right now, I'm periodically broadsting over 255.255.255.255 and if a client doesn't emit for 30 seconds I consider it dead (then offline). The goal is to keep an up-to-date list of clients runing. It's working well using UDP, but UDP does not ensure that the packets have been sucessfully delivered. So when it comes to the WiFi parts of the network, I sometimes have "false postivives" of dead clients. Currently I've reduced the time between 2 broadcasts to solve the issue (still not working well), but I don't find this clean.
Is there anything I can do to keep a list of "online" clients without this risk of "false positives" ?
To minimize the false positives, due to dropped packets you should alter a little bit the logic of your heartbeat protocol.
Rather than relying on a single packet broadcast per N seconds, you can send a burst 3 or more packets immediately one after the other every N seconds. This is an approach that ping and traceroute tools follow. With this method you decrease significantly the probability of a lost announcement from a peer.
Furthermore, you can specify a certain number of lost announcements that your application can afford. Also, in order to minimize the possibility of packet loss using wireless network, try to minimize as much as possible the size of the broadcast UDP packet.
You can turn this over, so you will broadcast "ServerIsUp" message
and every client than can register on server. When client is going offline it will unregister, otherwise you can consider it alive.

How to send Socket Messages in Series with Obj-c

I am currently using CocoaAsyncSocket to send UDP Socket messages to a server. Occasionally I need to enforce that messages arrive in a specific order. Basically my code structure is similar to below.
NSMutableArray *msgs = #[#0, #1, #2].mutableCopy;
-(void)sendMessages:(NSString *)str{
// blackbox function that converts to nsdata and sends to socket server
}
Normally, I don't care about the order so I am just blindly sending individual messages. For very specific commands this doesn't work. I have an example in java that spawns a new thread and sends the messages after a 0.2 second time span. I was hoping to find a more elegant solution in Objective-C. Does anybody have any suggestions for an approach?
Guaranteeing a specific packet arrival order for UDP is exactly like doing the same for the postal system.
If you send two letters from country A to country B, there isn't really a way of telling which one will arrive first. Heck, one of them (or maybe even both) might even be lost and won't arrive at all. Sending the second letter 0.2 days after the first one increases the chances of "correct" ordering, but guarantees nothing.
The only way of maintaining order is adding sequence numbers to packets and buffering them on the receiving end. Then, once the relevant packets have arrived and have been ordered by sequence number you deliver them to processing. Note that this means that you'll also need a retransmission mechanism for lost packets, so if packets 1 and 3 arrive but 2 doesn't, the sender knows to send the missing packet before moving on. This is what TCP does.

Why does the TLS heartbeat extension allow user supplied data?

The heartbeat protocol requires the other end to reply with the same data that was sent to it, to know that the other end is alive. Wouldn't sending a certain fixed message be simpler? Is it to prevent some kind of attack?
At least the size of the packet seems to be relevant, because according to RFC6520, 5.1 the heartbeat message will be used with DTLS (e.g. TLS over UDP) for PMTU discovery - in which cases it needs messages of different sizes. Apart from that it might be simply modelled after ICMP ping, where you can also specify the payload content for no reason.
Just like with ICMP Ping, the idea is to ensure you can match up a "pong" heartbeat response you received with whichever "ping" heartbeat request you made. Some packets may get lost or arrive out of order and if you send the requests fast enough and all the response contents are the same, there's no way to tell which of your requests were answered.
One might think, "WHO CARES? I just got a response; therefore, the other side is alive and well, ready to do my bidding :D!" But what if the response was actually for a heartbeat request 10 minutes ago (an extreme case, maybe due to the server being overloaded)? If you just sent another heartbeat request a few seconds ago and the expected responses are the same for all (a "fixed message"), then you would have no way to tell the difference.
A timely response is important in determining the health of the connection. From RFC6520 page 3:
... after a number of retransmissions without
receiving a corresponding HeartbeatResponse message having the
expected payload, the DTLS connection SHOULD be terminated.
By allowing the requester to specify the return payload (and assuming the requester always generates a unique payload), the requester can match up a heartbeat response to a particular heartbeat request made, and therefore be able to calculate the round-trip time, expiring the connection if appropriate.
This of course only makes much sense if you are using TLS over a non-reliable protocol like UDP instead of TCP.
So why allow the requester to specify the length of the payload? Couldn't it be inferred?
See this excellent answer: https://security.stackexchange.com/a/55608/44094
... seems to be part of an attempt at genericity and coherence. In the SSL/TLS standard, all messages follow regular encoding rules, using a specific presentation language. No part of the protocol "infers" length from the record length.
One gain of not inferring length from the outer structure is that it makes it much easier to include optional extensions afterwards. This was done with ClientHello messages, for instance.
In short, YES, it could've been, but for consistency with existing format and for future proofing, the size is spec'd out so that other data can follow the same message.