Server design: Send UDP packet for SSLTCP wakeup?

Server design: Send UDP packet for SSLTCP wakeup? - ssl

I have a server that at the minute that creates a new thread for each client connecting securely. If I use a thread pool this will mean that I will have a finite number of clients at once. However this means that I can not be listening on ports for all clients.
My idea is to have the client send a UDP packet with some ID linked to there connection so that they can re-establish the connect rather than lock up a thread for 10-60 seconds (server will keep the SSLsockets in memory). Is that a good way to solve the problem? - I don't see any security security vulnerabilities.
The server is java and the client is C++ not that effects the question.

Your question doesn't make sense. If the client wants to reconnect it should just open a new socket. You are positing at least one extra thread to listen to the UDP port and then ... what? It still has to use the thread pool to handle that client, if that is your self-imposed constraint, or else start a new thread, in which case you may as well not have had the thread pool constraint in the first place.
However this means I cannot be listening on ports for all clients.
No it doesn't. It just means that some clients will get delayed service while the thread pool is full, and a very few clients will get connection failure while the backlog queue is full. It doesn't impair your ability to listen for clients at all.

What if the only port you have say TCP/443 (HTTPS)? What if UDP is firewalled (very much possible)? In other words, you should NOT introduce UDP into this picture.
Even in thread-pool scenario, you can still know the difference between multiple clients who connected to the same server port.
Typical solution for this is to create set of sockets you are going to be watching for at once (in one thread) - in C/C++ it is typically done using select()/poll()/epoll(), and in Java you can use java.nio.
This way, if any client(s) have something to say to you as a server, your select loop will instantly notice that, serve these clients and go back to select(), which consumes very little (effectively 0) CPU usage.
This is an example how to do select loop in C and similar example in Java.

Related

UDP server and connected sockets

[edit]
Seems my question was asked nearly 10 years ago here...
Emulating accept() for UDP (timing-issue in setting up demultiplexed UDP sockets)
...with no clean and scalable solution. I think this could be solved handily by supporting listen() and accept() for UDP, just as connect() is now.
[/edit]
In a followup to this question...
Can you bind() and connect() both ends of a UDP connection
...is there any mechanism to simultaneously bind() and connect()?
The reason I ask is that a multi-threaded UDP server may wish to move a new "session" to its own descriptor for scalability purposes. The intent is to prevent the listener descriptor from becoming a bottleneck, similar to the rationale behind SO_REUSEPORT.
However, a bind() call with a new descriptor will take over the port from the listener descriptor until the connect() call is made. That provides a window of opportunity, albeit briefly, for ingress datagrams to get delivered to the new descriptor queue.
This window is also a problem for UDP servers wanting to employ DTLS. It's recoverable if the clients retry, but not having to would be preferable.

connect() on UDP does not provide connection demultiplexing.
connect() does two things:
Sets a default address for transmit functions that don't accept a destination address (send(), write(), etc)
Sets a filter on incoming datagrams.
It's important to note that the incoming filter simply discards datagrams that do not match. It does not forward them elsewhere. If there are multiple UDP sockets bound to the same address, some OSes will pick one (maybe random, maybe last created) for each datagram (demultiplexing is totally broken) and some will deliver all datagrams to all of them (demultiplexing succeeds but is incredibly inefficient). Both of these are "the wrong thing". Even an OS that lets you pick between the two behaviors via a socket option is still doing things differently from the way you wanted. The time between bind() and connect() is just the smallest piece of this puzzle of unwanted behavior.
To handle UDP with multiple peers, use a single socket in connectionless mode. To have multiple threads processing received packets in parallel, you can either
call recvfrom on multiple threads which process the data (this works because datagram sockets preserve message boundaries, you'd never do this with a stream socket such as TCP), or
call recvfrom on a single thread, which doesn't do any processing, just queues the message to the thread responsible for processing it.
Even if you had an OS that gave you an option for dispatching incoming UDP based on designated peer addresses (connection emulation), doing that dispatching inside the OS is still not going to be any more efficient than doing it in the server application, and a user-space dispatcher tuned for your traffic patterns is probably going to perform substantially better than a one-size-fits-all dispatcher provided by the OS.
For example, a DNS (DHCP) server is going to transact with a lot of different hosts, nearly all running on port 53 (67-68) at the remote end. So hashing based on the remote port would be useless, you need to hash on the host. Conversely, a cache server supporting a web application server cluster is going to transact with a handful of hosts, and a large number of different ports. Here hashing on remote port will be better.
Do the connection association yourself, don't use socket connection emulation.

The issue you described is the one I encountered some time ago doing TCP-like listen/accept mechanism for UDP.
In my case the solution (which turned out to be bad as I will describe later) was to create one UDP socket to receive any incoming datagrams and when one arrives making this particular socket connected to sender (via recvfrom() with MSG_PEEK and connect()) and returning it to new thread. Moreover, new not connected UDP socket was created for next incoming datagrams. This way the new thread (and dedicated socket) did recv() on the socket and was handling only this particular channel from now on, while the main one was waiting for new datagrams coming from other peers.
Everything had worked well until the incoming datagram rate was higher. The problem was that while the main socket was transitioning to connected state, it was buffering not one but a few more datagrams (coming from many peers) and thus thread created to handle the particular sender was reading in effect a few more datagrams not intended to it.
I could not find solution (e.g. creating new connected socket (instead connecting the main one) and pass the received datagram on main socket to its receive buffer for futher recv()). Eventually, I ended up with N threads, each one having one "listening" socket (with use of SO_REUSEPORT) with datagram scattering done on OS level.

How to prevent an I/O Completion Port from blocking when completion packets are available?

I have a server application that uses Microsoft's I/O Completion Port (IOCP) mechanism to manage asynchronous network socket communication. In general, this IOCP approach has performed very well in my environment. However, I have encountered an edge case scenario for which I am seeking guidance:
For the purposes of testing, my server application is streaming data (lets say ~400 KB/sec) over a gigabit LAN to a single client. All is well...until I disconnect the client's Ethernet cable from the LAN. Disconnecting the cable in this manner prevents the server from immediately detecting that the client has disappeared (i.e. the client's TCP network stack does not send notification of the connection's termination to the server)
Meanwhile, the server continues to make WSASend calls to the client...and being that these calls are asynchronous, they appear to "succeed" (i.e. the data is buffered by the OS in the outbound queue for the socket).
While this is all happening, I have 16 threads blocked on GetQueuedCompletionStatus, waiting to retrieve completion packets from the port as they become available. Prior to disconnecting the client's cable, there was a constant stream of completion packets. Now, everything (as expected) seems to have come to a halt...for about 32 seconds. After 32 seconds, IOCP springs back into action returning FALSE with a non-null lpOverlapped value. GetLastError returns 121 (The semaphore timeout period has expired.) I can only assume that error 121 is an artifact of WSASend finally timing out after the TCP stack determined the client was gone?
I'm fine with the network stack taking 32 seconds to figure out my client is gone. The problem is that while the system is making this determination, my IOCP is paralyzed. For example, WSAAccept events that post to the same IOCP are not handled by any of the 16 threads blocked on GetQueuedCompletionStatus until the failed completion packet (indicating error 121) is received.
My initial plan to work around this involved using WSAWaitForMultipleEvents immediately after calling WSASend. If the socket event wasn't signaled within (e.g. 3 seconds), then I terminate the socket connection and move on (in hopes of preventing the extensive blocking effect on my IOCP). Unfortunately, WSAWaitForMultipleEvents never seems to encounter a timeout (so maybe asynchronous sockets are signaled by virtue of being asynchronous? Or copying data to the TCP queue qualifies for a signal?)
I'm still trying to sort this all out, but was hoping someone had some insights as to how to prevent the IOCP hang.
Other details: My server application is running on Win7 with 8 cores; IOCP is configured to use at most 8 concurrent threads; my thread pool has 16 threads. Plenty of RAM, processor and bandwidth.
Thanks in advance for your suggestions and advice.

It's usual for the WSASend() completions to stall in this situation. You won't get them until the TCP stack times out its resend attempts and completes all of the outstanding sends in error. This doesn't block any other operations. I expect you are either testing incorrectly or have a bug in your code.
Note that your 'fix' is flawed. You could see this 'delayed send completion' situation at any point during a normal connection if the sender is sending faster than the consumer can consume. See this article on TCP flow control and async writes. A better plan is to use a counter for the amount of oustanding writes (per connection) that you want to allow and stop sending if that counter gets reached and then resume when it drops below a 'low water mark' threshold value.
Note that if you've pulled out the network cable into the machine how do you expect any other operations to complete? Reads will just sit there and only fail once a write has failed and AcceptEx will simply sit there and wait for the condition to rectify itself.

multiple UDP ports

I have situation where I have to handle multiple live UDP streams in the server.
I have two options (as I think)
Single Socket :
1) Listen at single port on the server and receive the data from all clients on the same port and create threads for each client to process the data till the client stop sending.
Here only one port is used to receive the data and number of threads used to process the data.
Multiple Sockets :
2) Client will request open port from the server to send the data and the application will send the open port to the client and opens a new thread listening at the port to receive and process the data.Here for each client will have unique port to send the data.
I already implemented a way to know which packet is coming from which client in UDP.
I have 1000+ clients and 60KB data per second I am receiving.
Is there any performance issues using the above methods
or Is here any efficient way to handle this type of task in C ?
Thanks,
Raghu

With that many clients, having one thread per client is very inefficient since lots and lots of context switches must be performed.
Also, the number of ports you can open per IP is limited (port is a 16 bit number).
Therefore "Single Socket" will be far more efficient. But you can also use "Multipe Sokets" with just a single thread using the asynchronous API. If you can identify the client using the package's payload, then there is no need to have a port per client.

WCF Server Push connectivity test. Ping()?

Using techniques as hinted at in:
http://msdn.microsoft.com/en-us/library/system.servicemodel.servicecontractattribute.callbackcontract.aspx
I am implementing a ServerPush setup for my API to get realtime notifications from a server of events (no polling). Basically, the Server has a RegisterMe() and UnregisterMe() method and the client has a callback method called Announcement(string message) that, through the CallbackContract mechanisms in WCF, the server can call. This seems to work well.
Unfortunately, in this setup, if the Server were to crash or is otherwise unavailable, the Client won't know since it is only listening for messages. Silence on the line could mean no Announcements or it could mean that the server is not available.
Since my goal is to reduce polling rather than immediacy, I don't mind adding a void Ping() method on the Server alongside RegisterMe() and UnregisterMe() that merely exists to test connectivity of to the server. Periodically testing this method would, I believe, ensure that we're still connected (and also that no Announcements have been dropped by the transport, since this is TCP)
But is the Ping() method necessary or is this connectivity test otherwise available as part of WCF by default - like serverProxy.IsStillConnected() or something. As I understand it, the channel's State would only return Faulted or Closed AFTER a failed Ping(), but not instead of it.
2) From a broader perspective, is this callback approach solid? This is not for http or ajax - the number of connected clients will be few (tens of clients, max). Are there serious problems with this approach? As this seems to be a mild risk, how can I limit a slow/malicious client from blocking the server by not processing it's callback queue fast enough? Is there a kind of timeout specific to the callback that I can set without affecting other operations?

Your approach sounds reasonable, here are some links that may or may not help (they are not quite exactly related):
Detecting Client Death in WCF Duplex Contracts
http://tomasz.janczuk.org/2009/08/performance-of-http-polling-duplex.html
Having some health check built into your application protocol makes sense.
If you are worried about malicious clients, then add authorization.
The second link I shared above has a sample pub/sub server, you might be able to use this code. A couple things to watch out for -- consider pushing notifications via async calls or on a separate thread. And set the sendTimeout on the tcp binding.
HTH

I wrote a WCF application and encountered a similar problem. My server checked clients had not 'plug pulled' by periodically sending a ping to them. The actual send method (it was asynchronous being a server) had a timeout of 30 seconds. The client simply checked it received the data every 30 seconds, while the server would catch an exception if the timeout was reached.
Authorisation was required to connect to the server (by using the built-in feature of WCF that force the connecting person to call a particular method first) so from a malicious client perspective you could easily add code to check and ban their account if they do something suspicious, while disconnecting users who do not authenticate.
As the server I wrote was asynchronous, there wasn't any way to really block it. I guess that addresses your last point, as the asynchronous send method fires off the ping (and any other sending of data) and returns immediately. In the SendEnd method it would catch the timeout exception (sometimes multiple for the client) and disconnect them, without any blocking or freezing of the server.
Hope that helps.

You could use a publisher / subscriber service similar to the one suggested by Juval:
http://msdn.microsoft.com/en-us/magazine/cc163537.aspx
This would allow you to persist the subscribers if losing the server is a typical scenario. The publish method in this example also calls each subscribers on a separate thread, so a few dead subscribers will not block others...

WCF - How to detect if server is alive?

I am developing a client/server application with net tcp binding and I need to be notified if my connection to server goes down.
From server-side if a client disconnects, i can detect it instantly with CommunicationObject. Faulted event (with reliable session off). However, from Client side, it seems I have no way to know if server goes down. Same event doesn't fire. By the way I am setting receiveTimeout to infinite. Some people suggested a heartbeat or ping function to check if server is alive. But i think at WCF level such methodologies have big impacts. After all it's not a simple packet you send , it's the whole WCF request. What should I do ?

There seems to be a common misconception that, in order to find out on the client side whether a WCF session is still alive, one has to implement some kind of custom ping or heartbeat operation on the service. However, the WCF framework, when configured correctly, already does this for you in the background.
The trick is to set the ReliableSession.InactivityTimeout to a period that is short enough. For instance, if you set it to 30 seconds, then the ICommunicationObject.Faulted event will be raised on the client proxy after 30 (minimum) to appr. 45 (maximum) seconds after a service breakdown. The exact delay depends on the rhythm of the WCF-internal session keep-alive control timer and the specific time of the breakdown.
Of course, this can only work for reliable-session capable bindings, combined with the right session properties (ServiceContractAttribute.SessionMode, ServiceBehaviorAttribute.InstanceContextMode, OperationContractAttribute.IsInitiating, and OperationContractAttribute.IsTerminating).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas