boost::asio timeouts example - writing data is expensive - boost-asio

boost:: asio provides an example of how to use the library to implement asynchronous timeouts; client sends server periodic heartbeat messages to server, which echoes heartbeat back to client. failure to respond within N seconds causes disconnect. see boost_asio/example/timeouts/server.cpp
The pattern outlined in these examples would be a good starting point for part of a project i will be working on shortly, but for one wrinkle:
in addition to heartbeats, both client and server need to send messages to each other.
The timeouts example pushes heartbeat echo messages onto a queue, and a subsequent timeout causes an asynchronous handler for the timeout to actually write the data to the socket.
Introducing data for the socket to write cannot be done on the thread running io_service, because it is blocked on run(). run_once() doesn't help, you still block until there is a handler to run, and introduce the complexity of managing work for the io_service.
In asio, asynchronous handlers - writes to the socket being one of them - are called on the thread running io_service.
Therefore, to introduce messages randomly, data to be sent is pushed onto a queue from a thread other than the io_service thread, which implies protecting the queue and notification timer with a mutex. There are then two mutexes per message, one for pushing the data to the queue, and one for the handler which dequeues the data for write to socket.
This is actually a more general question than asio timeouts alone: is there a pattern, when the io_service thread is blocked on run(), by which data can be asynchronously written to the socket without taking two mutexes per message?

The following things could be of interest: boost::asio strands is a mechanism of synchronising handlers. You only need to do this though if you are calling io_service::run from multiple threads AFAIK.
Also useful is the io_service::post method, which allows you execute code from the thread that has invoked io_service::run.

Related

ActiveMQ CMS: Can messages be lost between creating a consumer and setting a listener?

Setting up a CMS consumer with a listener involves two separate calls: first, acquiring a consumer:
cms::MessageConsumer* cms::Session::createConsumer( const cms::Destination* );
and then, setting a listener on the consumer:
void cms::MessageConsumer::setMessageListener( cms::MessageListener* );
Could messages be lost if the implementation subscribes to the destination (and receives messages from the broker/router) before the listener is activated? Or are such messages queued internally and delivered to the listener upon activation?
Why isn't there an API call to create the consumer with a listener as a construction argument? (Is it because the JMS spec doesn't have it?)
(Addendum: this is probably a flaw in the API itself. A more logical order would be to instantiate a consumer from a session, and have a cms::Consumer::subscribe( cms::Destination*, cms::MessageListener* ) method in the API.)
I don't think the API is flawed necessarily. Obviously it could have been designed a different way, but I believe the solution to your alleged problem comes from the start method on the Connection object (inherited via Startable). The documentation for Connection states:
A CMS client typically creates a connection, one or more sessions, and a number of message producers and consumers. When a connection is created, it is in stopped mode. That means that no messages are being delivered.
It is typical to leave the connection in stopped mode until setup is complete (that is, until all message consumers have been created). At that point, the client calls the connection's start method, and messages begin arriving at the connection's consumers. This setup convention minimizes any client confusion that may result from asynchronous message delivery while the client is still in the process of setting itself up.
A connection can be started immediately, and the setup can be done afterwards. Clients that do this must be prepared to handle asynchronous message delivery while they are still in the process of setting up.
This is the same pattern that JMS follows.
In any case I don't think there's any risk of message loss regardless of when you invoke start(). If the consumer is using an auto-acknowledge mode then messages should only be automatically acknowledged once they are delivered synchronously via one of the receive methods or asynchronously through the listener's onMessage. To do otherwise would be a bug in my estimation. I've worked with JMS for the last 10 years on various implementations and I've never seen any kind of condition where messages were lost related to this.
If you want to add consumers after you've already invoked start() you could certainly call stop() first, but I don't see any problem with simply adding them on the fly.

How to prevent an I/O Completion Port from blocking when completion packets are available?

I have a server application that uses Microsoft's I/O Completion Port (IOCP) mechanism to manage asynchronous network socket communication. In general, this IOCP approach has performed very well in my environment. However, I have encountered an edge case scenario for which I am seeking guidance:
For the purposes of testing, my server application is streaming data (lets say ~400 KB/sec) over a gigabit LAN to a single client. All is well...until I disconnect the client's Ethernet cable from the LAN. Disconnecting the cable in this manner prevents the server from immediately detecting that the client has disappeared (i.e. the client's TCP network stack does not send notification of the connection's termination to the server)
Meanwhile, the server continues to make WSASend calls to the client...and being that these calls are asynchronous, they appear to "succeed" (i.e. the data is buffered by the OS in the outbound queue for the socket).
While this is all happening, I have 16 threads blocked on GetQueuedCompletionStatus, waiting to retrieve completion packets from the port as they become available. Prior to disconnecting the client's cable, there was a constant stream of completion packets. Now, everything (as expected) seems to have come to a halt...for about 32 seconds. After 32 seconds, IOCP springs back into action returning FALSE with a non-null lpOverlapped value. GetLastError returns 121 (The semaphore timeout period has expired.) I can only assume that error 121 is an artifact of WSASend finally timing out after the TCP stack determined the client was gone?
I'm fine with the network stack taking 32 seconds to figure out my client is gone. The problem is that while the system is making this determination, my IOCP is paralyzed. For example, WSAAccept events that post to the same IOCP are not handled by any of the 16 threads blocked on GetQueuedCompletionStatus until the failed completion packet (indicating error 121) is received.
My initial plan to work around this involved using WSAWaitForMultipleEvents immediately after calling WSASend. If the socket event wasn't signaled within (e.g. 3 seconds), then I terminate the socket connection and move on (in hopes of preventing the extensive blocking effect on my IOCP). Unfortunately, WSAWaitForMultipleEvents never seems to encounter a timeout (so maybe asynchronous sockets are signaled by virtue of being asynchronous? Or copying data to the TCP queue qualifies for a signal?)
I'm still trying to sort this all out, but was hoping someone had some insights as to how to prevent the IOCP hang.
Other details: My server application is running on Win7 with 8 cores; IOCP is configured to use at most 8 concurrent threads; my thread pool has 16 threads. Plenty of RAM, processor and bandwidth.
Thanks in advance for your suggestions and advice.
It's usual for the WSASend() completions to stall in this situation. You won't get them until the TCP stack times out its resend attempts and completes all of the outstanding sends in error. This doesn't block any other operations. I expect you are either testing incorrectly or have a bug in your code.
Note that your 'fix' is flawed. You could see this 'delayed send completion' situation at any point during a normal connection if the sender is sending faster than the consumer can consume. See this article on TCP flow control and async writes. A better plan is to use a counter for the amount of oustanding writes (per connection) that you want to allow and stop sending if that counter gets reached and then resume when it drops below a 'low water mark' threshold value.
Note that if you've pulled out the network cable into the machine how do you expect any other operations to complete? Reads will just sit there and only fail once a write has failed and AcceptEx will simply sit there and wait for the condition to rectify itself.

How to tell for a particular request that all available worker threads are BUSY

I have a high-rate UDP server using Netty (3.6.6-Final) but notice that the back-end servers can take 1 to 10 seconds to respond - i have no control over those, so cannot improve latency there.
What happens is that all handler worker threads are busy waiting for response and that any new request must wait to get processed, over time this response comes very late. Is it possible to discover for a given request that the thread pool is exhausted, so as to intercept the request early and issue a server busy response?
I would use an ExecutionHandler configured with an appropriate ThreadPoolExecutor, with a max thread count and a bounded task queue. By choosing differenr RejectedExecutionHandler policies, you can either catch the RejectedExecutionException to answer with a "server busy", or use a "caller runs policy", in which case the IO worker thread will execute the task and create a push back (but that is what you wanted to avoid).
Either way, an execution handler with a limited capacity is the way forward.

using multiple io_service objects

I have my application in which listen and process messages from both internet sockets and unix domain sockets. Now I need to add SSL to the internet sockets, I was using a single io_service object for all the sockets in the application. It seems now I need to add separate io_service objects for network sockets and unix domain sockets. I don't have any threads in my application and I use async_send and async_recieve and async_accept to process data and connections. Please point me to any examples using multiple io_service objects with async handlers.
The question has a degree of uncertainty as if multiple io_service objects are required. I could not locate anything in the reference documentation, or the overview for SSL and UNIX Domain Sockets that mandated separate io_service objects. Regardless, here are a few options:
Single io_service:
Try to use a single io_service.
If you do not have a direct handle to the io_service object, but you have a handle to a Boost.Asio I/O object, such as a socket, then a handle to the associated io_service object can be obtained by calling socket.get_io_service().
Use a thread per io_service:
If multiple io_service objects are required, then dedicate a thread to each io_service. This approach is used in Boost.Asio's HTTP Server 2 example.
boost::asio::io_service service1;
boost::asio::io_service service2;
boost::thread_group threads;
threads.create_thread(boost::bind(&boost::asio::io_service::run, &service1));
service2.run();
threads.join_all();
One consequence of this approach is that the it may require thread-safety guarantees to be made by the application. For example, if service1 and service2 both have completion handlers that invoke message_processor.process(), then message_processor.process() needs to either be thread-safe or called in a thread-safe manner.
Poll io_service:
io_service provides non-blocking alternatives to run(). Where as io_service::run() will block until all work has finished, io_service::poll() will run handlers that are ready to run and will not block. This allows for a single thread to execute the event loop on multiple io_service objects:
while (!service1.stopped() &&
!service2.stopped())
{
std::size_t ran = 0;
ran += service1.poll();
ran += service2.poll();
// If no handlers ran, then sleep.
if (0 == ran)
{
boost::this_thread::sleep_for(boost::chrono::seconds(1));
}
}
To prevent a tight-busy loop when there are no ready-to-run handlers, it may be worth adding in a sleep. Be aware that this sleep may introduce latency in the overall handling of events.
Transfer handlers to a single io_service:
One interesting approach is to use a strand to transfer completion handlers to a single io_service. This allows for a thread per io_service, while preventing the need to have the application make thread-safety guarantees, as all completion handlers will post through a single service, whose event loop is only being processed by a single thread.
boost::asio::io_service service1;
boost::asio::io_service service2;
// strand2 will be used by service2 to post handlers to service1.
boost::asio::strand strand2(service1);
boost::asio::io_service::work work2(service2);
socket.async_read_some(buffer, strand2.wrap(read_some_handler));
boost::thread_group threads;
threads.create_thread(boost::bind(&boost::asio::io_service::run, &service1));
service2.run();
threads.join_all();
This approach does have some consequences:
It requires handlers that are intended to by ran by the main io_service to be wrapped via strand::wrap().
The asynchronous chain now runs through two io_services, creating an additional level of complexity. It is important to account for the case where the secondary io_service no longer has work, causing its run() to return.
It is common for an asynchronous chains to occur within the same io_service. Thus, the service never runs out of work, as a completion handler will post additional work onto the io_service.
| .------------------------------------------.
V V |
read_some_handler() |
{ |
socket.async_read_some(..., read_some_handler) --'
}
On the other hand, when a strand is used to transfer work to another io_service, the wrapped handler is invoked within service2, causing it to post the completion handler into service1. If the wrapped handler was the only work in service2, then service2 no longer has work, causing servce2.run() to return.
service1 service2
====================================================
.----------------- wrapped(read_some_handler)
| .
V .
read_some_handler NO WORK
| .
| .
'----------------> wrapped(read_some_handler)
To account for this, the example code uses an io_service::work for service2 so that run() remains blocked until explicitly told to stop().
Looks like you are writing a server and not a client. Don't know if this helps, but I am using ASIO to communicate with 6 servers from my client. It uses TCP/IP SSL/TSL. You can find a link to the code here
You should be able to use just one io_service object with multiple socket objects. But, if you decide that you really want to have multiple io_service objects, then it should be fairly easy to do so. In my class, the io_service object is static. So, just remove the static keyword along with the logic in the constructor that only creates one instance of the io_service object. Depending on the number of connections expected for your server, you would probably be better off using a thread pool dedicated to handling socket I/O rather than creating a thread for each new socket connection.

Application that uses own queues as holders of long-term process operations

I want to make a long-term process handler and use for it NServiceBus.
The role of NServiceBus is to hold an operations of that process (some kind of batch process)
The problem is that I have more than one type of long-term processes and each of them must run parallel, so pushing all messages in one queue is not that I have to do, I think.
Logic is:
1) Receive an order of a long-term process,
2) Divide it into N operations,
3) Each operation "pack" into the message and push in the queue,
4) According to the type of message, particular handler will handle messages and do the operation it holds.
I can't put all of the operations in one queue because my application should handle another messages, that requires fast response. If queue would be full of operations, another messages would wait a lot of time to be processed
So, does anyone know how to solve that problem ?
You should properly set the number of worker threads in the access queue config settings of the long-running process endpoint.
if you are using MSMQ check out this and especially the tag <MsmqTransportConfig ErrorQueue="error" NumberOfWorkerThreads="1" MaxRetries="5"/>
Every idle worker thread pull out a message from the queue although another thread is still processing another message. In this way you shoud achieve the parallel computation requirement you described in your scenario.