How to handle asynchronous errors in Go? - error-handling

I am working on my first real Go project, a messaging API. I use channels to pass messages and other data between user goroutines and library goroutines that use a thread-unsafe, event-based C protocol library. For details https://github.com/apache/qpid-proton/blob/master/proton-c/bindings/go/README.md
My question is in 2 related parts:
1. What are common idioms for handling errors across channels?
The goroutine at one end blows up, how do I ensure the other end unblocks, gets an error value and doesn't get blocked again later?
For readers:
I can close the channel, but no error info.
I could pass a struct { data, error }
or use a second channel.
Pros & cons? Other ideas?
For writers: I can't close without a panic so I guess I need a second channel. Is this idiomatic?
select {
case sendChan <- data: sentOk()
case err := <- errChan: oops(err)
}
I also can't write after close so I need to store the error somewhere and check before trying to write. Any other approaches?
2. Exposing channels in APIs.
I need channels to pass error info: should I make those channels public fields or hide them in methods?
There is a tradeoff, and I don't have the experience to evaluate it:
Exposing channels lets users select directly, but it requires them to correctly impement the error handling patterns (check for errors before write, select for error as well as write). This seems complex and error-prone but maybe that because I'm not seasoned in go.
Hiding channels in a method simplifies and enforces correct use of the library. But now an async user must create their own goroutine and channel(s). They may just duplicate what the library does already, which is silly. Also there is an extra goroutine and channel on the path. Maybe that's not a big deal, but the data channel is the critical path for my library and I think it has to be hidden along with the error channel.
I could do both: expose the channels for power users and provide a simple method wrapper for people with simple needs. That's more to support but worth it if neither alone can fit all cases.
The standard net.Conn uses blocking methods, not channels, and I wrote goroutines to pump data to my C event-loop channel so I know it can be done, but I did not find it trivial. net.Conn is wrapping sytem calls not channels underneath so "exposing the channels" is not an option. Do any of the standard libraries export channels with error handling? (time.After doesn't count, there are no errors)
Thanks a lot!
Alan

Your question is a bit on the broad side but I'll try to give some guidance based on my experience writing highly concurrent code...
Personally I think making the channel a property of the object that gets initialized in a nice helpful NewMyObject() *MyObject method is good design pattern. It makes it so code using the object doesn't have to do boiler plate set up every time it wants to call some asynchronous method the type offers.
For readers: I can close the channel, but no error info. I could pass a struct { data, error } or use a second channel. Pros & cons? Other ideas?
Let the reader signal to return by closing the abort channel. The reader should simply use the temp, err := <-FromChannel paradigm and move on with execution if the data or error channel has closed. This should prevent the 'send on closed channel' panics error from the workers since they will close their channel and return. When err != nil the reader will know to move on.
For writers: I can't close without a panic so I guess I need a second channel. Is this idiomatic?
Yes. Sadly I was quite pissed of with the uni-directional behavior of channels and though it should be abstracted. Regardless, it's not. In my code I would not define this on the object that does work asynchronously. The paradigm I prefer is to use the closing signal (since sending a on a channel is not one-to-many, only one goroutine will read that). Instead, I allocate the abort channel in the calling code and if things need to shut down you close the abort channel and all the goroutines doing asynchronous work who are listening on that channel do their clean up and return. You should also use a WaitGroup so you can wait for the goroutines to return before moving on.
So my basic summary;
1) let the caller of asynchronous methods signal it's time to stop, not the other way around. A waitgroup is better used to coordinate their returns
2) use a sync.WaitGroup in the calling code to know when your goroutines are finished so you can move on
3) allocate your error channel in the calling code and take advantage of the one-to-many signal produced by closing the channel; if you send on a channel you allocate in the caller, only a single instance will read from it. If you put one on each instance you have to iterate a collection of instances to send the on each.
4) if you have a type that provide async methods that do work in the background, set up the channels to read off of in it's initializer, document the async methods saying where to listen for data, provide an example of a non-blocking select that passes an abort channel into the async method and listens on the methods data and error channels. If you need to kill a single routine you could accomplish this by closing one of the channels it owns rather than killing them all by closing the callers abort channel.
Hopefully that all makes sense.

Related

asio - the design reason of async_write_some may not transmit all of the data

From user view, the property of "may not transmit all of the data" is a trouble thing. That will cause handler calls more than one time(may be).
The free function async_write ensure handler call only once, but it requires caller must call it in sequence or the data written will be interleaving. For network application usage, this is more bad than handler be called more than once.
If user want to handler called only once and data written is correct, user need to to do something.
I want to ask is: why asio not just make socket::async_write_some transmit all data?
I want to ask is: why asio not just make socket::async_write_some
transmit all data?
Opposed to async_write, socket::async_write_some is lower-level method.
The OS network stack is designed with send buffers and receive buffers. This buffers are required to be limited with some amount of memory. When you send many data over socket, receiving side can be more slow than sending and/or there can be network speed issues.
This is the reason why socket send buffers are limited and as a result system's syscalls like write or writev should be able to notify user program that system cannot accept chunk of data right now. With socket in async mode its even more critical. So, socket syscalls cannot work in async manner without signaling program to hold on.
So, the async_write_some as a mid-level wrapper to writev is required to support partial writes. In other hand async_write is composed operation and can call async_write_some many times in order to send buffers until operation is complete or possibly failed. It calls completion handler only once, not for each chunk of data passed to network stack.
If user want to handler called only once and data written is correct,
user need to to do something.
Nothing special, just to use async_write, not socket::async_write_some.

Twisted - succes (or failure) callback for LineReceiver sendLine

I'm still trying to master Twisted while in the midst of finishing an application that uses it.
My question is:
My application uses LineReceiver.sendLine to send messages from a Twisted TCP server.
I would like to know if the sendLine succeeded.
I gather that I need to somehow add a success (and error?) callback to sendLine but I don't know how to do this.
Thanks for any pointers / examples
You need to define "succeeded" in order to come up with an answer to this.
All sendLine does immediately (probably) is add some bytes to a send buffer. In some sense, as long as it doesn't raise an exception (eg, MemoryError because your line is too long or TypeError because your line was the number 3 instead of an actual line) it has succeeded.
That's not a very useful kind of success, though. Unfortunately, the useful kind of success is more like "the bytes were added to the send buffer, the send buffer was flushed to the socket, the peer received the bytes, and the receiving application acted on the data in a persistent way".
Nothing in LineReceiver can tell you that all those things happened. The standard solution is to add some kind of acknowledgement to your protocol: when the receiving application has acted on the data, it sends back some bytes that tell the original sender the message has been handled.
You won't get LineReceiver.sendLine to help you much here because all it really knows how to do is send some bytes in a particular format. You need a more complex protocol to handle acknowledgements.
Fortunately, Twisted comes with a few. twisted.protocols.amp is one: it offers remote method calls (complete with responses) as a basic feature. I find that AMP is suitable for a wide range of applications so it's often safe to recommend for new development. It largely supersedes the older twisted.spread (aka "PB") which also provides both remote method calls and remote object references (and is therefore more complex - in my experience, more complex than most applications need). There are also some options that are a bit more standard: for example, Twisted Web includes an HTTP implementation (HTTP, as you may know, is good at request/response style interaction).

blocked requests in io_service

I have implemented client server program using boost::asio library.
In my implementation there are times when io_service.run() blocks indefinitely. In case I pass another request to io_service, the blocked call begins to execute normally.
Is there any way to see what are the pending requests inside the io_service queue ?
I have not used work object to block the run call!
There are no official ways to query into the io_service to find all pending request. However, there are a few techniques to debug the problem:
Boost 1.47 introduced handler tracking. Simply define BOOST_ASIO_ENABLE_HANDLER_TRACKING and Boost.Asio will write debug output, including timestamps, an identifier, and the operation type, to the standard error stream.
Attach a debugger dig through the layers to find and examine operation queues. This answer covers both understanding handler tracking and using a debugger to examine an operation queue for the epoll_reactor.
Finally, if you believe it is a bug, then it may be worth updating to the latest version or checking the revision history for relevant changes. Regardless, describing the problem in more detail may allow others to help identify the source of the problem and potential solutions.
Now i spent a few hours reading and experimenting (i need more boost::asio functionality for work as well) and it turns out: Kind of.
But it is not as straightforward or readable as one might hope.
Under the hood (well, under the outermost hood) io_service has a bunch of other services registered, which do the work async_ operations of their respective fields require.
These are the "Services" described in the reference.
Now sadly, the services stay registered, wether there is work to do or not. For example if your io_service has a udp socket, it will still have all the corresponding services, even if the socket itself is inactive.
But you can ask your io_service which services it has. Lets say you want to know wether your io_service called m_io_service has an udp datagram_socket_service. Then you can call something like:
if (boost::asio::has_service<boost::asio::datagram_socket_service<boost::asio::ip::udp> >(m_io_service))
{
//Whatever
}
That does not help a lot, because it will be true no matter wether the socket is active or not. But after you know, that you have that service, you can get a ref to it using use_service instead of has_service but with the same elegant amount of <>.
And now you can inspect the service to see what it is up to. Sadly, it will not tell you what the outstanding handlers names are (probably partly because it does not know them) but if it is a socket, you can get its implemention_type and with that check whether it currently is_open or find either the local_endpoint as well as the remote_endpoint.
In case of a deadline_timer_service you can, among other stuff, find out when it expires_at.
See the reference for more information what the service is and is not willing to tell you.
http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio/reference.html
This information should then hopefully allow you to determine which async_ operation did not return.
And if not, at the very least you can cancel any unexpectedly active services.

UDP empty buffer ReceiveAsync

I used SocketClient.cs from this thread and very similar from msdn.
Can somebody tell me why buffer is empty after packets are received?
I have host aplication on windows 8, and then i send from Phone packet with some kind of information. Then host reply to me with new packet. Method 'Receive' receives empty information. Buffer is empty. How to fix that?
If you are not reacting to the Completed event of the SAEA object, no data has been received. If you are, then you received an empty message or your buffersize was 0. This is what the docs are telling you.
I had a look at the code in your link and found that it is using a ManualResetEvent with the SendToAsync method. I don't know why it is doing this but it may be one cause, depending on the timeout specified.
I guess not everyone is reading through the docs for the SAEA object, but you have to think about it as a thread synchronization object. It is sent to a thread, does its work there and signals finish, that just it. Maybe this is the issue with the code in your linked post, the thread that should receive the signal from the SAEA object is busy till the Reset method is called. If so, no event from the SAEA object that is working in another thread is getting through.
Also note that SendToAsync may return immediately with false if the result is available at the time of the call. You can examine the result right away. So you would safely call it like
if (!_socket.SendToAsync(myEventArgs))
ProcessResult(myEventArgs);
So the basic idea is: If you use the SocketAsyncEventArgs, don't use threading. The Async socket methods try to make the threading transparent to the user, and you would just add a threading layer on top of this. This is likely to get you in trouble.

WCF: DuplexSessionChannel, Asynchronous Operations and an Exception

Today I have a WCF question, though it probably pertains to other Networking models in .NET as well.
I have a WCF service that exposes a Send(Message) OperationContract, which is OneWay = true. Now this service has a callback channel to return Messages to the client.
Anyway I am trying (successfully) to call this Send method from my Client asynchronously. On a DuplexSessionChannel I am calling BeginSend(Message, OnSendComplete, null) and I have an OnSendComplete(IAsyncResult) method that calls EndSend(asyncResult) on the DuplexSessionChannel.
The service has a CallbackContract and uses the same BeginSend()/EndSend() pattern for sending back to the client, which is called on the callBack channel I get with OperationContext.Current.GetCallbackChannel.
The client on its DuplexSessionChannel calls BeginReceive()/EndReceive() when receiving messages back from the Services callback channel.
Even though things are working, I dont understand what the End<Operation>() methods actually do and this is what I need to have explained to me.
I ask because I am getting an occasional exception in a call to EndSend() on the Service (sending back to the client) complaining that a collection has been modified (I know what this exception means, but not why it is happening or where exactly...). I am using PollingDuplexHttpBinding with a Silverlight client.
I am not a WCF expert, but don't hold back on the details, I need the knowledge. I have seen these sort of Begin/End patterns before around other async operations during my career thus far but never really understood what was going on.
Thanks in advance.
It sounds like your question is just about the Begin/End APM (async programming model). Briefly, the APM takes a sync method like
R Foo(A a); // R is some result type, A is some argument type
and breaks it into async BeginFoo and EndFoo methods. The main advantage happens when the operation is doing some truly asynchronous system operation (e.g. talking to the network) that may be long-running (at least compared to other functions; e.g. talking to the network may take hundreds of milliseconds or more). This pattern gives you a way to tell the system to start the operation, and then call you back when the result of the operation is ready. The advantage to the pattern is you don't have to have a managed thread blocked while this call is pending (which means e.g. that you can have thousands of pending network reads/writes without needing thousands of threads, hurray, threads are expensive).
So given that, 'BeginFoo' is how you say 'start the method with these arguments', and then when you get called back (as notification that the result is ready), 'EndFoo' is how you get the result. In the general case, if 'Foo' might throw a particular exception, then this exception might come out of either the 'Begin' call or the 'End' call and you have to be prepared to handle it in both places.
In the case of something like Send() (which maybe returns void? I forget) it's a little annoying/weird because since it's one-way you kinda just want to 'fire-and-forget'. But exceptions can still happen (e.g. I tried to send but someone unplugged my network cable), and so this may yield exceptions... and given the Begin/End APM, such an exception might come out of the EndSend call. In effect, the exception is a kind of 'result' of calling Send, and so you calling EndSend provides a way for the system to throw an exception at you to say something went wrong after you called BeginSend.