Netty - `Future.operationComplete` never called when running in SSL mode in idle timeout handler - ssl

We have implemented the following channelIdle implementation.
public void channelIdle(ChannelHandlerContext ctx, IdleStateEvent e)
throws Exception {
Response response = business.getResponse();
final Channel channel = e.getChannel();
ChannelFuture channelFuture
= Channels.write(
channel,
ChannelBuffers.wrappedBuffer(response.getXML().getBytes())
);
if (response.shouldDisconnect()) {// returns true and listener _is_ added.
channelFuture.addListener(new ChannelFutureListener() {
#Override
public void operationComplete(ChannelFuture future) throws Exception {
channel.close(); // never gets called :(
}
});
}
}
When running in non-SSL mode this works as expected.
However, when running with SSL enabled the operationComplete method never gets called. We've verified this a few times on various machines. The idle timeout happens many times but the operationComplete isn't called. We don't see any exceptions being thrown.
I've tried tracing through the code to see where operationComplete should get called but it is complex and I've not quite figured it out.
There is a call to future = succeededFuture(channel); in SslHandler.wrap() but I don't know if that means anything. The future returned from wrap is never used elsewhere in the SslHandler code.

This sounds like a bug.. Would it be possible to write a simple "test-case" that shows the problem and open an issue at our github issue tracker[1].
Be sure to explain if it happens all the time or only sometimes etc..
[1] https://github.com/netty/netty/issues

Related

Apache HTTPClient5 - How to Prevent Connection/Stream Refused

Problem Statement
Context
I'm a Software Engineer in Test running order permutations of Restaurant Menu Items to confirm that they succeed order placement w/ the POS
In short, this POSTs a JSON payload to an endpoint which then validates the order w/ a POS to define success/fail/other
Where POS, and therefore Transactions per Second (TPS), may vary, but each Back End uses the same core handling
This can be as high as ~22,000 permutations per item, in easily manageable JSON size, that need to be handled as quickly as possible
The Network can vary wildly depending upon the Restaurant, and/or Region, one is testing
E.g. where some have a much higher latency than others
Therefore, the HTTPClient should be able to intelligently negotiate the same content & endpoint regardless of this
Direct Problem
I'm using Apache's HTTP Client 5 w/ PoolingAsyncClientConnectionManager to execute both the GET for the Menu contents, and the POST to check if the order succeeds
This works out of the box, but sometimes loses connections w/ Stream Refused, specifically:
org.apache.hc.core5.http2.H2StreamResetException: Stream refused
No individual tuning seems to work across all network contexts w/ variable latency, that I can find
Following the stacktrace seems to indicate it is that the stream had closed already, therefore needs a way to keep it open or not execute an already-closed connection
if (connState == ConnectionHandshake.GRACEFUL_SHUTDOWN) {
throw new H2StreamResetException(H2Error.PROTOCOL_ERROR, "Stream refused");
}
Some Attempts to Fix Problem
Tried to use Search Engines to find answers but there are few hits for HTTPClient5
Tried to use official documentation but this is sparse
Changing max connections per route to a reduced number, shifting inactivity validations, or connection time to live
Where the inactivity checks may fix the POST, but stall the GET for some transactions
And that tuning for one region/restaurant may work for 1 then break for another, w/ only the Network as variable
PoolingAsyncClientConnectionManagerBuilder builder = PoolingAsyncClientConnectionManagerBuilder
.create()
.setTlsStrategy(getTlsStrategy())
.setMaxConnPerRoute(12)
.setMaxConnTotal(12)
.setValidateAfterInactivity(TimeValue.ofMilliseconds(1000))
.setConnectionTimeToLive(TimeValue.ofMinutes(2))
.build();
Shifting to a custom RequestConfig w/ different timeouts
private HttpClientContext getHttpClientContext() {
RequestConfig requestConfig = RequestConfig.custom()
.setConnectTimeout(Timeout.of(10, TimeUnit.SECONDS))
.setResponseTimeout(Timeout.of(10, TimeUnit.SECONDS))
.build();
HttpClientContext httpContext = HttpClientContext.create();
httpContext.setRequestConfig(requestConfig);
return httpContext;
}
Initial Code Segments for Analysis
(In addition to the above segments w/ change attempts)
Wrapper handling to init and get response
public SimpleHttpResponse getFullResponse(String url, PoolingAsyncClientConnectionManager manager, SimpleHttpRequest req) {
try (CloseableHttpAsyncClient httpclient = getHTTPClientInstance(manager)) {
httpclient.start();
CountDownLatch latch = new CountDownLatch(1);
long startTime = System.currentTimeMillis();
Future<SimpleHttpResponse> future = getHTTPResponse(url, httpclient, latch, startTime, req);
latch.await();
return future.get();
} catch (IOException | InterruptedException | ExecutionException e) {
e.printStackTrace();
return new SimpleHttpResponse(999, CommonUtils.getExceptionAsMap(e).toString());
}
}
With actual handler and probing code
private Future<SimpleHttpResponse> getHTTPResponse(String url, CloseableHttpAsyncClient httpclient, CountDownLatch latch, long startTime, SimpleHttpRequest req) {
return httpclient.execute(req, getHttpContext(), new FutureCallback<SimpleHttpResponse>() {
#Override
public void completed(SimpleHttpResponse response) {
latch.countDown();
logger.info("[{}][{}ms] - {}", response.getCode(), getTotalTime(startTime), url);
}
#Override
public void failed(Exception e) {
latch.countDown();
logger.error("[{}ms] - {} - {}", getTotalTime(startTime), url, e);
}
#Override
public void cancelled() {
latch.countDown();
logger.error("[{}ms] - request cancelled for {}", getTotalTime(startTime), url);
}
});
}
Direct Question
Is there a way to configure the client such that it can handle for these variances on its own without explicitly modifying the configuration for each endpoint context?
Fixed w/ Combination of the below to Assure Connection Live/Ready
(Or at least is stable)
Forcing HTTP 1
HttpAsyncClients.custom()
.setConnectionManager(manager)
.setRetryStrategy(getRetryStrategy())
.setVersionPolicy(HttpVersionPolicy.FORCE_HTTP_1)
.setConnectionManagerShared(true);
Setting Effective Headers for POST
Specifically the close header
req.setHeader("Connection", "close, TE");
Note: Inactivity check helps, but still sometimes gets refusals w/o this
Setting Inactivity Checks by Type
Set POSTs to validate immediately after inactivity
Note: Using 1000 for both caused a high drop rate for some systems
PoolingAsyncClientConnectionManagerBuilder
.create()
.setValidateAfterInactivity(TimeValue.ofMilliseconds(0))
Set GET to validate after 1s
PoolingAsyncClientConnectionManagerBuilder
.create()
.setValidateAfterInactivity(TimeValue.ofMilliseconds(1000))
Given the Error Context
Tracing the connection problem in stacktrace to AbstractH2StreamMultiplexer
Shows ConnectionHandshake.GRACEFUL_SHUTDOWN as triggering the stream refusal
if (connState == ConnectionHandshake.GRACEFUL_SHUTDOWN) {
throw new H2StreamResetException(H2Error.PROTOCOL_ERROR, "Stream refused");
}
Which corresponds to
connState = streamMap.isEmpty() ? ConnectionHandshake.SHUTDOWN : ConnectionHandshake.GRACEFUL_SHUTDOWN;
Reasoning
If I'm understanding correctly:
The connections were being un/intentionally closed
However, they were not being confirmed ready before executing again
Which caused it to fail because the stream was not viable
Therefore the fix works because (it seems)
Given Forcing HTTP1 allows for a single context to manage
Where HttpVersionPolicy NEGOTIATE/FORCE_HTTP_2 had greater or equivalent failures across the spectrum of regions/menus
And it assures that all connections are valid before use
And POSTs are always closed due to the close header, which is unavailable to HTTP2
Therefore
GET is checked for validity w/ reasonable periodicity
POST is checked every time, and since it is forcibly closed, it is re-acquired before execution
Which leaves no room for unexpected closures
And otherwise the potential that it was incorrectly switching to HTTP2
Will accept this until a better answer comes along, as this is stable but sub-optimal.

How to solve this InvalidCastException while applying ClientCertificate to WCF DataService Client?

I have used the examples in this article to add client certificate authentication to my WCF Data Service. I had to change the example slightly because I am using WCF DataService 5.6 in which the SendingRequest event has been deprecated and replaced by SendingRequest2.
Basically this meant changing the following event handler:
private void OnSendingRequest_AddCertificate(object sender, SendingRequestEventArgs args)
{
if (null != ClientCertificate)
{
((HttpWebRequest)args.Request).ClientCertificates.Add(ClientCertificate);
}
}
To:
private void OnSendingRequest_AddCertificate(object sender, SendingRequest2EventArgs args)
{
if (null != ClientCertificate)
{
((HttpWebRequestMessage)args.RequestMessage).HttpWebRequest.ClientCertificates.Add(ClientCertificate);
}
}
This seems to work. However now I get the following InvalidCastException on some actions:
System.InvalidCastException: Unable to cast object of type
'System.Data.Services.Client.InternalODataRequestMessage' to type
'System.Data.Services.Client.HttpWebRequestMessage'.
I haven't been able to identify with 100% accuracy which actions these are, but it seems consistently on the SaveChanges method (see stacktrace below:)
at MyNamespace.MyContainer.OnSendingRequest_AddCertificate(Object sender, SendingRequest2EventArgs args)
at System.Data.Services.Client.ODataRequestMessageWrapper.FireSendingRequest2(Descriptor descriptor)
at System.Data.Services.Client.BatchSaveResult.GenerateBatchRequest()
at System.Data.Services.Client.BatchSaveResult.BatchRequest()
at System.Data.Services.Client.DataServiceContext.SaveChanges(SaveChangesOptions options)
I came to the modification from SendingRequest to SendingRequest2 by trial and error, so I wonder if I overlooked something there. Or is this completely unrelated and should I just add an && args.RequestMessage is HttpWebRequestMessage to the if statement in the handler?
Seems that you are sending a batch request. A batch request contains several internal requests which are InternalODataRequestMessage. SendingRequest2 will apply the OnSendingRequest action into both the $batch request and it's internal requests.
You can try following code
private void OnSendingRequest_AddCertificate(object sender, SendingRequest2EventArgs args)
{
if (null != ClientCertificate && !args.IsBatchPart)
{
((HttpWebRequestMessage)args.RequestMessage).HttpWebRequest.ClientCertificates.Add(ClientCertificate);
}
}
args.IsBatchPart returns true if this event is fired for request within a batch, otherwise returns false.
It seems the problem occurs when I perform a batch operation.
I tried to dig through the InternalODataRequestMessage to see if I could add client certificates to that somehow using reflection and the DataServices source. I found the instance of InternalODataRequestMessage has a private member requestMessage of type ODataBatchOperationRequestMessage. By looking at the source code I couldn't any way to add a certificate.
What I did notice is that I can actually still use the deprecated SendingRequest event just like before. So that's what I did and everything seems fine.
I feel like there should be a way to use a client certificate without using deprecated methods. So if someone has an answer that shows that, I'll accept that.

WP7: Unable to catch FaultException in asynchronous calls to WCF service

I am currently developing a Windows Phone 7 App that calls a WCF web service which I also control. The service offers an operation that returns the current user's account information when given a user's login name and password:
[ServiceContract]
public interface IWindowsPhoneService
{
[OperationContract]
[FaultContract(typeof(AuthenticationFault))]
WsAccountInfo GetAccountInfo(string iamLogin, string password);
}
Of course, there is always the possibility of an authentication failure and I want to convey that information to the WP7 app. I could simply return null in that case, but I would like to convey the reason why the authentication failed (i.e. login unknown, wrong password, account blocked, ...).
This is my implementation of the above operation (for testing purposes, all it does is throwing an exception):
public WsAccountInfo GetAccountInfo(string iamLogin, string password)
{
AuthenticationFault fault = new AuthenticationFault();
throw new FaultException<AuthenticationFault>(fault);
}
Now, if I call this operation in my WP7 app, like this:
Global.Proxy.GetAccountInfoCompleted += new EventHandler<RemoteService.GetAccountInfoCompletedEventArgs>(Proxy_GetAccountInfoCompleted);
Global.Proxy.GetAccountInfoAsync(txbLogin.Text, txbPassword.Password);
void Proxy_GetAccountInfoCompleted(object sender, RemoteService.GetAccountInfoCompletedEventArgs e)
{
if (e.Error != null)
{
MessageBox.Show(e.Error.Message);
return;
}
}
The debugger breaks in Reference.cs, saying that FaultException'1 was unhandled, here:
public PhoneApp.RemoteService.WsAccountInfo EndGetAccountInfo(System.IAsyncResult result) {
object[] _args = new object[0];
PhoneApp.RemoteService.WsAccountInfo _result = ((PhoneApp.RemoteService.WsAccountInfo)(base.EndInvoke("GetAccountInfo", _args, result)));
return _result;
}
BEGIN UPDATE 1
When pressing F5, the exception bubbles to:
public PhoneApp.RemoteService.WsAccountInfo Result {
get {
base.RaiseExceptionIfNecessary(); // <-- here
return ((PhoneApp.RemoteService.WsAccountInfo)(this.results[0]));
}
}
and then to:
private void Application_UnhandledException(object sender, ApplicationUnhandledExceptionEventArgs e)
{
if (System.Diagnostics.Debugger.IsAttached)
{
// An unhandled exception has occurred; break into the debugger
System.Diagnostics.Debugger.Break();
}
}
After that, the app terminates (with or without the debugger).
END UPDATE 1
Now, I would love to catch the exception in my code, but I am never given the chance, since my Completed handler is never reached.
Based on similar questions on this site, I have already tried the following:
Re-add the service reference --> no change
Re-create a really simple WCF service from scratch --> same problem
Start the app without the debugger to keep the app from breaking into the debugger --> well, it doesn't break, but the exception is not caught either, the app simply exits
Tell VS 2010 not to break on FaultExceptions (Debug > Options) --> does not have any effect
wrap every line in my app in try { ... } catch (FaultException) {} or even catch (Exception) --> never called.
BEGIN UPDATE 2
What I actually would like to achieve is one of the following:
ideally, reach GetAccountInfoCompleted(...) and be able to retrieve the exception via the GetAccountInfoCompletedEventArgs.Error property, or
be able to catch the exception via a try/catch clause
END UPDATE 2
I would be grateful for any advice that would help me resolve this issue.
The framework seems to read your WsAccountInfo.Result property.
This rethrows the exception on client side.
But you should be the first to read this property.
I don't know your AuthenticationFault class, does it have a DataContractAttribute and is it a known type like the example in
http://msdn.microsoft.com/en-us/library/system.servicemodel.faultcontractattribute.aspx ?
I believe I had the same problem. I resolved it by extending the proxy class and calling the private Begin.../End... methods within the Client object rather than using the public auto-generated methods on the Client object.
For more details, please see:
http://cbailiss.wordpress.com/2014/02/09/wcf-on-windows-phone-unable-to-catch-faultexception/

Memory leak using WCF GetCallbackChannel over named pipe

We have a simple wpf application that connects to a service running on the local machine. We use a named pipe for the connection and then register a callback so that later the service can send updates to the client.
The problem is that with each call of the callback we get a build up of memory in the client application.
This is how the client connects to the service.
const string url = "net.pipe://localhost/radal";
_channelFactory = new DuplexChannelFactory<IRadalService>(this, new NetNamedPipeBinding(),url);
and then in a threadpool thread we loop doing the following until we are connected
var service = _channelFactory.CreateChannel();
service.Register();
service.Register looks like this on the server side
public void Register()
{
_callback = OperationContext.Current.GetCallbackChannel<IRadalCallback>();
OperationContext.Current.Channel.Faulted += (sender, args) => Dispose();
OperationContext.Current.Channel.Closed += (sender, args) => Dispose();
}
This callback is stored and when new data arrives we invoke the following on the server side.
void Sensors_OnSensorReading(object sender, SensorReadingEventArgs e)
{
_callback.OnReadingReceived(e.SensorId, e.Count);
}
Where the parameters are an int and a double. On the client this is handled as follows.
public void OnReadingReceived(int sensorId, double count)
{
_events.Publish(new SensorReadingEvent(sensorId, count));
}
But we have found that commenting out _event.Publish... makes no difference to the memory usage. Does anyone see any logical reason why this might be leaking memory. We have used a profiler to track the problem to this point but cannot find what type of object is building up.
Well I can partially answer this now. The problem is partially caused by us trying to be clever and getting the connection to be opened on another thread and then passing it back to the main gui thread. The solution was to not use a thread but instead use a dispatch timer. It does have the downside that the initial data load is now on the GUI thread but we are not loading all that much anyway.
However this was not the entire solution (actually we don't have an entire solution). Once we moved over to a better profiler we found out that the objects building up were timeout handlers so we disabled that feature. That's OK for us as we are running against the localhost always but I can imagine for people working with remote services it would be an issue.

WCF Proxy Client taking time to create, any cache or singleton solution for it

we have more than dozon of wcf services and being called using TCP binding. There are a lots of calls to same wcf service at various places in code.
AdminServiceClient client = FactoryS.AdminServiceClient();// it takes significant time. and
client.GetSomeThing(param1);
client.Close();
i want to cache the client or produce it from singleton. so that i can save some time, Is it possible?
Thx
Yes, this is possible. You can make the proxy object visible to the entire application, or wrap it in a singleton class for neatness (my preferred option). However, if you are going to reuse a proxy for a service, you will have to handle channel faults.
First create your singleton class / cache / global variable that holds an instance of the proxy (or proxies) that you want to reuse.
When you create the proxy, you need to subscribe to the Faulted event on the inner channel
proxyInstance.InnerChannel.Faulted += new EventHandler(ProxyFaulted);
and then put some reconnect code inside the ProxyFaulted event handler. The Faulted event will fire if the service drops, or the connection times out because it was idle. The faulted event will only fire if you have reliableSession enabled on your binding in the config file (if unspecified this defaults to enabled on the netTcpBinding).
Edit: If you don't want to keep your proxy channel open all the time, you will have to test the state of the channel before every time you use it, and recreate the proxy if it is faulted. Once the channel has faulted there is no option but to create a new one.
Edit2: The only real difference in load between keeping the channel open and closing it every time is a keep-alive packet being sent to the service and acknowledged every so often (which is what is behind your channel fault event). With 100 users I don't think this will be a problem.
The other option is to put your proxy creation inside a using block where it will be closed / disposed at the end of the block (which is considered bad practice). Closing the channel after a call may result in your application hanging because the service is not yet finished processing. In fact, even if your call to the service was async or the service contract for the method was one-way, the channel close code will block until the service is finished.
Here is a simple singleton class that should have the bare bones of what you need:
public static class SingletonProxy
{
private CupidClientServiceClient proxyInstance = null;
public CupidClientServiceClient ProxyInstance
{
get
{
if (proxyInstance == null)
{
AttemptToConnect();
}
return this.proxyInstance;
}
}
private void ProxyChannelFaulted(object sender, EventArgs e)
{
bool connected = false;
while (!connected)
{
// you may want to put timer code around this, or
// other code to limit the number of retrys if
// the connection keeps failing
AttemptToConnect();
}
}
public bool AttemptToConnect()
{
// this whole process needs to be thread safe
lock (proxyInstance)
{
try
{
if (proxyInstance != null)
{
// deregister the event handler from the old instance
proxyInstance.InnerChannel.Faulted -= new EventHandler(ProxyChannelFaulted);
}
//(re)create the instance
proxyInstance = new CupidClientServiceClient();
// always open the connection
proxyInstance.Open();
// add the event handler for the new instance
// the client faulted is needed to be inserted here (after the open)
// because we don't want the service instance to keep faulting (throwing faulted event)
// as soon as the open function call.
proxyInstance.InnerChannel.Faulted += new EventHandler(ProxyChannelFaulted);
return true;
}
catch (EndpointNotFoundException)
{
// do something here (log, show user message etc.)
return false;
}
catch (TimeoutException)
{
// do something here (log, show user message etc.)
return false;
}
}
}
}
I hope that helps :)
In my experience, creating/closing the channel on a per call basis adds very little overhead. Take a look at this Stackoverflow question. It's not a Singleton question per se, but related to your issue. Typically you don't want to leave the channel open once you're finished with it.
I would encourage you to use a reusable ChannelFactory implementation if you're not already and see if you still are having performance problems.