Which timeout should I set to an external service? - api

This service is a remote session pool. I need to ask for a session to work with other services. In most cases, this pool will have a session available, so in 15ms i will have a response. But sometimes, it will need to create a session on demand, requiring up to 800ms to create it.
I have two options in mind to handle this situation:
To set a 15ms timeout, and to implement a retry policy up with an exponential back off until 800ms. This service will create the required session no matter whether I am connected to it.
To set a 800ms timeout, and to keep connected to the service until a session is available for me.
In both cases, there's no guarantee that I will have a session after 800ms.
So the question is: Which are the pro/cons for each option?

1 . To set a 15ms timeout, and to implement a retry policy up with an exponential back off until 800ms. This service will create the required session no matter whether I am connected to it.
Pro
Detects that the session is not available immediately, don't need to wait almost a second for this.
It's up to the client to request again for the session or go by other way, you have more flexibility for different use cases.
You could differentiate the undesired event of waiting for a session more than 15msec reporting each time the fallback strategy goes on, useful for abnormal session pool behaviour detection.
Cons
The code is more complex because of the fallback behaviour.
Multiple parameters because of different timeouts.
2 . To set a 800ms timeout, and to keep connected to the service until a session is available for me.
Pro
Simple and straight-forward implementation
Simple parametrization
Cons
You can't notice the session creation event delay from the session pool. This is important for tracing and diagnostics, this simple approach could hide session pool problems.
Not flexible implementation for different clients use cases.
-
I think the decision driver is if you need a solution that just works for this use case or if this approach will be used for different clients and use cases.
PS: If you need to create a solution for different clients maybe will be worth to create a more complex protocol, like:
// just takes a session if available, no more than 15msec delay expected
get_session(...) : session
// if not available, creates one
get_session_or_create(...) : session
available_sessions(...) : int
// between 0 and 1, the proportion of available sessions
availability(...) : double
...
It's up to the client how to use it.
And over dimension the timeout parameters by some safe %, depending on the session creation delay variance.

Related

Azure IoT Hub and reliable direct methods

We are working with a scenario where a device can be "unlocked" and we want to be certain that the unlock-state is properly propagated the server.
For the moment we are using Direct Methods, but there are concerns regarding what happens when the call times out. As we understand it if the server times out but the device successfully responds (getting an MQTT PUBACK from the IoT-Hub), then we have an inconsistency where the device is "unlocked" but the server think it failed. This is a state we want to avoid and it's important that the device and server are in synch.
Are there any good patterns how to solve this?
In my opinion, please don't worry about this issue. Direct methods represent a request-reply interaction with a device similar to an HTTP call in that they succeed or fail immediately (after a user-specified timeout). This approach is useful for scenarios where the course of immediate action is different depending on whether the device was able to respond.Direct methods are synchronous and either succeed or fail after the timeout period (default: 30 seconds, settable up to 3600 seconds). But it is no guarantee on ordering or any concurrency semantics on method calls.
We are going to rethink how we see the timeout and embrace uncertainty in this case. If we timeout we will think that as inconclusive and wait for next telemetry data from the device to be conclusive.

In what scenarios is recommended a reliable session?

In few words, if I am not wrong, a session is used when I want to ensure that the packages are sent in order, and to be able to use sessions is needed a reliable connection.
But my doubt what kind of applications need that? In my case is a simple application in which a client request to a service data from a database, the service get the data from the database and send to the client the results. Also the client can requeset to add, modify or delete data from database. In this case, should I need a reliable connection and sessions or not?
Thanks.
Session presumes that for some period of time you want to retain some data. Such a period of time, as far as session is concerned, refers to client's lifecycle that is when client opens up proxy, both service along with session are created, when client closes proxy service and session terminate their actions. There is exception when closing proxy does not actually perform it right away and this occures when you invoke one-way-operation. Service will keep working as long as operation performs its action despite the fact that it previously received an order to get rid of instance.
Based on provided information I assume the best choice would be PerCall. You do not store any data between calls and every single call can be perceived separately. Additionaly, leverage of ConcurrencyMode set to multiple so as to allow services being created simultaneously.
Personally, I find session useful in MSMQ, whenever I want to specific number of messages be wrapped into single queue-message. If error occures, regardless of whether which message is in charge of it, the whole queue-message is rolled back.

How do I correctly configure a WCF NetTcp Duplex Reliable Session?

Please excuse the Obvious Self-Q/A, but this information is widely misunderstood, and almost always incorrectly answered. So I Wanted to place this information here for people searching for a definitive answer to this problem.
Even so, there's still some information I haven't been able to nail down. I will put this towards the end of the question (skip to that if you are not interested in the preamble).
How do I correctly configure a WCF NetTcp Duplex Reliable Session?
There are many questions and answers regarding this topic, and nearly all of them suggest setting inactivityTimeout="Infinite" in your configuration. This doesn't really seem to work correctly, particularly for the case of NetTcp (It may work correctly for WSDualHttp Bindings, but I have never used those).
There are a number of other issues that are often associated with this: Including, Channel not faulting after client or server unexpectedly disconnected, Channel disconnecting after 10 minutes, Channel randomly disconnecting... Channel throwing exception when trying to open... Unable to configure Metadata on same endpoint...
Please note: There are two concepts that are important below. Infrastructure messages are internal to the way WCF communicates, and are used by the framework to keep things running smoothly. Operation messages are messages that occur because your app has done something, like send a message across the wire. Infrastructure messages are largely invisible to your app (but they still occur in the background) while operation messages are the result of an action your app has taken.
Information I have figured out, through hard won trial and error.
Infinite does not appear to be a valid configuration setting in all situations (and certainly, the visual studio validation schema doesn't know about it).
There are two special configuration converters, called InfiniteIntConverter and InfiniteTimeSpanConverter which will sometimes work to convert the value Infinite to either Int.MaxValue or TimeSpan.MaxValue, but I haven't yet figured out the situations in which this appears to be valid as sometimes it works, and sometimes it doesn't. What's more, it appears that some libraries will allow Infinite in the config, while others will not, so you can succeed in one part of a configuration, but fail in another.
You must configure BOTH inactivityTimeout and receiveTimeout, on both the client and the server. While these values do not HAVE to be the same, they probably should be as they will probably cause confusion if they are not. (technically, you can leave inactivityTimeout to its default value if you want, but you should be aware of its value, and what it does)
inactivityTimeout should NEVER be set to a large value, much less Infinite or TimeSpan.MaxValue.
inactivityTimeout has two functions (and this is not widely understood). The first function defines the maximum amount of time that can elapse on a channel without receiving any "infrastructure" or "operation" messages. The second function defines the time period in which infrastructure messages are sent (half the time specified). If no infrastructure or operation messages have been received during the timeout period, the connection is aborted.
receiveTimeout specifies the maximum amount of time that can elapse between operation messages only. This value can be set to a large value, such as TimeSpan.MaxValue (particularly if your channel runs internally over a trusted network or over a vpn). This value is what defines how long the reliable session will "stay alive" if there is no activity between client and server (other than infrastructure messages). ie, your client does not call any methods of the interface, and your server does not call back into the client.
setting a short inactivityTimeout and a large receiveTimeout keeps your reliable session "tacked up" even when there is no operational activity between your client and server. The short inactivity timeout (i like to keep the default 10 minutes or less) sends infrastructure "ping" messages to keep the TCP connection alive while the long receive timeout keeps the reliable session active. while at the same time providing a reasonable timeout in case of disconnection.
If you set inactivityTimeout to a large value, then the reliable session will not be reliable as it has no way to keep the Tcp connection alive, nor does it have any way to verify the integrity of the connection. It won't know if a user has disconnected unexpectedly until you try and send a message to that client and find out the connection is no longer there. This is why many people who use Infinite for this setting resort to creating a "Ping" method in their service, which is completely unnecessary if you've configured these settings correctly.
If you set inactivityTimeout to a value larger than receiveTimeout then it will likewise also be unreliable, as you will still be governed by the receiveTimeout for operation messages. ie. if you forget to set receiveTimeout and leave it at the default 10 minutes, then if the user is idle for 10 minutes, the connection will be aborted.
When the client or server unexpectedly disconnects (app crashes, network failure, someone trips over the power cord, etc..), the other side may not notice right away. I have attached various ChannelFaulted event handlers in various test situations, and sometimes the connection is faulted right away... other times it doesn't seem to fault at all. What i have discovered through trial and error is that the when it doesn't seem to fault, it will actually fault after the inactivityTimeout expires on that end. (so if it's set to 10 minutes, then after 10 minutes it will call the ChannelFaulted event).
I have not yet figured out why in some situations it notices the disconnection right away, and others it waits for the timer to expire. In both cases, I notice internal first chance communication exceptions thrown and handled by the framework, and there are calls to Abort the connection... but somehow the call to the event handler gets lost and it must wait for the timeout. My suspicion is this is somehow thread related.
When trying to configure Metadata to work across the NetTcp channel, I have had sporadic results. Sometimes it works, sometimes it doesn't. I've read many reports that Metadata doesn't work over NetTcp and that you have to use an Http channel for the Metadata, but I have in fact had it work on several occasions using the net.tcp:// url to generate the proxy. Then I would change something, recompile and it would no longer work. Changing things back, it wouldn't work again. So it was very confusing what magic incantation was necessary to make Metadata function over net.tcp, shared with the endpoint on the same port (obviously with a different address).
When configuring both a NetTcp and Metatdata endpoint on the same service, and specifying non-default settings for connection parameters like listenBacklog, and maxConnections, you also need to make sure the Metadata endpoint uses the same settings, which typically means you have to define a custom binding, since these settings are not available from the standard tcp mex binding. This includes setting listenBacklog and maxPendingConnections on tcpTransport, and groupName and maxOutboundConnectionsPerEndpoint on connectionPoolSettings.
The default setting for the Ordered setting of ReliableSession is True. This uses a lot more overhead than turning it off. If you don't need ordered messages, i would suggest turning it off (need to set this on both sides)
-
Configuration I still need to understand:
How do I correctly configure the shared net.tcp Metadata endpoint? (I will add an example when I get a chance) Currently, i'm specifying an http get url to bypass the problem. It's so inconsistent as to why it sometimes works and sometimes does not. I kept getting the error `The URI Prefix is not recognized' when generating the proxy in Visual Studio.
Why does WCF sometimes Fault the channel immediately upon disconnect, and sometimes waits for inactivityTimeout to expire? What controls/causes one vs the other behavior?

WCF Server Push connectivity test. Ping()?

Using techniques as hinted at in:
http://msdn.microsoft.com/en-us/library/system.servicemodel.servicecontractattribute.callbackcontract.aspx
I am implementing a ServerPush setup for my API to get realtime notifications from a server of events (no polling). Basically, the Server has a RegisterMe() and UnregisterMe() method and the client has a callback method called Announcement(string message) that, through the CallbackContract mechanisms in WCF, the server can call. This seems to work well.
Unfortunately, in this setup, if the Server were to crash or is otherwise unavailable, the Client won't know since it is only listening for messages. Silence on the line could mean no Announcements or it could mean that the server is not available.
Since my goal is to reduce polling rather than immediacy, I don't mind adding a void Ping() method on the Server alongside RegisterMe() and UnregisterMe() that merely exists to test connectivity of to the server. Periodically testing this method would, I believe, ensure that we're still connected (and also that no Announcements have been dropped by the transport, since this is TCP)
But is the Ping() method necessary or is this connectivity test otherwise available as part of WCF by default - like serverProxy.IsStillConnected() or something. As I understand it, the channel's State would only return Faulted or Closed AFTER a failed Ping(), but not instead of it.
2) From a broader perspective, is this callback approach solid? This is not for http or ajax - the number of connected clients will be few (tens of clients, max). Are there serious problems with this approach? As this seems to be a mild risk, how can I limit a slow/malicious client from blocking the server by not processing it's callback queue fast enough? Is there a kind of timeout specific to the callback that I can set without affecting other operations?
Your approach sounds reasonable, here are some links that may or may not help (they are not quite exactly related):
Detecting Client Death in WCF Duplex Contracts
http://tomasz.janczuk.org/2009/08/performance-of-http-polling-duplex.html
Having some health check built into your application protocol makes sense.
If you are worried about malicious clients, then add authorization.
The second link I shared above has a sample pub/sub server, you might be able to use this code. A couple things to watch out for -- consider pushing notifications via async calls or on a separate thread. And set the sendTimeout on the tcp binding.
HTH
I wrote a WCF application and encountered a similar problem. My server checked clients had not 'plug pulled' by periodically sending a ping to them. The actual send method (it was asynchronous being a server) had a timeout of 30 seconds. The client simply checked it received the data every 30 seconds, while the server would catch an exception if the timeout was reached.
Authorisation was required to connect to the server (by using the built-in feature of WCF that force the connecting person to call a particular method first) so from a malicious client perspective you could easily add code to check and ban their account if they do something suspicious, while disconnecting users who do not authenticate.
As the server I wrote was asynchronous, there wasn't any way to really block it. I guess that addresses your last point, as the asynchronous send method fires off the ping (and any other sending of data) and returns immediately. In the SendEnd method it would catch the timeout exception (sometimes multiple for the client) and disconnect them, without any blocking or freezing of the server.
Hope that helps.
You could use a publisher / subscriber service similar to the one suggested by Juval:
http://msdn.microsoft.com/en-us/magazine/cc163537.aspx
This would allow you to persist the subscribers if losing the server is a typical scenario. The publish method in this example also calls each subscribers on a separate thread, so a few dead subscribers will not block others...

WCF Design questions

I am designing a WCF service.
I am using netTCP binding.
The Service could be called from multi-threaded clients.
The multi-threaded clients are not sharing the proxy.
1. WCF Service design question.
Client has to sent these 2 values in every call: UserID and SourceSystemID. This will help the Service to identify the user and the system he belongs.
Instead of passing these 2 values in every call, I decided to have them cached with the Service for the duration of call from the client.
I decided to have a parameterized constructor for the Service and store these values in the ChannelContext as explained in this article.
http://www.danrigsby.com/blog/index.php/2008/09/21/using-icontextchannel-extensions-to-store-custom-data/
Initially I wanted to go with storing the values in the Session and have a method for initialization and termination. But there I found that I need to manually clean up the session in each case. When I am storing values in the channel context, I don’t have to clean it up every time and when the channel closes the values stored are already destroyed.
Can somebody please make sure that I am correct in my assumption?
2. Should I use SessionMode?
For my contract, I used : [ServiceContract(SessionMode = SessionMode.Required)] and without this service attribute.
Irrespective of my choice, I am always finding a value for : System.ServiceModel.OperationContext.Current.SessionId
How can this be explained?
When I say SessionMode.Required, does my InstanceContextMode automatically change to PerSession?
3. InstanceContextMode to be used?
My service is stateless except that I am storing some values in the Channel Context as mentioned in (1).
Should I use Percall or PerSession as InstanceContextMode?
The netTcp always has a transport-level session going - so that's why you always have a SessionId. So basically, no matter what you choose, with netTcp, you've got a session-ful connection right from the transport level on up.
As for InstanceContextMode - as long as you don't need anything else from a session except the SessionId - no reliable messaging etc. - then I'd typically pick Per-Call - it's more scalable, it typically performs better, it gives you less "glue" to worry about and less bits and pieces that you need to manage.
I would use an explicitly required session only if you need to turn on reliable messaging or something else that absolutely requires a WCF session. If you don't - then it's just unnecessary overhead, in my opinion.
Setting SessionMode to SessionMode.Required will enforce using bindings which support sessions, like NetTcpBinding, WSHttpBinding, etc. In fact if you try using a non-session-enabled binding , the runtime will throw an exception when you try to open the host.
Setting InstanceContextMode to PerSession means that only one instance of the service will be crated per session and that instance will serve all the requests coming from that session.
Having SessionId set by the runtime means that you might have a transport session or a reliable session or security session. Having those does not necessarily mean you have an application session , that is a single service object serving the requests per proxy. In other words, you might switch off application session by setting InstanceContextMode=PerCall forcing the creation of a new service object for every call, while maintaining a transport session due to using netTcpBinding, or a reliable or security session.
Think of the application session that is configured by InstanceContextMode and Session Mode as a higher level session, relying on a lower-level session /security, transport or reliable/. An application session cannot actually be established without having one of the other sessions in place, from there the requirement for the binding .
It is getting a bit long already, but for simple values I would recommend you to pass those values every time instead of creating application session. That will ensure the service objects have a short lifetime and no unnecessary resources will be kept alive on the server. It makes a lot sense with more clients, or proxies talking to your service. And you could always cache the values in the clients, even pass them as custom headers if you want.