FYI: This will be my first real foray into Async/Await; for too long I've been settling for the familiar territory of BackgroundWorker. It's time to move on.
I wish to build a WCF service, self-hosted in a Windows service running on a remote machine in the same LAN, that does this:
Accepts a request for a single .ZIP archive
Creates the archive and packages several files
Returns the archive as its response to the request
I have to support archives as large as 10GB. Needless to say, this scenario isn't covered by basic WCF designs; we must take additional steps to meet the requirement. We must eliminate timeouts while the archive is building and memory errors while it's being sent. Both of these occur under basic WCF designs, depending on the size of the file returned.
My plan is to proceed using task-based asynchronous WCF calls and streaming mode.
I have two concerns:
Is this the proper approach to the problem?
Microsoft has done a nice job at abstracting all of this, but what of the underlying protocols? What goes on 'under the hood?' Does the server keep the connection alive while the archive is building (could be several minutes) or instead does it close the connection and initiate a new one once the operation is complete, thereby requiring me to properly route the request through the client machine firewall?
For #2, clearly I'm hoping for the former (keep-alive). But after some searching I'm not easily finding an answer. Perhaps you know.
You need streaming for big payloads. That is the right approach. This has nothing at all to do with asynchronous IO. The two are independent. The client cannot even tell that the server is async internally.
I'll add my standard answers for whether to use async IO or not:
https://stackoverflow.com/a/25087273/122718 Why does the EF 6 tutorial use asychronous calls?
https://stackoverflow.com/a/12796711/122718 Should we switch to use async I/O by default?
Each request runs over a single connection that is kept alive. This goes for both streaming big amounts of data as well as big initial delays. Not sure why you are concerned about routing. Does your router kill such connections? That's a problem.
Regarding keep alive, there is nothing going over the wire to do that. TCP sessions can stay open indefinitely without any kind of wire traffic.
Related
I read there are web servers their behaviors are called blocking whereas Node.js's is said non-blocking.
Would a blocking web server get hung up to the sense it needs restarting, if many http clients send requests at most in parallel?
As a complement, I don't say that it needs restarting while it potentially works fine again after the flood of parallel requests have stopped.
And I currently don't understand how request buffers and overflows work for web servers.
Although technically it could be possible to make a single-thread, single-process blocking server that can only handle 1 request at a time, it doesn't really practically make sense. Concurrency is kind of important.
The three main paradigms for parallelism (that I know of) are:
Multi-process/forking
Threading
Using an event loop/reactor pattern
Node falls in the third category, and also a bit in the second category depending on how you look at it.
Most languages can look at a socket and read from it, and immediately move on if there was nothing to read. Therefore most languages can have this non-blocking behavior.
I have a situation where I have two programs (one exe and one dll loaded into the process space of another third-party exe) communicating requests with each other using a local machine wcf service (using net named pipe binding). There's a third host exe that starts hosting the service. It all works great (so far anyways... I'm still learning), but I got to thinking about what would happen if the channel faults or the service times out. What would be the best practice for checking and handling faults as well as keep the channel alive?
In my case it will be up to the user to keep the applications open or close them and we do have those users who tend to keep them open overnight, over the weekend, etc... It seems to me this could open the possibility of a fault or loss of service and I don't have a clue how to recover. Any help would be greatly appreciated.
Firstly, why would you keep the channel alive indefinitely?
Imagine you are connecting to a database from which you want to read over the course of one day. Would you create the database connection in the morning and then close it in the evening?
It is relatively cheap to construct a channel in WCF for each call, unless you know you are going to be making multiple calls within a few seconds of each other, in which case you should reuse the channel.
EDIT
This post explains how to do it. It's pretty complicated and it may be easier to just set a huge timeout value for the binding in code (as suggested at the end of the post):
Do WCF Callbacks TimeOut
EDIT
There's tons of stuff on google about this: http://bit.ly/10ZPWE2
I have a web tier that forwards calls onto an application tier. The web tier uses a shared, cached channel to do so. The application tier services in question are stateless and have concurrency enabled.
But they are not being called concurrently.
If I alter the web tier to create a new channel on every call, then I do get concurrent calls onto the application tier. But I want to avoid that cost since it is functionally unnecessary for my scenario. I have no session state, and nor do I need to re-authenticate the caller each time. I understand that the creation of the channel factory is far more expensive than than the creation of the channels, but it is still a cost I'd like to avoid if possible.
I found this article on MSDN that states:
While channels and clients created by
the channels are thread-safe, they
might not support writing more than
one message to the wire concurrently.
If you are sending large messages,
particularly if streaming, the send
operation might block waiting for
another send to complete.
Firstly, I'm not sending large messages (just a lot of small ones since I'm doing load testing) but am still seeing the blocking behavior. Secondly, this is rather open-ended and unhelpful documentation. It says they "might not" support writing more than one message but doesn't explain the scenarios under which they would support concurrent messages.
Can anyone shed some light on this?
Addendum: I am also considering creating a pool of channels that the web server uses to fulfill requests. But again, I see no reason why my existing approach should block and I'd rather avoid the complexity if possible.
After much ado, this all came down to the fact that I wasn't calling Open explicitly on the channel before using it. Apparently an implicit Open can preclude concurrency in some scenarios.
You can cache the WCF proxy, but still create a channel for each service call - this will ensure concurrency, is not very expensive in comparison to creating a channel from scratch, and re-authentication for each call will not be necessary. This is explained on Wenlong Dong's blog - "Performance Improvement for WCF Client Proxy Creation in .NET 3.5 and Best Practices" (a much better source of WCF information and guidance than MSDN).
Just for completeness: Here is a blog entry explaining the observed behavior of request serialization when not opening the channel explicitly:
http://blogs.msdn.com/b/wenlong/archive/2007/10/26/best-practice-always-open-wcf-client-proxy-explicitly-when-it-is-shared.aspx
I'm currently playing around a little with WCF, during this I stepped on a question where I'm not sure if I'm on the right track.
Let's assume a simple setup that looks like this: client -> service1 -> service2.
The communication is tcp-based.
So where I'm not sure is, if it makes sense that the service1 caches the client proxy for service2. So I might get a multi-threaded access to that proxy, and I have to deal with it.
I'd like to take advantage of the tcp session to get better performance, but I'm not sure if this "architecture" is supported by WCF/network/whatever at all. The problem I see is that all the communication goes over the same channel, if I'm not using locks or another sync.
I guess the better idea is to cache the proxy in a threadstatic variable.
But before I do that, I wanted to confirm that it's really not a good idea to have only one proxy instance.
tia
Martin
If you don't know that you have a performance problem, then why worry about caching? You're opening yourself to the risk of improperly implementing multithreading code, and without any clear, measurable benefit.
Have you measured performance yet, or profiled the application to see where it's spending its time? If not, then when you do, you may well find that the overhead of multiple TCP sessions is not where your performance problems lie. You may wish you had the time to optimize some other part of your application, but you will have spent that time optimizing something that didn't need to be optimized.
I am already using such a structure. I have one service that collaborates with some other services and realise the implementation. Of course, in my case the client calls some one-way method of the first service. I am getting very good benifit. Of course, I also have configured it to limit the number of concurrent calls in some of the cases.
Yes, that architecture is supported by WCF. I deal with applications every day that use similar structures, using NetTCPBinding.
The biggest thing to worry about is the ConcurrencyMode of the various services involved, and making sure that they do not block unnecessarily. It is very easy to get into a scenario where you will be guaranteed timeouts, or at the least have poor performance due to multiple, synchronous calls across service boundaries. Even OneWay calls are not guaranteed to immediately return.
careful with threadstatic, .net changes the thread so the variable can get null.
For session...perhaps you could use session enabled calls:
http://msdn.microsoft.com/en-us/library/ms733040.aspx
But i would not recomend using if you do not have any performance issue. I would use the normal way, or if service 1 is just for forwarding you could use that functionality easily with 4.0:
http://www.sdn.nl/SDN/Artikelen/tabid/58/view/View/ArticleID/2979/Whats-New-in-WCF-40.aspx
Regards
Firstly, make sure you know about the behaviour of ThreadStatic in ASP.NET applications:
http://piers7.blogspot.com/2005/11/threadstatic-callcontext-and_02.html
The same thread that started your request may not be the same thread that finishes it. Basically the only safe way of storing Thread local storage in ASP.NET applications is inside HttpContext. The next obvious approach would be to creat a wrapper client to manage your WCF client proxy and ensure each IO request is thread safe using locks.
Although my personal preference would be to use a pool of proxy clients. Whenever you need one pop it off the pool queue and when you're finished with it put it back on.
We can use polling to find out about updates from some source, for example, clients connected to a webserver. WCF provides a nifty feature in the way of Duplex contracts, in which, I can maintain a connection to a client, and make invocations on that connection at will.
Some peeps in the office were discussing the merits of both solutions, and I wanted to get feedback on when each strategy is best used.
I would use an event-based mechanism instead of polling. In WCF, you can do this easily by following the Publish-Subscribe framework that Juval Lowy provides at his website, IDesign.net.
Depends partly on how many users you have.
Say you have 1,000,000 users you will have problems maintaining that many sessions.
But if your system can respond to 1000 poll requests a second then each client can poll every 1000 seconds.
I think Shiraz nailed this one, but I wanted to say two more things.
I've had trouble with Duplex
contracts. You have to have all of
your ducks in a row with regards to
the callback channel... you have to
check it to make sure it's open,
etc. The IDesign.net stuff would be
a minimum amount of plumbing code
you'll have to include.
If it makes sense for your solution
(this is only appropriate in certain
situations), the MSMQ binding allows
a client to send data to a service
in an async manner (like Duplex),
but the service isn't "polling" for
messages... it gets notified when
one enters the queue through some
under-the-covers plumbing.
This sort of forces you to turn the
communication around (client becomes
server, server becomes client), but
if the majority of the communication
is one-way, this would provide a lot
of benefits. The other advantage
here is obviously the queued
communication - the server can be
down and not miss any messages...
it'll pick 'em up when it comes back
online.
Something to think about.