Would a blocking web server get hung up to the sense it needs restarting, if many http clients send requests at most in parallel? - blocking

I read there are web servers their behaviors are called blocking whereas Node.js's is said non-blocking.
Would a blocking web server get hung up to the sense it needs restarting, if many http clients send requests at most in parallel?
As a complement, I don't say that it needs restarting while it potentially works fine again after the flood of parallel requests have stopped.
And I currently don't understand how request buffers and overflows work for web servers.

Although technically it could be possible to make a single-thread, single-process blocking server that can only handle 1 request at a time, it doesn't really practically make sense. Concurrency is kind of important.
The three main paradigms for parallelism (that I know of) are:
Multi-process/forking
Threading
Using an event loop/reactor pattern
Node falls in the third category, and also a bit in the second category depending on how you look at it.
Most languages can look at a socket and read from it, and immediately move on if there was nothing to read. Therefore most languages can have this non-blocking behavior.

Related

Async WCF and Protocol Behaviors

FYI: This will be my first real foray into Async/Await; for too long I've been settling for the familiar territory of BackgroundWorker. It's time to move on.
I wish to build a WCF service, self-hosted in a Windows service running on a remote machine in the same LAN, that does this:
Accepts a request for a single .ZIP archive
Creates the archive and packages several files
Returns the archive as its response to the request
I have to support archives as large as 10GB. Needless to say, this scenario isn't covered by basic WCF designs; we must take additional steps to meet the requirement. We must eliminate timeouts while the archive is building and memory errors while it's being sent. Both of these occur under basic WCF designs, depending on the size of the file returned.
My plan is to proceed using task-based asynchronous WCF calls and streaming mode.
I have two concerns:
Is this the proper approach to the problem?
Microsoft has done a nice job at abstracting all of this, but what of the underlying protocols? What goes on 'under the hood?' Does the server keep the connection alive while the archive is building (could be several minutes) or instead does it close the connection and initiate a new one once the operation is complete, thereby requiring me to properly route the request through the client machine firewall?
For #2, clearly I'm hoping for the former (keep-alive). But after some searching I'm not easily finding an answer. Perhaps you know.
You need streaming for big payloads. That is the right approach. This has nothing at all to do with asynchronous IO. The two are independent. The client cannot even tell that the server is async internally.
I'll add my standard answers for whether to use async IO or not:
https://stackoverflow.com/a/25087273/122718 Why does the EF 6 tutorial use asychronous calls?
https://stackoverflow.com/a/12796711/122718 Should we switch to use async I/O by default?
Each request runs over a single connection that is kept alive. This goes for both streaming big amounts of data as well as big initial delays. Not sure why you are concerned about routing. Does your router kill such connections? That's a problem.
Regarding keep alive, there is nothing going over the wire to do that. TCP sessions can stay open indefinitely without any kind of wire traffic.

Are WCF callbacks an anti-pattern?

As far as I understand it one of the things servers are very good at is managing large numbers of incoming connections, allocating those connections resources as they become available and then disposing of those resources once they have serviced the incoming requests so that they can service more requests.
By allowing a client to register a callback with a server, the client essentially takes an element of control away from that server because it now allocates itself a proportion of that server's resources for some unspecified amount of time that is, to a certain extent, beyond the control of the server (of course the server can kick the client off at any time to claim those resources back, but that all starts to get kind of complicated).
The more I think about it the more I feel like WCF callbacks are basically an anti-pattern and that polling should be preferred except in exceptional cases.
Is that right and in which case what are those exceptional cases?

Long polling blocking multiple windows?

Long polling has solved 99% of my problems. There is now just one other problem. Imagine a penny auction site, where people bid. On the frontpage, there are several Auctions.
If the user opens three of these auctions, and because javascript is not multithreaded, how would you get the other pages to ever load? Won't they always get bogged down and not load because they are waiting for long polling to end? In practice, I've experienced this and I can't think of a way around it. Any ideas?
There are two ways that javascript gets around some of this.
While javascript is single threaded conceptually, it does its io in separate threads using completion handlers. This means other pieces of javascript can be running while you are waiting for your network request to complete.
Javascript for each page (or even each frame in each page) is isolated from Javascript on the other pages/frames. This means that each copy of javascript can be running in its own thread.
A bigger issue for you is likely to be that browsers often limit the number of concurrent connections to a given site, and it sounds like you want to make many concurrent connections to the same site. In this case you will get a lock up.
If you control both the sever and client, you will need to combined the multiple long-poll request from the client into a single long-poll request to the server.

ZMQ device queue does not load balance properly

I know that ZMQ offers all of the flexibility to do your own load-balancing. However I would expect the out-of-the-box broker, about 4 lines of code using the line
zmq_device (ZMQ_QUEUE, frontend, backend);
to load balance quite well as the documentation says it does load balance.
ZMQ_QUEUE creates a shared queue that collects requests from a set of clients, and distributes these fairly among a set of services. Requests are fair-queued from frontend connections and load-balanced between backend connections. Replies automatically return to the client that made the original request.
I have an army of back-end services and yet find that often my front-end clients have to wait several seconds for something that takes < 1/10 of a second in a 1:1 setting (there are same # of client and service machines). I suspect that ZMQ is not load-balancing properly out of the box - it's sending too many requests to the same service even though it doesn't have bandwidth, etc.
I think this is partly because the services are multithreaded in a way that lets them take up to 10 concurrent requests yet it slows down greatly at near the 10th request even though it can still accept them. Random distribution would be ideal. Is there an out-of-the-box way to do this or can it be done in a few lines of code, or do I have to write my own broker from scratch?
Fwiw issue was the workers were taking on work when they didn't have room for it, issue was not in ZMQ layer per se.

Concurrent access to WCF client proxy

I'm currently playing around a little with WCF, during this I stepped on a question where I'm not sure if I'm on the right track.
Let's assume a simple setup that looks like this: client -> service1 -> service2.
The communication is tcp-based.
So where I'm not sure is, if it makes sense that the service1 caches the client proxy for service2. So I might get a multi-threaded access to that proxy, and I have to deal with it.
I'd like to take advantage of the tcp session to get better performance, but I'm not sure if this "architecture" is supported by WCF/network/whatever at all. The problem I see is that all the communication goes over the same channel, if I'm not using locks or another sync.
I guess the better idea is to cache the proxy in a threadstatic variable.
But before I do that, I wanted to confirm that it's really not a good idea to have only one proxy instance.
tia
Martin
If you don't know that you have a performance problem, then why worry about caching? You're opening yourself to the risk of improperly implementing multithreading code, and without any clear, measurable benefit.
Have you measured performance yet, or profiled the application to see where it's spending its time? If not, then when you do, you may well find that the overhead of multiple TCP sessions is not where your performance problems lie. You may wish you had the time to optimize some other part of your application, but you will have spent that time optimizing something that didn't need to be optimized.
I am already using such a structure. I have one service that collaborates with some other services and realise the implementation. Of course, in my case the client calls some one-way method of the first service. I am getting very good benifit. Of course, I also have configured it to limit the number of concurrent calls in some of the cases.
Yes, that architecture is supported by WCF. I deal with applications every day that use similar structures, using NetTCPBinding.
The biggest thing to worry about is the ConcurrencyMode of the various services involved, and making sure that they do not block unnecessarily. It is very easy to get into a scenario where you will be guaranteed timeouts, or at the least have poor performance due to multiple, synchronous calls across service boundaries. Even OneWay calls are not guaranteed to immediately return.
careful with threadstatic, .net changes the thread so the variable can get null.
For session...perhaps you could use session enabled calls:
http://msdn.microsoft.com/en-us/library/ms733040.aspx
But i would not recomend using if you do not have any performance issue. I would use the normal way, or if service 1 is just for forwarding you could use that functionality easily with 4.0:
http://www.sdn.nl/SDN/Artikelen/tabid/58/view/View/ArticleID/2979/Whats-New-in-WCF-40.aspx
Regards
Firstly, make sure you know about the behaviour of ThreadStatic in ASP.NET applications:
http://piers7.blogspot.com/2005/11/threadstatic-callcontext-and_02.html
The same thread that started your request may not be the same thread that finishes it. Basically the only safe way of storing Thread local storage in ASP.NET applications is inside HttpContext. The next obvious approach would be to creat a wrapper client to manage your WCF client proxy and ensure each IO request is thread safe using locks.
Although my personal preference would be to use a pool of proxy clients. Whenever you need one pop it off the pool queue and when you're finished with it put it back on.