what are some good "load balancing issues" to know? - load-balancing

Hey there guys, I am a recent grad, and looking at a couple jobs I am applying for I see that I need to know things like runtime complexity (straight forward enough), caching (memcached!), and load balancing issues
 (no idea on this!!)
So, what kind of load balancing issues and solutions should I try to learn about, or at least be vaguely familiar with for .net or java jobs ?
Googling around gives me things like network load balancing, but wouldn't that usually not be adminstrated by a software developer?

One thing I can think of is session management. By default, whenever you get a session ID, that session ID points to some in-memory data on the server. However, when you use load-balacing, there are multiple servers. What happens when data is stored in the session on machine 1, but for the next request the user is redirected to machine 2? His session data would be lost.
So, you'll have to make sure that either the user gets back to the same machine for every concurrent request ('sticky connection') or you do not use in-proc session state, but out-of-proc session state, where session data is stored in, for example, a database.

There is a concept of load distribution where requests are sprayed across a number of servers (usually with session affinity). Here there is no feedback on how busy any particular server may be, we just rely on statistical sharing of the load. You could view the WebSphere Http plugin in WAS ND as doing this. It actually works pretty well even for substantial web sites
Load balancing tries to be cleverer than that. Where some feedback on the relative load of the servers determines where new requests go. (even then session affinity tends to be treated as higher priority than balancing load). The WebSphere On Demand Router that was originally delivered in XD does this. If you read this article you will see the kind of algorithms used.
You can achieve balancing with network spraying devices, they could consult "agents" running in the servers which give feedback to the sprayer to give a basis for decisions where request should go. Hence even this Hardware-based approach can have a Software element. See Dynamic Feedback Protocol

network combinatorics, max- flow min-cut theorems and their use

Related

Async WCF and Protocol Behaviors

FYI: This will be my first real foray into Async/Await; for too long I've been settling for the familiar territory of BackgroundWorker. It's time to move on.
I wish to build a WCF service, self-hosted in a Windows service running on a remote machine in the same LAN, that does this:
Accepts a request for a single .ZIP archive
Creates the archive and packages several files
Returns the archive as its response to the request
I have to support archives as large as 10GB. Needless to say, this scenario isn't covered by basic WCF designs; we must take additional steps to meet the requirement. We must eliminate timeouts while the archive is building and memory errors while it's being sent. Both of these occur under basic WCF designs, depending on the size of the file returned.
My plan is to proceed using task-based asynchronous WCF calls and streaming mode.
I have two concerns:
Is this the proper approach to the problem?
Microsoft has done a nice job at abstracting all of this, but what of the underlying protocols? What goes on 'under the hood?' Does the server keep the connection alive while the archive is building (could be several minutes) or instead does it close the connection and initiate a new one once the operation is complete, thereby requiring me to properly route the request through the client machine firewall?
For #2, clearly I'm hoping for the former (keep-alive). But after some searching I'm not easily finding an answer. Perhaps you know.
You need streaming for big payloads. That is the right approach. This has nothing at all to do with asynchronous IO. The two are independent. The client cannot even tell that the server is async internally.
I'll add my standard answers for whether to use async IO or not:
https://stackoverflow.com/a/25087273/122718 Why does the EF 6 tutorial use asychronous calls?
https://stackoverflow.com/a/12796711/122718 Should we switch to use async I/O by default?
Each request runs over a single connection that is kept alive. This goes for both streaming big amounts of data as well as big initial delays. Not sure why you are concerned about routing. Does your router kill such connections? That's a problem.
Regarding keep alive, there is nothing going over the wire to do that. TCP sessions can stay open indefinitely without any kind of wire traffic.

ZMQ device queue does not load balance properly

I know that ZMQ offers all of the flexibility to do your own load-balancing. However I would expect the out-of-the-box broker, about 4 lines of code using the line
zmq_device (ZMQ_QUEUE, frontend, backend);
to load balance quite well as the documentation says it does load balance.
ZMQ_QUEUE creates a shared queue that collects requests from a set of clients, and distributes these fairly among a set of services. Requests are fair-queued from frontend connections and load-balanced between backend connections. Replies automatically return to the client that made the original request.
I have an army of back-end services and yet find that often my front-end clients have to wait several seconds for something that takes < 1/10 of a second in a 1:1 setting (there are same # of client and service machines). I suspect that ZMQ is not load-balancing properly out of the box - it's sending too many requests to the same service even though it doesn't have bandwidth, etc.
I think this is partly because the services are multithreaded in a way that lets them take up to 10 concurrent requests yet it slows down greatly at near the 10th request even though it can still accept them. Random distribution would be ideal. Is there an out-of-the-box way to do this or can it be done in a few lines of code, or do I have to write my own broker from scratch?
Fwiw issue was the workers were taking on work when they didn't have room for it, issue was not in ZMQ layer per se.

WCF Session Instancing Mode Hosting Issue

I am facing a situation regarding hosting WCF on Session Instancing mode.I am encapsulating the actual situation and proposing an example to replicate it...as below.
The service to be hosted is "MyService". I am using windows service to host it..with http endpoint.
It will need to support 500 concurrent sessions.(Singleton & Percall cannot be done because the Contract is Workflow based...Login...Function1,Function2,Logout..)
I have 4 Servers each with a hardware capability of supporting 200 concurrent sessions.
So I configured the service on One server as a Router(ServiceHost S = new ServiceHost(RouterService)) with hosting path such as "http://myserver/MyService". I have set a simple load balancing mechanism and applied the Router table to redirect incoming requests to other three servers where the actual service copies are hosted...("http://myserver/MyService1","http://myserver/MyService2","http://myserver/MyService3")
It is still not working...As soon as hits go above 200...communication error starts...I suppose because when 500 concurrent calls are made, then the Router(capability 200) is also required to stay connected to the Client along with the Actual Service Server...(in Session Call mode)..Is my thinking correct??
My question is...
1) Is my approach correct or flawed from concept...Should I ask the Hardware team to set up NLB...
2) Should we redesign the contract specifically to ensure that the requests can somehow be made PerCall...
3) Someone suggested that such systems should be hosted on cloud (Windows Azure)...will need to look at costs involved...but is it correct...
4) What are the best practicies involved while hosting WCF to handle Session Based Calls.
I understand that my question is complex and there would not be one "Correct" answer...but any help and insight will be really appreciated.
Thanks
"Should I ask the Hardware team to set up NLB..." as per you & "Sticky IP cluster" by Shiraz are the closest one can get to host the given scnerio.
The thing is that WCF sessions are transport based.hence we cannot store these "sessions" on a state server/db like a traditional aspnet.
WCF4.0 has come up with new bindings such as NetTcpContextBinding, BasicHttpContextBinding, WSHttpContextBinding which could help context re-creation on cross machine environment.But I have no production implementation knowledge to provide example.
This article should help you to know more...
There are three seperate but connected issues here:
Your design requires that you maintain state between calls
You are dependent upon getting to the same server each time (since you store state in memory)
You have a limit of 200 connections per server
A solution where you are dependent on coming back to the same server will not work (well) on Windows Azure.
You could implement a Sticky IP cluster, that would solve most of your problems, but it would not guarrantee that no more than 200 connections are on one server. For the most part this would be OK.
You could store the cache in Appfabric Cache, then it would not matter which server you returned to.
You could redesign your system so that all state is stored in the database.

How does the load balanced server is working?

Thanks for taking time to read my questions.
I am having some basic doubts about the load balanced servers.
I assume that One application is hosted on the two servers, when one server is heavily loaded the load balancer is switching the responsibilities of handling the particular request to another server.
This is how I assumed about the load balancer.
Which is managing and monitoring the load and do all the transfers of requests?
How do the static variables are taken place for processing? For ex: , - I have a variable called as 'totalNumberOfClick'. Which is being incremented whenever we hit the page.
If a GET request is handled by a server and its POST method also should be managed by that server.Right? For Ex: in to- A user is requesting a page for editing, the Asp.Net runtime will create a set of viewstate (which has controlID and its values) and is maintained in the server and client side. When we hit the post button the server is validating the view state and allowing it to into a server and doing other processing.
If the post is getting transferred to another server, how the Runtime allow it to do the processing.
If you are using the load balancing built into Windows, then there are several options for how the load is distributed. The servers keep in communication with each other and organise the load between themselves.
The most scalable option is to evenly balance the requests across all of the servers. This means that each request could end up being processed by a different server so a common practice is to use "sticky sessions". These are tied to the user's IP address, and make sure that all requests from the same user go to the same server.
There is no way to share static variables across multiple servers so you will need to store the value in a database or on another server.
If you find an out of process to host session state (such as stateserver or sql server) then you can process any request on any server. Viewstate allows the server to recreate most of the data needed that generated the page.
I have some answers for you.
When it comes to web applications, load balancers need to provide what is calles Session Stickyness. That means that once a server is elected to serve a clients request all subsequent request will be directed to the same node as long as the session is active. Of course this is not neccessary if your web application does not rely on any state that has to be preserved (i.e. stateless, sessionless).
I think this can answer your third and maybe even your second question.
Your first question is on how load balancers work internally. Since I am not an expert in that I can only guess that the load balancer that each client is talking to measures ping response times to derive an estimated load amount on the server. Maybe more sophisticated techniques could be used.

using BOSH/similar technique for existing application/system

We've an existing system which connects to the the back end via http (apache/ssl) and polls the server for new messages, needless to say we have scalability issues.
I'm researching on removing this polling and have come across BOSH/XMPP but I'm not sure how we should take the BOSH technique (using long lived http connection).
I've seen there are few libraries available but the entire thing seems bloaty since we do not need buddy lists etc and simply want to notify the clients of available messages.
The client is written in C/C++ and works across most OS so that is an important factor. The server is in Java.
does bosh result in huge number of httpd processes? since it has to keep all the clients connected, what would be the limit on that. we are also planning to move to 64 bit JVM/apache what would be the max limit of clients in that case.
any hints?
I would note that BOSH is separate from XMPP, so there's no "buddy lists" involved. XMPP-over-BOSH is what you're thinking of there.
Take a look at collecta.com and associated blog posts (probably by Jack Moffitt) about how they use BOSH (and also XMPP) to deliver real-time information to large numbers of users.
As for the scaling issues with Apache, I don't know — presumably each connection is using few resources, so you can increase the number of connections per Apache process. But you could also check out some of the connection manager technologies (like punjab) mentioned on the BOSH page above.