Using Load Balancers To Terminate TLS For TCP/IP Connections - ssl

I am writing a TCP/IP server that handlers persistent connections. I'll be using TLS to secure the communication and have a question about how to do this:
Currently I have a load balancer (AWS ELB) in front of a single server. In order for the load balancer to do the TLS termination for the duration of the connection it must hold on to the connection and forward the plain text to the application behind it.
client ---tls---> Load Balancer ---plain text---> App Server
This works great. Yay! My concern is that I'll need a load balancer in front of every app server because, presumably, the number of connections the load balancer can handle is the same as the number of connections the app server can handle (assuming the same OS and NIC). This means that if I had 1 load balancer and 2 app servers, I could wind up in a situation where the load balancer is at full capacity and each app server is at half capacity. In order to avoid this problem I'd have to create a 1 to 1 relationship between the load balancers and app servers.
I'd prefer the app server to not have to do the TLS termination because, well, why recreate the wheel? Are there better methods than to have a 1 to 1 relationship between the load balancer and the app server to avoid the capacity issue mentioned above?

There are two probable flaws in your presumption.
The first is the assumption that your application server will experience the same amount of load for a given number of connections as the load balancer. Unless your application server is extremely well-written, it seems reasonable that it would run out of CPU or memory or encounter other scaling issues before it reached the theoretical maximum ~64K concurrent connections IPv4 can handle on a given IP address. If that's really true, then great -- well done.
The second issue is that a single load balancer from ELB is not necessarily a single machine. A single ELB launches a hidden virtual machine in each availability zone where you've attached the ELB to a subnet, regardless of the number of instances attached, and the number of ELB nodes scales up automatically as load increases. (If I remember right, I've seen as many as nodes 8 running at the same time -- for a single ELB.) Presumably the class of those ELB instances could change , too, but that's not a facet that's well documented. There's not a charge for these machines, as they are included in the ELB price, so as they scale up, the monthly cost for the ELB doesn't change... but provisioning qty = 1 ELB does not mean you get only 1 ELB node.

Related

Load balancer confusion (Load balancer mechanism )

Hi I'm little confused about load balancer concept
I've read some articles about loadbalancer in nginx and from what I've understand is that the load balancer spread the request into multiple servers !
But i thought if one server is down another one is up and running (not simultaneously all server together)
and another thing is when request spread between servers what happen to static data like sessions and InMemory Database like RedisDB
I think i'm confused and missunderstood the loadbalancer mechanism
and from what I've understand is that the load balancer spread the request into multiple servers ! But i thought if one server is down another one is up and running (not simultaneously all server together)
As it comes from the name the goal of load balancer (LB) is to balance the load. As per wiki definition for example:
In computing, load balancing is the process of distributing a set of tasks over a set of resources (computing units), with the aim of making their overall processing more efficient. Load balancing can optimize the response time and avoid unevenly overloading some compute nodes while other compute nodes are left idle.
To perform this task load balancer obviously need to have some monitoring over the resources, including liveness checks (so it can bring out of the rotation the failing servers/nodes). Ideally LB should work with stateless services (i.e. request could be routed to any of the servers supporting handling such request type) but that is not always the case due to multiple reasons, for example in ASP.NET in case of non-distributed session requests should have been routed to servers which handled the previous request from the session, which could have been handled with so called sticky session/cookie.
and another thing is when request spread between servers what happen to static data like sessions and InMemory Database like RedisDB
It is not very clear what is the question here. As I mentioned before ideally you will want to have stateless services which will use some shared datastore (s) to handle the requests, so if request comes for any server/node it can load all the needed data to handle it.
So in short when request comes to LB it selects one of the servers based on some algorithm (round robin, resource based, sharding, response time based, etc.) and send this request to this server so in theory based on the used approach sequential requests from the same source can hit different nodes/servers (so basically this is one of the ways to horizontally scale your application).
I actually found my answer in nginx doc page
Short answer is IP-Hash mechanism
Nginx doc word :
Please note that with round-robin or least-connected load balancing, each subsequent client’s request can be potentially distributed to a different server. There is no guarantee that the same client will be always directed to the same server.
If there is the need to tie a client to a particular application server — in other words, make the client’s session “sticky” or “persistent” in terms of always trying to select a particular server — the ip-hash load balancing mechanism can be used.
With ip-hash, the client’s IP address is used as a hashing key to determine what server in a server group should be selected for the client’s requests. This method ensures that the requests from the same client will always be directed to the same server except when this server is unavailable.
To configure ip-hash load balancing, just add the ip_hash directive to the server (upstream) group configuration:
upstream myapp1 {
ip_hash;
server srv1.example.com;
server srv2.example.com;
server srv3.example.com;
}
http://nginx.org/en/docs/http/load_balancing.html

Why is mod_proxy_protocol or ELB causing high apache worker count?

We have a legacy cluster of servers running Apache 2.4 that run our application sitting behind an ELB. This ELB has two listeners, one HTTP, and one HTTPS which terminates at the ELB and sends regular HTTP traffic to the instances behind it. This ELB also has pre-open turned off (it was causing a busy worker buildup). Under normal load we have 1-3 busy workers per instance.
We have a new cluster of servers we are trying to migrate to behind a new ELB. The purpose of this migration is to allow for SNI – serving TLS traffic to thousands of domains. As such this cluster uses mod_proxy_protocol which has been enabled at the ELB level. For the purposes of testing we’ve been weighting traffic at the DNS (Route 53) level to send 30% of our traffic to the new load balancer. Under even this small load we see 5 – 10 busy workers and that grows as traffic does.
As a further test we took one of these new instances, disabled proxy_protocol, and moved it from the new ELB to the old ELB, the worker count drops to average levels, being 1-3 busy workers. This seems to indicate that there is an issue wither with the ELB (differences between HTTP and TCP handling?) or mod_proxy_protocol.
My question: Why is it that we have twice the busy apache workers when using proxy protocol and the new ELB? I would think that since TCP listeners are dumb and don’t do any processing on the traffic, they would be faster and as a result consume less workers time than HTTP listeners which actively ‘modify’ the traffic going thru them.
Any guidance to help us diagnose this issue is appreciated.
The difference is simple and significant:
An ELB in HTTP mode takes care of holding the idle keep-alive connections from browsers without holding open corresponding connections to the instance. There's no necessary correlation between browser connections and back-end connections -- a backend connection can be reused.
In TCP mode, it's 1:1. It has to be, because the ELB can't reuse a back-end connection for different browser connection on the front-end -- it's not interpreting what's going down the pipe. That's always true for TCP, but if the reason isn't intuitive, it should be particularly obvious with the proxy protocol enabled. The PROXY "header" is not in fact a "header" in the usual sense -- it's a preamble. It can only be sent at the very beginning of a connection, identifying the source address and port. The connection persists until the browser or server closes it, or it times out. It's 1:1.
This is not likely to be viable with Apache.
Back to HTTP mode, for a minute.
This ELB also has pre-open turned off (it was causing a busy worker buildup).
I don't know how you did that -- I've never seen it documented, so I assume this must have been through a support request.
This seems like a case of solving entirely the wrong problem. Instead of having a number of connections that seems to you to be artificially high, all you've really accomplished is keeping the number of connections artificially low -- ultimately, you're probably actually impairing your performance and ability to scale. Those spare connections are for the purpose of handling bursts of demand. If your instance is too small to handle them, then I would suggest that the real problem is just that: your instance is too small.
Another approach -- which is exactly the solution I use for my dreaded legacy Apache-based applications (one of which has a single Apache server sitting behind a total of about 15 to 20 different ELBs -- necessary because each ELB is offloading SSL using a certificate provided by one of the old platform's customers) -- is HAProxy between the ELBs and Apache. HAProxy can handle literally hundreds of connections and millions of requests per day on tiny instances (I'm talking tiny -- t2.nano and t2.micro), and it has no problem keeping the connections alive from all of the ELBs yet closing the Apache connection after each request... so it's optimizing things in both directions.
And of course, you can also use HAProxy with a TCP balancer and the proxy protocol -- the author of HAProxy was also the creator of the proxy protocol standard. You can also just run it on the instances with Apache rather than on separate instances. It's lightweight in memory and CPU and doesn't fork. I'm not affilated with the project, other than having submitted occasional bug reports during the development of the Lua integration.

How to scale out haproxy node itself?

As i am having one application in which architecture is as per below,
users ----> Haproxy load balancer(TCP Connection) -> Application server 1
-> Application server 2
-> Application server 3
Now i am able to scale out and scale in application servers. But due to high no of TCP connections(around 10000 TCP connections), my haproxy load balancer node RAM gets full, so further not able to achieve more no of TCP connections afterwords. So need to add one more node for happroxy itself. So my question is how to scale out haproxy node itself or is there any other solution for the same?
As i am deploying application on AWS. Your solution is highly appreciated.
Thanks,
Mayank
Use AWS Route53 and create CNAMEs that point to your haproxy instances.
For example:
Create a CNAME haproxy.example.com pointing to haproxy instance1.
Create a second CNAME haproxy.example.com pointing to haproxy instance2.
Both of your CNAMEs should use some sort of routing policy. Simplest is probably round robin. Simply rotates over your list of CNAMEs. When you lookup haproxy.example.com you will get the addresses of both instances. The order of ips returned will change with each request. This will distribute load evenly between your two instances.
There are many more options depending on your needs: http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy.html
Overall this is pretty easy to configure.
Few unrelated issues: You can also setup health checks and route traffic to the remaining healthy instance(s) if needed. If your running at capacity with two instances you might want to add a few more to able to cope with an instance failing.

Build an app stack that has no single point of failure and is fault tolerant

Considering a three-tier app ( web server, app server and database).
[Apache Web server -> Tomcat app server -> database]
How to build an app stack (leaving out the database part) that has no single point of failure and is fault tolerant ?
IMHO, this is quite an open-ended question. How specific is a single point of failure - single app server, single physical server, single data centre, network?
A starting point would be to run your Tomcat and Apache servers in clusters. Alternatively, you can run separate instances, fronted with a load balancer such as HAProxy - except, to avoid a single point of failure, you will need redundancy on the load balancer as well. I recently worked on a project where we had two instances of a load balancer, fronted with a virtual IP (VIP). The load balancers communicated with two different app server instances, using a round-robin approach. Clients connected to the VIP in order to use the application, and they were completely oblivious to the fact that there were multiple servers behind it.
As an additional comment, you may also want to look at space-based architecture - https://en.wikipedia.org/wiki/Space-based_architecture.

Glassfish failover without load balancer

I have a Glassfish v2u2 cluster with two instances and I want to to fail-over between them. Every document that I read on this subject says that I should use a load balancer in front of Glassfish, like Apache httpd. In this scenario failover works, but I again have a single point of failure.
Is Glassfish able to do that fail-over without a load balancer in front?
The we solved this is that we have two IP addresses which both respond to the URL. The DNS provider (DNS Made Easy) will round robin between the two. Setting the timeout low will ensure that if one server fails the other will answer. When one server stops responding, DNS Made Easy will only send the other host as the server to respond to this URL. You will have to trust the DNS provider, but you can buy service with extremely high availability of the DNS lookup
As for high availability, you can have cluster setup which allows for session replication so that the user won't loose more than potentially one request which fails.
Hmm.. JBoss can do failover without a load balancer according to the docs (http://docs.jboss.org/jbossas/jboss4guide/r4/html/cluster.chapt.html) Chapter 16.1.2.1. Client-side interceptor.
As far as I know glassfish the cluster provides in-memory session replication between nodes. If I use Suns Glassfish Enterprise Application Server I can use HADB which promisses 99.999% of availability.
No, you can't do it at the application level.
Your options are:
Round-robin DNS - expose both your servers to the internet and let the client do the load-balancing - this is quite attractive as it will definitely enable fail-over.
Use a different layer 3 load balancing system - such as "Windows network load balancing" , "Linux Network Load balancing" or the one I wrote called "Fluffy Linux cluster"
Use a separate load-balancer that has a failover hot spare
In any of these cases you still need to ensure that your database and session data etc, are available and in sync between the members of your cluster, which in practice is much harder.