SSL and Load Balancing

SSL and Load Balancing - ssl

What affect does SSL have on the way load balancing works? I know that you need to use sticky sessions if you have chosen to not store you session info in the DB or Out of Process but how does that effect SSL?

Just to clarify, the SSL/TLS sessions have nothing to do with the HTTP sessions. (Some implementations may use the SSL/TLS session ID as a basis for maintaining HTTP sessions, but this is a bad design, as SSL/TLS may change sessions completely independently what HTTP is doing).
In terms of load balancing, you get a couple of options:
Use a load-balancer that is your SSL/TLS endpoint. In this case, the load-balancing will be done at the HTTP level: the client connects to the load-balancer and the load-balancer unwraps the SSL/TLS connection to pass on the HTTP content (then in clear) to its workers.
Use a load-balancer at the TCP/IP level, which redirects entire the TCP connection directly to a worker node. In this case, each worker node would have to have the certificate and private key (which isn't necessarily a problem if they're administered consistently). Using this technique, the load balancer doesn't do any HTTP processing at all (since it doesn't look within the SSL/TLS connection): on the one hand this reduces the processing done by the load-balancer itself, on the other hand it would prevent you from dispatching to a particular worker node based on the URL structure for example. Both methods have their advantages and disadvantages.

Related

Load balancer confusion (Load balancer mechanism )

Hi I'm little confused about load balancer concept
I've read some articles about loadbalancer in nginx and from what I've understand is that the load balancer spread the request into multiple servers !
But i thought if one server is down another one is up and running (not simultaneously all server together)
and another thing is when request spread between servers what happen to static data like sessions and InMemory Database like RedisDB
I think i'm confused and missunderstood the loadbalancer mechanism

and from what I've understand is that the load balancer spread the request into multiple servers ! But i thought if one server is down another one is up and running (not simultaneously all server together)
As it comes from the name the goal of load balancer (LB) is to balance the load. As per wiki definition for example:
In computing, load balancing is the process of distributing a set of tasks over a set of resources (computing units), with the aim of making their overall processing more efficient. Load balancing can optimize the response time and avoid unevenly overloading some compute nodes while other compute nodes are left idle.
To perform this task load balancer obviously need to have some monitoring over the resources, including liveness checks (so it can bring out of the rotation the failing servers/nodes). Ideally LB should work with stateless services (i.e. request could be routed to any of the servers supporting handling such request type) but that is not always the case due to multiple reasons, for example in ASP.NET in case of non-distributed session requests should have been routed to servers which handled the previous request from the session, which could have been handled with so called sticky session/cookie.
and another thing is when request spread between servers what happen to static data like sessions and InMemory Database like RedisDB
It is not very clear what is the question here. As I mentioned before ideally you will want to have stateless services which will use some shared datastore (s) to handle the requests, so if request comes for any server/node it can load all the needed data to handle it.
So in short when request comes to LB it selects one of the servers based on some algorithm (round robin, resource based, sharding, response time based, etc.) and send this request to this server so in theory based on the used approach sequential requests from the same source can hit different nodes/servers (so basically this is one of the ways to horizontally scale your application).

I actually found my answer in nginx doc page
Short answer is IP-Hash mechanism
Nginx doc word :
Please note that with round-robin or least-connected load balancing, each subsequent client’s request can be potentially distributed to a different server. There is no guarantee that the same client will be always directed to the same server.
If there is the need to tie a client to a particular application server — in other words, make the client’s session “sticky” or “persistent” in terms of always trying to select a particular server — the ip-hash load balancing mechanism can be used.
With ip-hash, the client’s IP address is used as a hashing key to determine what server in a server group should be selected for the client’s requests. This method ensures that the requests from the same client will always be directed to the same server except when this server is unavailable.
To configure ip-hash load balancing, just add the ip_hash directive to the server (upstream) group configuration:
upstream myapp1 {
ip_hash;
server srv1.example.com;
server srv2.example.com;
server srv3.example.com;
}
http://nginx.org/en/docs/http/load_balancing.html

Why is mod_proxy_protocol or ELB causing high apache worker count?

We have a legacy cluster of servers running Apache 2.4 that run our application sitting behind an ELB. This ELB has two listeners, one HTTP, and one HTTPS which terminates at the ELB and sends regular HTTP traffic to the instances behind it. This ELB also has pre-open turned off (it was causing a busy worker buildup). Under normal load we have 1-3 busy workers per instance.
We have a new cluster of servers we are trying to migrate to behind a new ELB. The purpose of this migration is to allow for SNI – serving TLS traffic to thousands of domains. As such this cluster uses mod_proxy_protocol which has been enabled at the ELB level. For the purposes of testing we’ve been weighting traffic at the DNS (Route 53) level to send 30% of our traffic to the new load balancer. Under even this small load we see 5 – 10 busy workers and that grows as traffic does.
As a further test we took one of these new instances, disabled proxy_protocol, and moved it from the new ELB to the old ELB, the worker count drops to average levels, being 1-3 busy workers. This seems to indicate that there is an issue wither with the ELB (differences between HTTP and TCP handling?) or mod_proxy_protocol.
My question: Why is it that we have twice the busy apache workers when using proxy protocol and the new ELB? I would think that since TCP listeners are dumb and don’t do any processing on the traffic, they would be faster and as a result consume less workers time than HTTP listeners which actively ‘modify’ the traffic going thru them.
Any guidance to help us diagnose this issue is appreciated.

The difference is simple and significant:
An ELB in HTTP mode takes care of holding the idle keep-alive connections from browsers without holding open corresponding connections to the instance. There's no necessary correlation between browser connections and back-end connections -- a backend connection can be reused.
In TCP mode, it's 1:1. It has to be, because the ELB can't reuse a back-end connection for different browser connection on the front-end -- it's not interpreting what's going down the pipe. That's always true for TCP, but if the reason isn't intuitive, it should be particularly obvious with the proxy protocol enabled. The PROXY "header" is not in fact a "header" in the usual sense -- it's a preamble. It can only be sent at the very beginning of a connection, identifying the source address and port. The connection persists until the browser or server closes it, or it times out. It's 1:1.
This is not likely to be viable with Apache.
Back to HTTP mode, for a minute.
This ELB also has pre-open turned off (it was causing a busy worker buildup).
I don't know how you did that -- I've never seen it documented, so I assume this must have been through a support request.
This seems like a case of solving entirely the wrong problem. Instead of having a number of connections that seems to you to be artificially high, all you've really accomplished is keeping the number of connections artificially low -- ultimately, you're probably actually impairing your performance and ability to scale. Those spare connections are for the purpose of handling bursts of demand. If your instance is too small to handle them, then I would suggest that the real problem is just that: your instance is too small.
Another approach -- which is exactly the solution I use for my dreaded legacy Apache-based applications (one of which has a single Apache server sitting behind a total of about 15 to 20 different ELBs -- necessary because each ELB is offloading SSL using a certificate provided by one of the old platform's customers) -- is HAProxy between the ELBs and Apache. HAProxy can handle literally hundreds of connections and millions of requests per day on tiny instances (I'm talking tiny -- t2.nano and t2.micro), and it has no problem keeping the connections alive from all of the ELBs yet closing the Apache connection after each request... so it's optimizing things in both directions.
And of course, you can also use HAProxy with a TCP balancer and the proxy protocol -- the author of HAProxy was also the creator of the proxy protocol standard. You can also just run it on the instances with Apache rather than on separate instances. It's lightweight in memory and CPU and doesn't fork. I'm not affilated with the project, other than having submitted occasional bug reports during the development of the Lua integration.

Scaling out the zuul proxy

We have scaled out all sevices in our system by having more than one instance of them registered in eureka service registry.
Also, they are also proxied by a zuul server in the front.
My question is how can we ensure the scalability of zuul proxy when accessed from clients.
One solution i can think of is having multiple instances of the proxy registered in eureka registry. But if that is done how do we decide on which of the instances would be exposed to the clients.

We faced the same issue in our application, having multiple instances of multiple types of micro-service-type applications on our backend. All servers registered with Eureka. The problem is that we also had multiple security gateways configured (based on the architecture described in this excellent tutorial: https://spring.io/guides/tutorials/spring-security-and-angular-js/).
Eventually we decided to use a hardware http load balancer that calls our security gateways in a round-robin approach (our solution is on-prem).
We use Redis with #EnableHttpRedisSession annotation to have the spring session synced across all the servers, so the http load balancer does not have to deal with sticky sessions or stateful considerations. It just does a round-robin to all the security gateways. It doesn't matter if the load balancer hits SG1, SG2 or SG3, they all share the same session information coming from Redis (which is also configured for fail-over with Redis Sentinel).

service discovery, load balancing and connection pooling approach

There are two approaches that can be used for service interaction when having SOA for large systems deployed on cloud like AWS.
Have each service cluster behind internal elb. client makes a connection pool with corresponding elb and elb does round-robin balancing.
going with service discovery approach like netflix eureka.
Currently we are using 1st approach where each service cluster is behind internal elb and clients communicate via elbs so each client instance has to maintain only 1 pool i.e. with the elb endpoint.
I have following doubts regarding 2nd apporach.
Is there a benefit in moving to service discovery and smart client architecture where service client knows all service instances (via eureka service or equivalent) and does internal load balancing?
In above case how does connection pooling work? Currently each client instance has to maintain exactly 1 connection pool i.e. with the corresponding service's elb. But with rich client each client will have all the service instance endpoints to directly communicate to. Making connection on each request will not be efficient and having so many connection pools (1 for each service instance) for each client is a overkill I guess.
Need inputs/suggestions on above two questions.

First question.
Yes there is. First, you can do better failure recovery - for example, retry failed requests to another node without showing any errors to client. Next, you can do better balancing than ELB offers. Next, you can automatically add/remove nodes to/from cluster w/o altering ELB configuration. This is very useful if your nodes have healthchecks. More importantly, software balancer can do this fast.
Second question.
Have connection pool per node. i.e.
[api method in client code] -> [software balancer] -> [node connection pool] -> [node connection] -> [use this connection to make request]