Scaling out the zuul proxy - reverse-proxy

We have scaled out all sevices in our system by having more than one instance of them registered in eureka service registry.
Also, they are also proxied by a zuul server in the front.
My question is how can we ensure the scalability of zuul proxy when accessed from clients.
One solution i can think of is having multiple instances of the proxy registered in eureka registry. But if that is done how do we decide on which of the instances would be exposed to the clients.

We faced the same issue in our application, having multiple instances of multiple types of micro-service-type applications on our backend. All servers registered with Eureka. The problem is that we also had multiple security gateways configured (based on the architecture described in this excellent tutorial: https://spring.io/guides/tutorials/spring-security-and-angular-js/).
Eventually we decided to use a hardware http load balancer that calls our security gateways in a round-robin approach (our solution is on-prem).
We use Redis with #EnableHttpRedisSession annotation to have the spring session synced across all the servers, so the http load balancer does not have to deal with sticky sessions or stateful considerations. It just does a round-robin to all the security gateways. It doesn't matter if the load balancer hits SG1, SG2 or SG3, they all share the same session information coming from Redis (which is also configured for fail-over with Redis Sentinel).

Related

How Can I use Apache to load balance Marklogic Cluster

Hi I am new to Marklogic and Apache. I have been provided task to use apache as loadbalancer for our Marklogic cluster of 3 machines. Marklogic cluster is currently running on Linux servers.
How can we achieve this? Any information regarding this would be helpful.
You could use mod_proxy_balancer. How you configure it depends what MarkLogic client you would like to use. If you would like to use the Java Client API, please follow the second example here to allow apache to generate stickiness cookies. If you would like to use XCC, please configure it to use the ML-Server-generated or backend-generated "SessionID" cookie.
The difference here is that XCC uses sessions whereas the Java Client API builds on the REST API which is stateless, so there are no sessions. However, even in the Java Client API when you use multi-request transactions, that imposes state for the duration of that transaction so the load balancer needs a way to route requests during that transaction to the correct node in the MarkLogic cluster. The stickiness cookie will be resent by the Java Client API with every request that uses a Transaction so the load balancer can maintain that stickiness for requests related to that transaction.
As always, do some testing of your configuration to make sure you got it right. Properly configuring apache plugins is an advanced skill. Since you are new to apache, your best hope of ensuring you got it right is checking with an HTTP monitoring tool like WireShark to look at the HTTP traffic from your application to MarkLogic Server to make sure things are going to the correct node in the cluster as expected.
Note that even with the client APIs (Java, Node.js) its not always obvious or explicit at the language API layer what might cause a session to be created. Explicitly creating multi statement transactions definately will, but other operations may do so as well. If you are using the same connection for UI (browser) and API (REST or XCC) then the browser app is likely to be doing things that create session state.
The safest, but least flexable configuration is "TCP Session Affinity". If they are supported they will eliminate most concerns related to load balancing. Cookie Session Affinity relies on guarenteeing that the load balencer uses the correct cookie. Not all code is equal. I have had cases where it the load balancer didn't always use the cookie provided. Changing the configuration to "Load Balancer provided Cookie Affinity" fixed that.
None of this is needed if all your communications are stateless at the TCP layer, the HTTP layer and the app layer. The later cannot be inferred by the server.
Another conern is if your app or middle tier is co-resident with other apps or the same app connecting to the same load balancer and port. That can be difficult to make sure there are no 'crossed wires' . When ML gets a request it associates its identity with the client IP and port. Even without load balencers, most modern HTTP and TCP client libraries implement socket caching. A great perfomrance win, but a hidden source of subtle random severe errors if the library or app are sharing "cookie jars" (not uncomnon). A TCP and Cookie Jar cache used by different application contexts can end up sending state information from one unrelated app in the same process to another. Mostly this is in middle tier app servers that may simply pass on requests from the first tier without domain knowledge, presuming that relying on the low level TCP libraries to "do the right thing" ... They are doing the right thing -- for the use case the library programmers had in mind -- don't assume that your case is the one the library authors assumed. The symptoms tend to be very rare but catastrophic problems with transaction failures and possibly data corruption
and security problems (at an application layer) because the server cannot tell the difference between 2 connections from the same middle tier.
Sometimes a better strategy is to load balance between the first tier and the middle tier, and directly connect from the middle tier to MarkLogic.
Especially if caching is done at the load balancer. Its more common for caching to be useful between the middle tier and the client then the middle tier and the server. This is also more analogous to the classic 3 tier architecture used with RDBMS's .. where load balancing is between the client and business logic tiers not between business logic and database.

Distributed Rabbitmq within a spring-cloud environment

I am trying to setup a distributed system based on current spring-cloud release (meaning mostly Netflix OSS) using the following components
1 or more cloud config servers
1 or more Eureka servers
1 or more services using Eureka and Config Server clients
The setup above is easy enough to get going however once you start looking into setting up so that configuration changes in the cloud Config servers automatically trigger changes in the values of the actual clients, things start getting more complicated.
It is my understanding that for such a feature to work one should introduce spring-cloud-bus clients to the services which in turn will use, currently the only supported implementation, rabbitmq servers (the actual rabbitmq binaries and not some spring-boot app like eureka or Config servers) to allow change events in the Config server to be propagated to the clients automatically.
It sounds counterintuitive to setup such a system and have to hardcode addresses to rabbitmq servers in the clients (even if one will be keeping the amount of rabbitmq servers more or less static).
How is one supposed to register rabbitmq server instances in the Eureka service discovery server(s) to allow for clients to find them without having to have any knowledge about their location prior to startup?
I cannot seem to find any documentation on how this is done given that rabbitmq is not a spring-cloud component. In fact very little documentation seems to exist regarding on how the rabbitmq + eureka + spring-cloud-bus should be setup together.
I know that I am on a VERY old question, even though I think it worth a comment for people who read this in the future.
Most of the cloud services, lets take AWS as an example, have an Elastic IP solution - so you can configure IPs for RabbitMQ servers, and the IPs always belong to the RabbitMQ, no matter whether the instances change. You can re-attach the Elastic IP to different instances.
It works nearly the same with Elastic Load Balancer, which keeps its IP, so you could configure your microservices to a specific IP using Spring Cloud Config Server - and scale the RabbitMQ instances without a need to worry about configuration change.

What is the conceptual difference between Service Discovery tools and Load Balancers that check node health?

Recently several service discovery tools have become popular/"mainstream", and I’m wondering under what primary use cases one should employ them instead of traditional load balancers.
With LBs, you cluster a bunch of nodes behind the balancer, and then clients make requests to the balancer, who then (typically) round robins those requests to all the nodes in the cluster.
With service discovery (Consul, ZK, etc.), you let a centralized “consensus” service determine what nodes for particular service are healthy, and your app connects to the nodes that the service deems as being healthy. So while service discovery and load balancing are two separate concepts, service discovery gives you load balancing as a convenient side effect.
But, if the load balancer (say HAProxy or nginx) has monitoring and health checks built into it, then you pretty much get service discovery as a side effect of load balancing! Meaning, if my LB knows not to forward a request to an unhealthy node in its cluster, then that’s functionally equivalent to a consensus server telling my app not to connect to an unhealty node.
So to me, service discovery tools feel like the “6-in-one,half-dozen-in-the-other” equivalent to load balancers. Am I missing something here? If someone had an application architecture entirely predicated on load balanced microservices, what is the benefit (or not) to switching over to a service discovery-based model?
Load balancers typically need the endpoints of the resources it balances the traffic load. With the growth of microservices and container based applications, runtime created dynamic containers (docker containers) are ephemeral and doesnt have static end points. These container endpoints are ephemeral and they change as they are evicted and created for scaling or other reasons. Service discovery tools like Consul are used to store the endpoints info of dynamically created containers (docker containers). Tools like consul-registrator running on container hosts registers container end points in service discovery tools like consul. Tools like Consul-template will listen for changes to containers end points in consul and update the load balancer (nginx) for sending the traffic to. Thus both Service Discovery Tools like Consul and Load Balancing tools like Nginx co-exist to provide runtime service discovery and load balancing capability respectively.
Follow up: what are the benefits of ephemeral nodes (ones that come and go, live and die) vs. "permanent" nodes like traditional VMs?
[DDG]: Things that come quickly to my mind: Ephemeral nodes like docker containers are suited for stateless services like APIs etc. (There is traction for persistent containers using external volumes - volume drivers etc)
Speed: Spinning up or destroying ephemeral containers (docker containers from image) takes less than 500 milliseconds as opposed to minutes in standing up traditional VMs
Elastic Infrastructure: In the age of cloud we want to scale out and in according to users demand which implies there will be be containers of ephemeral in nature (cant hold on to IPs etc). Think of a markerting campaign for a week for which we expect 200% increase in traffic TPS, quickly scale with containers and then post campaign, destroy it.
Resource Utilization: Data Center or Cloud is now one big computer (compute cluster) and containers pack the compute cluster for max resource utilization and during weak demand destroy the infrastructure for lower bill/resource usage.
Much of this is possible because of lose coupling with ephemeral containers and runtime discovery using service discovery tool like consul. Traditional VMs and tight binding of IPs can stifle this capability.
Note that the two are not necessarily mutually exclusive. It is possible, for example, that you might still direct clients to a load balancer (which might perform other roles such as throttling) but have the load balancer use a service registry to locate instances.
Also worth pointing out that service discovery enables client-side load balancing i.e. the client can invoke the service directly without the extra hop through the load balancer. My understanding is that this was one of the reasons that Netflix developed Eureka, to avoid inter-service calls having to go out and back through the external ELB for which they would have had to pay. Client-side load balancing also provides a means for the client to influence the load-balancing decision based on its own perspective of service availability.
If you look at the tools from a completely different perspective, namely ITSM/ITIL, load balancing becomes "just that", whereas service discovery is a part of keeping your CMDB up to date, and ajour with all your services, and their interconnectivity, for better visibility of impact, in case of downtime, and an overview of areas that may need supplementing, in case of High availability applications.
Furthermore, service-discovery only gives you a picture as of the last scan, and not near-real-time (of course dependent on which scanning interval you have set), whereas load balancing will keep an up-to-date picture of your application's health.

Load Balancing with the WSHttp Binding: Do not use reliable sessions? WHY?

We have WCF service X: deployed on server A and Server B, host address:
http://127.0.0.1:8777/ServiceX/
And we load balance the two servers. We accesss the service via http://myappserver/ServiceX
We need to use per-session mode, and we set [reliable sessions] as true:
We don't find any issue till now based on testing. But the below linked MSDN article says that Do not use reliable sessions for Load Balancing with the WSHttp Binding? Please can someone give more details? Thanks a lot.
WCF Load Balancing http://msdn.microsoft.com/en-us/library/ms730128.aspx
Reliable Messaging means all your messages from your established client reach the same endpoint behind any intermediaries like routers and load balancers.
Load balancing means your calls will be distributed across all nodes as the load balancer sees fit.
Those two goals are mutually exclusive. You can have one or the other, not both.
I have not had time to try this myself yet, but I found this old blog entry (https://blogs.msdn.microsoft.com/drnick/2007/07/13/sticky-sessions/):
This division according to groups would allow a feature like reliable messaging to work because the same server would be used to process all of the messages in the reliable session. The feature that this division method represents is typically called “sticky sessions” or some other phrase for affinitization in the load balancer.
Given that you mention that your firewall supports sticky sessions, I suspect/hope you will be fine.

Glassfish failover without load balancer

I have a Glassfish v2u2 cluster with two instances and I want to to fail-over between them. Every document that I read on this subject says that I should use a load balancer in front of Glassfish, like Apache httpd. In this scenario failover works, but I again have a single point of failure.
Is Glassfish able to do that fail-over without a load balancer in front?
The we solved this is that we have two IP addresses which both respond to the URL. The DNS provider (DNS Made Easy) will round robin between the two. Setting the timeout low will ensure that if one server fails the other will answer. When one server stops responding, DNS Made Easy will only send the other host as the server to respond to this URL. You will have to trust the DNS provider, but you can buy service with extremely high availability of the DNS lookup
As for high availability, you can have cluster setup which allows for session replication so that the user won't loose more than potentially one request which fails.
Hmm.. JBoss can do failover without a load balancer according to the docs (http://docs.jboss.org/jbossas/jboss4guide/r4/html/cluster.chapt.html) Chapter 16.1.2.1. Client-side interceptor.
As far as I know glassfish the cluster provides in-memory session replication between nodes. If I use Suns Glassfish Enterprise Application Server I can use HADB which promisses 99.999% of availability.
No, you can't do it at the application level.
Your options are:
Round-robin DNS - expose both your servers to the internet and let the client do the load-balancing - this is quite attractive as it will definitely enable fail-over.
Use a different layer 3 load balancing system - such as "Windows network load balancing" , "Linux Network Load balancing" or the one I wrote called "Fluffy Linux cluster"
Use a separate load-balancer that has a failover hot spare
In any of these cases you still need to ensure that your database and session data etc, are available and in sync between the members of your cluster, which in practice is much harder.