I have 2 EC2 instances running Glassfish app server (running the exact same deployed application) and these run under an Elastic Load Balancer. This is working great but I am worried about caching inconsistencies with the EJBs.
Situation:
Client sends request to ELB which forwards to EC2 instance 1. Some EJB object, let's call it EJB1, gets cached.
Client --> ELB --> EC2-1 (EJB1 cached)
A short time later Client sends another request but this gets forwarded now to EC2 instance 2. EJB1 now is cached on that instance.
Client --> ELB --> EC2-2 (EJB1 cached)
A short time later Client sends another request and is forwarded to EC2 instance 1. EJB1 is still in cache but is not up to date, thus causing a cache inconsistency.
Client --> ELB --> EC2-1 (EJB1 cached)
Unfortunately I haven't been able to actually see this issue yet but I feel that it's a possibility. Other than turning caching of EJBs off, what is the proper way to prevent this from happening?
Thanks.
Amazon provides "session stickiness" support at the load balancer. This means, assuming the user has cookies enabled, they are sent to the same EC2 instance for the lifetime of their session.
Until you implement a level 2 cache your instances are not going to share the cache.
Here are a couple of links for level 2 caching solutions.
(open source) http://www.terracotta.org/ehcache/
(lots of cash) http://www.oracle.com/technetwork/middleware/coherence/overview/index.html
Related
Hi I'm little confused about load balancer concept
I've read some articles about loadbalancer in nginx and from what I've understand is that the load balancer spread the request into multiple servers !
But i thought if one server is down another one is up and running (not simultaneously all server together)
and another thing is when request spread between servers what happen to static data like sessions and InMemory Database like RedisDB
I think i'm confused and missunderstood the loadbalancer mechanism
and from what I've understand is that the load balancer spread the request into multiple servers ! But i thought if one server is down another one is up and running (not simultaneously all server together)
As it comes from the name the goal of load balancer (LB) is to balance the load. As per wiki definition for example:
In computing, load balancing is the process of distributing a set of tasks over a set of resources (computing units), with the aim of making their overall processing more efficient. Load balancing can optimize the response time and avoid unevenly overloading some compute nodes while other compute nodes are left idle.
To perform this task load balancer obviously need to have some monitoring over the resources, including liveness checks (so it can bring out of the rotation the failing servers/nodes). Ideally LB should work with stateless services (i.e. request could be routed to any of the servers supporting handling such request type) but that is not always the case due to multiple reasons, for example in ASP.NET in case of non-distributed session requests should have been routed to servers which handled the previous request from the session, which could have been handled with so called sticky session/cookie.
and another thing is when request spread between servers what happen to static data like sessions and InMemory Database like RedisDB
It is not very clear what is the question here. As I mentioned before ideally you will want to have stateless services which will use some shared datastore (s) to handle the requests, so if request comes for any server/node it can load all the needed data to handle it.
So in short when request comes to LB it selects one of the servers based on some algorithm (round robin, resource based, sharding, response time based, etc.) and send this request to this server so in theory based on the used approach sequential requests from the same source can hit different nodes/servers (so basically this is one of the ways to horizontally scale your application).
I actually found my answer in nginx doc page
Short answer is IP-Hash mechanism
Nginx doc word :
Please note that with round-robin or least-connected load balancing, each subsequent client’s request can be potentially distributed to a different server. There is no guarantee that the same client will be always directed to the same server.
If there is the need to tie a client to a particular application server — in other words, make the client’s session “sticky” or “persistent” in terms of always trying to select a particular server — the ip-hash load balancing mechanism can be used.
With ip-hash, the client’s IP address is used as a hashing key to determine what server in a server group should be selected for the client’s requests. This method ensures that the requests from the same client will always be directed to the same server except when this server is unavailable.
To configure ip-hash load balancing, just add the ip_hash directive to the server (upstream) group configuration:
upstream myapp1 {
ip_hash;
server srv1.example.com;
server srv2.example.com;
server srv3.example.com;
}
http://nginx.org/en/docs/http/load_balancing.html
It would be a great help if you can clarify on this please.
When am using a load balancer and I bind a server with a client with say either appsession or any other means. However if that server goes down then the load balancer redirects the client to another server and while doing so, the whole session is lost. So do i have to write my application in such a way that it stores session data externally so that it can be shared?
So how good is using a load balancer when a transaction fails halfway because the server goes unresponsive?
Please let me know, thanks.
There is a difference between the 2 concepts: session stickiness and session replication.
Session stickiness gives you an assurance that once a request from a client reaches a healthy server, subsequent requests from the same client will be handled by that server. When your server goes down, the stickiness is lost, and new requests go to a different healthy server. Session stickiness is usually offered by the load balancer and your application servers generally do not need to do anything.
Session replication gives you the capability of recovering the session when a server goes down. In the above case, stickiness is lost, but the new server will be able to recover the previous session based on an external session storage, which you will have to implement.
I am using the Azure Container Service with Kubernetes orchestrator and have an app deployed on a cluster with 3 nodes. It has 5 replicas. How can I verify load balancing in action e.g. I want to be able to see that every time I hit the external IP I am being routed to perhaps a different node. Thanks.
The simplest solution is to connect (over ssh for example) to 3 nodes and run WinDump there. In order everything is working properly you will be able to see what happens on every node.
Also here is Microsoft documentation for testing a load balancer:
https://learn.microsoft.com/en-us/azure/virtual-machines/windows/tutorial-load-balancer#test-load-balancer
The default Load Balancer which are available to your Windows Azure Web and Worker roles are software load balancers and not so much configurable however they do work in Round Robin setting. If you want to test this behavior this is what you need to do:
Create two (or more) instances of your service with RDP access
enabled so you can RDP to both instances
RDP to your both instances and run NETMON or any network monitor
solution in it.
Now access your Windows Azure web application from your desktop You
need to understand that when a network connection is made from your
desktop the connection is still alive based on network settings
(default 60 seconds) so you need to wait until default timeout is
passed to access your Windows Azure web application again.
When you will access your Windows Azure Web application again you can
verify that seconds time the request went to next instance. BE sure
to pass the connection timeout otherwise your request will be keep
handled by same instance.
Note: If you dont want to use RDP, you sure can also create a test ASP.NET page to write some special code based on your specific instance which will show you that this page is specific to certain instance. The best way to do is to read the Instance ID as below:
int instanceID = RoleEnvironment.CurrentRoleInstance.Id;
If you want to have more control over Windows Azure Load Balancing, i would suggest using the Windows Azure Traffic Manager which will help you to route the traffic to your site via Round-Robin, Performance or backup based scenario. More info on using Traffis Manager is in this article.
There are two approaches that can be used for service interaction when having SOA for large systems deployed on cloud like AWS.
Have each service cluster behind internal elb. client makes a connection pool with corresponding elb and elb does round-robin balancing.
going with service discovery approach like netflix eureka.
Currently we are using 1st approach where each service cluster is behind internal elb and clients communicate via elbs so each client instance has to maintain only 1 pool i.e. with the elb endpoint.
I have following doubts regarding 2nd apporach.
Is there a benefit in moving to service discovery and smart client architecture where service client knows all service instances (via eureka service or equivalent) and does internal load balancing?
In above case how does connection pooling work? Currently each client instance has to maintain exactly 1 connection pool i.e. with the corresponding service's elb. But with rich client each client will have all the service instance endpoints to directly communicate to. Making connection on each request will not be efficient and having so many connection pools (1 for each service instance) for each client is a overkill I guess.
Need inputs/suggestions on above two questions.
First question.
Yes there is. First, you can do better failure recovery - for example, retry failed requests to another node without showing any errors to client. Next, you can do better balancing than ELB offers. Next, you can automatically add/remove nodes to/from cluster w/o altering ELB configuration. This is very useful if your nodes have healthchecks. More importantly, software balancer can do this fast.
Second question.
Have connection pool per node. i.e.
[api method in client code] -> [software balancer] -> [node connection pool] -> [node connection] -> [use this connection to make request]
I have a Glassfish v2u2 cluster with two instances and I want to to fail-over between them. Every document that I read on this subject says that I should use a load balancer in front of Glassfish, like Apache httpd. In this scenario failover works, but I again have a single point of failure.
Is Glassfish able to do that fail-over without a load balancer in front?
The we solved this is that we have two IP addresses which both respond to the URL. The DNS provider (DNS Made Easy) will round robin between the two. Setting the timeout low will ensure that if one server fails the other will answer. When one server stops responding, DNS Made Easy will only send the other host as the server to respond to this URL. You will have to trust the DNS provider, but you can buy service with extremely high availability of the DNS lookup
As for high availability, you can have cluster setup which allows for session replication so that the user won't loose more than potentially one request which fails.
Hmm.. JBoss can do failover without a load balancer according to the docs (http://docs.jboss.org/jbossas/jboss4guide/r4/html/cluster.chapt.html) Chapter 16.1.2.1. Client-side interceptor.
As far as I know glassfish the cluster provides in-memory session replication between nodes. If I use Suns Glassfish Enterprise Application Server I can use HADB which promisses 99.999% of availability.
No, you can't do it at the application level.
Your options are:
Round-robin DNS - expose both your servers to the internet and let the client do the load-balancing - this is quite attractive as it will definitely enable fail-over.
Use a different layer 3 load balancing system - such as "Windows network load balancing" , "Linux Network Load balancing" or the one I wrote called "Fluffy Linux cluster"
Use a separate load-balancer that has a failover hot spare
In any of these cases you still need to ensure that your database and session data etc, are available and in sync between the members of your cluster, which in practice is much harder.