Tomcat API : how to improve performance when client are closing connection after each request? - api

I have a simple Tomcat API and my goal is to manage the higher number of req /sec.
My problem is the following :
Scenario 1: When the client is using some persistent connections I manage to reach around 20000 req/sec using a single instance of the API. The server is loaded and the CPU of the server is almost fully used.
Scenario 2: When the client is closing connections after each request, the API only manages 600 req/sec and the server resources are not used at all. So I guess there is a bottleneck either on the global number of connections, either on the number of connections the server is able to manage per second.
What I want to know is if there is a configuration (on tomcat or on the server) that I can change to improve performance during scenario 2.
If not, which kind of resources is limiting? Can I address the problem by deploying many 1 CPU servers?
What I have looked to for the moment :
The number of thread and connection in Tomcat config :
I have adjusted theses number from default to 200 threads and 2000 connections, I don't see any effect during scenario2.
Ulimit is set to unlimited
JVM is configured as follow : JAVA_OPTS: -Xmx8g

It was better if you provide more information about your deployment but generally there are some works that can help you to achieve better performance.
First of all you should measure the cost of each request and optimize it as much as you can. For example, if your API with each request, execute a query on the local database and this query is consuming a lot of CPU usages, you should optimize your query.By doing this your server can tolerate more request before its cpu becomes 100%.
Note some tools like JProbe can help you for optimizing your API.
secondly, monitor your resources during the test and find which one of them becomes fully used. You should check Network Connection, Disk, Memory and CPU loads during the test and identify weakness of your resources. Track thread blocks and deadlocks as they are important to performance.
You can scale-up your server resources based on this information or decide to implement distributed architecture or add a load-balancer to your solution or add a caching strategy for you project.
In your Tomcat configuration there some settings which can improve your performance such as :
Configuring connectors
set maxThreads to a high enough value
set acceptCount to a high enough value
Configuring cache
set cacheMaxSize attribute to the appropriate value.
Configuring content compression
turning content compression on and using GZIP compression

Related

How to better utilize local cache with load balancing strategies?

I have an Authentication service where I need to cache some user information for better performance. I chose to use local cache because Authentication service probably will be called on each request so I want it to be super fast. Compared to remote cache options local cache is a lot faster (local cache access is below 1ms while remote cache access is around 25ms).
The problem is I can not cache as much information as a distributed cache without running out of memory (talking about millions of users). I can either leave it as it is and when local cache reaches the memory limit it would evict some other data but that would be bad optimization of the cache. Or I can use some kind of load balancer strategy where users will be redirected to same Authentication service instances based on their IP address or other criteria thus the cache hits will be a lot higher.
It kind of defeats the purpose of having stateless services however I think I can slightly compromise from this principle in network layer if I want both consistency and availability. And as for Authentication both are crucial for full security (user info always has to be up-to-date and available).
What kind of load balancing techniques out there for solving this kind of problem? Can there be other solutions?
Note: Even though this question is specific to Authentication I think many other services that are frequently accesses and requires speed can benefit a lot from using local caches.
So - to answer the question here - load balancers can handle this with their hashing algorithms.
I'm using Azure a lot so I'm giving Azure Load Balancer as an example:
Configuring the distribution mode
Load balancing algorithm
From the docs:
Hash-based distribution mode
The default distribution mode for Azure
Load Balancer is a five-tuple hash.
The tuple is composed of the:
Source IP
Source port
Destination IP
Destination port
Protocol type
The hash is used to map traffic to the available servers. The
algorithm provides stickiness only within a transport session. Packets
that are in the same session are directed to the same datacenter IP
behind the load-balanced endpoint. When the client starts a new
session from the same source IP, the source port changes and causes
the traffic to go to a different datacenter endpoint.

JMeter gives “The target server failed to respond ” Error

We use Jmeter to do performance testing. I gave 200 threads(200 users). and we have two servers. like sever A, Server B. i tested indivisibly for 200 Users, it works. and we load balancing server Like server C. So request goes to ether server A Or Server B. But if configure my same jmx script(200 thread) with Server C. it gives error below (but it works for 50 users-- no error).
org.apache.http.NoHttpResponseException: The target server failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:95)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:61)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
at org.apache.jmeter.protocol.http.sampler.MeasuringConnectionManager$MeasuredConnection.receiveResponseHeader(MeasuringConnectionManager.java:201)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
If the issue can be reproduced only on higher loads - it's definitely a server (or load balancer) issue so congratulations on finding the first bottleneck.
Now you can investigate the reason and suggest the fixes, the next steps could be:
Inspect application under test / load balancer logs - you can find a clue there
Inspect application under test / load balancer / database / any other middleware configuration. in the majority of cases default configuration is good for development and debugging but you will need to perform some performance tuning before running a prod-like load test
Collect main health metrics on the application under test side (CPU, RAM, Network, Disk, Swap usage, etc.). It might be the case your application simply lacks hardware resources. You can use built-in tools of the operating system(s) or an APM tool or JMeter PerfMon Plugin
Re-run your test with a profiling tool telemetry enabled on the application under test side. This will give you an overview with regards to where the application spends the most time, which are the "heaviest" functions or functions called most frequently so you would know what to optimise.
Make sure that the load balancer equally (or according to the other algorithm) distributes the requests between the backend servers. It might be the case you're hitting only one server, if this is - consider adding the DNS Cache Manager to your Test Plan and re-run your test to see if it helps.

Apache HTTP load balancing based on URL pattern

I have a Apache web server in front of 2 tomcats which are connected to the same MySQL backend database.
I need to load balance the incoming requests between two tomcats based on a URL parameter named "projectid". For example all even project ids may be served with tomcat 1 and odd requests with tomcat 2.
This is required because the user may start jobs in a project of tomcat 1 which tomcat 2 won't be aware of and these jobs are currently not stored in the database.
Is there a way to achieve this using mod-proxy-load-balancing?
I'm not aware of such a load algorithm being already present. However, keep in mind that the most common loadbalancing outcome (especially when you have server-side state as you obviously have) is a sticky session: You're only balancing the initial request. After that, all requests are typically directed to the same server.
I typically recommend against distributing the session data as it adds some commonly unnecessary performance hit onto each request, negating the improved performance that you can get with clustering. This is subject to be changed in actual installations though and just a first rule of thumb.
You might be able to create your own loadbalancing algorithm with mod-proxy-load-balancing (you'll need to configure the algorithm in the config file), but I believe your time is better spent fixing your implementation, or implement business specific logic to check all cluster machines for running jobs.

Weblogic Performance Tunning

We have a problem with Weblogic 10.3.2. We install a standard domain with default parameters. In this domain we only have a managed server and only running a web application on this managed server.
After installation we face performance problems. Sometimes user waits 1-2 minutes for application responses. (Forexample user clicks a button and it takes 1-2 minutes to perform GUI refresh. Its not a complicated task.)
To overcome these performance problems define parameters like;
configuraion->server start->arguments
-Xms4g -Xmx6g -Dweblogic.threadpool.MinPoolSize=100 -Dweblogic.threadpool.MaxPoolSize=500
And also we change the datasource connection pool parameters of the application in the weblogic side as below.
Initial Capacity:50
Maximum Capacity:250
Capacity Increment: 10
Statement Cache Type: LRU
Statement Cache Size: 50
We run Weblogic on 32 GB RAM servers with 16 CPUs. %25 resource of the server machine is dedicated for the Weblogic. But we still have performance problem.
Our target is servicing 300-400 concurrent users avoiding 1-2 minutes waiting time for each application request.
Defining a work manager can solve performance issue?
My datasource or managed bean definition is incorrect?
Can anyone help me?
Thanks for your replies

Apache and the c10k

How is Apache in respect to handling the c10k problem under normal conditions ?
Say while running very small scripts with little data, or do I need to scale out if I use Apache?
In the background heavy lifting is done by a few servers running specialized software that processes the requests but I'd like to use Apache as a front. Is this a viable plan?
I consider Apache to be more of an origin server - running something like mod_php or mod_perl to generate the content and being smart about routing to the appropriate system.
If you are getting thousands of concurrent hits to the front of your site, with a mix of types of data (static and dynamic) being returned, you may find it useful to put a more optimised system in front of it though.
The classic post-optimisation problem with Apache isn't generating the dynamic content (or at least, that can be optimised for early in the process), but simply waiting for a slow client to be able to receive the bytes that are being sent. It can therefore be a significant advantage to put a reverse proxy, in the form of Squid or Nginx, in front of the servers to take over the 'spoon-feeding' of the slow network clients, while allowing the content production to happen at full speed, and at local network speeds - 100Mb/sec or even gigabit speeds - if it even has to traverse a network at all.
I'm assuming you've probably seen this data, but if not, it might give you some idea.
Guys, imagine that you are running web server with 10K connections (simultaneous). How could it be?
You've got many many connections per second
Dynamic content
Are you sure that your CPU can handle that many PHP sessions for example? I guess no, so why are you thinking about C10K problem? :D
Static content - small files
And still soo many connections? On single server? Probably you've got problems with networking/throughput too or you are future competitor of Google. Use lighttpd which addresses C10K problem and is stable - fly light. Using Apache for only static files for large sites is obvious.
Your clients are downloading large files for a large time - static content
ISO images, archives etc
If you are doing it via web server - FTP may be more appropriate.
Video streaming
Use lighttpd or specialized software. And still... What about other resources?
I am using Linux Virtual Server as load balancer in front of apache servers (with specific patches for LVS-NAT) and I am happy :) This string is an answer you want to hear.