I am doing load test on a weblogic server.
When I fire a relative high loading to the server, the active thread count stop at 40-50 and never increase again.
When I fire a relative low loading to the server, the active thread count keeping increase util reach the limit.
I saw the oracle doc said the Throughput increase the active will incrase.
If there is a high loading, how the Throughput can increase. The active will never increase.
Related
We are trying to build a low latency system using Java.
As part of fine tuning, we do the initial warm up of the application during startup to go above the default compile threshold 10000. We could see that any live processing latency immediately after the warmup looks better. The latency looks good when the system is continuously processing events. But when the system goes back to idle state where there is no events coming for 10 minutes, the latency starts degrading.
I initially thought it could be JVM evicting the generated code from the cache. I have set a higher size for code cache and enabled events of code cache and noticed that the code cache is not even reaching 50% capacity.
Any idea what could be the issue? Basically the initial warm up during start up goes in vain once the system goes into idle mode for a short duration.
I configured capacity scheduler and schedule jobs in specific Queues. However, I see there are times when jobs in some Queues complete faster while other Queues have jobs waiting on the previous ones to commplete. This creates a scenario where half of my capacity is idle and other half is busy with jobs waiting to get resources.
Is there any config that I can tweak to maximize my utilization. I want to route waiting jobs to other queues where resources are available. Attached is a screenshot -
Seems like an issue with Capacity-Scheduler here, I switched to Fair-scheduler and definitely see huge improvements in cluster utilization, ~75% and way better than 40s with caoacity-scheduler
So the reason behind is when multiple users submits jobs to a same queue it can consume max resources, but a single user can't consume more than the capacity even though max capacity is greater than that.
So if you specify yarn.scheduler.capacity.root.QUEUE-1.capacity: 20 this to capacity-scheduler.xml one user can't take more than 20% resources for QUEUE-1 queue even though your cluster have free resources.
By default this user-limit-factor is set to 1. So if you set it to 2 your job can use 40% of resources if maximum allocated resources is greater than or equals to 40.
yarn.scheduler.capacity.root.QUEUE-1.user-limit-factor: 2
Please follow this blog
For this test, I have a simple Java servlet that reads data in and calculates the CRC32 for it. When making serial requests of 512MB each, I get about 600MB/sec. That makes sense since I can't use all 24 cores available to me to calculate a CRC. The program driving this I/O is sitting on the local box to eliminate the possibility of networking issues. I am running Tomcat 8.0.24.0 on FreeBSD using OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode).
Next, I attempt the same test with 6 concurrent requests, expecting that the performance per request might be lower than 600MB/sec, but that the aggregate performance across all 6 requests would be significantly higher.
What I see is the CPU has some idle time at ALL times (so it doesn't appear that I'm CPU-bound). I also see that all processing threads in Tomcat are running concurrently as anticipated. However, it looks like I'm only getting around 800MB/sec in aggregate. The threads in Tomcat spend most of their time waiting to read from the socket, as shown below.
I would appreciate any thoughts on how to improve Tomcat throughput / why so much time is spent waiting for more data (which I assume is what's going on below).
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch(NioEndpoint.java:1386)
at org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitReadLatch(NioEndpoint.java:1388)
at org.apache.tomcat.util.net.NioBlockingSelector.read(NioBlockingSelector.java:185)
at org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:251)
at org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:232)
at org.apache.coyote.http11.InternalNioInputBuffer.fill(InternalNioInputBuffer.java:133)
at org.apache.coyote.http11.InternalNioInputBuffer$SocketInputBuffer.doRead(InternalNioInputBuffer.java:177)
at org.apache.coyote.http11.filters.IdentityInputFilter.doRead(IdentityInputFilter.java:110)
at org.apache.coyote.http11.AbstractInputBuffer.doRead(AbstractInputBuffer.java:416)
at org.apache.coyote.Request.doRead(Request.java:469)
at org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:342)
at org.apache.tomcat.util.buf.ByteChunk.substract(ByteChunk.java:395)
at org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:367)
at org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:190)
...
During peak hours of our service, I notice the CPU load goes up to 100%. However, when I SSH into the machine and use top or htop, I don't ever see the CPU usage go above 25%. This instance is a dedicated load balancer running HAProxy.
Here is a screenshot of the Dashboard: https://gyazo.com/010715208f81ec97c1bc9b78123fe4d4
Here is a screenshot of top in the instance: https://gyazo.com/afa7098331f7a3f66c018041b9be686d
During these peak times, I noticed there are some latency and this is not caused by database or my other server instances as I checked their loads and were far below threshold. I was wondering if Compute Engine is throttling by mistaking it is at 100% CPU usage or something?
Has anyone had a similar experience?
I am traying to reduce memory usage by Apache on the server.
My actual Max Connections Per Child is 10k
According to the following recommendation
the Max Connections Per Child should be reduced to 1000
http://www.lophost.com/tutorials/how-to-reduce-high-memory-usage-by-apache-httpd-on-a-cpanel-server/
What is the recommended max value for Max Connections Per Child in Apache configuration?
The only time when this directive affects anything is when your Apache workers are leaking memory. One way this happens is that memory is allocated (via malloc() or whatever) and never freed. It's the result of design/implementation flaws in Apache or its modules.
This directive is somewhat of a hack, really -- but if there's some module that's loaded into Apache that leaks, say, 8 bytes every request, then after a lot of requests, you'll run out of memory. So the quick fix is to just kill the process every MaxConnectionsPerChild requests and start a new one.
This will only affect your memory usage if you see it gradually increase over the span of lots of requests when setting MaxConnectionsPerChild to zero.
The default is 0 (which implies no maximum connections per child) so unless you have memory leakage I'm unaware of any need to change this setting - I agree with Hut8.
Sharing here FYI from the Apache 2.4 Performance Tuning page:
Related to process creation is process death induced by the MaxConnectionsPerChild setting. By default this is 0, which means that there is no limit to the number of connections handled per child. If your configuration currently has this set to some very low number, such as 30, you may want to bump this up significantly. If you are running SunOS or an old version of Solaris, limit this to 10000 or so because of memory leaks.
And from the Apache 2.4 docs on MaxConnectionsPerChild:
Setting MaxConnectionsPerChild to a non-zero value limits the amount of memory that process can consume by (accidental) memory leakage.