How to take heap dump for a activemq pod which is running inside amazonMq - activemq

I have three pods ActiveMQ 5.16.3 pods (xlarge) running inside a Amazon MQ. I am encountering some memory issues where the pods are consuming a lot of memory in general like the traffic is comparable to large (instance type) only but it is still hitting 60-70% heap usage sometimes.
To debug the issue I need to take the heap dump from Amazon MQ. Any idea on how to do that?

Related

How to debug aws fargate task running out of memory?

I'm running a task at fargate with CPU as 2048 and memory as 8192. Task after running some time is stopped with error
container was stopped as it ran out of memory.
Thing is that task does not fails every time. If I run the same task 10 time it fails 5 times and works 5 times. However If I take an ec2 machine with 2 vcpu and 4GB memory and try to run the same container it runs successfully.(Infact the memory usage on ec2 instance is very low).
Can somebody please guide me how to figure out the memory issue while running a fargate task?
Thanks
The way to start would be enabling memory metrics from container insights for your fargate tasks and Further correlating the Memory Usage graph with Application logs. help here
The difference between running on EC2 vs Fargate could probably be due to the fact that when you run a container on ECS Fargate, it runs on AWS's internal EC2 Instances. Now, here could possibly arise a Noisy Neighbour Situation although the chances would be pretty low.

activemq tuning for 20000 threads

I have running ActiveMQ which connects thru stomp port with 20000+ servers at same time to publish and consume message. The activemq server is running 8CPU and 32G memory. I have assigned JVM max memory as -Xmx16384m . But still when all the servers are connected with this ActiveMQ, server gets over loaded and Virtual Memory usage about 21G and cpu utilization is about 500 some times.
Not sure whether JVM uses that much or anyother process utilizing in this activemq and tried with many tunings and no improvements.
Maybe you should reconsider the architecture. If you really need that many servers you may want to try a non blocking messaging bus, like ActiveMQ Artemis. I don't know for sure how many STOMP client it will support under your setup but it's worth a try. Keeping that many clients a separate threads will have a huge memory footprint and I think Artemis will handle such cases better. Not sure for STOMP though.

java.lang.OutOfMemoryError: Java heap space on ignite cluster

We are migrating a web application from an ad hoc in memory cache solution to an apache ignite cluster, where the jboss that runs the webapp works as client node and two external vm works as ignite server nodes.
When testing performance with one client node and one server node all goes ok. But when testing with one client node and two server nodes in cluster, the server nodes crash with an OutOfMemoryError.
The virtual machine of both nodes it's started with -server -Xms1024M -Xmx1024M -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+UseTLAB -XX:NewSize=128m -XX:MaxNewSize=128m -XX:MaxTenuringThreshold=0 -XX:SurvivorRatio=1024 -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=60 -XX:MaxGCPauseMillis=1000 -XX:InitiatingHeapOccupancyPercent=50 -XX:+UseCompressedOops -XX:ParallelGCThreads=8 -XX:ConcGCThreads=8 -XX:+DisableExplicitGC
Any idea why a two nodes cluster fails when a single node one works perfectly running the same test ?
I don't know if it's relevant, but the test consists on 10 parallel http requests launched against the JBoss server, that each one starts a process that writes several entries into the caché.
The communication between nodes can add some overhead, so apparently 1GB is not enough for the data and Ignite itself. Generally 1GB is not enough, I would recommend to allocate at least 2GB, better 4GB per node.
In the end the problem wasn't in the amount of memory required by the two nodes, but in the synchronization between the nodes. My test caché was running with PRIMARY_SYNC, but write/read cycles where faster than the replication in the cluster and end up in an inconsistent read that provoked an infinite loop that wrote infinite values to the cluster.
Changing to FULL_SYNC fixed the problem.

jax-rs in weblogic high memory

We are currently using JAX-RS 2.0 jersey on WebLogic for hosting restful web services . We are observing very high heap memory utilization in the benchmarks that keep increasing with time. Even after benchmark is over the heap memory allocated does not get released even after I hit perform GC on jconsole. When I analyze the heap dump with MAT I see ~99% of the heap is consumed by oracle.j2ee.ws.server.jaxrs.dms.monitoring.internal.DmsApplication. I un-targetted DMS from the managed server but still the same behavior.
A Little bit of analysis of dominator tree in heap dump shows that every request is being tracked by the listener. The weblogic.jaxrs.monitoring.JaxRsRequestEventListener is mapped to oracle.j2ee.ws.server.jaxrs.dms.monitoring.DmsApplicationEventListener.
Am I understanding this correctly? Does JAX-RS jersey maps to DMS request event listener internally. How this can be configured correctly so we don't face this memory issue.
I think you need to look at your diagnostic module in weblogic. Look at watches & notifications

Using redis with logstash

I'm wondering what are the pros and cons of using redis as a broker in an infrastructure?
At the moment, all my agents are sending to a central NXLog server which proxies the requests to logstash --> ES.
What would I gain by using a redis server in between my nxlog collector and logstash? To me, it seems pointless as nxlog has already good mem and disk buffers in case logstash is down.
What would I gain?
Thank you
On a heavy load : calling ES (HTTP) directly can be dangerous and you can have problems if ES break down .
Redis can handle More (Much more) Write request and send it in asynch logic to ES(HTTP).
I started using redis because I felt that it would separat the input and the filter part.
At least during periodes in which I change the configuration a lot.
As you know if you change the logstash configuration you have to restart the thing. All clients (in my case via syslog) are doomed to reconnect to the logstash daemon when he is back in business.
By putting an indexer in front which holds the relativly static input configuration and pusing everything to redis I am able to restart logstash without causing hickups throughout the datacenter.
I encountered some issues, because our developers hadn't found time (yet) to reduce the amount of useless logs send to syslog, thus overflowing the server. Before we had logstash they overflowed the disk space for logs - more general issue though... :)
When used with Logstash, Redis acts as a message queue. You can have multiple writers and multiple readers.
By using Redis (or any other queueing service) allows you to scale Logstash horizontaly by adding more servers to the 'cluster'. This will not matter for small operations but can be extremely useful for larger installations.
When using Logstash with Redis, you can configure Redis to only store all the log entries in memory which would like a in memory queue (like memcache).
You mat come to the point where the number of logs sent will not be processed by Logstash and it can bring down your system on constant basis (observed in our environment).
If you feel Redis is an overhead for your disk, you can configure it to store all the logs in memory until they are processed by logstash.
As we built our ELK infrastructure, we originally had a lot of problems with the logstash indexer (reading from redis). Redis would back up and eventually die. I believe this was because, in the hope of not losing log files, redis was configured to persist the cache to disk once in a while. When the queue got "too large" (but still within available disk space), redis would die, taking all of the cached entries with it.
If this is the best redis can do, I wouldn't recommend it.
Fortunately, we were able to resolve the issues with the indexer, which typically kept the redis queue empty. We set our monitoring to alert quickly when the queue did back up, and it was a good sign that the indexer was unhappy again.
Hope that helps.