jProfiler long monitoring sessions - jprofiler

I'm using jProfilers for monitoring the activity and performance of a server in production. I've configured it in remote monitoring and it works ok, but for long monitoring sessions (I would like to do 24h monitornig) the amount of memory used by jProfiler keeps growing too much.
I would like to know if there is any better way to do 24h monitoring.
Thanks in advance.

Related

Google Compute Engine VM constantly crashes

On the Compute Engine VM in us-west-1b, I run 16 vCPUs near 99% usage. After a few hours, the VM automatically crashes. This is not a one-time incident, and I have to manually restart the VM.
There are a few instances of CPU usage suddenly dropping to around 30%, then bouncing back to 99%.
There are no logs for the VM at the time of the crash. Is there any other way to get the error logs?
How do I prevent VMs from crashing?
CPU usage graph
This could be your process manager saying that your processes are out of resources. You might wanna look into Kernel tuning where you can increase the limits on the number of active processes on your VM/OS and their resources. Or you can try using a bigger machine with more physical resources. In short, your machine is falling short on resources and hence in order to keep the OS up, process manager shuts down the processes. SSH is one of those processes. Once you reset the machine, all comes back to normal.
How process manager/kernel decides to quit a process varies in many ways. It could simply be that a process has consistently stayed up for way long time to consume too many resources. Also, one thing to note is that OS images that you use to create a VM on GCP is custom hardened by Google to make sure that they can limit malicious capabilities of processes running on such machines.
One of the best ways to tackle this is:
increase the resources of your VM
then go back to code and find out if there's something that is leaking in the process or memory
if all fails, then you might wanna do some kernel tuning to make sure your processes have higer priority than other system process. Though this is a bad idea since you could end up creating a zombie VM.

RabbitMQ delayed exchange plugin loads and resources

We are using rabbitmq (3.6.6) to send analysis (millions) to different analyzers. These are very quick and we were planning on use the rabbit-message-plugin to schedule monitorizations over the analyzed elements.
We were thinking about rabbitmq-delayed-exchange-plugin, already made some tests and we need some clarification.
Currently:
We are scheduling millions of messages
Delays range from a few minutes to 24 hours
As previously said, these are tests, so we are using a machine with one core and 4G of RAM which has also other apps running on it.
What happened with a high memory watermark set up at 2.0G:
RabbitMQ eventually (a day or so) starts consuming 100% (only one core) and does not respond to the management interface nor rabbitmqctl. This goes on for at least 18 hours (always end up killing, deleting mnesia delayed file on disk - about 100 / 200 MB - and restarting).
What happened with a high memory watermark set up at 3.6G:
RabbitMQ was killed by kernel, because of high memory usage (4 GB hardware) about a week after working like this.
Mnesia file for delayed exchange is about 1.5G
RabbitMQ cannot start anymore giving to the below trace (we are assuming that because of being terminated by a KILL messages in the delay somehow ended up corrupted or something
{could_not_start,rabbit,
rabbitmq-server[12889]: {{case_clause,{timeout,['rabbit_delayed_messagerabbit#rabbitNode']}},
rabbitmq-server[12889]: [{rabbit_boot_steps,'-run_step/2-lc$^1/1-1-',1,
And right now we are asking ourselves: Are we a little over our heads using rabbit delayed exchange plugin for this volumes of information? If we are, then end of the problem, rethink and restart, but if not, what could be an appropiate hardware and/or configuration setup?
RabbitMQ delayed exchange plugin is not properly designed to store millions of messages.
It is also documented to the plugin page
Current design of this plugin doesn't really fit scenarios with a high
number of delayed messages (e.g. 100s of thousands or millions). See
72 for details.
Read also here: https://github.com/rabbitmq/rabbitmq-delayed-message-exchange/issues/72
This plugin is often used as if RabbitMQ was a database. It is not.

Is WAMP publish/subscribe battery efficient?

I am writing an client side desktop app that will need to receive updates from a server. These updates would be few and far between (possibly 1 a week) but I would like them to be received as quickly as possible.
Is it hard on the battery to "subscribe" to the topic that will provide the updates through WAMP and let the app run in the background continuously? Would it be more efficient to periodically poll the server using a REST based API?
WAMP requires a persistent connection - so you have to deal with the battery drain for this. The only way to find out how much of a cost this is is to test it on the system you'll be running the app on. Then you can consider the actual trade offs involved versus a polling solution.
There are no implications on energy consumption when subscribing. However there are implications when persisting a connection for so long time for so few updates. You should reconsider your use of WAMP as your communication protocol I think.

For shipping logs from app server, which to use Logstash forwarder, FLume or Fluentd?

Logstash forwarder is light, but from logstash forwarder to logstash , there is latency over the network. [ if i am using Logstash forwarder on one machine and sending logs to Logstash which is on other machine ]
Flume /Flume-ng : CPU utilisation is high for same amount of data (for example for 2 MB ,its like 20 percent )
Fluentd : doestn't use java, its based on CRuby , but its CPU utilisation is also at peak time 30 percent, .
As per our use case we do not want to add significant load on my production boxes to just forward the log and if i use logstash i will be introducing new single point of failure so i am pretty confused to choose one among them.
Interesting performance statistics.
From my experience, logstash-forwarder is fairly light weight and encryption/compression is very helpful. This indeed might cause some latency. Is that an important factor for you? I guess latency is smaller than 2-3 seconds. I think that in many log management use cases, real-time is not a strong requirement.
At the end of the day, all these agents need to collect data from apps/files, package them and ship them over the network. This takes some cycles but in most cases, these are 2%-4% of the resources a normal server would have.
Have a look at rsyslog which has many configurations on how often it piggy backs logs. You can run it in a docker and limit resources more strictly on rsyslog or on any of the other agents (https://hub.docker.com/r/logzio/logzio-rsyslog-shipper/)
Another option would be to post logs directly from your app server with bulk HTTP post by writing your own code. It's something most open source like ELK can ingest and it something we recommend using at http://logz.io

Known issues for Weblogic 10 concurrency issues?

Recently, our production weblogic is taking too much time to process queues. Besides investigating into queues, db queries and other stuff I thought to look into any known memory and concurrency issues in weblogic.
Does anyone know ?
Summary about the problem:
we had like 2 queues and like 8-9 clusters. one of the queues was down for some reason and the other queue started to pile up and weblogic took forever to process it. the db io increased and cpu consumptions as well.
We had a similar production issue recently.
Check if Flow Control is set at the connection factory level. Using this setting weblogic can throttle message production when it sees that the queue is being overloaded.
Weblogic's checklist of things to do when you have a large message backlog is useful for you to compare to your own scenarios