My EC2 instance is consuming 100% CPU after installing RabbitMQ.
Also at times 2 instance of erl.exe are visible in task manager.
I have googled this issue but could not find resolution for it.
I have just started with RabbitMQ and do not have much knowledge on what to check for. Please let me know what should I look for or how to debug.
Any assistance will be of great help.
Memory details :
Related
Our Carbon daemon (in Graphite) takes up no more than 9% CPU on a 2-core machine. However, our Graphite webapp has shot the HTTPD usage high recently to about 95%. Out of this, we have noted that the process "wsgi:graphite" is taking up as much as 93% CPU.
Has anyone come across this problem? What is the solution? We have a lot of monitoring scripts querying graphite via the Graphite URL/Render API. This will of course increase Graphite's HTTPD activity, but we havne't made any drastic changes.
I would appreciate your comments.
I'm using websphere liberty profile v8.5.5.0 and worklight 6.2.
The full version of my WL and runtime is:
Server version: 6.2.0.00.20140922-2259
Project WAR version: 6.2.0.00.20140922-2259
I've noticed that sometimes I have troubles getting into the worklightconsole, the server takes a too big of a time to answer and most of the time it just gives me a time out.
Regarding JVM Heap its at 60 - 70% of the total heap, most likkely 1,5 Gb or something like that.
On the FFDC, sometimes I get a error saying something close to an
FFDC Incident has been created: "javax.naming.ServiceUnavailableException: ldap.example.com:389; socket closed; remaining name 'o=example' com.ibm.ws.wim.adapter.ldap.LdapConnection 1670" at ffdc.log
I have my LDAP connected to this websphere via VPN, and I know that webspheres historically have trouble dealing with LDAP.
However I don't see any more errors on the logs; the machine eventually recovers and is able to work correctly, but for some time is 'down'.
If I enable tracing, the verbosity overwhelms the machine and I can't even start the worklightconsole, neither continue to work with worklight like calling an adapter from an application.
There is one more thing, it seems that this happens more frequently after updates on existing application versions or adapters. Does this ring a bell with anyone?
If i ask for a restart when the machine is sluggish, the stoping of the websphere takes quite some time but eventually stops normally and when I start it, everything is fine right out of the bat.
Before asking for a PMR, I would like to know if there is something else I could do to troubleshoot this problem.
Thanks in advance.
My initial "smell" of the problem is that sometimes your VPN connection with LDAP is very slow or your LDAP server is taking too long to respond.
My suggestion is that you try using WAIT(wait.ibm.com), it's a non-invasive easy to use diagnostic tool, to further investigate. If you find out the call to LDAP is getting hang then I suggest you try tuning Liberty LDAP cache, this should help.
I want to monitor JVM performance on my production environment. I have installed only JRE, not JDK, Hence i can't use jstat, jconsole etc. to monitor the JVM performance.
Can somebody please help to understand how can i monitor JVM performance in this scenario?
Is there any way to achieve this?
(please note that i don't want to monitor it remotely through JMX or something else. i would like to install local agent in each machine which will send the metrics to server at the interval of 1 minute.)
Thanks,
KS
If you manage to get JMX up and running on your VM (from the comment), you can then use jmxterm or jmxfetch to push these JMX metrics into a metrics system (like graphite or Datadog).
If you have enough patience and time to write, you can probably have a look at JVMTI. You can write your code in C/C++ and run it along your Java Process and you can gather information about the JVM without affecting it.
Another simple and naive way is to start your VM with a javaagent written in java but JVMTI is even better than that. The most crucial difference between the javaagent and JVMTI app is derived from the loading mechanics. While the agents are loaded inside the heap, they are governed by the same JVM. Whereas the JVMTI agents are not governed by the JVM rules and are thus not affected by the JVM internals such as the GC or runtime error handling.
You can even give Java Mission Control a try if you're using JDK7 or above :)
Jolokia is a java agent you can use to expose JMX as http. Run jmx2graphite and get those metrics into Graphite. The link includes instructions on Graphite installation (10 minutes)
I'm running ActiveMQ (a very recent version) on LinuxMint 15 using oracle 1.7 java. I've only enabled a single transport "mqtt+nio+ssl". It boots up fine, ssl is all working, easy!
However, when I make a (mqtt) connection from the same host (different java process), the activemq process starts to consume a whole core. It keeps the core at 100% until I stop it (it stops normally). This sounds like abnormal behaviour to me, but when I turned on debug logging I got nothing that seemed to suggest massive CPU consumption.
Has anyone else seen or resolved this problem?
Can anyone suggest how I should go about analyzing this problem?
Many Thanks!
Obviously this is some sort of bug in ActiveMQ. There's been a lot of work done on the MQTT and AMQP side for the upcoming release of v5.9.0. You can download snapshots builds or the release candidate of 5.9 and test that to see if it still does this. If it still acts like this then you need to create an issue in the Jira tracker so the team can work on it, preferably with a test case to reproduce it.
Many times, I get:
-Frozen, load goes to 5.0. Can't use my box.
-Just doesn't work.
Do following steps:
1.rabbitmq-plugins enable rabbitmq_management
2.service rabbitmq-server restart
3.browse to http://rabbitmq-server-ip:15672
4.login with
username: guest
password: guest
Dont forget to change your password later.
As sheki notes, rabbitmqctl is your first port of call for diagnostics, and for building monitoring on top of, but it's not suitable for actual monitoring directly being a manual command line.
I've found DataDog very good to monitor both the MQ details, plus the host platform in parallel. e.g. you can watch the queue levels and set alerts on queues backing-up, while also watching the CPU/memory/IO inflicted by these queue levels. It really helps to get ratios of resource usage, and the alerts are good. Having a uniform platform for both infrastructure and application level monitoring is surprisingly rare, but speeds up diagnoses of production issues hugely.
NewRelic is similar and also has a RabbitMQ plugin, although I've not used this plugin specifically, I've used NR for years and found it invaluable in diagnosing operational issues.
AppDynamics is another example. Similarly this allows you to drill down into your app from a high-level dashboard, and visually navigate from problems to causes. It's especially good with visualising the network of a distributed application across various services/servers. I've used this, for example, to find complex problems in .NET applications and SQL Server clusters using 3rd party Web Services (e.g. latency and its consequences to your app over chatty protocols). These things are very difficult to diagnose, especially for developers who are limited to checking their code. Diagnosing operational issues requires a much broader picture.
I gave up trying to even install and configure Nagios. I know it's the 'best' but it's the best of an old breed of self-configured beasts which we don't have time to manage. I didn't even get it going... and eventually turned to the more 'modern' cloud approach. Once you get over the trust factor, it's pretty liberating.
I'm using these APM platforms together* to aggregate data from:
Windows O/S level Event Logs/Services
Linux O/S level
AWS console level
RDS, EC2
Apache
MySQL
App integrations / custom NR plugins I've written
Rabbit MQ
*NewRelic can feed into Datadog! So if you are already using NR you don't need to install DD on those hosts as well.
Being able to view all these levels together gives you a view on the publishers, middleware, MQ servers, workers and front-end app - all in one dashboard.
I would highly recommend an approach like this, because just looking at one server alone leads you to a lot of head-scratching. Seeing an entire stack in one customisable dashboard is just so illuminating it takes most of the guesswork out of it.
Worried about installing these things? I found New Relic to be especially light-weight and unobtrusive. AppDynamics seemed to stress the host a bit more, but mostly that's because you had to run the visualisation tools on the host! (this may have changed). DataDog seems performant, but creates a lot of control panels/icons on the target host (perhaps just a visual impression).
To a four year old question - this answer probably wasn't available in 2011, but in 2015 these once 'startup' style APM services are just tens or hundred dollars a month for an unbelievably rich enterprise-level solution.
There are bunch of RabbitMQ monitoring plugins available for different monitoring systems like Nagios, Zabbix etc.
Look at http://www.rabbitmq.com/how.html#management
Using rabbitmqctl is the most straight forward solution to check the status of the node.
$ rabbitmqctl status
This should tell you the status of the RabbitMQ node.
If you have PRTG (or any probe system with a HTTP sensor check), you can check the server status described at the following page:
https://blog.cdemi.io/monitoring-rabbitmq-in-prtg/
In particular you have to
Enable Management Plugin
The rabbitmq-management plugin provides an HTTP-based API for management and monitoring of your RabbitMQ
server, along with a browser-based UI and a command line tool,
rabbitmqadmin. The management plugin is included in the RabbitMQ
distribution. To enable it, we need to run: rabbitmq-plugins enable
rabbitmq_management on the RabbitMQ nodes. For more details on the
Management plugin refer to RabbitMQ Documentation.
The web UI is located at: http://server-name:15672/ The HTTP API and
its documentation are both located at: http://server-name:15672/api/
Once done, you can check the overview of your server with the API:
http://server-name:15672/api/overview
Where you have a JSON with all details about the server, active connections, queues, etc.
This cmd will help you service rabbitmq-server status
OR try theseservice rabbitmq-server stop and service rabbitmq-server start then service rabbitmq-server status.