We are currently trying to use Prometheus/Grafana in order to monitor several RabbitMQ instances deployed on multiple Docker containers.
My question is quite simple, what's the difference between using the Rabbitmq Prometheus Exporter vs Prometheus Plugin for RabbitMQ ?
Does the exporter scrape different/more information compared to the Plugin ?
Is there an overhead when using the Pluging compared to the exporter ?
Is it just a question of RabbiMQ's version ?
What is the added value from using one of the two options?
So basically what approach is better or can they be used in combination.
I have not trued out the plugin, but as far as I read it exports the same metrics as the exporter. The plugin has the advantage that it does not add complexity:
You need to host the rabbit exporter (which is not much effort, but still you need to make sure it runs, is updated from time to time,...)
You need an account for the rabbit exporter that can query the metrics which is a security issue. Your credentials might get stolen or the exporter might get compromised and and attacker would have access to your rabbit cluster.
Since there might be network between your rabbit cluster and the Rabbit exporter there might be the situation the exporter cannot reach the cluster while the plugin could still produce the metrics.
These are not big issues, we use the exporter for years now and never had an issue with it, but still, if we would start from scratch, we would give the plugin a try.
Related
We are running one of our services in a newly created kubernetes cluster. Because of that, we have now switched them from the previous "in-memory" cache to a Redis cache.
Preliminary tests on our application which exposes an API shows that we experience timeouts from our applications to the Redis cache. I have no idea why and it issue pops up very irregularly.
So I'm thinking maybe the reason for these timeouts are actually network related. Is it a good idea to put in affinity so we always run the Redis-cache on the same nodes as the application to prevent network issues?
The issues have not arisen during "very high load" situations so it's concerning me a bit.
This is an opinion question so I'll answer in an opinionated way:
Like you mentioned I would try to put the Redis and application pods on the same node, that would rule out wire networking issues. You can accomplish that with Kubernetes pod affinity. But you can also try nodeslector, that way you always pin your Redis and application pods to a specific node.
Another way to do this is to taint your nodes where you want to run your workloads and then add a toleration to the Redis and your application pods.
Hope it helps!
We are using prometheus in our production envirment recently. Before we only have 30-40 nodes for each service and those servers not change very often, so we just write it in the prometheus.yml, but right now it become too long to hold in one file and change much frequently then before, so my question is should i use file_sd_config to put those server list out of yml file and change those config files sepearately, or using consul for service discovery(same much easy to handle changes).
I have install 3 nodes consul cluster in data center and as i can see if i change to use consul to slove this problem , i also need to install consul client in each server(node) and define its services info. Is that correct? or does anyone have good advise.
Thanks
I totally advocate the use of a service discovery system. It may be a bit hard to deploy at first but surely it will worth it in the future.
That said, Prometheus comes with a lot of service discovery integrations. It's possible that you don't need a Consul cluster. If your servers are in a cloud provider like AWS, GCP, Azure, Openstack, etc, prometheus are able to autodiscover the instances.
If you keep running with Consul, the answer is yes, the agent must be running in every node. You can also register services and nodes via API but it's easier to deploy the agent.
For a large online application, use k8s to run it. The scale maybe daily activity user 500,000.
The application inside k8s need messaging feature - Pub/Sub, there are these options:
Kafka
RabbitMQ
Redis
Kafka
It needs zookeeper and good to run on os depends on disk I/O. So if install it into k8s cluster, how? The performance will be worse?
And, if keep Kafka outside of the k8s cluster, connect Kafka from application inside the k8s cluster, how about that performance? They are in the different layer, won't be slow?
RabbitMQ
It's slow than Kafka, but for a daily activity user 500,000 application, is it good enough? If so, maybe it's a good choice.
Redis
It's another option. Maybe the most simple one. But from the internet I got that it will lose message sometimes. If true, that's terrible.
So, the most important thing is, use Kafka(also with zookeeper) on k8s, good or not in this use case?
Yes, running Kafka on Kubernetes is great. Check out this example: https://github.com/Yolean/kubernetes-kafka. It includes ZooKeeper and Kafka as StatefulSets.
PS. Running any of the services in your question on Kubernetes will be pleasant. You can Google the name of the service and "kubernetes" and find example manifests. Many examples here: https://github.com/kubernetes/charts.
For Kafka, you can find some suggestion here. Kubernetes 1.7+ supports local persistent volume, which may be good for Kafka deployment.
You can also take a look to the following project :
https://github.com/EnMasseProject/barnabas
It's about running Kafka on Kubernetes and OpenShift as well. It provides deploying with StatefulSets with persistent volumes or just in memory (for developing or just testing purpose). It provides deploying for Kafka Connect and Prometheus metrics as well.
Another simple configuration of Kafka/Zookeeper on Kubernetes in DigitalOcean with external access:
https://github.com/StanislavKo/k8s_digitalocean_kafka
You can connect to Kafka from outside of AWS/DO/GCE by regular binary protocol. Connection is PLAINTEXT or SASL_PLAINTEXT (user/password).
Kafka cluster is StatefulSet, so you can scale cluster easily.
Am using RabbitMQ for messaging queue and using the statistic graphs provided in the management portal.
I want clear the graphs in between my tests, but don't know how to do it. I did try by clearing rabbitmq logs but no luck!
This would be my first time using ActiveMQ (instead of the out-of-the-box OpenMQ in GF) and I am trying to determine which approach is better in terms of scaling and maintaining an ActiveMQ environment. We do have experience in setting up and maintaining Glassfish clusters and deploy applications to it. But we are contemplating on what approach is better as we don't want to go down a rabbit hole that we can't get out of because we built environments around it and seeing towards the end that the infrastructure we had setup wouldn't scale.
Has anybody tried using both approaches? Even if anybody implemented one of the approaches with Glassfish, telling us their experience (gains and pains) would be very helpful and appreciated.
For 99% of cases, it's usually better to deploy a standalone broker - this way you're treating your messaging as just another layer of the infrastructure, much like a database. When a broker is standalone, you can set it up as highly available, upgrade it at will without modifying your applications (a broker can be upgraded without upgrading the client libraries), and can scale it out as appropriate later on if you need to (most projects don't).
I have seen people deploy brokers as embedded, with a convoluted network of brokers to get all the boxes in a cluster talking to each other. This usually ends in tears and reverting back to a separate master-slave pair of brokers. Which is all they needed all along.