Specifying RabbitMQ messaging strategy on memeory or disc - rabbitmq

I am new at RabbitMQ am wonder something about saving message strategy. By default RabbitMQ saves message queuses on memeory. This way is high performance. But messages are important and should be save on disc. Because server may down at any time. This way shows slower performace.
Which stuation should be prefable. What is your real world experience?

There is a whole lot regarding persistance here.
You can make queues durable, in that way messages are saved to the disk. Of course only until they are acknowledged!
You didn't say what is your use case and what do you need this for, but bare in mind that RAbbitMQ is not a database.

Related

RabbitMQ HA with Durable features

Background
I have a RabbitMQ cluster that running for more than a year without any problems. Lastly, I found that sometimes, the CPU of the machine is touching the 100% CPU. I'm investigating ways to increase the throughput of the cluster to serve more customers.
The cluster architecture is that we have HA enabled (exactly 1 replica), and durable messages (for all the queues). As I understand it, the durable feature is the most expensive one in terms of performance. So, I trying to understand if it is needed for me.
Question
According to my experience, the cluster was running for more than a year without problems. So I assume that the chance for a problem is very low. Even after this, I want to create another layer of protection, just in case...
If I have two servers that holding the same data, but not storing it into the disk (durable OFF), is not safe enough for 99.99% of the cases? Those two servers are in different regions so the chance that both of them will go down is very low. Wondering if saving it to the disk can be helpful, or just a waste?
There is a thumb rule about the performance improvements of disabling the durable feature? In percents.
Thank you!
The influence of durable on performance
For reliable delivery, rabbitmq use the publish confirmation mechanism. Everytime the publisher publish a message to rabbitmq server, the server will respond with basic.ack rpc to ack the message. For routable messages, the basic.ack is sent when a message has been accepted by all the queues. For persistent messages routed to durable queues, this means persisting to disk. For mirrored queues, this means that all mirrors have accepted the message. So as you mentioned, the IO may become bottlenect of performance.
Is it overhead both durable and mirrored
It depends on your consideration between performance and HA. Imagine if you declare non-durable mirrored queue, and the master and slave are down, your messages will get lost. So whether overhead depends on how important message safty is.
Is the performance bottleneck mainly caused by durable?
As we discussed, if you declare non-durable queue, the throught maybe increase. But this may not be the main cause of low performance. You have said the cpu usage sometimes is 100%, which means very little I/O waitting. The high load maybe due to many connections and high throughput. In order to determine how to increase throughput, you can use benchmark tool to find the bottleneck.
pages maybe useful:
https://www.cloudamqp.com/blog/2016-01-25-identify-and-protect-against-high-cpu-and-memory-usage.html
https://www.cloudamqp.com/blog/2018-01-08-part2-rabbitmq-best-practice-for-high-performance.html

Redis PUB/SUB and high availability

Currently I'm working on a distributed test execution and reporting system. I'm planning to use Redis PUB/SUB as a message queue and message distribution system.
I'm new to Redis, so I'm trying to read as many docs as I can and play around with it. One of the most important topics is high availability. As I said, I'm not an expert, but I'm aware of the possible options - using Sentinel, replication, clustering, etc.
What's not clear for me is how the Pub/Sub feature and the HA options are related each other. What's the best practice to build a reliable messaging system with Redis? By reliable I mean if my Redis message broker is down there should be some kind of a backup node (a slave?) that should be able to take over this role.
Is there a purely server-side solution? Or do I need to create a smart wrapper around the Redis client to handle this? Will a Sentinel-driven setup help me?
Doing pub sub in Redis with failover means thinking about additional factors in the client side. A key piece to understand is that subscriptions are per-connection. If you are subscribed to a channel on a node and it fails, you will need to handle reconnect and resubscribe. Because subscriptions are done at the connection level it is not something which can be replicated.
Regarding the details as to how it works and what you can expect to see, along with ways around it see a post I made earlier this year at https://objectrocket.com/blog/how-to/reliable-pubsub-and-blocking-commands-during-redis-failovers
You can lower the risk surface by subscribing to slaves and publishing to the master, but you would then need to have non-promotable slaves to subscribe to and still need to handle losing a slave - there is just as much chance to lose a given slave as there is a master.
IMO, PUB/SUB is not a good choice, may be disque (comes from antirez, author of the Redis) fits better:
Disque, an in-memory, distributed job queue

RabbitMQ vs NoSQL?

I was just wondering why would you use a something like RabbitMQ instead of a persistent store especially a document store like MongoDB? Arent they kinda the same? Whats the benefit of something like RabbitMQ over a database?
Would anyone who used something like RabbitMQ elaborate on the benefits?
RabbitMQ is a message broker software aka a queue and not a NoSql database!
While the trend goes towards storing more and more data in scaled-up queues as well as processing data at real time and thus obliterating the need for additional data storage, queues are not to be confused with databases:
most queues don't persist data indefinitely.
the data in queues is not available on demand by the use of queries, but accessed via an automatically triggered consumer mechanism.
the architectural intention behind queues differs tremendously from that of databases. Their purpose in a system's architecture is not data storage, but system integration and data distribution. For more good information on queue architecture, please check this article from the Kafka guys.

Need advice on suitable message queue for Storm spout

I'm developing a prototype Lambda system and my data is streaming in via Flume to HDFS. I also need to get the data into Storm. Flume is a push system and Storm is more pull so I don't believe it's wise to try to connect a spout to Flume, but rather I think there should be a message queue between the two. Again this is a prototype, so I'm looking for best practices, not perfection. I'm thinking of putting an AMQP compliant queue as a Flume sink and then pulling the messages from a spout.
Is this a good approach? If so, I want to use a message queue that has relatively robust support in both the Flume world (as a sink) and the Storm world (as a spout). If I go AMQP then I assume that gives me the option to use whatever AMQP-compliant queue I want to use, correct? Thanks.
If your going to use AMQP, I'd recommend sticking to the finalized 1.0 version of the AMQP spec. Otherwise, your going to feel some pain when you try to upgrade to it from previous versions.
Your approach makes a lot of sense, but, for us the AMQP compliant issue looked a little less important. I will try to explain why.
We are using Kafka to get data into storm. The main reason is mainly around performance and usability. AMQP complaint queues do not seem to be designed for holding information for a considerable time, while with Kafka this is just a definition. This allows us to keep messages for a long time and allow us to "playback" those easily (as the message we wish to consume is always controlled by the consumer we can consume the same messages again and again without a need to set up an entire system for that purpose). Also, Kafka performance is incomparable to anything that I have seen.
Storm has a very useful KafkaSpout, in which the main things to pay attention to are:
Error reporting - there is some improvement to be done there. Messages are not as clear as one would have hoped.
It depends on zookeeper (which is already there if you have storm) and a path is required to be manually created for it.
According to the storm version, pay attention to the Kafka version in use. It is documented, but, can really easily be missed and cause unclear problems.
You can have the data streamed to a broker topic first. Then flume and storm spout can both consume from that topic. Flume has a jms source that makes it easy to consume from the message broker. And a storm jms spout to get the messages into storm.

Why NServiceBus ForwardRecievedMessagesTo and what are the performance implications of using it?

What is the intended usage of ForwardRecievedMessagesTo?
I read some where that it is to support auditing. Is there any harm in using it as a solution to ensure that messages have been processed and if not reprocessing them? lets say a message was sent to queue_A#server_A and also forwarded to q_All#server_All and before the message was handled, machine_A died irrecoverably. In such a case, I could have a handler pick up messages from q_All#sever_All and check against a database table if the message has been processed. If not reprocess(publish or send) the message or save it in a database table.
Also, what is the performance implication of using forwardreceivedmessageto? How is it different from journalling?
Yes, I am trying to not use msmq clustering.
The feature is there to support auditing. If your machine dies during processing then the messages will backup at the sending machine and would continue to flow after the machine recovered. This means you must size the disk on the sending machine appropriately. You could leverage auditing to accomplish this and the overhead would be minimal. The implication would be the time it would take to complete the distributed transaction to the other machine where your audit queue lives which should be very small.