Let's say I have 200 events which are going to be placed in multiple queues (or not) and I was thinking of binding each queue to a topic exchange with 200 unique keys. Am i going to see bottleneck in performance by adding 200 unique bindings between one queue and one exchange?
if yes, do I have an alternative?
Thanks in advance
In general, it is less likely (like snow on July 4th) that routing will be the most resources consuming part. For further reading on routing please refer to Very fast and scalable topic routing – part 1 and Very fast and scalable topic routing – part 2.
As to particular case it depends on resources available to RabbitMQ server(s), messages flow, bindings, bindings key complexity, etc. Anyway, it is always better to run some load tests first to figure out bottlenecks, but again, it is less likely that routing will be the cause of significant performance degradation.
Related
Hypothetical (but simpler) scenario:
I have many orders in my system.
I have external triggers that affect those orders (e.g. webhooks). They may occur in parallel, and are handled by different instances in my cluster.
In the scope of a single order, I would like to make sure that those events are processed in sequential order to avoid race conditions, version conflicts etc.
Events for different orders can (and should) be processed in parallel
I'm currently toying with the idea of leveraging RabbitMQ with a setup similar to this:
use a queue for each order (create on the fly)
if an event occurs, put it in that queue
Those queues would be short-lived, so I wouldn't end up with millions of them, but it should scale anyway (let's say lower one-digit thousands if the project grows substantially). Question is whether that's an absolute anti-pattern as far as RabbitMQ (or similar) systems goes, or if there's better solutions to ensure sequential execution anyway.
Thanks!
In my opinion creating ephemeral queues might not be a great idea as there will considerable overhead of creating and delete queues. The focus should be on message consumption. I can think of following solutions:
You can limit the number of queues by building publishing strategy like all orders with orderId divisible by 2 goes to queue-1, divisible by 3 goes to queue-2 and so forth. That will give you parallel throughput as well as finite number queues but there is some additional publisher logic you have to handle
The same logic can be transferred to consumer side by using single pub-sub style queue and then onus lies on consumer to filter unwanted orderIds
If you are happy to explore other technologies, you can look into Kafka as well where you can use orderId as partitionKey and use multiple partitions to gain parallel throughput.
I am binding a queue to a topic exchange with multiple binding keys, and have the following related questions
Is there a maximum number of binding keys that can be used to bind a single queue to an exchange?
Is it considered bad practice to use a lot of binding keys for a single queue (say around 50 binding keys)?
I see from this answer here that "For topic routing, performance decreases as the number of bindings increase". I am curious to understand a bit more specifically about how drastically performance can decrease, and generally how big the number of bindings can get before it starts affecting performance.
I am reading "paxos" on wiki, and it reads:
"Rounds fail when multiple Proposers send conflicting Prepare messages, or when the Proposer does not receive a Quorum of responses (Promise or Accepted). In these cases, another round must be started with a higher proposal number."
But I don't understand how the proposer tells the difference between its proposal not being approved and it just takes more time for the message to transmit?
One of the tricky parts to understanding Paxos is that the original paper and most others, including the wiki, do not describe a full protocol capable of real-world use. They only focus on the algorithmic necessities. For example, they say that a proposer must choose a number "n" higher than any previously used number. But they say nothing about how to actually go about doing that, the kinds of failures that can happen, or how to resolve the situation if two proposers simultaneously try to use the same proposal number (as in both choosing n=2). That actually completely breaks the protocol and would lead to incorrect results but I'm not sure I've ever seen that specifically called out. I guess it's just supposed to be "obvious".
Specifically to your question, there's no perfect way to tell the difference using the raw algorithm. Practical implementations typically go the extra mile by sending a Nack message to the Proposer rather than just silently ignoring it. There are plenty of other tricks that can be used but all of them, including the nacks, come with varying downsides. Which approach is best generally depends on both the kind of application employing Paxos and the environment it's intended to run in.
If you're interested, I put together a much longer-winded description of Paxos that includes many of issues practical implementations must address in addition to the core components. It covers this issue along with several others.
Specific to your question it isn't possible for a proposer to distinguish between lost messages, delayed messages, crashed acceptors or stalled acceptors. In each case you get no response. Typically an implementation will timeout on getting less than a quorum response and resend the proposal on the assumption messages were dropped or acceptors are rebooting.
Often implementations add "nack" messages as negative acknowledgement as an optimisation to speed up recovery. The proposer only gets "nack" responses from nodes that are reachable that have accepted a higher promise. The ”nack” can show both the highest promise and also the highest instance known to be fixed. How this helps will be outlined below.
I wrote an implementation of Paxos called TRex with some of these techniques sticking as closely as possible to the description of the algorithm in the paper Paxos Made Simple. I wrote up a description of the practical considerations of timeouts and nacks on a blog post.
One of the interesting techniques it uses is for a timed out node to make the first proposal with a very low number. This will always get "nack" messages. Why? Consider a three node cluster where one network link breaks between a stable proposer and one other node. The other node will timeout and issue a prepare. If it issues a high prepare it will get a promise from the third node. This will interrupt the stable leader. You then have symmetry where the two nodes that cannot message one another can fight with the leadership swapping with no forward progress.
To avoid this a timed out node can start with a low prepare. It can then look at the "nack" messages to learn from the third node that there is a leader who is making progress. It will see this as the highest instance known to be fixed in the nack will be greater than the local value. The timed out node can then not issue a high prepare and instead ask the third node to send it the latest fixed and accepted values. With that enhancement a timed out node can now distinguish between a stable proposer crashing or the connection failing. Such ”nack” based techniques don't affect the correctness of the implementation they are only an optimisation to ensure fast failover and forward progress.
I have tried to find an answer but couldn't find anything solid.
Let's say i have system that split to 3 modules.(Just an example)
Users
Products
Orders
I can create 3 Exchanges, 1 for each module.(Assuming they are all the same type)
Or i can create 1 Exchange for everyone.
What will be the differences(Beside the logical separation between modules)
Are there any best practices related to performance in such case?
And another one.. There is any meaning of splitting channel in nodeJS?
Nodejs is single threaded but the I/O calls is working on the OS threads
Thanks for your help.. I would love to get clarification for that and if there is any official reference it will be great
EDIT: I am trying to achieve really good performance and low latency streaming which is cruical for my business.
I worked with rabbit before and i know him pretty(Worked in Java) well but when i started to read i wasn't sure what the exchange benefits? Why we need to create new exchange if we have the default(Performance wise) and if the channels have any meaning in nodejs application cluster where every node is single process
Greetings,
I'm evaluating some components for a multi-data center distributed system. We're going to be using message queues (via either RabbitMQ or Qpid) so agents can make asynchronous requests to other agents without worrying about addressing, routing, load balancing or retransmission.
In many cases, the agents will be interacting with components that were not designed for highly concurrent access, so locking and cross-agent coordination will be needed to avoid race conditions. Also, we'd like the system to automatically respond to agent or data center failures.
With the above use cases in mind, ZooKeeper seemed like it might be a good fit. But I'm wondering if trying to use both ZK and message queuing is overkill. It seems like what Zookeeper does could be accomplished by my own cluster manager using AMQP messaging, but that would be hard to get really right. On the other hand, I've seen some examples where ZooKeeper was used to implement message queuing, but I think RabbitMQ/Qpid are a more natural fit for that.
Has anyone out there used a combination like this?
Thanks in advance,
-Chris
Coming into this late, but maybe it will be of some use. The primary consideration should be the performance characteristics of your system. ZooKeeper, like you said, is more than capable of implementing a task distribution system using a distributed queue, but zk currently, is more optimized for reads than it is for writes (this only comes into play in the 1000's of ops per second range). If your throughput needs are less than this, then using just zk to implement your system would reduce number of runtime components and make it simpler. Of course, you should always run your performance tests before deciding.
Distributed coordination is really hard to get right, so I would definitely recommend using zookeeper for that and not rolling your own.
Not quite sure what ZooKeeper exactly is, but I guess that using a component from Apache (if it does fit your needs well) is preferred before managing such things as distributed synchronization and group services at your own. You could of course hire a team of developers especially for that purpose, but that doesn't guarantee you a better implementation.
I guess, that it would be anyways implemented as a separate component, cuz other way could bring much complexity and decelerate the workflow; so the preference of ZooKeeper or anything similar is kind of obvious (to me).
And surely, unless you're in the global optimization phase of your project workflow, I guess it would be better to use RabbitMQ or such (I would even stress that, cuz implementations (especially commercial) of the AMQP would be more reliable than everything that you'd come up with).
So I would go for both, carefully chosing the appropriate thirdparty products, but using as much of them as it is needed. And that's just my opinion; thanks for reading :)