How to handle long asynchronous requests with pyramid and celery? - rabbitmq

I'm setting up a web service with pyramid. A typical request for a view will be very long, about 15 min to finish. So my idea was to queue jobs with celery and a rabbitmq broker.
I would like to know what would be the best way to ensure that bad things cannot happen.
Specifically I would like to prevent the task queue from overflow for example.
A first mesure will be defining quotas per IP, to limit the number of requests a given IP can submit per hour.
However I cannot predict the number of involved IPs, so this cannot solve everything.
I have read that it's not possible to limit the queue size with celery/rabbitmq. I was thinking of retrieving the queue size before pushing a new item into it but I'm not sure if it's a good idea.
I'm not used to good practices in messaging/job scheduling. Is there a recommended way to handle this kind of problems ?

RabbitMQ has flow control built into the QoS. If RabbitMQ cannot handle the publishing rate it will adjust the TCP window size to slow down the publishers. In the event of too many messages being sent to the server it will also overflow to disk. This will allow your consumer to be a bit more naive although if you restart the connection on error and flood the connection you can cause problems.
I've always decided to spend more time making sure the publishers/consumers could work with multiple queue servers instead of trying to make them more intelligent about a single queue server. The benefit is that if you are really overloading a single server you can just add another one (or another pair if using RabbitMQ HA. There is a useful video from Pycon about Messaging at Scale using Celery and RabbitMQ that should be of use.

Related

RabbitMQ security design to declare queues from server (and use from client)

I have a test app (first with RabbitMQ) which runs on partially trusted clients (in that i don't want them creating queues on their own), so i will look into the security permissions of the queues and credentials that the clients connect with.
For messaging there are mostly one-way broadcasts from server to clients, and sometimes a query from server to a specific client (over which the replies will be sent on a replyTo queue which is dedicated to that client on which the server listens for responses).
I currently have a receive function on the server which looks out for "Announce" broadcast from clients:
agentAnnounceListener.Received += (model, ea) =>
{
var body = ea.Body;
var props = ea.BasicProperties;
var message = Encoding.UTF8.GetString(body);
Console.WriteLine(
"[{0}] from: {1}. body: {2}",
DateTimeOffset.FromUnixTimeMilliseconds(ea.BasicProperties.Timestamp.UnixTime).Date,
props.ReplyTo,
message);
// create return replyTo queue, snipped in next code section
};
I am looking to create the return to topic in the above receive handler:
var result = channel.QueueDeclare(
queue: ea.BasicProperties.ReplyTo,
durable: false,
exclusive: false,
autoDelete: false,
arguments: null);
Alternatively, i could store the received announcements in a database, and on a regular timer run through this list and declare a queue for each on every pass.
In both scenarioes this newly created channel would then be used at a future point by the server to send queries to the client.
My questions are please:
1) Is it better to create a reply channel on the server when receiving the message from client, or if i do it externally (on a timer) are there any performance issues for declaring queues that already exist (there could be thousands of end points)?
2) If a client starts to miss behave, is there any way that they can be booted (in the receive function i can look up how many messages per minute and boot if certain criteria are met)? Are there any other filters that can be defined prior to receive in the pipeline to kick clients who are sending too many messages?
3) In the above example notice my messages continuously come in each run (the same old messages), how do i clear them out please?
I think preventing clients from creating queues just complicates the design without much security benefit.
You are allowing clients to create messages. In RabbitMQ, its not very easy to stop clients from flooding your server with messages.
If you want to rate-limit your clients, RabbitMQ may not be the best choice. It does rate-limiting automatically when servers starts to struggle with processing all the messages, but you can't set a strict rate limit on per-client basis on the server using out-of-the-box solution. Also, clients are normally allowed to create queues.
Approach 1 - Web App
Maybe you should try to use web application instead:
Clients authenticate with your server
To Announce, clients send a POST request to a certain endpoint, ie /api/announce, maybe providing some credentials that allow them to do so
To receive incoming messages, GET /api/messages
To acknowledge processed message: POST /api/acknowledge
When client acknowledges receipt, you delete your message from database.
With this design, you can write custom logic to rate-limit or ban clients that misbehave and you have full control of your server
Approach 2 - RabbitMQ Management API
If you still want to use RabbitMQ, you can potentially achieve what you want by using RabbitMQ Management API
You'll need to write an app that will query RabbitMQ Management API on timer basis and:
Get all the current connections, and check message rate for each of them.
If message rate exceed your threshold, close connection or revoke user's permissions using /api/permissions/vhost/user endpoint.
In my opinion, web app may be easier if you don't need all the queueing functionality like worker queues or complicated routing that you can get out of the box with RabbitMQ.
Here are some general architecture/reliability ideas for your scenario. Responses to your 3 specific questions are at the end.
General Architecture Ideas
I'm not sure that the declare-response-queues-on-server approach yields performance/stability benefits; you'd have to benchmark that. I think the simplest topology to achieve what you want is the following:
Each client, when it connects, declares an exclusive and/or autodelete anonymous queue. If the clients' network connectivity is so sketchy that holding open a direct connection is undesirable, so something similar to Alex's proposed "Web App" above, and have clients hit an endpoint that declares an exclusive/autodelete queue on their behalf, and closes the connection (automatically deleting the queue upon consumer departure) when a client doesn't get in touch regularly enough. This should only be done if you can't tune the RabbitMQ heartbeats from the clients to work in the face of network unreliability, or if you can prove that you need queue-creation rate limiting inside the web app layer.
Each client's queue is bound to a broadcast topic exchange, which the server uses to communicate broadcast messages (wildcarded routing key) or specifically targeted messages (routing key that only matches one client's queue name).
When the server needs to get a reply back from the clients, you could either have the server declare the response queue before sending the "response-needed" message, and encode the response queue in the message (basically what you're doing now), or you could build semantics in your clients in which they stop consuming from their broadcast queue for a fixed amount of time before attempting an exclusive (mutex) consume again, publish their responses to their own queue, and ensure that the server consumes those responses within the allotted time, before closing the server consume and restoring normal broadcast semantics. That second approach is much more complicated and likely not worth it, though.
Preventing Clients Overwhelming RabbitMQ
Things that can reduce the server load and help prevent clients DoSing your server with RMQ operations include:
Setting appropriate, low max-length thresholds on all the queues, so the amount of messages stored by the server will never exceed a certain multiple of the number of clients.
Setting per-queue expirations, or per-message expirations, to make sure that stale messages do not accumulate.
Rate-limiting specific RabbitMQ operations is quite tricky, but you can rate-limit at the TCP level (using e.g. HAProxy or other router/proxy stacks), to ensure that your clients don't send too much data, or open too many connections, at a time. In my experience (just one data point; if in doubt, benchmark!) RabbitMQ cares less about the count of messages ingested per time than it does the data volume and largest possible per-message size ingested. Lots of small messages are usually OK; a few huge ones can cause latency spikes, otherwise, rate-limiting the bytes at the TCP layer will probably allow you to scale such a system very far before you have to re-assess.
Specific Answers
In light of the above, my answers to your specific questions would be:
Q: Should you create reply queues on the server in response to received messages?
A: Yes, probably. If you're worried about the queue-creation rate
that happens as a result of that, you can rate-limit per server instance. It looks like you're using Node, so you should be able to use one of the existing solutions for that platform to have a single queue-creation rate limiter per node server instance, which, unless you have many thousands of servers (not clients), should allow you to reach a very, very large scale before re-assessing.
Q: Are there performance implications to declaring queues based on client actions? Or re-declaring queues?
A: Benchmark and see! Re-declares are probably OK; if you rate-limit properly you may not need to worry about this at all. In my experience, floods of queue-declare events can cause latency to go up a bit, but don't break the server. But that's just my experience! Everyone's scenario/deployment is different, so there's no substitute for benchmarking. In this case, you'd fire up a publisher/consumer with a steady stream of messages, tracking e.g. publish/confirm latency or message-received latency, rabbitmq server load/resource usage, etc. While some number of publish/consume pairs were running, declare a lot of queues in high parallel and see what happens to your metrics. Also in my experience, the redeclaration of queues (idempotent) doesn't cause much if any noticeable load spikes. More important to watch is the rate of establishing new connections/channels. You can also rate-limit queue creations very effectively on a per-server basis (see my answer to the first question), so I think if you implement that correctly you won't need to worry about this for a long time. Whether RabbitMQ's performance suffers as a function of the number of queues that exist (as opposed to declaration rate) would be another thing to benchmark though.
Q: Can you kick clients based on misbehavior? Message rates?
A: Yes, though it's a bit tricky to set up, this can be done in an at least somewhat elegant way. You have two options:
Option one: what you proposed: keep track of message rates on your server, as you're doing, and "kick" clients based on that. This has coordination problems if you have more than one server, and requires writing code that lives in your message-receive loops, and doesn't trip until RabbitMQ actually delivers the messages to your server's consumers. Those are all significant drawbacks.
Option two: use max-length, and dead letter exchanges to build a "kick bad clients" agent. The length limits on RabbitMQ queues tell the queue system "if more messages than X are in the queue, drop them or send them to the dead letter exchange (if one is configured)". Dead-letter exchanges allow you to send messages that are greater than the length (or meet other conditions) to a specific queue/exchange. Here's how you can combine those to detect clients that publish messages too quickly (faster than your server can consume them) and kick clients:
Each client declares it's main $clientID_to_server queue with a max-length of some number, say X that should never build up in the queue unless the client is "outrunning" the server. That queue has a dead-letter topic exchange of ratelimit or some constant name.
Each client also declares/owns a queue called $clientID_overwhelm, with a max-length of 1. That queue is bound to the ratelimit exchange with a routing key of $clientID_to_server. This means that when messages are published to the $clientID_to_server queue at too great a rate for the server to keep up, the messages will be routed to $clientID_overwhelm, but only one will be kept around (so you don't fill up RabbitMQ, and only ever store X+1 messages per client).
You start a simple agent/service which discovers (e.g. via the RabbitMQ Management API) all connected client IDs, and consumes (using just one connection) from all of their *_overwhelm queues. Whenever it receives a message on that connection, it gets the client ID from the routing key of that message, and then kicks that client (either by doing something out-of-band in your app; deleting that client's $clientID_to_server and $clientID_overwhelm queues, thus forcing an error the next time the client tries to do anything; or closing that client's connection to RabbitMQ via the /connections endpoint in the RabbitMQ management API--this is pretty intrusive and should only be done if you really need to). This service should be pretty easy to write, since it doesn't need to coordinate state with any other parts of your system besides RabbitMQ. You'll lose some messages from misbehaving clients with this solution, though: if you need to keep them all, remove the max-length limit on the overwhelm queue (and run the risk of filling up RabbitMQ).
Using that approach, you can detect spamming clients as they happen according to RabbitMQ, not just as they happen according to your server. You could extend it by also adding a per-message TTL to messages sent by the clients, and triggering the dead-letter-kick behavior if messages sit in the queue for more than a certain amount of time--this would change the pseudo-rate-limiting from "when the server consumer gets behind by message count" to "when the server consumer gets behind by message delivery timestamp".
Q: Why do messages get redelivered on each run, and how do I get rid of them?
A: Use acknowledgements or noack (but probably acknowledgements). Getting a message in "receive" just pulls it into your consumer, but doesn't pop it from the queue. It's like a database transaction: to finally pop it you have to acknowledge it after you receive it. Altnernatively, you could start your consumer in "noack" mode, which will cause the receive behavior to work the way you assumed it would. However, be warned, noack mode imposes a big tradeoff: since RabbitMQ is delivering messages to your consumer out-of-band (basically: even if your server is locked up or sleeping, if it has issued a consume, rabbit is pushing messages to it), if you consume in noack mode those messages are permanently removed from RabbitMQ when it pushes them to the server, so if the server crashes or shuts down before draining its "local queue" with any messages pending-receive, those messages will be lost forever. Be careful with this if it's important that you don't lose messages.

Large RabbitMQ message in Slow network

I am using RabbitMQ with Spring AMQP
large message (>100MB, 102400KB)
small bandwidth (<512Kbps)
low heartbeat interval (10 seconds)
single broker
It will take >= 200*8 seconds to consume the message, which is more than my heartbeat interval. From https://stackoverflow.com/a/42363685/418439
If the message transfer time between nodes (60seconds?) > heartbeat time between nodes, it will cause the cluster to disconnect and the loose the message
Will I also face the disconnection issue even I am using single broker?
Does the heartbeat and consumer using the same thread, where if
consumer is consuming, it is not possible to perform heartbeat?
If so, what can I do to consume the message, without increase heartbeat interval or reduce my message size?
Update:
I have received another answer and comments after I posted my own answer. Thanks for the feedback. Just to clarify, I do not use AMQP for file transfer. Actually the data is in JSON message, some are simple and small but some contain complex information, include some free hand drawing. Besides saving the data at Data Center, we also save a copy of message at branch level via AMQP, for case connectivity to Data Center is not available.
So, the real questions here are a bit more fundamental, and those are: (1) is it appropriate to perform a large file transfer via AMQP, and (2) what purpose does the heartbeat serve?
Heartbeats
First off, let's address the heartbeat question. As the RabbitMQ documentation clearly states, the purpose of the heartbeat is "to ensure that the application layer promptly finds out about disrupted connections."
The reason for this is simple. In an ordinary AMQP usage, there may be several seconds, even minutes between the arrival of successive messages. Without data being exchanged across a TCP session, many firewalls and other networking equipment automatically close ports to lower exposure to the enterprise network. Heartbeats further help mitigate a fundamental weakness in TCP, which is the difficulty of detecting a dropped connection. Networks experience failure, and TCP is not always able to detect that on its own.
So, the bottom line here is that, while you're transferring a large message, the connection is active and the heartbeat function serves no useful purpose, and can cause you trouble. It's best to turn it off in such cases.
AMQP For Moving Large Files?
The second issue, and I believe more important question, is how should large files be dealt with. To answer this, let's first consider what a message queue does: sending messages -- small bits of data which communicate something to another computer system. The operative word here is small. Messages typically contain one of three things: 1. commands (go do something), 2. events (something happened), 3. requests (give me some data), and 4. responses (here is your data). A full discussion on these is beyond the scope, but suffice it to say that each of these can generally be composed of a small message less than 100kB.
Indeed, the AMQP protocol, which underlies RabbitMQ, is a fairly chatty protocol. It requires large messages be divided into multiple segments of no more than 131kB. This can add a significant amount of overhead to a large file transfer, especially when compared to other file transfer mechanisms (FTP, for instance). Secondly, the message has to be fully processed by the broker before it is made available in a queue, and it ties up valuable resources on the broker while this is being done. For one, the whole message must fit into RAM on the broker due to its architecture. This solution may work for one client and one broker, but it will break quickly when scaling out is attempted.
Finally, compression is often desirable when transferring files - HTTP supports gzip compression automatcially. AMQP does not. It is quite common in message-oriented applications to send a message containing a resource locator (e.g. URL) pointing to the larger data file, which is then accessed via appropriate means.
The moral of the story
As the adage goes: "to the man with a hammer, everything looks like a nail." AMQP is not a hammer- it's a precision scalpel. It has a very specific purpose, and narrow applicability within that purpose. Using it for something other than its intended purpose will lead to stability and reliability problems in whatever it is you are designing, and overall dissatisfaction with your end product.
Will I also face the disconnection issue even I am using single
broker?
Yes
Does the heartbeat and consumer use the same thread, where
if consumer is consuming, it is not possible to perform heartbeat?
Can't confirm the thread, but from what I observe when Java RabbitMQ consumer consumes a message, it won't perform heartbeat acknowledgement. If the time to consume longer than 3 x heartbeat timeout timer (due to large message and/or low bandwidth), MQ server will close AMQP connection.
If so, what can I do to consume the message, without increase
heartbeat interval or reduce my message size?
I resolved my issue by increasing heartbeat size. No further code change is required.

How to load balancing ActiveMQ with persistent message

I have a middleware based on Apache Camel which does a transaction like this:
from("amq:job-input")
to("inOut:businessInvoker-one") // Into business processor
to("inOut:businessInvoker-two")
to("amq:job-out");
Currently it works perfectly. But I can't scale it up, let say from 100 TPS to 500 TPS. I already
Raised the concurrent consumers settings and used empty businessProcessor
Configured JAVA_XMX and PERMGEN
to speed up the transaction.
According to Active MQ web Console, there are so many messages waiting for being processed on scenario 500TPS. I guess, one of the solution is scale the ActiveMQ up. So I want to use multiple brokers in cluster.
According to http://fuse.fusesource.org/mq/docs/mq-fabric.html (Section "Topologies"), configuring ActiveMQ in clustering mode is suitable for non-persistent message. IMHO, it is true that it's not suitable, because all running brokers use the same store file. But, what about separating the store file? Now it's possible right?
Could anybody explain this? If it's not possible, what is the best way to load balance persistent message?
Thanks
You can share the load of persistent messages by creating 2 master/slave pairs. The master and slave share their state either though a database or a shared filesystem so you need to duplicate that setup.
Create 2 master slave pairs, and configure so called "network connectors" between the 2 pairs. This will double your performance without risk of loosing messages.
See http://activemq.apache.org/networks-of-brokers.html
This answer relates to an version of the question before the Camel details were added.
It is not immediately clear what exactly it is that you want to load balance and why. Messages across consumers? Producers across brokers? What sort of concern are you trying to address?
In general you should avoid using networks of brokers unless you are trying to address some sort of geographical use case, have too many connections for a signle broker to handle, or if a single broker (which could be a pair of brokers configured in HA) is not giving you the throughput that you require (in 90% of cases it will).
In a broker network, each node has its own store and passes messages around by way of a mechanism called store-and-forward. Have a read of Understanding broker networks for an explanation of how this works.
ActiveMQ already works as a kind of load balancer by distributing messages evenly in a round-robin fashion among the subscribers on a queue. So if you have 2 subscribers on a queue, and send it a stream of messages A,B,C,D; one subcriber will receive A & C, while the other receives B & D.
If you want to take this a step further and group related messages on a queue so that they are processed consistently by only one subscriber, you should consider Message Groups.
Adding consumers might help to a point (depends on the number of cores/cpus your server has). Adding threads beyond the point your "Camel server" is utilizing all available CPU for the business processing makes no sense and can be conter productive.
Adding more ActiveMQ machines is probably needed. You can use an ActiveMQ "network" to communicate between instances that has separated persistence files. It should be straight forward to add more brokers and put them into a network.
Make sure you performance test along the road to make sure what kind of load the broker can handle and what load the camel processor can handle (if at different machines).
When you do persistent messaging - you likely also want transactions. Make sure you are using them.
If all running brokers use the same store file or tx-supported database for persistence, then only the first broker to start will be active, while others are in standby mode until the first one loses its lock.
If you want to loadbalance your persistence, there were two way that we could try to do:
configure several brokers in network-bridge mode, then send messages
to any one and consumer messages from more than one of them. it can
loadbalance the brokers and loadbalance the persistences.
override the persistenceAdapter and use the database-sharding middleware
(such as tddl:https://github.com/alibaba/tb_tddl) to store the
messages by partitions.
Your first step is to increase the number of workers that are processing from ActiveMQ. The way to do this is to add the ?concurrentConsumers=10 attribute to the starting URI. The default behaviour is that only one thread consumes from that endpoint, leading to a pile up of messages in ActiveMQ. Adding more brokers won't help.
Secondly what you appear to be doing could benefit from a Staged Event-Driven Architecture (SEDA). In a SEDA, processing is broken down into a number of stages which can have different numbers of consumer on them to even out throughput. Your threads consuming from ActiveMQ only do one step of the process, hand off the Exchange to the next phase and go back to pulling messages from the input queue.
You route can therefore be rewritten as 2 smaller routes:
from("activemq:input?concurrentConsumers=10").id("FirstPhase")
.process(businessInvokerOne)
.to("seda:invokeSecondProcess");
from("seda:invokeSecondProcess?concurentConsumers=20").id("SecondPhase")
.process(businessInvokerTwo)
.to("activemq:output");
The two stages can have different numbers of concurrent consumers so that the rate of message consumption from the input queue matches the rate of output. This is useful if one of the invokers is much slower than another.
The seda: endpoint can be replaced with another intermediate activemq: endpoint if you want message persistence.
Finally to increase throughput, you can focus on making the processing itself faster, by profiling the invokers themselves and optimising that code.

ActiveMQ: Reject connections from producers when persistent store fills

I would like to configure my ActiveMQ producers to failover (I'm using the Stomp protocol) when a broker reaches a configured limit. I want to allow consumers to continue consumption from the overloaded broker, unabated.
Reading ActiveMQ docs, it looks like I can configure ActiveMQ to do one of a few things when a broker reaches its limits (memory or disk):
Slow down messages using producerFlowControl="true" (by blocking the send)
Throw exceptions when using sendFailIfNoSpace="true"
Neither of the above, in which case..I'm not sure what happens? Reverts to TCP flow control?
It doesn't look like any of these things are designed to trigger a producer failover. A producer will failover when it fails to connect but not, as far as I can tell, when it fails to send (due to producer flow control, for example).
So, is it possible for me to configure a broker to refuse connections when it reaches its limits? Or is my best bet to detect slow down on the producer side, and to manually reconfigure my producers to use the a different broker at that time?
Thanks!
Your best bet is to use sendFailIfNoSpace, or better sendFailIfNoSpaceAfterTimeout. This will throw an exception up to your client, which can then attempt to resend the message to another broker at the application level (though you can encapsulate this logic over the top of your Stomp library, and use this facade from your code). Though if your ActiveMQ setup is correctly wired, your load both in terms of production and consumption should be more or less evenly distributed across your brokers, so this feature may not buy you a great deal.
You would probably get a better result if you concentrated on fast consumption of the messages, and increased the storage limits to smooth out peaks in load.

How can I tell a WAS service polling an MSMQ to wait when busy?

I'm working on a system which amongst other things, runs payroll, a heavy load process. It is likely that soon, there may be so many requests to run payroll at peak times that the batch servers will be overwhelmed.
I'm looking to put together a proof of concept to cope with this by using MSMQ (probably replacing this with a commercial solution like nservicebus later). I using this this example as a basis. I can see how to set up the bindings and stick it together, but I still need a way to tell the subscribers hosted by WAS to only process the 'run heavy payroll process' message if they are not busy. Otherwise the messages on the queue will get picked up straightaway and we have the same problem as before.
Can I set up the subscribing service to say, "I'm busy, I can't take the message, leave it on the queue"? Does the queue need to be transactional?
If you're using WCF then there's no way to conditionally activate the channel thereby leaving the messages on the queue for later.
A better solution is to host the message receiver in a completely different process, for example as a windows service. These can then be enabled/disabled according to your service window requirement.
You also get the additional benefit of being able to very easily scale out the message receivers to handle greater loads (by hosting more instances of your receiver).
One way to do this is to have 2 queues, your polling always checks the high priority queue first, only if there are no items in that queue does it take an item from the other