I'm not quite sure how to go about finding the answer to this type of question. In Erlang each process has a message queue. There is the possibility that a message sent to a process may not match any pattern of that specific process. If that's the case the process leaves the message in the queue and goes on to check the other messages. My question is:
Doesn't this create a minor memory leak?
Hypothetically a process could continue to receive messages it can't match and grow and grow and grow eventually causing an issue. How does Erlang deal with this situation? I know Erlang has an implementation for timeouts, but if these messages didn't have a timeout would this cause a problem? Is there any sort of default garbage collection?
Doesn't this create a minor memory leak?
Yes, a selective receive can cause a process to run out of memory.
How does Erlang deal with this situation?
receive
Pattern1 -> %% do something;
Pattern2 -> %% do something else;
Other -> ok
end
Or, after some time passes:
myflush() ->
receive
_Any -> myflush()
after 0 -> % If at any time there are no messages in the mailbox, this executes.
ok
end.
Related
Our system has a bunch of consumers that use rabbit to consume messages for long running tasks. Currently we ack at the end of processing, so that if the consumer crashes, the message gets requeued. What we want is that a consumer only works on one message at a time and does not prefetch so that another consumer can work on the next message, and if a crash occurs we do not requeue, but we'll have our own monitor that will decide whether we need to re-run on a larger EC2 instance or whatever. It looks like we can get CLOSE to this by acking at start of processing with a prefetch of 1, but that is still 1 message in the queue that could have been handled by another consumer. Apparently setting prefetch to 0 makes no sense
according to rabbit devs (I don't understand why), so another option would be to still ack only on completion so that a prefetch doesn't occur, but somehow DON'T requeue on crash.
If we are swimming upstream so to speak then I know we'll have to come up with another plan, but I don't understand why the desire for a consumer to only work on one thing at a time (and not prefetch the next item of work) and to not requeue on crash is so odd
Consider using one of the RabbitTemplate receive() or receiveAndConvert methods instead; that's a better model for this type of workload - fetching records as needed instead of them being pushed into your app.
Recently, I have encounted some Confusing questions ,my rabbitmq always blocked beacause many messages doesn't get “ack” ,but I don't know what exactly messages cause it blocking. So I want to ask is there any way to find what mesaages make the block ?
If a message is not acknowledged (rejected) with requeue it goes back to its original position in the queue (or closer to the head of the queue).
This means that your next dequeue would return the same message which may be problematic. If this is your case you could requeue the message and ack the original in order to "move" the problematic message to the back of the queue.
You may also have an additional problem of thrashing your queue when you have only "problematic" messages remaining.
Perhaps this may shed some light: Where does a BasicReject with requeue actually go?
I have observed RabbitMQ "stuck" with unacked messages. The queue shows a consumer which no longer exists, and I assume what's happening is that RabbitMQ is continuing to deliver messages to that consumer. They show as an ever-increasing count of unacked messages. I'm doing this in PHP with php-amqplib.
I can produce the problem by killing the consumer process (control-C on command line).
I tried specifying a heartbeat of 3 seconds and tried keep-alive both true and false. With heartbeat, the consumer will eventually fail:
Exception fwrite(): send of 573 bytes failed with errno=32 Broken pipe
PhpAmqpLib\Wire\IO\StreamIO->error_handler(8, 'fwrite(): send ...',
php-amqplib/PhpAmqpLib/Wire/IO/StreamIO.php(281): fwrite(Resource id #176, '\x01\x00\x01\x00\x00\x00\x15\x00<\x00(\x00\x00\fb...', 8192)
Issue #374 might relate: https://github.com/php-amqplib/php-amqplib/issues/374
The consumer is consuming from multiple queues, but I believe that shouldn't matter.
The problem I'm trying to solve is that RabbitMQ continues to think that a consumer exists when it doesn't, with the result that RabbitMQ delivers those messages nowhere, and they go unacknowledged. I'm looking for a way to get rid of that spurious connection so that those messages can be re-delivered to a live consumer. I think that's what heartbeat is for, but I haven't gotten it to work.
The first and more important think that we need to do in this case is try to "print" your content message, and only return true to consumer. Don't process your real code, if you can "consume" the messages the problem isn't in rabbit but in our process, because probably we expend to much time to acknowledge message to rabbit and Rabbit closes our connections.
I'm not saying that its you case, but I'm just trying to help debugging the problem.
In my case I change the approach of this problem, because I have many product ids(my case) for each message and its expend long time to ACK process cause they reach database, I fit my messages and it works well after do that.
We can change the approach like create another queues to fit this messages, I don't know, but 90% of problems is it.
You can read more about Detecting Dead TCP Connections with Heartbeats here
When I set up manual Ack with RMQ, but how could i know whether ack is successfully done?If there is a exception before basic.ack when i have long operations to perform, the message will be sent to another consumer .How can i avoid that?
How can i avoid that?
You can't.
At some point it will happen and your code needs to deal with this scenario gracefully. This is typically done with idempotence in your message processing.
That is, you allow the message to be processed more than once (because it will happen), but you only make the underlying change to the system once.
A common / simple way of handling this is to have an ID associated with each message. Before processing the message, check to see if that ID is marked as complete in your database. If it's not, then process the message. When the message is processed, you update a database with that ID. That way, when (not if) you run into the scenario where a message is processed twice, you won't actually do the processing / system changes twice.
RabbitMQ ticks all the boxes for the project I am planning, save one. I would have different workers listening on a queue and it is important that they process the newest messages (i.e., latest sequence number) first (LIFO).
My application is such that newer messages pretty much obsolete older messages. If you have workers to spare you could still process the older messages but it is important the newer ones are done first.
After trawling the various forums and such I can only see one solution and that is for a client to process a message it should first:
consume all messages
re-order them according to the sequence number
re-submit to the queue
consume the first message
Ugly and problematic if the client dies halfway. But mabye somebody here has a better solution.
My research is based (in part) on:
http://groups.google.com/group/rabbitmq-discuss/browse_thread/thread/e79e77d86bc7a3b8?fwc=1
http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2010-July/007934.html
http://groups.google.com/group/rabbitmq-discuss/browse_thread/thread/e40d1069dcebe2cc
http://old.nabble.com/Priority-Queue-implementation-and-performance-td29946348.html
Note: the expected traffic of messages will roughly be in the range of 1 msg/hour for some queues and 100/minute for others. So nothing stellar.
Since there is no reply I guess I did my homework rather well ;)
Anyway, after discussing the requirements with the other stakeholders it was decided I can drop the LIFO requirement for now. We can worry about that when it comes to it.
A solution that we will probably end up adopting is for the worker to open a second queue that the master can use to let the worker know what jobs to ignore + provide additional control/monitoring information (which it looks like we will need anyway).
The RabbitMQ implementing the AMQP 1.0 spec may also help here.
So I will mark this question as answered for now. Somebody else is still free to add or improve.
One possibility might be to use basic.get in a loop and wait for the response basic-ok.message-count to become zero (throwing away all other messages):
while (<get ok> = <call basic.get>) {
if (<get ok>.message-count == 0) {
// Now <get ok> is the most recent message on this queue
break;
} else if (<is get-empty>) {
// Someone else got it
}
}
Of course, you'd have to set up the message routing patterns on the broker such that 1 consumer throwing away messages doesn't mess with another. Try to avoid re queueing messages as they will re queue at the top of the stack, making them look like the most recent.