There is an exclusive queue. The application failed and did not close the connections correctly.
When app restarting, it tries to declare new queues, but there was already existing. I tried deleting these from web admin and rabbitadmin, but it doesn't work.
Is it possible delete exclusive crashed queue without deleted virtual host?
Yes, you can try this:
rabbitmqctl eval '{ok, Q} = rabbit_amqqueue:lookup(rabbit_misc:r(<<"/">>, queue, <<"springCloudHystrixStream.anonymous.rq_ghYeMR7mbYJ8PzMyF8Q">>)), rabbit_amqqueue:delete_crashed(Q).'
Which / is your virtual host, and springCloudHystrixStream.anonymous.rq_ghYeMR7mbYJ8PzMyF8Q is your exclusive crashed queue.
Related
I am using ServiceStack 5.0.2 and Redis 3.2.100 on Windows.
I have got several nodes with active Pub/Sub Subscription and a few Pub's per second.
I noticed that if Redis Service restarts while there is no physical network connection (so one of the clients cannot connect to Redis Service), that client stops receiving any messages after network recovers. Let's call it a "zombie subscriber": it thinks that it is still operational, but never actually receives a message: client thinks it has a connection, the same connection on server is closed.
The problem is no exception is thrown in RedisSubscription.SubscribeToChannels, so I am not able to detect the issue in order to resubscribe.
I have also analyzed RedisPubSubServer and I think I have discovered a problem. In the described case RedisPubSubServer tries to restart (send stop command CTRL), but "zombie subscriber" does not receive it and no resubscription is made.
Can I change the node name from RabbitMq Management Console for a specific queue? I tried, but I think that this is created when I started my app. Can I change it afterwards? My queue is on node RabbitMQ1, and my connection on node RabbitMQ2, so I cannot read messages from that queue. Maybe I can change my connection node?
The node name is not just a label, but it's where the queue is physically located. In fact by default queues are not distributed/mirrored, but created on the server where the application connected, as you correctly guessed.
However you can make your queue mirrored using policies, so you can consume messages from both the servers.
https://www.rabbitmq.com/ha.html
You can change the policy for the queues by using the rabbitmqctl command or from the management console, admin -> policies.
You need to synchronize the queue in order to clone the old messages to the mirror queue with:
rabbitmqctl sync_queue <queue_name>
Newly published messages will end in both the copies of the queue, and can be consumed from both alternatively (the same message won't be consumed from both).
I wish to run an experiment in which the publisher loses connection with the broker and then enqueues messages in its own queue and then when it regains connectivity it sends all its queued messages to the broker. How can I I do this since if I call close connection, I can no longer send(raises an exception). A trick that I can think of is to use a network of two brokers and simulate the above by breaking the connection between the two brokers. Is there an API call that I can use to do the above?
This is very much like facebook messenger or whatsapp acting as a publisher and enqueuing our to-send messages if we are offline and sending them once we are connected.
There is plenty of solutions you could use to break the connection in order to test, here is a non-comprehensive list :
Make a script that can set/unset a firewall rule on your environement blocking the connection port
If you are working with VMs, you can suspend/resume the one running Activemq, you can even automate it with tools like vagrant (vagrant suspend, then vagrant up)
Tweak the connection manualy accessing the activemq jmx
Develop an activemq plugin able to trash connections on demand (or maybe there is one ?)
Now in order to have the behavior you wish to obtain there is two options :
1) Make sure your connection is failover so it can be reestablished, and store your message on disk before sending them with your producer.
2)Produce to a local broker embbeded in your app, and connect this one to the remote broker.
We had a network network partition and RabbitMQ ended up in "split brain".
After the cluster recovered, I have a queue that I cant delete. In the mgmt. interface the queue is just listed with "?", and I'm unable to delete it from using mgmt. interface or from commandline.
I have tried to remove the node "sh-mq-cl1a-04" from the cluster, but the queue remains in the cluster.
I had a similar issue where I couldn't delete some queues, and the solution listed here worked for me: https://community.pivotal.io/s/article/Queue-cant-be-deleted-or-purged-in-RabbitMQ
I ssh'd onto one of the nodes in my cluster (the one where the queue is hosted is probably best), sudo'd as root, and then ran this command:
rabbitmqctl eval '{ok, Q} = rabbit_amqqueue:lookup(rabbit_misc:r(<<"VHOST">>, queue, <<"QUEUE">>)), rabbit_amqqueue:delete_crashed(Q).'
You'll need to replace VHOST with your virtual host name, and QUEUE with your queue name (which I realize it might be tricky to figure out, in your situation).
Version pre-3 the recommendation was to run a timeout manager as a standalone process on your cluster, beside the distributor. (As detailed here: http://support.nservicebus.com/customer/portal/articles/965131-deploying-nservicebus-in-a-windows-failover-cluster).
After the inclusion of the timeout manager as a satellite assembly, what is the correct way to use it when scaling out with the distributor?
Should each worker of Service A run with timeout manager enabled or should only the distributor process for service A be configured to run a timeout manager for service A?
If each worker runs it, do they share the same Raven instance for storing the timeouts? (And if so, how do you make sure that two or more workers don't pick up the same expired timeout at the same time?)
Allow me to answer this clearly myself.
After a lot of digging and with help from Andreas Öhlund on the NSB team(http://tech.groups.yahoo.com/group/nservicebus/message/17758), the correct anwer to this question is:
Like Udi Dahan mentioned, by design ONLY the distributor/master node should run a timeout manager in a scale out scenario.
Unfortunately in early versions of NServiceBus 3 this is not implemented as designed.
You have the following 3 issues:
1) Running with the Distributor profile does NOT start a timeout manager.
Workaround:
Start the timeout manager on the distributor yourself by including this code on the distributor:
class DistributorProfileHandler : IHandleProfile<Distributor>
{
public void ProfileActivated()
{
Configure.Instance.RunTimeoutManager();
}
}
If you run the Master profile this is not an issue as a timeout manager is started on the master node for you automatically.
2) Workers running with the Worker profile DO each start a local timeout manager.
This is not as designed and messes up the polling against the timeout store and dispatching of timeouts. All workers poll the timeout store with "give me the imminent timeouts for MASTERNODE". Notice they ask for timeouts of MASTERNODE, not for W1, W2 etc. So several workers can end up fetching the same timeouts from the timeout store concurrently, leading to conflicts against Raven when deleting the timeouts from it.
The dispatching always happens through the LOCAL .timouts/.timeoutsdispatcher queues, while it SHOULD be through the queues of the timeout manager on the MasterNode/Distributor.
Workaround, you'll need to do both:
a) Disable the timeout manager on the workers. Include this code on your workers
class WorkerProfileHandler:IHandleProfile<Worker>
{
public void ProfileActivated()
{
Configure.Instance.DisableTimeoutManager();
}
}
b) Reroute NServiceBus on the workers to use the .timeouts queue on the MasterNode/Distributor.
If you don't do this, any call to RequestTimeout or Defer on the worker will die with an exception saying that you have forgotten to configure a timeout manager. Include this in your worker config:
<UnicastBusConfig TimeoutManagerAddress="{endpointname}.Timeouts#{masternode}" />
3) Erroneous "Ready" messages back to the distributor.
Because the timeout manager dispatches the messages directly to the workers input queues without removing an entry from the available workers in the distributor storage queue, the workers send erroneous "Ready" messages back to the distributor after handling a timeout. This happens even if you have fixed 1 and 2, and it makes no difference if the timeout was fetched from a local timeout manager on the worker or on one running on the distributor/MasterNode. The consequence is a build up of an extra entry in the storage queue on the distributor for each timeout handled by a worker.
Workaround:
Use NServiceBus 3.3.15 or later.
In version 3+ we created the concept of a master node which hosts inside it all the satellites like the distributor, timeout manager, gateway, etc.
The master node is very simple to run - you just pass a /master flag to the NServiceBus.Host.exe process and it runs everything for you. So, from a deployment perspective, where you used to deploy the distributor, now you deploy the master node.