Distributor and worker end point queue in same machine - nservicebus

I am using NServiceBus 3.2.2.0, trying to test distributor and worker in same machine.
I noticed distributor is creating following queues
EndPointQueue
EndPointQueue.distributor.control
EndPointQueue.distributor.storage
EndPointQueue.retries
EndPointQueue.timeouts
And worker is creating a new queue something like:
EndPointQueue.5eb1d8d2-8274-45cf-b639-7f2276b56c0c
Is there a way to specify worker end point queue name instead of worker creating a queue by prefixing random string with end point queue?

Since it doesn't really make sense to run a worker on the same machine as the master (distributor), NServiceBus assumes that you're doing this for test purposes only and generates this kind of queue name.
In a true distributed scenario where the worker is running on its own box, it will have the same queue name as the master. The whole idea is that you shouldn't have to make any code or config changes to go from a single machine to a scaled-out deployment.

Related

Ignite : Persist until server stops

We are using Ignite's distributed datastructure - IgniteQueue. Please find below the Server details
Server 1: Initializes the queue and continuously runs.
Server 2: Producer. Produces contents to the queue. Started now and then
Server 3: Consumer. Consumes contents from the queue. Started now and then
Issue: When there is a time gap of 10 minutes between producer and consumer, the data in the queue is getting lost.
Could you please provide the correct configuration[eviction] that persists the contents in the queue until Server 1 is stopped?
Ultimately there shouldn't be any data loss.
There is no eviction for queues. And by default there are no backups, so most likely when you start and stops servers, you cause rebalancing and eventual loss of some entries. I suggest to do the following:
Start consumer and producer as clients rather than servers. Server topology that holds the data should always be as stable as possible.
Use CollectionConfiguration#setBackups to configure one or more backups for underlying cache used for the queue. This will help to preserve the state even if one of the server fails.
Done as per Valentin Kulichenko's comment as below
Server 1: Initializes the queue and continuously runs.
Client 1: Producer. Produces contents to the queue. Started now and then
Client 2: Consumer. Consumes contents from the queue. Started now and then
Code to make an Ignite Client :
Ignition.setClientMode(true)
val ignite = Ignition.start()

celery multiple workers but one queue

i am new to celery and redis.
I started up my redis server by using redis-server.
Celery was run using this parameter
celery -A proj worker
There are no other configurations. However, i realised that when i have a long running job in celery, it does not process another task that is in the queue until the long running task is completed. My understanding is that since i have 8 cores on my CPU, i should be able to process 8 tasks concurrently since the default parameter for -c is the number of cores?
Am i missing something here ?
Your problem is classical, everybode met this who had long-running tasks.
The root cause is that celery tries to optimize your execution flow reserving some tasks for each worker. But if one of these tasks is long-running the others get locked. It is known as 'prefetch count'. This is because by default celery set up for short tasks.
Another related setting is a 'late ack'. By default worker takes a task from the queue and immediately sends an 'acknowledge' signal, then broker removes this task from the queue. But this means that more messages will be prefetched for this worker. 'late ack' enabled tells worker to send acknowledge only after the task is completed.
This is just in two words. You may read more about prefetch and late ack.
As for the solution - just use these settings (celery 4.x):
task_acks_late = True
worker_prefetch_multiplier = 1
or for previous versions (2.x - 3.x):
CELERY_ACKS_LATE = True
CELERYD_PREFETCH_MULTIPLIER = 1
Also, starting the worker with parameter -Ofair does the same.

Could someone help me to understand difference between static queue and dynamic queue?

While I was working on the Message-Queue, I encounter the word static queue and dynamic queue.
Can any one tell me the difference?
A static queue is one that is defined ahead of time and the queue definition persists in the environment.
A dynamic queue is created on demand. Of these there are two varieties in IBM MQ. A temporary dynamic queue is created on demand and is deleted when the program that created it disconnects. A permanent dynamic queue is one that is created on demand but persists in the environment after the program which created it disconnects.
For example, a temporary dynamic queue is useful for catching replies in a request/reply scenario. The queue exists only so long as the application making requests is connected. When the program disconnects, the queue goes away so there is no need for the administrator to manually clean it up.
A permanent dynamic queue is useful for things like durable subscriptions. When a subscription is created, the queue needs to be unique and the overhead of having to define it ahead of time is excessive. So we let the application create it dynamically but also let the queue hang around when the program is offline in order to collect publications. Normally, the application deletes the queue when it is no longer needed so that the administrator doesn't need to.

How load balancer works in RabbitMQ

I am new to RabbitMQ, so please excuse me for trivial questions:
1) In case of clustering in RabbitMQ, if a node fails, load shift to another node (without stopping the other nodes). Similarly, we can also add new fresh nodes to the existing cluster without stopping existing nodes in cluster. Is that correct?
2) Assume that we start with a single rabbitMQ node, and create 100 queues on it. Now producers started sending message at faster rate. To handle this load, we add more nodes and make a cluster. But queues exist on first node only. How does load balanced among nodes now? And if we need to add more queues, on which node we should add them? Or can we add them using load balancer.
Thanks In Advance
1) In case of clustering in RabbitMQ, if a node fails, load shift to another node (without stopping the other nodes). Similarly, we can also add new fresh nodes to the existing cluster without stopping existing nodes in cluster. Is that correct?
If a node on which the queue was created fails, rabbitmq will elect a new master for that queue in the cluster as long as mirroring for the queue is enabled. Clustering provides HA based on a policy that you can define.
2) Assume that we start with a single rabbitMQ node, and create 100 queues on it. Now producers started sending message at faster rate. To handle this load, we add more nodes and make a cluster. But queues exist on first node only. How does load balanced among nodes now?
The load is not balanced. The distributed cluster provides HA and not load balancing. Your requests will be redirected to the node in the cluster on which the queue resides.
And if we need to add more queues, on which node we should add them? Or can we add them using load balancer.
That depends on your use case. Some folks use a round robin and create queues on separate nodes.
In summary
For HA use mirroring in the cluster.
To balance load across nodes, use a LB to distribute across Queues.
If you'd like to load balance the queue itself take a look at Federated Queues. They allow you to fetch messages on a downstream queue from an upstream queue.
Let me try to answer your question and this is generally most of dev may encounter.
Question 1) In case of clustering in RabbitMQ, if a node fails, load shift to another node (without stopping the other nodes). Similarly, we can also add new fresh nodes to the existing cluster without stopping existing nodes in cluster. Is that correct?
Answer: absolutely correct(if rabbitMQ running on a single host) but rabbitMQ's Queue behaves differently on the cluster. Queues only live on one node in the cluster by default. But Rabbit(2.6.0) gave us a built-in active-active redundancy option for queues: mirrored queues. Declaring a mirrored queue is just like declaring a normal queue; you pass an extra argument called x-ha-policy; tells Rabbit that you want the queue to be mirrored across all nodes in the cluster. This means that if a new node is added to the cluster after the queue is declared, it’ll automatically begin hosting a slave copy of the queue.
Question 2) Assume that we start with a single rabbitMQ node, and create 100 queues on it. Now producers started sending message at faster rate. To handle this load, we add more nodes and make a cluster. But queues exist on first node only. How does load balanced among nodes now? And if we need to add more queues, on which node we should add them? Or can we add them using load balancer.
This question has multiple sub-questions.
How does load-balanced among nodes now?
Set to all, x-ha-policy tells Rabbit that you want the queue to be mirrored across all nodes in the cluster. This means that if a new node is added to the cluster after the queue is declared, it’ll automatically begin hosting a slave copy of the queue.
on which node we should add them?
answer the above.
can we add them using load balancer?
No but yes(you have to call the rabbitMQ API within LB which is not a best practice approach), Load balancer is used for resilient messaging infrastructure. Your cluster nodes are the servers behind the load balancer and your producers and consumers are the customers.

Temporary queue made in Celery

I am using Celery with RabbitMQ. Lately, I have noticed that a large number of temporary queues are getting made.
So, I experimented and found that when a task fails (that is a tasks raises an Exception), then a temporary queue with a random name (like c76861943b0a4f3aaa6a99a6db06952c) is formed and the queue remains.
Some properties of the temporary queue as found in rabbitmqadmin are as follows -
auto_delete : True
consumers : 0
durable : False
messages : 1
messages_ready : 1
And one such temporary queue is made everytime a task fails (that is, raises an Exception). How to avoid this situation? Because in my production environment a large number of such queues get formed.
It sounds like you're using the amqp as the results backend. From the docs here are the pitfalls of using that particular setup:
Every new task creates a new queue on the server, with thousands of
tasks the broker may be overloaded with queues and this will affect
performance in negative ways. If you’re using RabbitMQ then each
queue will be a separate Erlang process, so if you’re planning to
keep many results simultaneously you may have to increase the Erlang
process limit, and the maximum number of file descriptors your OS
allows
Old results will not be cleaned automatically, so you must make
sure to consume the results or else the number of queues will
eventually go out of control. If you’re running RabbitMQ 2.1.1 or
higher you can take advantage of the x-expires argument to queues,
which will expire queues after a certain time limit after they are
unused. The queue expiry can be set (in seconds) by the
CELERY_AMQP_TASK_RESULT_EXPIRES setting (not enabled by default).
From what I've read in the changelog, this is no longer the default backend in versions >=2.3.0 because users were getting bit in the rear end by this behavior. I'd suggest changing the results backend if this not the functionality you need.
Well, Philip is right there. The following is a description of how I solved it. It is a configuration in celeryconfig.py.
I am still using CELERY_BACKEND = "amqp" as Philip had said. But in addition to that, I am now using CELERY_IGNORE_RESULT = True. This configuration will ensure that the extra queues are not formed for every task.
I was already using this configuration but still when a task fails, the extra queue was formed. Then I noticed that I was using another configuration which needed to be removed which was CELERY_STORE_ERRORS_EVEN_IF_IGNORED = True. What this did that it did not store the results for all tasks but did only for errors (tasks which failed) and hence one extra queue for a task which failed.
The CELERY_TASK_RESULT_EXPIRES dictates the time to live of the temp queues. The default is 1 day. You can modify this value.
The reason this is happening is because celery workers remote control is enabled (it is enabled by default).
You can disable it by setting the CELERY_ENABLE_REMOTE_CONTROL setting to False
However, note that you will lose the ability to do things like add_consumer, cancel_consumer etc using the celery command
amqp backend creates a new queue for each task. If you want to avoid it, you can use rpc backend which keeps results in a single queue.
In your config, set
CELERY_RESULT_BACKEND = 'rpc'
CELERY_RESULT_PERSISTENT = True
You can read more about this on celery docs.