designing distributed system with celery and rabbitMQ

designing distributed system with celery and rabbitMQ - rabbitmq

Consider the following high level design:
I have a server (call it rabbitMQ) and it create some tasks and place them in a messaging queue.
some servers (A,B,C,D...) 'see' that there are some tasks on the queue and they take a task to process on their end.
My questions:
is this explanation is correct in terms of distributed system design? and
if so where does the celery comes into play? is it on the serverA/B or C side? so each has its own celery?

Related

Using both, request-reply and pub-sub for microservices communication

We are planning to introduce both, pub-sub and request-reply communication models to our micriservices architecture. Both communication models are needed.
One of the solutions could be using RabbitMQ as it can provide both models and provide HA, clusterring ang other interesting features.
RabbitMQ request-reply model requires using queues, both for input and for output messages. Only one service can read from the input queue and this increases coupling.
Is there any other recommended solution for using both request-reply and pub-sub communication models in the same system?
Does service mesh could be a better option?
It shall be suppoered by node.js, python and. Net CORE.
Thank you for your help

There multiple pub-sub and request-reply support HA communication models :
1. Kafka
Kafka relies heavily on the filesystem for storing and caching messages. All data is immediately written to a persistent log on the filesystem without necessarily flushing to disk. In effect this just means that it is transferred into the kernel’s pagecache.
Kafka is designed with failure in mind. At some point in time, web communications or storage resources fail. When a broker goes offline, one of the replicas becomes the new leader for the partition. When the broker comes back online, it has no leader partitions. Kafka keeps track of which machine is configured to be the leader. Once the original broker is back up and in a good state, Kafka restores the information it missed in the interim and makes it the partition leader once more.
See :
https://kafka.apache.org/
https://docs.cloudera.com/documentation/kafka/latest/topics/kafka_ha.html
https://docs.confluent.io/4.1.2/installation/docker/docs/tutorials/clustered-deployment.html
2. Redis
Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.
See :
https://redis.io/
https://redislabs.com/redis-enterprise/technology/highly-available-redis/
https://redis.io/topics/sentinel
3. ZeroMQ
ZeroMQ (also known as ØMQ, 0MQ, or zmq) looks like an embeddable networking library but acts like a concurrency framework. It gives you sockets that carry atomic messages across various transports like in-process, inter-process, TCP, and multicast. You can connect sockets N-to-N with patterns like fan-out, pub-sub, task distribution, and request-reply. It's fast enough to be the fabric for clustered products. Its asynchronous I/O model gives you scalable multicore applications, built as asynchronous message-processing tasks. It has a score of language APIs and runs on most operating systems.
See :
https://zeromq.org/
http://zguide.zeromq.org/pdf-c:chapter3
http://zguide.zeromq.org/pdf-c:chapter4
4. RabbitMQ
RabbitMQ is lightweight and easy to deploy on premises and in the cloud. It supports multiple messaging protocols. RabbitMQ can be deployed in distributed and federated configurations to meet high-scale, high-availability requirements.

My preference would be to have REST api for request-reply pattern. This is specially applicable for internal microservices where you are in control of communication mechanism. I don't understand your comment about why they are not scalable if you defined them as properly and you can scale out and down the number of instances for the services based on demand. Be it Kafka, RabbitMQ, or any other broker, I don't think they are developed for request-reply as primary use case. And don't forget that whatever broker you are using, if it is A->B->C in REST, it will be A->broker->B->broker->C->broker->A and broker need to do it house keeping.
Then for pub-sub, I would use Kafka as it is unified model which can support pub-sub as well as point to point.
But if you still wanted to use a broker for request-reply, I would check Kafka as it can scale massively via partitions and lot of near real streaming applications are built using that. So It could be near the minimal latency requirement of request-reply pattern. But then I would want a framework on top of that to associate request and replies. So I would consider using Spring Kafka to achieve that

Why does celery need a message broker?

As celery is a job queue/task queue, name illustrates that it can maintain its tasks and process them. Then why does it need a message broker like rabbitmq or redis?

Celery is a Distributed Task Queue that means that the system can reside across multiple computers (containers) across multiple locations with a single centralise bus
the basic architecture is as follows:
workers - processes that can take jobs (data) from the bus (task queue) and process it
*it can put the result back into the bus for farther processing by a different worker (create a processing flow)
bus - task queue, this is basically a db that store the jobs as messages, so the workers can retrieve them,
it's important to implement a concurrent and non blocking db, so when one process takes or puts job from/on the bus, it doesn't block other workers from getting/putting theirs jobs.
RabbitMQ, Redis, ActiveMQ Kafka and such are best candidates for this sort of behaviour
the bus has an api which let to submit jobs for workers and retrieve them (among more complex features)
most buses implement an ack/fail feature so workers can ack their job being done or if not ack (or report failure) this message can be served again to another worker, and might get processed successfully this time, thus no data is lost...(this depends highly on the fail over logic and the context of data as an input to a task)
Celery include a scheduler (beat) that periodically put specific jobs on the bus and thus create a periodically tasks
lets work with a scrapping example, you want to scrap the world, but china can only allow traffic from it's region and so is Europe and the USA
so you can build a workers and place them all over the world
you can use only one bus, lets say it's located in the usa, all other workers know this bus and can connect to it, so by placing a specific job (scrap china) on the bus located in the US, a process in china can work on it, hence distributed
of course, workers will increase the throughput of the system, only due to parallelism, unrelated to their geo location and this is the common case of using an event-driven architecture (i.e central bus, consumers and producers)
I suggest read the formal docs, it's pretty straight forward

Is it possible to define priorities for Celery workers consuming from the same queue?

I have two machines on my network running Celery workers that process tasks from a common queue (the messaging back-end is RabbitMQ).
One machine is much more powerful and processes the tasks faster (which is important). If there is only one task in the queue, I always want it to run on this machine. If the queue is full, I want the less powerful machine to start accepting tasks as well.
Is there a recommended, elegant way to do this? Or do I have to set up two queues ("fast" and "slow") and implement some kind of router that sends tasks to the "slow" queue only when the "fast" queue is full?

Redis PUB/SUB and high availability

Currently I'm working on a distributed test execution and reporting system. I'm planning to use Redis PUB/SUB as a message queue and message distribution system.
I'm new to Redis, so I'm trying to read as many docs as I can and play around with it. One of the most important topics is high availability. As I said, I'm not an expert, but I'm aware of the possible options - using Sentinel, replication, clustering, etc.
What's not clear for me is how the Pub/Sub feature and the HA options are related each other. What's the best practice to build a reliable messaging system with Redis? By reliable I mean if my Redis message broker is down there should be some kind of a backup node (a slave?) that should be able to take over this role.
Is there a purely server-side solution? Or do I need to create a smart wrapper around the Redis client to handle this? Will a Sentinel-driven setup help me?

Doing pub sub in Redis with failover means thinking about additional factors in the client side. A key piece to understand is that subscriptions are per-connection. If you are subscribed to a channel on a node and it fails, you will need to handle reconnect and resubscribe. Because subscriptions are done at the connection level it is not something which can be replicated.
Regarding the details as to how it works and what you can expect to see, along with ways around it see a post I made earlier this year at https://objectrocket.com/blog/how-to/reliable-pubsub-and-blocking-commands-during-redis-failovers
You can lower the risk surface by subscribing to slaves and publishing to the master, but you would then need to have non-promotable slaves to subscribe to and still need to handle losing a slave - there is just as much chance to lose a given slave as there is a master.

IMO, PUB/SUB is not a good choice, may be disque (comes from antirez, author of the Redis) fits better:
Disque, an in-memory, distributed job queue

Celery, zmq, message passing approach for a distributed system

I need to implement a system which does the following:
Transfer data to a remote place.
Once the data gets transferred fully, start a computation on the remote server.
Once the computation is done, fetch the resulting computed data back to the source.
A web interface to track/edit the progress of each task.
I am thinking of using:
1. Ruby on Rails for 4)
2. Celery as the distributed solution.
3. Zmq to pass messages across to RoR app and in between the different "categories" of workers within celery described below.
To decouple these components from each other, I'm considering having 3 sets of celery workers, each belonging to a separate category :-
A. 'Sync' workers,
B. 'Render' workers, and
C. 'Fetch' workers.
I wanna use zmq pub sub or broadcast model to pass messages around between these sets of workers and the web app so that they can be synchronised properly. For example B) should only kick in once A) is done. And C) should follow B).
Does this approach sound reasonable or can it be done better using perhaps just zmq or celery alone? Should instead of these I be using the celery back end like redis or amp?
Reasons I wanna use celery is of course data persistence as well as a web interface to monitor the workers.
I'm obviously relatively new to celery, zmq and distributed computation in general so any advice would be welcome.
Thanks all.

I have done something similar for work but it has all been done using rabbitmq and celery. The way I would approach this is have a celery worker running on the remote server and on the local host. Have each worker have it's own unique queue and fire off a chain something like
chain(sync.s(file), compute.s(), sync_back.s()).delay have the 2 sync tasks go to the localhost queue and the compute task go into the remote host queue

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas