Celery, zmq, message passing approach for a distributed system - redis

I need to implement a system which does the following:
Transfer data to a remote place.
Once the data gets transferred fully, start a computation on the remote server.
Once the computation is done, fetch the resulting computed data back to the source.
A web interface to track/edit the progress of each task.
I am thinking of using:
1. Ruby on Rails for 4)
2. Celery as the distributed solution.
3. Zmq to pass messages across to RoR app and in between the different "categories" of workers within celery described below.
To decouple these components from each other, I'm considering having 3 sets of celery workers, each belonging to a separate category :-
A. 'Sync' workers,
B. 'Render' workers, and
C. 'Fetch' workers.
I wanna use zmq pub sub or broadcast model to pass messages around between these sets of workers and the web app so that they can be synchronised properly. For example B) should only kick in once A) is done. And C) should follow B).
Does this approach sound reasonable or can it be done better using perhaps just zmq or celery alone? Should instead of these I be using the celery back end like redis or amp?
Reasons I wanna use celery is of course data persistence as well as a web interface to monitor the workers.
I'm obviously relatively new to celery, zmq and distributed computation in general so any advice would be welcome.
Thanks all.

I have done something similar for work but it has all been done using rabbitmq and celery. The way I would approach this is have a celery worker running on the remote server and on the local host. Have each worker have it's own unique queue and fire off a chain something like
chain(sync.s(file), compute.s(), sync_back.s()).delay have the 2 sync tasks go to the localhost queue and the compute task go into the remote host queue

Related

Multiple service instances using Hangfire (shared tasks/objects), is it possible?

I need to run multiple instances of the same service, with the same database, for redundancy reason.
I found some question about "Hangfire multiple instances" but for a differenct purpose then mine: usually about running multiple instances for different tasks on the same database, or similar to this.
I need to know if there are problems of concurrency when 2 or more instances of Hangfire use the same Database (we want to use MongoDB) and if this is the solution to make the service resilient.
The goal is to have instance that take care of all the jobs when another instance goes down.
Any suggestion wellcome for covering this scenario.
In our environment, we have a replica set used by about 10 Hangfire servers. If there are multiple Hangfire servers servicing the same queue, it means they will share the load and whichever Hangfire server checks the queue first, picks up the job and continues. If you remove all but 1 server, the jobs will continue (as long as there are enough workers otherwise they will remain queued until a worker is available).
To answer your question, yes, you can have 2 or more Hangfire servers using the same MongoDB. MongoDB provides multi-threading support so its safe to have various servers accessing the same database backend. If you have two servers, both will be active and if one instance goes off line, other instance (based on queues) will continue to process the jobs in queue.
Keep in mind, Hangfire servers processes the jobs in Specific Queues. If both servers are part of the same queue then you are load balancing the jobs among the two servers. If they are part of different queues, then you read about that scenario where each Hangfire instance processes different jobs (because they are part of different queues).
Read about configuring Job Queues here

Can single CPU core work with multiple clients using Distributed Tensorflow?

In Distributed Tensorflow, we could run multiple clients working with workers in Parameter-Server architecture, which is known as "Between-Graph Replication". According to the documentation,
Between-graph replication. In this approach, there is a separate
client for each /job:worker task, typically in the same process as the
worker task.
it says the client and worker typically are in the same process. However, if they are not in the same process, can number of clients are not equal to the number of workers? Also, can multiple clients share and run on the same CPU core?
Clients are the python programs that define a graph and initialize a session in order to run computation. If you start these programs, the created processes represent the servers in the distributed architecture.
Now it is possible to write programs that do not create a graph and do not run session, but rather just call the server.join() method with the appropriate job name and task index. This way you could theoretically have a single client defining the whole graph and start a session with its corresponding server.target; then within this session, parts of the graph are automatically going to be sent to the other processes/servers and they will do the computations (as long as you have set which server/task is going to do what). This setup describes the in-graph replication architecture.
So, it is basically possible to start several servers/processes on the same machine, that has only a single CPU, but you are not going to gain much parallelism, because context switching between multiple running processes is going to slow you down. So unless the servers are doing some unrelated work, you should rather avoid this kind of setup.
Between-graph just means that every worker is going to have its own client and run its own session respectively.

RabbitMQ - parellel queue

We use RabbitMQ as a queuing system for our client's 3rd party accounts application. There are a few reasons but one is that we can control the speed at which data goes into the application. Sometimes a massive queue will build up and this works really well.
However we want to use RabbitMQ for another application which we'd like to be separate and be more real-time.
Would a separate exchange/queue work best?
Do I need a separate console app?
If there are 100,000 messages queued up for the accounts app I'd like other app to process straight way
if you want to handle more applications, one solution is to use the rabbitmq virtual hosts, in this way you have different enviroments and you can also use different users/password to access.
In general the best way to scale is to scale the queues, in case you need to handle an high throughput you can create a cluster and scale the traffic between the nodes.
you should avoid to have one giant queue! .. so more queues more scale!

Multiple clients load distribution with redis

We are using redis as a queue for asynchronous processing of jobs. One application pushes jobs to redis (lpush), other application reads the redis queue (blpop) and processes the same. We wanted to scale the processing application so we ran two different instances on 2 different machines to process the jobs from queue, but we observed that one instance is taking 70% of the load from queue while other instances processes only a meagre amount. Is there any well defined strategy or configuration in using multiple clients with redis and proper load sharing? Or we have to maintain separate queues for the two instances and push the requests in a round robin manner?

Simple queue with Celery and RabbitMQ

I'm trying to implement a simple queue that performs one task at a time. Offloading tasks off the main thread using Celery and setting concurrency=1 in the Celery config works fine, but I might want to use more concurrent workers for other tasks.
Is there a way to tell Celery or RabbitMQ to not use multiple concurrent workers for a task (except by forcing concurrency=1)? I can't find anything in the documentation but maybe these tools are not designed for a linear queue?
Thanks!
I think what you need is a separate queue for each type of task. Create separate workers that consume from each queue, with concurrency set to 1.