In distributed TensorFlow, is it possible to share the same queue across different workers? - tensorflow

In TensorFlow, I want to have a filename queue shared across different workers on different machines, such that each machine can get a subset of files to train. I searched a lot, and it seems that only variables could be put on a PS task to be shared. Does anyone have any example? Thanks.

It is possible to share the same queue across workers, by setting the optional shared_name argument when creating the queue. Just as with tf.Variable objects, you can place the queue on any device that can be accessed from different workers. For example:
with tf.device("/job:ps/task:0"): # Place queue on parameter server.
q = tf.FIFOQueue(..., shared_name="shared_queue")
A few notes:
The value for shared_name must be unique to the particular queue that you are sharing. Unfortunately, the Python API does not currently use scoping or automatic name uniqification to make this easier, so you will have to ensure this manually.
You do not need to place the queue on a parameter server. One possible configuration would be to set up an additional "input job" (e.g. "/job:input") containing a set of tasks that perform pre-processing, and export a shared queue for the workers to use.

Related

Can single CPU core work with multiple clients using Distributed Tensorflow?

In Distributed Tensorflow, we could run multiple clients working with workers in Parameter-Server architecture, which is known as "Between-Graph Replication". According to the documentation,
Between-graph replication. In this approach, there is a separate
client for each /job:worker task, typically in the same process as the
worker task.
it says the client and worker typically are in the same process. However, if they are not in the same process, can number of clients are not equal to the number of workers? Also, can multiple clients share and run on the same CPU core?
Clients are the python programs that define a graph and initialize a session in order to run computation. If you start these programs, the created processes represent the servers in the distributed architecture.
Now it is possible to write programs that do not create a graph and do not run session, but rather just call the server.join() method with the appropriate job name and task index. This way you could theoretically have a single client defining the whole graph and start a session with its corresponding server.target; then within this session, parts of the graph are automatically going to be sent to the other processes/servers and they will do the computations (as long as you have set which server/task is going to do what). This setup describes the in-graph replication architecture.
So, it is basically possible to start several servers/processes on the same machine, that has only a single CPU, but you are not going to gain much parallelism, because context switching between multiple running processes is going to slow you down. So unless the servers are doing some unrelated work, you should rather avoid this kind of setup.
Between-graph just means that every worker is going to have its own client and run its own session respectively.

What is the default strategy for device placement in Tensorflow?

I am trying to set up distributed training. Right now I have one parameter server and two workers. If I add another parameter server how will Tensorflow split up the parameters between the two servers? Is it done randomly or do I need to manually specify it?
They get placed round-robin on available ps tasks, see device_setter_test.py

configure parallel async event queue on replicated region in Gemfire

I'm trying to configure Gemfire/Geode in order to have an async event queue with parallel=true on a replicated region. However, I'm getting the following exception at startup:
com.gemstone.gemfire.internal.cache.wan.AsyncEventQueueConfigurationException: Parallel Async Event Queue myQueue can not be used with replicated region /myRegion
This (i.e. to prevent parallel queues on replicated regions) seems to be a design decision, but I can't understand why it is the case.
I have read all the documentation I've been able to find (primarily http://gemfire.docs.pivotal.io/docs-gemfire/latest/reference/book_intro.html and related docs),
and searched any kind of reference to this exception on the internet, but I didn't find any clear explanation on why I can't have an event listener on each member hosting a replicated region.
My conclusion is that I must be missing some fundamental concept about replicated regions and/or parallel queues, but since I can't find the appropriate documentation
on my own, I'm asking for an explanation and/or pointers to the right resources to read.
Thanks in advance.
EDIT : Let me put the question into context.
I have an external system sending data to my application using REST services, which are load balanced between nodes in order to maximize performance. Each of the nodes hosts the same regions (let's say 3, named A B and C). The data travels through all those regions (A to B to C) and is processed along the way. This means that region A hosts data that has just been received, region B data that has been partially processed and region C hosts data whose processing is complete.
I am using event listeners to process data and move it from region to region, and in case of the listener for region C, to export it to another external system.
All the listeners must (and I repeat, must) be transactional.
I also need horizontal scalability (i.e. adding nodes on the fly to increase throughput) and the maximum amount of data replication that can be possibily achieved.
Moreover, I want to run all of the nodes with the same gemfire configuration.
I have already tried to use partitioned regions, but they are not fit to my needs for a bunch of reasons that I won't explain here for the sake of brevity (just trust me, it is not currently possible).
So I thought that having all the nodes host the replicated regions could be the way, but I need all of them to be able to process events independently and perform region synchronization afterwards in an active/active scenario. It is my understanding that this requires event queues to be parallel, but it does not seem possible (by design).
So the (updated) question(s) are :
Is this scenario even possible? And if it is, how can I achieve it?
Any explanation and/or documentation, example, resource or anything else is more than welcome.
Again, thanks in advance.
An AsyncEventQueue is used to write data that arrives in GemFire to some other data store. You would ideally want to do this only once. Since the content of the replicated region is same on all the members of the system, you only need a Async event listener on one member, hence parallel=true is not supported.
For Partitioned regions, if you only had one member that hosts the AsyncQueue, then every single put to a partitioned region will also be routed through that member. This introduces a single point of contention in the system. The solution to this problem was introduction of parallel AsyncQueues, so that events on each member are only queued up locally in that member.
GemFire also supports CacheListeners, which are invoked on each member even for replicated regions, however, they are synchronous. You can introduce a thread pool in your CacheListener to get the same functionality.

Celery, zmq, message passing approach for a distributed system

I need to implement a system which does the following:
Transfer data to a remote place.
Once the data gets transferred fully, start a computation on the remote server.
Once the computation is done, fetch the resulting computed data back to the source.
A web interface to track/edit the progress of each task.
I am thinking of using:
1. Ruby on Rails for 4)
2. Celery as the distributed solution.
3. Zmq to pass messages across to RoR app and in between the different "categories" of workers within celery described below.
To decouple these components from each other, I'm considering having 3 sets of celery workers, each belonging to a separate category :-
A. 'Sync' workers,
B. 'Render' workers, and
C. 'Fetch' workers.
I wanna use zmq pub sub or broadcast model to pass messages around between these sets of workers and the web app so that they can be synchronised properly. For example B) should only kick in once A) is done. And C) should follow B).
Does this approach sound reasonable or can it be done better using perhaps just zmq or celery alone? Should instead of these I be using the celery back end like redis or amp?
Reasons I wanna use celery is of course data persistence as well as a web interface to monitor the workers.
I'm obviously relatively new to celery, zmq and distributed computation in general so any advice would be welcome.
Thanks all.
I have done something similar for work but it has all been done using rabbitmq and celery. The way I would approach this is have a celery worker running on the remote server and on the local host. Have each worker have it's own unique queue and fire off a chain something like
chain(sync.s(file), compute.s(), sync_back.s()).delay have the 2 sync tasks go to the localhost queue and the compute task go into the remote host queue

Temporary queue made in Celery

I am using Celery with RabbitMQ. Lately, I have noticed that a large number of temporary queues are getting made.
So, I experimented and found that when a task fails (that is a tasks raises an Exception), then a temporary queue with a random name (like c76861943b0a4f3aaa6a99a6db06952c) is formed and the queue remains.
Some properties of the temporary queue as found in rabbitmqadmin are as follows -
auto_delete : True
consumers : 0
durable : False
messages : 1
messages_ready : 1
And one such temporary queue is made everytime a task fails (that is, raises an Exception). How to avoid this situation? Because in my production environment a large number of such queues get formed.
It sounds like you're using the amqp as the results backend. From the docs here are the pitfalls of using that particular setup:
Every new task creates a new queue on the server, with thousands of
tasks the broker may be overloaded with queues and this will affect
performance in negative ways. If you’re using RabbitMQ then each
queue will be a separate Erlang process, so if you’re planning to
keep many results simultaneously you may have to increase the Erlang
process limit, and the maximum number of file descriptors your OS
allows
Old results will not be cleaned automatically, so you must make
sure to consume the results or else the number of queues will
eventually go out of control. If you’re running RabbitMQ 2.1.1 or
higher you can take advantage of the x-expires argument to queues,
which will expire queues after a certain time limit after they are
unused. The queue expiry can be set (in seconds) by the
CELERY_AMQP_TASK_RESULT_EXPIRES setting (not enabled by default).
From what I've read in the changelog, this is no longer the default backend in versions >=2.3.0 because users were getting bit in the rear end by this behavior. I'd suggest changing the results backend if this not the functionality you need.
Well, Philip is right there. The following is a description of how I solved it. It is a configuration in celeryconfig.py.
I am still using CELERY_BACKEND = "amqp" as Philip had said. But in addition to that, I am now using CELERY_IGNORE_RESULT = True. This configuration will ensure that the extra queues are not formed for every task.
I was already using this configuration but still when a task fails, the extra queue was formed. Then I noticed that I was using another configuration which needed to be removed which was CELERY_STORE_ERRORS_EVEN_IF_IGNORED = True. What this did that it did not store the results for all tasks but did only for errors (tasks which failed) and hence one extra queue for a task which failed.
The CELERY_TASK_RESULT_EXPIRES dictates the time to live of the temp queues. The default is 1 day. You can modify this value.
The reason this is happening is because celery workers remote control is enabled (it is enabled by default).
You can disable it by setting the CELERY_ENABLE_REMOTE_CONTROL setting to False
However, note that you will lose the ability to do things like add_consumer, cancel_consumer etc using the celery command
amqp backend creates a new queue for each task. If you want to avoid it, you can use rpc backend which keeps results in a single queue.
In your config, set
CELERY_RESULT_BACKEND = 'rpc'
CELERY_RESULT_PERSISTENT = True
You can read more about this on celery docs.