Using shared variables across sessions in tensorflow - tensorflow

I want to train a model and at the same time use the results of the model for further actions. The training can be done in the background, but I need the prediction model to be available all the time.
I've got an idea to how to do this but not sure if that is possible to do in tensorflow. So I'm thinking of creating separate threads/processes for prediction and training. There will be two different sessions running in each process and they will share the same variables. So, the training model can update the variables in it's own time and the prediction model can use the latest weights for better prediction.
Is there any way to share variable across sessions or some better way to do this? I've heard that it is dicouraged to run multiple sessions in tensorflow.

On same machine can you share session between "predict" and "train" threads? tf.Session().run() calls are thread safe. Here is a working example:
import tensorflow as tf
import numpy as np
import time
import threading
N = 128
input = tf.placeholder(tf.float32, shape=(None, N))
labels = tf.greater_equal(tf.reduce_sum(input, axis=-1, keepdims=True), 0)
l1size = 1024
fc1 = tf.contrib.layers.fully_connected(input, l1size)
l2size=128
fc2 = tf.contrib.layers.fully_connected(fc1, l2size)
predictions = tf.contrib.layers.fully_connected(fc2, 1,
activation_fn=tf.nn.sigmoid)
loss = tf.losses.mean_squared_error(labels, predictions)
train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss)
session = tf.Session()
session.run(tf.global_variables_initializer())
keep_going = True
def predict_thread(session):
test_data = np.random.randn(10, N)
while keep_going:
current_loss = session.run(loss, feed_dict={input:test_data})
print("Current loss: %f" % current_loss)
time.sleep(1.)
def train_thread(session):
train_data = np.random.randn(1024, N)
while keep_going:
session.run(train_op, feed_dict={input:train_data})
t1 = threading.Thread(target=train_thread, args=(session,))
t2 = threading.Thread(target=predict_thread, args=(session,))
t1.start()
t2.start()
time.sleep(10)
keep_going = False
t1.join()
t2.join()
You can also save/restore your model from time to time if training and prediction are on different machines. This question might be related.

Related

Tensorflow2.x custom data generator with multiprocessing

I just upgraded to tensorflow 2.3.
I want to make my own data generator for training.
With tensorflow 1.x, I did this:
def get_data_generator(test_flag):
item_list = load_item_list(test_flag)
print('data loaded')
while True:
X = []
Y = []
for _ in range(BATCH_SIZE):
x, y = get_random_augmented_sample(item_list)
X.append(x)
Y.append(y)
yield np.asarray(X), np.asarray(Y)
data_generator_train = get_data_generator(False)
data_generator_test = get_data_generator(True)
model.fit_generator(data_generator_train, validation_data=data_generator_test,
epochs=10000, verbose=2,
use_multiprocessing=True,
workers=8,
validation_steps=100,
steps_per_epoch=500,
)
This code worked fine with tensorflow 1.x. 8 processes were created in the system. The processor and video card were loaded perfectly. "data loaded" was printed 8 times.
With tensorflow 2.3 i got warning:
WARNING: tensorflow: multiprocessing can interact badly with TensorFlow, causing nondeterministic deadlocks. For high performance data pipelines tf.data is recommended.
"data loaded" was printed once(should 8 times). GPU is not fully utilized. It also have memory leak every epoch, so traning will stops after several epochs. use_multiprocessing flag did not help.
How to make a generator / iterator in tensorflow(keras) 2.x that can easily be parallelized across multiple CPU processes? Deadlocks and data order are not important.
With a tf.data pipeline, there are several spots where you can parallelize. Depending on how your data are stored and read, you can parallelize reading. You can also parallelize augmentation, and you can prefetch data as you train, so your GPU (or other hardware) is never hungry for data.
In the code below, I have demonstrated how you can parallelize augmentation and add prefetching.
import numpy as np
import tensorflow as tf
x_shape = (32, 32, 3)
y_shape = () # A single item (not array).
classes = 10
# This is tf.data.experimental.AUTOTUNE in older tensorflow.
AUTOTUNE = tf.data.AUTOTUNE
def generator_fn(n_samples):
"""Return a function that takes no arguments and returns a generator."""
def generator():
for i in range(n_samples):
# Synthesize an image and a class label.
x = np.random.random_sample(x_shape).astype(np.float32)
y = np.random.randint(0, classes, size=y_shape, dtype=np.int32)
yield x, y
return generator
def augment(x, y):
return x * tf.random.normal(shape=x_shape), y
samples = 10
batch_size = 5
epochs = 2
# Create dataset.
gen = generator_fn(n_samples=samples)
dataset = tf.data.Dataset.from_generator(
generator=gen,
output_types=(np.float32, np.int32),
output_shapes=(x_shape, y_shape)
)
# Parallelize the augmentation.
dataset = dataset.map(
augment,
num_parallel_calls=AUTOTUNE,
# Order does not matter.
deterministic=False
)
dataset = dataset.batch(batch_size, drop_remainder=True)
# Prefetch some batches.
dataset = dataset.prefetch(AUTOTUNE)
# Prepare model.
model = tf.keras.applications.VGG16(weights=None, input_shape=x_shape, classes=classes)
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy")
# Train. Do not specify batch size because the dataset takes care of that.
model.fit(dataset, epochs=epochs)

A Tensorflow training agnostic to Eager and Graph modes

I spend some of my time coding novel (I wish) RNN cells in Tensorflow.
To prototype, I use eager mode (easier to debug).
In order to train, I migrate the code to a graph (runs faster).
I am looking for a wrapper code/example that can run forward pass and training in a way that will be agnostic to the mode I run it - eager or graph, as much as possible. I have in mind a set of functions/classes, to which the particular neural network/optimizer/data can be inserted, and that these set of functions/classes could run in both modes with minimal changes between the two. In addition, it is of course good that it would be compatible with many types of NN/optimizers/data instances.
I am quite sure that many had this idea.
I wonder if something like this is feasible given the current eager/graph integration in TF.
Yes. I have been wondering the same. In the Tensorflow documentation you can see:
The same code written for eager execution will also build a graph during graph execution. Do this by simply running the same code in a new Python session where eager execution is not enabled.
But this is hard to achieve, mostly because working with graphs means dealing with placeholders, which can not be used in Eager mode. I tried to get rid off placeholders using object-oriented layers and the Dataset API. This is the closest I could get to totally compatible code:
m = 128 # num_examples
n = 5 # num features
epochs = 2
batch_size = 32
steps_per_epoch = m // 32
dataset = tf.data.Dataset.from_tensor_slices(
(tf.random_uniform([m, n], dtype=tf.float32),
tf.random_uniform([m, 1], dtype=tf.float32)))
dataset = dataset.repeat(epochs)
dataset = dataset.batch(batch_size)
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_dim=n),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(1)
])
def train_eagerly(model, dataset):
optimizer = tf.train.AdamOptimizer()
iterator = dataset.make_one_shot_iterator()
print('Training graph...')
for epoch in range(epochs):
print('Epoch', epoch)
progbar = tf.keras.utils.Progbar(target=steps_per_epoch, stateful_metrics='loss')
for step in range(steps_per_epoch):
with tf.GradientTape() as tape:
features, labels = iterator.get_next()
predictions = model(features, training=True)
loss_value = tf.losses.mean_squared_error(labels, predictions)
grads = tape.gradient(loss_value, model.variables)
optimizer.apply_gradients(zip(grads, model.variables))
progbar.add(1, values=[('loss', loss_value.numpy())])
def train_graph(model, dataset):
optimizer = tf.train.AdamOptimizer()
iterator = dataset.make_initializable_iterator()
print('Training graph...')
with tf.Session() as sess:
sess.run(iterator.initializer)
sess.run(tf.global_variables_initializer())
for epoch in range(epochs):
print('Epoch', epoch)
progbar = tf.keras.utils.Progbar(target=steps_per_epoch, stateful_metrics='loss')
for step in range(steps_per_epoch):
with tf.GradientTape() as tape:
features, labels = sess.run(iterator.get_next())
predictions = model(features, training=True)
loss_value = tf.losses.mean_squared_error(labels, predictions)
grads = tape.gradient(loss_value, model.variables)
optimizer.apply_gradients(zip(grads, model.variables))
progbar.add(1, values=[('loss', loss_value.eval())])
As you can see, the main difference is that I use a one_shot_iterator during Eager training (of course, during graph training, I have to run operations within a session).
I tried to do the same using optimizer.minimize instead of applying the gradients myself, but I could not come up with a code that worked both for eager and graph modes.
Also, I'm sure this becomes much harder to do with not so simple models, like the one you are working with.

Linear Regression model On tensorflow can't learn bias

I am trying to train a linear regression model in Tensorflow using some generated data. The model seems to learn the slope of the line, but is unable to learn the bias.
I have tried changing the no. of epochs, the weight(slope) and the biases, but every time , the learnt bias by the model comes out to be zero. I don't know where I am going wrong and some help would be appreciated.
Here is the code.
import numpy as np
import tensorflow as tf
# assume the linear model to be Y = W*X + b
X = tf.placeholder(tf.float32, [None, 1])
Y = tf.placeholder(tf.float32, [None,1])
# the weight and biases
W = tf.Variable(tf.zeros([1,1]))
b = tf.Variable(tf.zeros([1]))
# the model
prediction = tf.matmul(X,W) + b
# the cost function
cost = tf.reduce_mean(tf.square(Y - prediction))
# Use gradient descent
learning_rate = 0.000001
train_step =
tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
steps = 1000
epochs = 10
Verbose = False
# In the end, the model should learn these values
test_w = 3
bias = 10
for _ in xrange(epochs):
for i in xrange(steps):
# make fake data for the model
# feed one example at a time
# stochastic gradient descent, because we only use one example at a time
x_temp = np.array([[i]])
y_temp = np.array([[test_w*i + bias]])
# train the model using the data
feed_dict = {X: x_temp, Y:y_temp}
sess.run(train_step,feed_dict=feed_dict)
if Verbose and i%100 == 0:
print("Iteration No: %d" %i)
print("W = %f" % sess.run(W))
print("b = %f" % sess.run(b))
print("Finally:")
print("W = %f" % sess.run(W))
print("b = %f" % sess.run(b))
# These values should be close to the values we used to generate data
https://github.com/HarshdeepGupta/tensorflow_notebooks/blob/master/Linear%20Regression.ipynb
Outputs are in the last line of code.
The model needs to learn test_w and bias (in the notebook link, it is in the 3rd cell, after the first comment), which are set to 3 and 10 respectively.
The model correctly learns the weight(slope), but is unable to learn the bias. Where is the error?
The main problem is that you are feeding just one sample at a time to the model. This makes your optimizer very inestable, that's why you have to use such a small learning rate. I will suggest you to feed more samples in each step.
If you insist in feeding one sample at a time, maybe you should consider using an optimizer with momentum, like tf.train.AdamOptimizer(learning_rate). This way you can increase the learning rate and reach convergence.

Multiprocessing with GPU in keras

I need to compute multiple deep models in parallel and average their results. My job runs forever after finishing computation with GPU 0.
def model_train(self, params):
from nn_arch import nn_models
X, y, gpu_no = params
print("GPU NO ", gpu_no)
with tf.device('/gpu:' + str(gpu_no)):
model1 = nn_models.lenet5()
early_callback = CustomCallback()
model1.fit(X, y, batch_size=256, validation_split=0.2, callbacks=[early_callback],
verbose=1,
epochs=1)
return model1
And my main method below. In this case I have 2 GPUs
def main(self, X_train, y_train, X_test, y_test):
random_buckets = self.get_random()
X = [X_train[random_buckets[k]] for k in sorted(random_buckets)]
y = [y_train[random_buckets[j]] for j in sorted(random_buckets)]
params = zip(X, y, [0, 1])
models = pool1.map(self.model_train, params)
How do I train multiple models in parallel with Keras. (Data Parallel Approach)
Before compiling the model in keras. Add this line
model = make_parallel(model, 2)
where 2 is the number of GPUs available.
The make_parallel function is available in this file. Just import the file in your code and your code will be executed on multiple GPUs.
https://github.com/kuza55/keras-extras/blob/master/utils/multi_gpu.py
make_parallel is a simple function that:
It instantiates a copy of your model on the N GPUs you tell it to
It splits your batch into N evenly sized smaller batches
It passes each smaller batch into the corresponding model
It concatenates the outputs of the models
Please refer to multi-GPU TensorFlow tutorials as a reference.
https://github.com/tensorflow/tensorflow/blob/r0.7/tensorflow/models/image/cifar10/cifar10_multi_gpu_train.py

Tensorflow (tf-slim) Model with is_training True and False

I would like to run a given model both on the train set (is_training=True) and on the validation set (is_training=False), specifically with how dropout is applied. Right now the prebuilt models expose a parameter is_training that is passed it the dropout layer when building the network. The issue is that If I call the method twice with different values of is_training, I will get two different networks that do no share weights (I think?). How do I go about getting the two networks to share the same weights such that I can run the network that I have trained on the validation set?
I wrote a solution with your comment to use Overfeat in train and test mode. (I couldn't test it so you can check if it works?)
First some imports and parameters:
import tensorflow as tf
slim = tf.contrib.slim
overfeat = tf.contrib.slim.nets.overfeat
batch_size = 32
inputs = tf.placeholder(tf.float32, [batch_size, 231, 231, 3])
dropout_keep_prob = 0.5
num_classes = 1000
In train mode, we pass a normal scope to the function overfeat:
scope = 'overfeat'
is_training = True
output = overfeat.overfeat(inputs, num_classes, is_training,
dropout_keep_prob, scope=scope)
Then in test mode, we create the same scope but with reuse=True.
scope = tf.VariableScope(reuse=True, name='overfeat')
is_training = False
output = overfeat.overfeat(inputs, num_classes, is_training,
dropout_keep_prob, scope=scope)
you can just use a placeholder for is_training:
isTraining = tf.placeholder(tf.bool)
# create nn
net = ...
net = slim.dropout(net,
keep_prob=0.5,
is_training=isTraining)
net = ...
# training
sess.run([net], feed_dict={isTraining: True})
# testing
sess.run([net], feed_dict={isTraining: False})
It depends on the case, the solutions are different.
My first option would be to use a different process to do the evaluation. You only need to check that there is a new checkpoint and load that weights into the evaluation network (with is_training=False):
checkpoint = tf.train.latest_checkpoint(self.checkpoints_path)
# wait until a new check point is available
while self.lastest_checkpoint == checkpoint:
time.sleep(30) # sleep 30 seconds waiting for a new checkpoint
checkpoint = tf.train.latest_checkpoint(self.checkpoints_path)
logging.info('Restoring model from {}'.format(checkpoint))
self.saver.restore(session, checkpoint)
self.lastest_checkpoint = checkpoint
The second option is after every epoch you unload the graph and create a new evaluation graph. This solution waste a lot of time loading and unloading graphs.
The third option is to share the weights. But feeding these networks with queues or dataset can lead to issues, so you have to be very careful. I only use this for Siamese networks.
with tf.variable_scope('the_scope') as scope:
your_model(is_training=True)
scope.reuse_variables()
your_model(is_training=False)