How to run the example code of TensorFlow in distributed mode?

How to run the example code of TensorFlow in distributed mode? - tensorflow

I'm new to TensorFlow and try to run it in distributed mode. Now I have found its official document in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/how_tos/distributed/index.md . But it lacks something in loss function.
Can anyone help to complete that so that I can run with your code?

It not only lacks of loss function, it lacks of the model to train and thus the loss to minimize.
This file is just a template file that you have to complete in order to train your model in distributed mode.
So, when in the template file you find the comment
# Build model...
It means that you have to define a model to train (eg: a convolutional neural network, a simple perceptron...).
Something like the MNIST model that you can find in the tutorial: https://www.tensorflow.org/versions/r0.9/tutorials/mnist/beginners/index.html
Your model ends with a loss function to minimize.
Following the MNIST example, the loss is:
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
loss = cross_entropy
Once you defined the model to train and the loss to minimize, you have filled the template with the missing values and you can now start to train you model in distributed mode.

Related

Implementing custom loss that makes use of training batch data with Keras

I am trying to implement a GAN called the SimGAN proposed by Apple researchers. The SimGAN is used to refine labelled synthetic images so that they look more like the unlabelled real images.
The link to the paper can be found on arXiv here.
In the paper, the loss function of the combined model, which comprises the generator and the discriminator, has a self-regularization component in the form of an L1 loss that penalizes too great a difference between the synthetic images and the images after refinement. In other words, the refinement should not be too drastic.
I would like to know how I can implement this self-regularization loss in Keras. Here is what I tried:
def self_regularization_loss(refined_images, syn_images):
def l1loss(y_true, y_pred):
return keras.metrics.mean_absolute_error(refined_images, syn_images)
return l1loss
However, I do not think I can compile the model in the way below as the batches of refined and synthetic images change during training time.
model.compile(loss=[self_regularization_loss(current_batch_of_refined, current_batch_of_synthetic),
local_adversarial_loss],
optimizer=opt)
What is the way to implement this loss?

Trying using the tf.function decorator and tf.GradientTape():
#tf.function
def train_step(model, batch):
with tf.GradientTape() as tape:
refined_images, syn_images = batch
loss = self_regularization_loss(model, refined_images, syn_images)
gradients = tape.gradient(loss, model.trainable_variables)
self.optimizer.apply_gradients(zip(gradients, model.trainable_variables))
your training loop can look something like:
for image_batch in dataset:
train_step(model, image_batch)
Here it is assumed that model is of type tf.keras.Model. More details to the model class can be found here. Note that model is also passed to self_regularization_loss. In this function your model recieves both images as inputs and then gives you the respective output. Then you calculate your loss.

Freezing BERT layers after importing via TF-hub and training them?

I will describe my intention here. I want to import BERT pretrained model via tf-hub function hub.module(bert_url, trainable = True) and utilize it for text classification task. I plan to use a large corpus to fine-tune weights of BERT as well as a few dense layers whose inputs are the BERT outputs. I would then like to freeze layers of BERT and train only the dense layers following BERT. How can I do this efficiently?

You mention Hub's TF1 API hub.Module, so I suppose you are writing TF1 code and using the TF1-compatible Hub assets google/bert/..., such as https://tfhub.dev/google/bert_cased_L-12_H-768_A-12/1
Are you going to have separate run of your program for the two phases of training? If so, maybe you can just drop trainable=True from the hub.Module call in the second run. This doesn't affect variable names, so you can restore the training result from the first run, including BERT's adjusted weights. (To be clear: the pre-trained weights shipped with the hub.Module are only used for initialization at the very start of training; restoring a checkpoint overrides them.)

Keras: Custom loss function with training data not directly related to model

I am trying to convert my CNN written with tensorflow layers to use the keras api in tensorflow (I am using the keras api provided by TF 1.x), and am having issue writing a custom loss function, to train the model.
According to this guide, when defining a loss function it expects the arguments (y_true, y_pred)
https://www.tensorflow.org/guide/keras/train_and_evaluate#custom_losses
def basic_loss_function(y_true, y_pred):
return ...
However, in every example I have seen, y_true is somehow directly related to the model (in the simple case it is the output of the network). In my problem, this is not the case. How do implement this if my loss function depends on some training data that is unrelated to the tensors of the model?
To be concrete, here is my problem:
I am trying to learn an image embedding trained on pairs of images. My training data includes image pairs and annotations of matching points between the image pairs (image coordinates). The input feature is only the image pairs, and the network is trained in a siamese configuration.
I am able to implement this successfully with tensorflow layers and train it sucesfully with tensorflow estimators.
My current implementations builds a tf Dataset from a large database of tf Records, where the features is a dictionary containing the images and arrays of matching points. Before I could easily feed these arrays of image coordinates to the loss function, but here it is unclear how to do so.

There is a hack I often use that is to calculate the loss within the model, by means of Lambda layers. (When the loss is independent from the true data, for instance, and the model doesn't really have an output to be compared)
In a functional API model:
def loss_calc(x):
loss_input_1, loss_input_2 = x #arbirtray inputs, you choose
#according to what you gave to the Lambda layer
#here you use some external data that doesn't relate to the samples
externalData = K.constant(external_numpy_data)
#calculate the loss
return the loss
Using the outputs of the model itself (the tensor(s) that are used in your loss)
loss = Lambda(loss_calc)([model_output_1, model_output_2])
Create the model outputting the loss instead of the outputs:
model = Model(inputs, loss)
Create a dummy keras loss function for compilation:
def dummy_loss(y_true, y_pred):
return y_pred #where y_pred is the loss itself, the output of the model above
model.compile(loss = dummy_loss, ....)
Use any dummy array correctly sized regarding number of samples for training, it will be ignored:
model.fit(your_inputs, np.zeros((number_of_samples,)), ...)
Another way of doing it, is using a custom training loop.
This is much more work, though.
Although you're using TF1, you can still turn eager execution on at the very beginning of your code and do stuff like it's done in TF2. (tf.enable_eager_execution())
Follow the tutorial for custom training loops: https://www.tensorflow.org/tutorials/customization/custom_training_walkthrough
Here, you calculate the gradients yourself, of any result regarding whatever you want. This means you don't need to follow Keras standards of training.
Finally, you can use the approach you suggested of model.add_loss.
In this case, you calculate the loss exaclty the same way I did in the first answer. And pass this loss tensor to add_loss.
You can probably compile a model with loss=None then (not sure), because you're going to use other losses, not the standard one.
In this case, your model's output will probably be None too, and you should fit with y=None.

tf.Estimator.predict() issue when using a Tensorflow Hub module as the basis of a custom tf.Estimator

I am trying to create a custom tensorflow tf.Estimator. In the model_fn passed to the tf.Estimator, I am importing the Inception_V3 module from Tensorflow Hub.
Problem: After fine-tuning the model (using tf.Estimator.train), the results obtained using tf.Estimator.predict are not as good as expected based on tf.Estimator.evaluate (This is for a regression problem.)
I am new to Tensorflow and Tensorflow Hub, so I could be making lots of rookie mistakes.
When I run tf.Estimator.evaluate() on my validation data, the reported loss is in the same ball park as the loss after tf.Estimator.train() was used to train the model. The problem comes in when I try to use tf.Estimator.predict() on the same validation data.
tf.Estimator.predict() returns predictions which I then use to calculate the same loss metric (mean_squared_error) which is computed by tf.Estimator.evaluate(). I am using the same set of data to feed to the predict function as the evaluate function. But I do not get the same result for the mean_squared_error -- not remotely close! (The mse I calculate from predict is much worse.)
Here is what I have done (edited out some details)...
Define a model_fn with Tensorflow Hub module. Then call the tf.Estimator functions to train, evaluate and predict.
def my_model_fun(features, labels, mode, params):
# Load InceptionV3 Module from Tensorflow Hub
iv3_module =hub.Module("https://tfhub.dev/google/imagenet/inception_v3/feature_vector/1",trainable=True, tags={'train'})
# Gather the variables for fine-tuning
var_list = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES,scope='CustomeLayer')
var_list.extend(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES,scope='module/InceptionV3/Mixed_5b'))
predictions = {"the_prediction" : final_output}
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
# Define loss, optimizer, and evaluation metrics
loss = tf.losses.mean_squared_error(labels=labels, predictions=final_output)
optimizer =tf.train.AdadeltaOptimizer(learning_rate=learn_rate).minimize(loss,
var_list=var_list, global_step=tf.train.get_global_step())
rms_error = tf.metrics.root_mean_squared_error(labels=labels,predictions=predictions["the_prediction"])
eval_metric_ops = {"rms_error": rms_error}
if mode == tf.estimator.ModeKeys.TRAIN:
return tf.estimator.EstimatorSpec(mode=mode, loss=loss,train_op=optimizer)
if mode == tf.estimator.ModeKeys.EVAL:
tf.summary.scalar('rms_error', rms_error)
return tf.estimator.EstimatorSpec(mode=mode, loss=loss,eval_metric_ops=eval_metric_ops)
iv3_estimator = tf.estimator.Estimator(model_fn=iv3_model_fn)
iv3_estimator.train(input_fn=train_input_fn, steps=TRAIN_STEPS)
iv3_estimator.evaluate(input_fn=val_input_fn)
ii =0
for ans in iv3_estimator.predict(input_fn=test_input_fn):
sqErr = np.square(label[ii] - ans['the_prediction'][0])
totalSqErr += sqErr
ii += 1
mse = totalSqErr/ii
I expect that the mse loss reported by tf.Estimator.evaluate() should be the same as the when I calculate mse from the known labels and the output of tf.Estimator.predict()
Do I need to import the Tensorflow Hub model differently when I use predict? (use trainable=False in the call to hub.Module()?
Are the weights obtained from training being used when tf.Estimator.evaluate() runs, but not when tf.Estimator.predict()- runs?
other?

There's a few things that seem to be missing from the code snippet. How is final_output computed from iv3_module? Also, mean squared error is an unusual choice of loss function for a classification problem; the common approach is to pass image features from the module into a a linear output layer with scores for each class ("logits") and a "softmax cross-entropy loss". For an explanation of these terms, you can review online tutorials like https://developers.google.com/machine-learning/crash-course/ (all the way to multi-class neural nets).
Regarding TF-Hub technicalities:
The variables of a Hub module are automatically added to the GLOBAL_VARIABLES and TRAINABLE_VARIABLES collections (if trainable=True, as you already do). No manual extension of those collections should be needed.
hub.Module(..., tags=...) should be set to {"train"} for mode==TRAIN and set to None or the empty set otherwise.
In general, it's useful to get a solution working end-to-end for your problem without fine-tuning as a baseline, and then add fine-tuning.

Can I retrain an old model with new data using TensorFlow?

I am new to TensorFlow and I am just trying to see if my idea is even possible.
I have trained a model with multi class classifier. Now I can classify a sentence in input, but I would like to change the result of CNN, for example, to improve the score of classification or change the classification.
I want to try to train just a single sentence with its class on a trained model, is this possible?

If I understand your question correctly, you are trying to reload a previously trained model either to run it through further iterations, test it on a new sentence, or fine tune the model a bit. If this is the case, yes you can do this. Look into saving and restoring models (https://www.tensorflow.org/api_guides/python/state_ops#Saving_and_Restoring_Variables).
To give you a rough outline, when you initially train your model, after setting up the network architecture, set up a saver:
trainable_var = tf.trainable_variables()
sess = tf.Session()
saver = tf.train.Saver()
sess.run(tf.global_variables_initializer
# Run/train your model until some completion criteria is reached
#....
#....
saver.save(sess, 'model.ckpt')
Now, to reload your model:
saver = tf.train.import_meta_graph('model.ckpt.meta')
saver.restore('model.ckpt')
#Note: if you have already defined all variables before restoring the model, import_meta_graph is not necessary
This will give you access to all the trained variables and you can now feed in whatever new sentence you have. Hope this helps.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to run the example code of TensorFlow in distributed mode? - tensorflow

Related

Implementing custom loss that makes use of training batch data with Keras

Freezing BERT layers after importing via TF-hub and training them?

Keras: Custom loss function with training data not directly related to model

tf.Estimator.predict() issue when using a Tensorflow Hub module as the basis of a custom tf.Estimator

Can I retrain an old model with new data using TensorFlow?

Categories

Resources