How can I implement the concept of Scheduled Sampling while training an LSTM Encoder-Decoder with teacher forcing in Keras?
The source paper is here: https://arxiv.org/abs/1506.03099
From this post, I see that pure Tensor-flow supports this training helper:
scheduled sampling in Tensorflow
https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/ScheduledOutputTrainingHelper
Is there a way to do this in Keras, maybe using a custom backend function?
Tensorflow contrib API is deprecated in Tensorflow2.x version. Few libraries of tf.contrib are moved to Tensorflow addons.
Use tfa.seq2seq.ScheduledOutputTrainingSampler
tfa.seq2seq.ScheduledOutputTrainingSampler(
sampling_probability: tfa.types.TensorLike,
time_major: bool = False,
seed: Optional[int] = None,
next_inputs_fn: Optional[Callable] = None
)
For more information on the library find here
Related
I am new to TensorFlow and I am wanting to use tensorflow.config.legacy_seq2se, specifically embedding_rnn_seq2seq() and I can't figure out how to use it (or if there is an equivalent method) for TensorFlow 2.
I know that in TensorFlow 2, TensorFlow removed contrib and according to this document
tf.contrib.legacy_seq2seq has been deleted and replaced with tf.seq2seq in TensorFlow 2, but I can't find embedding_rnn_seq2seq() in the tf.seq2seq documentation I have seen.
The reason I want to use it is I am trying to implement something similar to what is done with embedding_rnn_seq2seq() in this article. So is there an equivalent in tensorflow 2, or is there a different way to achieve the same goal?
According to https://docs.w3cub.com/tensorflow~python/tf/contrib/legacy_seq2seq/embedding_rnn_seq2seq , contrib.legacy_rnn_seq2seq createsan embedding of an argument that you pass, encoder_inputs (the shape is num_encoder_symbols x input_size). It then runs an RNN to encode the embedded encoder_inputs to convert it into a state vector. Then it embeds another argument you pass decoder_inputs (the shape is num_decoder_symbols x input_size). Next it runs an RNN decoder initialized with with the last encoder state, on the embedded decoder_inputs.
Contrib was a community maintained part of Tensorflow, and seq2seq was part of it. In Tensorflow 2 it was removed.
You could just use a Tensorflow_addons which contains community made add ons including seq2seq I believe.
You can import Tensorflow add ons via
import tensorflow_addons
Or you could just use a Tensorflow version that still has Seq2Seq (I believe 1.1 is the latest).
There are also things like bi-directional recurrent neural networks and dynamic RNNs (they are basically a new version of seq2seq) that may work.
In tensorflow 1.x this can be done using a graph and a session, which is quite tedious.
Is there an easier way to manually assign pretrained weights to a specific convolution in tensorflow 2.x?
If you are working with Keras inside Tensorflow 2.x, every layer has a method called set_weights that you can use to substitute weights or assign new ones from Numpy arrays.
Say, for example, that you are doing distillation knowledge. Then you could assign weights of the teacher to the student by:
conv.set_weights(teacher.convx.get_weights())
where conv is a particular layer of the student and convx the homologue of the teacher.
You can check the documentation for more details:
Documentation - set_weights()
I am using tf.data.Dataset to create my dataset and training a CNN with keras. I need to apply masks on the images, and the mask depends on the shape of the image, there are no predefined pixel coordinates.
When looking for an answer on the internet, I found that there are 2 ways of accessing shapes of images in TensorFlow (in training time):
Using eager execution (which is not enabled by default in my case, I'm using tf v 12.0)
Using a session
I do not want to use eager execution because it slows down training, and cannot use a session because I train and test the CNN using Keras (I feed the data to model.train() using iterators of tf.data.Dataset).
As a consequence, I have no way of knowing the shapes of images, and thus cannot access specific pixels for data augmentation.
I wrote a function using OpenCV (cv2) that applies the masks. Is there a way to integrate it with the TensorFlow data pipeline?
EDIT : I found a solution. I used tf.py_func to wrap the python functions
You can use map to transform elements of your dataset. You can then use tf.py_function to wrap your cv2 function into a tf op that executes eagerly. In tensorflow 1.x, you may use tf.py_func but the behavior is a bit different. See tf.py_function documentation for more info.
So, in TF-2.x it will look something like:
def cv2_func(image, label):
# your code goes here
def tf_cv2_func(image, label):
[image, label] = tf.py_function(cv2_func, [image, label], [tf.float32, tf.float64])
return image, label
train_ds = train_ds.shuffle(BUFFER_SIZE).map(tf_cv2_func).batch(BATCH_SIZE)
NOTE: Since you need image augmentation, I thought of supplying with some information on various image-augmentation libraries. This does not show you how to add OpenCV function into your tfdata-pipeline. But, if your requirements are standard enough, you may be able to use one of these:
tf.keras.preprocessing.image.ImageDataGenerator
imaug
albumentations
Data Augmentation in Python
Package: albumentations
library: external
url: Python albumentations library
Package: imaug :star:
library: external
url: Python imaug library
Package: tf.keras.preprocessing.image.ImageDataGenerator
library: external
url: Pyhon - TensorFlow ImageDataGenerator library
Examples
Example(s)/use of albumentations.
url: Example use-cases of Albumentations
Example(s)/use of imaug.
url: Data Augmentation for Deep Learning
:star::page_facing_up::heavy_check_mark: Fantastic Article
url: Data Augmentation techniques in python
Example(s)/use of tf.keras.preprocessing.image.ImageDataGenerator.
url: Official Example use-case of tf.keras - ImageDataGenerator
url: Building powerful image classification models using very little data
Is it possible to define a graph in native TensorFlow and then convert this graph to a Keras model?
My intention is simply combining (for me) the best of the two worlds.
I really like the Keras model API for prototyping and new experiments, i.e. using the awesome multi_gpu_model(model, gpus=4) for training with multiple GPUs, saving/loading weights or whole models with oneliners, all the convenience functions like .fit(), .predict(), and others.
However, I prefer to define my model in native TensorFlow. Context managers in TF are awesome and, in my opinion, it is much easier to implement stuff like GANs with them:
with tf.variable_scope("Generator"):
# define some layers
with tf.variable_scope("Discriminator"):
# define some layers
# model losses
G_train_op = ...AdamOptimizer(...)
.minimize(gloss,
var_list=tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES,
scope="Generator")
D_train_op = ...AdamOptimizer(...)
.minimize(dloss,
var_list=tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES,
scope="Discriminator")
Another bonus is structuring the graph this way. In TensorBoard debugging complicated native Keras models are hell since they are not structured at all. With heavy use of variable scopes in native TF you can "disentangle" the graph and look at a very structured version of a complicated model for debugging.
By utilizing this I can directly setup custom loss function and do not have to freeze anything in every training iteration since TF will only update the weights in the correct scope, which is (at least in my opinion) far easier than the Keras solution to loop over all the existing layers and set .trainable = False.
TL;DR:
Long story short: I like the direct access to everything in TF, but most of the time a simple Keras model is sufficient for training, inference, ... later on. The model API is much easier and more convenient in Keras.
Hence, I would prefer to set up a graph in native TF and convert it to Keras for training, evaluation, and so on. Is there any way to do this?
I don't think it is possible to create a generic automated converter for any TF graph, that will come up with a meaningful set of layers, with proper namings etc. Just because graphs are more flexible than a sequence of Keras layers.
However, you can wrap your model with the Lambda layer. Build your model inside a function, wrap it with Lambda and you have it in Keras:
def model_fn(x):
layer_1 = tf.layers.dense(x, 100)
layer_2 = tf.layers.dense(layer_1, 100)
out_layer = tf.layers.dense(layer_2, num_classes)
return out_layer
model.add(Lambda(model_fn))
That is what sometimes happens when you use multi_gpu_model: You come up with three layers: Input, model, and Output.
Keras Apologetics
However, integration between TensorFlow and Keras can be much more tighter and meaningful. See this tutorial for use cases.
For instance, variable scopes can be used pretty much like in TensorFlow:
x = tf.placeholder(tf.float32, shape=(None, 20, 64))
with tf.name_scope('block1'):
y = LSTM(32, name='mylstm')(x)
The same for manual device placement:
with tf.device('/gpu:0'):
x = tf.placeholder(tf.float32, shape=(None, 20, 64))
y = LSTM(32)(x) # all ops / variables in the LSTM layer will live on GPU:0
Custom losses are discussed here: Keras: clean implementation for multiple outputs and custom loss functions?
This is how my model defined in Keras looks in Tensorboard:
So, Keras is indeed only a simplified frontend to TensorFlow so you can mix them quite flexibly. I would recommend you to inspect source code of Keras model zoo for clever solutions and patterns that allows you to build complex models using clean API of Keras.
You can insert TensorFlow code directly into your Keras model or training pipeline! Since mid-2017, Keras has fully adopted and integrated into TensorFlow. This article goes into more detail.
This means that your TensorFlow model is already a Keras model and vice versa. You can develop in Keras and switch to TensorFlow whenever you need to. TensorFlow code will work with Keras APIs, including Keras APIs for training, inference and saving your model.
How do I update moving mean and moving variance in keras BatchNormalization?
I found this in tensorflow documentation, but I don't know where to put train_op or how to work it with keras models:
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
train_op = optimizer.minimize( loss )
No posts I found say what to do with train_op and whether you can use it in model.compile.
You do not need to manually update the moving mean and variances if you are using the BatchNormalization layer. Keras takes care of updating these parameters during training, and to keep them fixed during testing (by using the model.predict and model.evaluate functions, same as with model.fit_generator and friends).
Keras also keeps track of the learning phase so different codepaths run during training and validation/testing.
If you need just update the weights for existing model with some new values then you can do the following:
w = model.get_layer('batchnorm_layer_name').get_weights()
# Order: [gamma, beta, mean, std]
for j in range(len(w[0])):
gamma = w[0][j]
beta = w[1][j]
run_mean = w[2][j]
run_std = w[3][j]
w[2][j] = new_run_mean_value1
w[3][j] = new_run_std_value2
model.get_layer('batchnorm_layer_name').set_weights(w)
There are two interpretations of the question: the first is assuming that the goal is to use high level training api and this question was answered by Matias Valdenegro.
The second - as discussed in the comments - is whether it is possible to use batch normalization with the standard tensorflow optimizer as discussed here keras a simplified tensorflow interface and the section "Collecting trainable weights and state updates". As mentioned there the update ops are accessible in layer.updates and not in tf.GraphKeys.UPDATE_OPS, in fact if you have a keras model in tensorflow you can optimize with a standard tensorflow optimizer and batch normalization like this
update_ops = model.updates
with tf.control_dependencies(update_ops):
train_op = optimizer.minimize( loss )
and then use a tensorflow session to fetch the train_op. To distinguish training and evaluation modes of the batch normalization layer you need to feed the
learning phase state of the keras engine (see "Different behaviors during training and testing" on the same tutorial page as given above). This would work for example like this
...
# train
lo, _ = tf_sess.run(fetches=[loss, train_step],
feed_dict={tf_batch_data: bd,
tf_batch_labels: bl,
tensorflow.keras.backend.learning_phase(): 1})
...
# eval
lo = tf_sess.run(fetches=[loss],
feed_dict={tf_batch_data: bd,
tf_batch_labels: bl,
tensorflow.keras.backend.learning_phase(): 0})
I tried this in tensorflow 1.12 and it works with models containing batch normalization. Given my existing tensorflow code and in the light of approaching tensorflow version 2.0 I was tempted to use this approach myself, but given that this approach is not being mentioned in the tensorflow documentation I am not sure this will be supported in the long term and I finally have decided to not use it and to invest a little bit more to change the code to use the high level api.