Implementation of BERT in keras with TF_HUB - tensorflow

I was trying to implement the Google Bert model in tensorflow-keras using tensorflow hub. For this I designed a custom keras layer "Bertlayer" . Now the problem is when I am compiling the keras model it keeps showing that
AttributeError: 'Bertlayer' object has no attribute '_keras_style'
I don't know where I am wrong and what _keras_style attribute is.Please help to find the error in the code.
This is the github link to the full code: https://github.com/PradyumnaGupta/BERT/blob/master/Untitled21.ipynb
class BertLayer(tf.layers.Layer):
def __init__(self, n_fine_tune_layers=10, **kwargs):
self.n_fine_tune_layers = n_fine_tune_layers
self.trainable = True
self.output_size = 768
super(BertLayer, self).__init__(**kwargs)
def build(self, input_shape):
self.bert = hub.Module(
bert_path,
trainable=self.trainable,
name="{}_module".format(self.name)
)
trainable_vars = self.bert.variables
# Remove unused layers
trainable_vars = [var for var in trainable_vars if not "/cls/" in var.name]
# Select how many layers to fine tune
trainable_vars = trainable_vars[-self.n_fine_tune_layers :]
# Add to trainable weights
for var in trainable_vars:
self._trainable_weights.append(var)
for var in self.bert.variables:
if var not in self._trainable_weights:
self._non_trainable_weights.append(var)
super(BertLayer, self).build(input_shape)
def call(self, inputs):
inputs = [K.cast(x, dtype="int32") for x in inputs]
input_ids, input_mask, segment_ids = inputs
bert_inputs = dict(
input_ids=input_ids, input_mask=input_mask, segment_ids=segment_ids
)
result = self.bert(inputs=bert_inputs, signature="tokens", as_dict=True)[
"pooled_output"
]
return result
def compute_output_shape(self, input_shape):
return (input_shape[0], self.output_size)

So, tensorflow version 1.* is a bit misleading. It actually has 2 bases classes called Layer. One - the one that you are using. It is intended to implement shortcut wrappers over regular TF operations. The other from tensorflow.keras.layers import Layer is for Keras-like models and sequencies.
Judging by your error, you are using keras/models to train further.
You probably should start form derivering your layer from keras.layers.Layer instead of tf.layers.Layer.

Related

How to create a keras layer with a custom gradient *and learnable parameters* in TF2.0?

this is a similar question to: How to create a keras layer with a custom gradient in TF2.0?
Only, I would like to introduce a learnable parameter into the custom layer that I am training.
Here's a toy example of my current approach here:
# Method for calculation custom gradient
#tf.custom_gradient
def scaler(x, s):
def grad(upstream):
dy_dx = s
dy_ds = x
return dy_dx, dy_ds
return x * s, grad
# Keras Layer with trainable parameter
class TestLayer(tf.keras.layers.Layer):
def build(self, input_shape):
self.scale = self.add_weight("scale",
shape=[1,],
initializer=tf.keras.initializers.Constant(value=2.0),
trainable=True)
def call(self, inputs):
return scaler(inputs, self.scale)
# Creates Keras Model that uses the layer
def Model():
x_in = tf.keras.layers.Input(shape=(1,))
x_out = TestLayer()(x_in)
return tf.keras.Model(inputs=x_in, outputs=x_out, name="fp8_test")
# Create toy dataset, want to learn `scale` such to satisfy 5 = 2 * scale (i.e, `scale` should learn ~2.5)
def Dataset():
inps = tf.ones(shape=(10**5,)) * 2 # inputs
expected = tf.ones(shape=(10**5,)) * 5 # targets
data_in = tf.data.Dataset.from_tensors(inps)
data_exp = tf.data.Dataset.from_tensors(expected)
dataset = tf.data.Dataset.zip((data_in, data_exp))
return dataset
model = Model()
model.summary()
dataset = Dataset()
# Use `MSE` loss and `SGD` optimizer
model.compile(
loss=tf.keras.losses.MSE,
optimizer=tf.keras.optimizers.SGD(),
)
model.fit(dataset, epochs=100)
This is failing with the following shape related error in the optimizer:
ValueError: Shapes must be equal rank, but are 1 and 2 for '{{node SGD/SGD/update/ResourceApplyGradientDescent}} = ResourceApplyGradientDescent[T=DT_FLOAT, use_locking=true](fp8_test/test_layer_1/ReadVariableOp/resource, SGD/Identity, SGD/IdentityN)' with input shapes: [], [], [100000,1].
I've been staring at the docs for a while, I'm a bit stumped as to why this isn't working, I would really appreciate any input on how to fix this toy example.
Thanks in advance.

How to save and reload a Subclassed model in TF 2.6.0 / Python 3.9.7 wihtout performance drop?

Looks like the million dollars question. I have the model below built by sub classing Model in Keras.
Model trains fine and have good performance but I cannot find a way to save and restore the model without incurring a significant performance loss.
I track AUC on ROC curves for anomaly detection, and the ROC curve after loading the model is worse than before, using exactly the same validation data set.
I suspect the problem to come from the BatchNormalization, but I could be wrong.
I've tried several option:
This works but leads to performance drop.
model.save() / tf.keras.models.load()
This works but also lead to performance drop:
model.save_weights() / model.load_weights()
This does not work and I get the following error:
tf.saved_model.save() / tf.saved_model.load()
AttributeError: '_UserObject' object has no attribute 'predict'
This does not work either, as Subclassed model do not support json export:
model.to_json()
Here is the model:
class Deep_Seq2Seq_Detector(Model):
def __init__(self, flight_len, param_len, hidden_state=16):
super(Deep_Seq2Seq_Detector, self).__init__()
self.input_dim = (None, flight_len, param_len)
self._name_ = "LSTM"
self.units = hidden_state
self.regularizer0 = tf.keras.Sequential([
layers.BatchNormalization()
])
self.encoder1 = layers.LSTM(self.units,
return_state=False,
return_sequences=True,
#activation="tanh",
name='encoder1',
input_shape=self.input_dim)#,
#kernel_regularizer= tf.keras.regularizers.l1(),
#)
self.regularizer1 = tf.keras.Sequential([
layers.BatchNormalization(),
layers.Activation("tanh")
])
self.encoder2 = layers.LSTM(self.units,
return_state=False,
return_sequences=True,
#activation="tanh",
name='encoder2')#,
#kernel_regularizer= tf.keras.regularizers.l1()
#) # input_shape=(None, self.input_dim[1],self.units),
self.regularizer2 = tf.keras.Sequential([
layers.BatchNormalization(),
layers.Activation("tanh")
])
self.encoder3 = layers.LSTM(self.units,
return_state=True,
return_sequences=False,
activation="tanh",
name='encoder3')#,
#kernel_regularizer= tf.keras.regularizers.l1(),
#) # input_shape=(None, self.input_dim[1],self.units),
self.repeat = layers.RepeatVector(self.input_dim[1])
self.decoder = layers.LSTM(self.units,
return_sequences=True,
activation="tanh",
name="decoder",
input_shape=(self.input_dim[1],self.units))
self.dense = layers.TimeDistributed(layers.Dense(self.input_dim[2]))
#tf.function
def call(self, x):
# Encoder
x0 = self.regularizer0(x)
x1 = self.encoder1(x0)
x11 = self.regularizer1(x1)
x2 = self.encoder2(x11)
x22 = self.regularizer2(x2)
output, hs, cs = self.encoder3(x22)
# see https://www.tensorflow.org/guide/keras/rnn
encoded_state = [hs, cs]
repeated_vec = self.repeat(output)
# Decoder
decoded = self.decoder(repeated_vec, initial_state=encoded_state)
output_decoder = self.dense(decoded)
return output_decoder
I've seen Git threads, but no straight answer:
https://github.com/keras-team/keras/issues/4875
Did anyone found a solution ? Do I have to use the Functional or Sequential API instead ?
It seems the problem was coming from the Sublcassing API.
I reconstructed the exact same model using the Functionnal API and now model.save / model.load yields similar results.

TensorFlow 2.3: load model from ModelCheckPoint callback with both custom layers and model

I have wrote a custom code to build a UNet architecture. To do so I have firstly subclassed the tf.keras.layers.Layer object to define an encoder convolutional block composed by a conv3D layer, a BatchNormalization layer and a Activation layer, similarly I defined a decoder inverse convolutional block composed by a Conv3DTranspose layer, a BatchNormalization layer, an Activation layer and a Concatenate layer. Finally I subclassed the tf.keras.Model object to define the full model, composed by 4 enconding blocks and 4 decoding blocks.
To checkpoint the model while training I have used the tf.keras.callbacks.ModelCheckpoint callback. However when a I try to load back the model (that in fact is still training) with tf.keras.models.load_model() I receive the following error: ValueError: No model found in config file.
Here the full code for the model definition, building and fitting:
import tensorflow as tf
# Encoder block
class ConvBlock(tf.keras.layers.Layer):
def __init__(self, n_filters, conv_size, conv_stride, **kwargs):
super(ConvBlock, self).__init__(**kwargs)
self.conv3D = tf.keras.layers.Conv3D(
filters=n_filters,
kernel_size=conv_size,
strides=conv_stride,
padding="same",
)
self.batch_norm = tf.keras.layers.BatchNormalization()
self.relu = tf.keras.layers.Activation("relu")
def call(self, inputs, training=None):
h = self.conv3D(inputs)
if training:
h = self.batch_norm(h)
h = self.relu(h)
return h
# Decoder block
class InvConvBlock(tf.keras.layers.Layer):
def __init__(self, n_filters, conv_size, conv_stride, activation, **kwargs):
super(InvConvBlock, self).__init__(**kwargs)
self.conv3D_T = tf.keras.layers.Conv3DTranspose(
filters=n_filters,
kernel_size=conv_size,
strides=conv_stride,
padding="same",
)
self.batch_norm = tf.keras.layers.BatchNormalization()
self.activ = tf.keras.layers.Activation(activation)
self.concat = tf.keras.layers.Concatenate(axis=-1)
def call(self, inputs, feat_concat=None, training=None):
h = self.conv3D_T(inputs)
if training:
h = self.batch_norm(h)
h = self.activ(h)
if feat_concat is not None:
h = self.concat([h, feat_concat])
return h
class UNet(tf.keras.Model):
def __init__(self, n_filters, e_size, e_stride, d_size, d_stride, **kwargs):
super(UNet, self).__init__(**kwargs)
# Encoder
self.conv_block_1 = ConvBlock(n_filters, e_size, e_stride)
self.conv_block_2 = ConvBlock(n_filters * 2, e_size, e_stride)
self.conv_block_3 = ConvBlock(n_filters * 4, e_size, (1, 1, 1))
self.conv_block_4 = ConvBlock(n_filters * 8, e_size, (1, 1, 1))
# Decoder
self.inv_conv_block_1 = InvConvBlock(n_filters * 4, d_size, (1, 1, 1), "relu")
self.inv_conv_block_2 = InvConvBlock(n_filters * 2, d_size, (1, 1, 1), "relu")
self.inv_conv_block_3 = InvConvBlock(n_filters, d_size, d_stride, "relu")
self.inv_conv_block_4 = InvConvBlock(1, d_size, d_stride, "sigmoid")
def call(self, inputs, **kwargs):
h1 = self.conv_block_1(inputs, **kwargs)
h2 = self.conv_block_2(h1, **kwargs)
h3 = self.conv_block_3(h2, **kwargs)
h = self.conv_block_4(h3, **kwargs)
h = self.inv_conv_block_1(h, feat_concat=h3, **kwargs)
h = self.inv_conv_block_2(h, feat_concat=h2, **kwargs)
h = self.inv_conv_block_3(h, feat_concat=h1, **kwargs)
h = self.inv_conv_block_4(h, **kwargs)
return h
model = UNet(
n_filters,
e_size,
e_stride,
d_size,
d_stride,
)
model.build((None, *input_shape, 1))
loss = tf.keras.losses.BinaryCrossentropy(from_logits=True)
optimizer = tf.keras.optimizers.Adam(learning_rate)
metrics = [tf.keras.metrics.Precision(), tf.keras.metrics.Recall()]
model.compile(
loss=loss,
optimizer=optimizer,
metrics=metrics,
)
CP_callback = tf.keras.callbacks.ModelCheckpoint(
f"{checkpoint_dir}/model.h5", save_freq='epoch', monitor="loss"
)
unet.fit(
data,
epochs=opts.epochs,
callbacks=[CP_callback],
)
To load the model I used the following code on another python console:
import tensorflow as tf
model = tf.keras.models.load_model(f'{checkpoint_dir}/model.h5')
but here I receive the above mentioned error. What am I missing? Or what am I doing wrong?
Thank you in advance for your help.
This is because you don't define the get_config method in your custom layers. For this check, this exited answer in SO.
Otherwise, you can save the trained weights (not the full model) and load the model as follows. In that case, you don't need to define this function. Please note, it's good practice to do, however. Here is a workaround for your problem:
# callback
tf.keras.callbacks.ModelCheckpoint('model.h5',
monitor='val_loss',
verbose= 1,
save_best_only=True,
mode= 'min',
save_weights_only=True) # <---- only save weight
# train
model = UNet(
n_filters,
e_size,
e_stride,
d_size,
d_stride,
)
model.compile(...)
model.fit(...)
# inference
model = UNet(
n_filters,
e_size,
e_stride,
d_size,
d_stride,
)
model.build((None, *input_shape, 1))
model.load_weights('model.h5')
For more details, see the documentation of Serialization and saving and also collab demonstration of François Chollet. Also, We've written an article about model subclassing and custom training stuff in tf 2.x, in the Save and Load section (at the bottom) of this article, we've demonstrated many strategies, here, hope that help.
Update
I've run your public colab notebook. Unfortunately, I am facing the same issue, and it's a bit weird and currently, I don't have the exact answer for saving the entire model in the ModelCheckpoint callback with Custom Layer even if we define the get_config() method.
However, there is another workaround that may come in handy for you. As we know there are two major ways to save tf models: (1). SaveModel and HDF5 format. The way is we choose the SaveMoedl format. Which is recommended by the way and safe to use.
The key difference between HDF5 and SavedModel is that HDF5 uses object configs to save the model architecture, while SavedModel saves the execution graph. Thus, SavedModels are able to save custom objects like subclassed models and custom layers without requiring the original code.
Now, as for your requirements, you are saving the entire model along with the best loss or val_loss in training time. For that, we can define a custom callback do save the model for lowest validation_loss (or whatever you want). As follows:
class SaveModelH5(tf.keras.callbacks.Callback):
def on_train_begin(self, logs=None):
self.val_loss = []
def on_epoch_end(self, epoch, logs=None):
current_val_loss = logs.get("val_loss")
self.val_loss.append(logs.get("val_loss"))
if current_val_loss <= min(self.val_loss):
print('Find lowest val_loss. Saving entire model.')
self.model.save('unet', save_format='tf') # < ----- Here
save_model = SaveModelH5()
unet.fit(.., callbacks=save_model)
Using
model.save('any_name', save_format=`tf`)
allows us create a any_name working directory, inside which it contains assets, saved_model.pb, and variables. The model architecture and training configuration, including the optimizer, losses, and metrics are stored in saved_model.pb. The weights are saved in the variables directory.
When saving the model and its layers, the SavedModel format stores the class name, call function, losses, and weights (and the config, if implemented). The call function defines the computation graph of the model/layer. In the absence of the model/layer config, the call function is used to create a model that exists like the original model which can be trained, evaluated, and used for inference. When we need to re-load the saved model, we can do as follows:
new_unet = tf.keras.models.load_model("unet", compile=False)
Colab.

Keras changing input shape for functional API - Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor

I have trained my functional model in keras with images of dimensions 120x120 and now I would like to use this predicted model for a different shape of an input image. The common answer is to use None in the input shape, however, in my case it throws:
Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor
My training model is:
input_i = Input(shape = (120, 120, 1))
model = Model(input_i, all_together(input_i))
Once it is trained I am trying to build new model and load weights from trained one:
input_img = Input(shape = (None, None, 1))
model = Model(input_img, all_together(input_img)) <--- error
....
loading weights and so on
Do you have some recommendation on how to avoid this behaviour?
Numpy:1.16.4
Keras:2.3.1
Tensorflow: 2.0.0
You have something in your model that does not support a variable dimension.
A Flatten layer, for instance. You need to use only things that support variable dimensions. A GlobalMaxPooling2D or a GlobalAveragePooling2D can be replacements for the flattening.
You can build your own Prelu:
class MyPrelu(Layer):
def __init__(self, **kwargs):
super(MyPrelu, self).__init__(**kwargs)
self.alpha_initializer = initializers.get('zeros')
self.alpha_regularizer = regularizers.get(None)
self.alpha_constraint = constraints.get(None)
def build(self, input_shape):
param_shape = tuple(1 for i in range(len(input_shape)-1)) + input_shape[-1:]
self.alpha = self.add_weight(shape=param_shape,
name='alpha',
initializer=self.alpha_initializer,
regularizer=self.alpha_regularizer,
constraint=self.alpha_constraint)
self.built = True
def call(self, inputs, mask=None):
pos = K.relu(inputs)
neg = -self.alpha * K.relu(-inputs)
return pos + neg
def compute_output_shape(self, input_shape):
return input_shape
#if you want to have those initializers and other parameters, check the source code and add this:
def get_config(.....):
....

How to access recursive layers of custom layer in tensorflow keras

From tensorflow keras example. I can create a custom layer which contains Linear layer recursively
class MLPBlock(layers.Layer):
def __init__(self):
super(MLPBlock, self).__init__()
self.linear_1 = Linear(32)
self.linear_2 = Linear(32)
self.linear_3 = Linear(1)
def call(self, inputs):
x = self.linear_1(inputs)
x = tf.nn.relu(x)
x = self.linear_2(x)
x = tf.nn.relu(x)
return self.linear_3(x)
how do i access all the component layers of a custom layer, I want to access weight and biases of all the component layers.
ex:
MLPBlock(Parent Layer):
linear_1
linear_2
linear_3
I have looked into tensorflow keras api version r 1.14 https://www.tensorflow.org/guide/keras
but could not find any way to do this.
I assume that you are following this tutorial. Based on that, here is how you can access the weights:
class MLPBlock(tf.keras.Model):
def __init__(self):
super(MLPBlock, self).__init__()
self.linear_1 = tf.keras.layers.Dense(32)
self.linear_2 = tf.keras.layers.Dense(32)
self.linear_3 = tf.keras.layers.Dense(1)
def call(self, inputs):
x = self.linear_1(inputs)
x = tf.nn.relu(x)
x = self.linear_2(x)
x = tf.nn.relu(x)
return self.linear_3(x)
mlp_block = MLPBlock()
y = mlp_block(tf.ones(shape=(3, 64)))
for layer in mlp_block.layers:
weights, biases = layer.get_weights()
Please note that I slightly modified the example so that you can access the layer's weights and biases. Namely, what I did is instead of subclassing the class with tf.keras.layers.Layer, I subclassed with tf.keras.Model so that the stack of layers can be treated as a model, and then you can access the layers of that model. Then, instead of using the custom Linear layer, I used the tf.keras.layers.Dense for simplicity, however, using the custom layer should not make a difference.