tensorflow2: keras: model.fit() callbacks and eager mode - tensorflow

I am running Tensorflow 2.1 with keras API. I am following the following coding style:
model = tf.keras.Sequential()
...
model.fit(..., callbacks=callbacks)
Now, I would like to save some intermediate layer tensor value as image summary (as a sample what is happening at n-th training step). In order to do this, I've implemented my own callback class. I've also learned how keras.callbacks.TensorBoard is implemented, since it can save layer weights as image summaries.
I do the following in my on_epoch_end:
tensor = self.model.get_layer(layer_name).output
with context.eager_mode():
with ops.init_scope():
tensor = tf.keras.backend.get_value(tensor)
tf.summary.image(layer_name, tensor, step=step, max_outputs=1)
Unfortunately, I am still getting issue related to eager/graph modes:
tensor = tf.keras.backend.get_value(tensor)
File "/home/matwey/lab/venv/lib/python3.6/site-packages/tensorflow_core/python/keras/backend.py", line 3241, in get_value
return x.numpy()
AttributeError: 'Tensor' object has no attribute 'numpy'
Unfortunately, there is a little to no documentation on how to correctly combine keras callbacks and tf.summary.image. How could I overcome this issue?
upd: tf_nightly-2.2.0.dev20200427 has the same behaviour.

Related

"ValueError: Your Layer or Model is in an invalid state." after upgrading to tensorflow federated 0.17.0 from 0.16.1

I am running into an error after upgrading to TFF 0.17.0. The same code works perfectly in TFF 0.16.1. The training works just fine in both versions however when I try to copy weights from the FL state to model to evaluate it on test dataset, I get the following error:
File "fl/main_fl.py", line 166, in keras_evaluate
loss, accuracy = self.model.evaluate(test_dataset, verbose=0)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_v1.py", line 905, in evaluate
self._assert_built_as_v1()
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_v1.py", line 852, i$ _assert_built_as_v1
(type(self),))
ValueError: Your Layer or Model is in an invalid state. This can happen for the following cases:
1. You might be interleaving estimator/non-estimator models or interleaving models/layers made in tf.compat.v1.Graph.as_default() with model$/layers created outside of it. Converting a model to an estimator (via model_to_estimator) invalidates all models/layers made before the conv$rsion (even if they were not the model converted to an estimator). Similarly, making a layer or a model inside a a tf.compat.v1.Graph invalid$tes all layers/models you previously made outside of the graph.
2. You might be using a custom keras layer implementation with custom __init__ which didn't call super().__init__. Please check the impleme$tation of <class 'tensorflow.python.keras.engine.functional.Functional'> and its bases.
Below is my keras_evaluate method:
def keras_evaluate(self, test_dataset, mode='test', step=0):
self.state.model.assign_weights_to(self.model)
loss, accuracy = self.model.evaluate(test_dataset, verbose=0)
print('Mode={}, Loss={}, Accuracy={}'.format(mode, loss, accuracy))
self.state is the state returned by tff.learning.build_federated_averaging_process i.e tff.templates.IterativeProcess, test_dataset is of type tf.data.Dataset and self.model is tf.keras.Model type i.e keras functional model. I have one custom layer however it does have super() method so point 2 in the error is misleading me.
Any help will be appreciated.

Isues with saving and loading tensorflow model which uses hugging face transformer model as its first layer

Hi I am having some serious problems saving and loading a tensorflow model which is combination of hugging face transformers + some custom layers to do classfication. I am using the latest Huggingface transformers tensorflow keras version. The idea is to extract features using distilbert and then run the features through CNN to do classification and extraction. I have got everything to work as far as getting the correct classifications.
The problem is in saving the model once trained and then loading the model again.
I am using tensorflow keras and tensorflow version 2.2
Following is the code to design the model, train it, evaluate it and then save and load it
bert_config = DistilBertConfig(dropout=0.2, attention_dropout=0.2, output_hidden_states=False)
bert_config.output_hidden_states = False
transformer_model = TFDistilBertModel.from_pretrained(DISTIL_BERT, config=bert_config)
input_ids_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='input_token', dtype='int32')
input_masks_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='masked_token', dtype='int32')
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in)[0]
x = tf.keras.layers.Bidirectional(
tf.keras.layers.LSTM(50, return_sequences=True, dropout=0.1,
recurrent_dropout=0, recurrent_activation="sigmoid",
unroll=False, use_bias=True, activation="tanh"))(embedding_layer)
x = tf.keras.layers.GlobalMaxPool1D()(x)
outputs = []
# lots of code here to define the dense layers to generate the outputs
# .....
# .....
model = Model(inputs=[input_ids_in, input_masks_in], outputs=outputs)
for model_layer in model.layers[:3]:
logger.info(f"Setting layer {model_layer.name} to not trainable")
model_layer.trainable = False
rms_optimizer = RMSprop(learning_rate=0.001)
model.compile(loss=SigmoidFocalCrossEntropy(), optimizer=rms_optimizer)
# the code to fit the model (which works)
# then code to evaluate the model (which also works)
# finally saving the model. This too works.
tf.keras.models.save_model(model, save_url, overwrite=True, include_optimizer=True, save_format="tf")
However, when I try to load the saved model using the following
tf.keras.models.load_model(
path, custom_objects={"Addons>SigmoidFocalCrossEntropy": SigmoidFocalCrossEntropy})
I get the following load error
ValueError: The two structures don't have the same nested structure.
First structure: type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')
Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}
More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')" is not
Entire first structure:
.
Entire second structure:
{'input_ids': .}
I believe the issue is because TFDistilBertModel layer can be called using a dictionary input from DistilBertTokenizer.encode() and that happens to be the first layer. So the model compiler on load expects that to be the input signature to the call model. However, the inputs defined to the model are two tensors of shape (None, 128)
So how do I tell the load function or the save function to assume the correct signatures?
I solved the issue.
The issue was the object transformer_model in the above code is itself not a layer. So if we want to embed it inside another keras layer we should use the internal keras layer that is wrapped in the model
So changing the line
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in[0]
to
embedding_layer = transformer_model.distilbert([input_ids_in, input_masks_in])[0]
makes everything work. Hope this helps someone else. Took a long time to debug through tf.keras code to figure this one out although in hindsight it is obvious. :)
I suffered the same problem, casually, yesterday. My solution is very similar to yours, I supposed that the problem was due to how tensorflow keras processes custom models so, the idea was to use the layers of the custom model inside my model. This has the advantage of not calling explicitly the layer by its name (in my case, it is useful for easy building more generic models using different pretrained encoders):
sent_encoder = getattr(transformers, self.model_name).from_pretrained(self.shortcut_weights).layers[0]
I don't explored all the models of HuggingFace, but a few that I tested seem to be a custom model with only one custom layer.
Your solution also works like a charm, in fact, both solutions are the same if "distilbert" references to ".layers[0]".

Can't build model with DenseFeature input layer, get "'DenseFeatures' object has no attribute 'shape'"

I'm trying to build a Keras model using a DenseFeatures layer as the input; the input comes as a dict of Tensors. TF is insisting that I use model.build() to build the model before optimization, but I can't build it due to DenseFeatures not having an input shape. I get the error
AttributeError: 'DenseFeatures' object has no attribute 'shape'
How can I resolve this? Here's my code:
input_layer = tf.keras.layers.DenseFeatures(params.columns)
predictions = Dense(1, input_dim=len(params.columns), activation='softmax')(input_layer)
model = Sequential([input_layer, predictions])
model.build()
ETA: some further information for insight: I'm not actually fitting the model with this code; rather, I'm creating an EstimatorSpec to use with a Sagemaker model (thus it seems like this will probably require a bit of weird footwork between two different paradigms.)

Using patch from larger image as input dim to Keras CNN gives error 'Tensor' object has no attribute '_keras_history'*

I am trying to create a CNN with keras to process 20x20 patches from a larger image of 600x600.
When I attempt the run the code below I receive an error AttributeError: 'Tensor' object has no attribute '_keras_history'
The below code is only intended to look at the first 20 x 20 patch out of an total of 900, I am trying to get this functioning before attempting to loop through the entire input image.
I don't understand why it is returning the error as each layer is generated with an keras layer and I haven't applied any other operations to the tensor?
I am using tensorflow 1.3 and keras 2.0.6.
nb_filters=16
input_image=Input(shape=(600,600,3))
Input_1R=Reshape((900,20,20,3))(input_image)
conv1=Convolution2D(nb_filters,(5,5),activation='relu',padding='valid')(Input_1R[:,0])
conv4=Convolution2D(1,(6,6),activation='hard_sigmoid',padding='same')(conv1)
dense6=Dense(1)(conv4)
output_dense=dense6
model = Model(inputs=input_image, outputs=output_dense)
The error occurs because the slicing operation Input_1R[:,0] is not performed in a Keras layer.
You can wrap it into a Lambda layer:
sliced = Lambda(lambda x: x[:, 0])(Input_1R)
conv1 = Convolution2D(nb_filters, (5,5), activation='relu', padding='valid')(sliced)

Tensorflow: How can I assign numpy pre-trained weights to subsections of graph?

This is a simple thing which I just couldn't figure out how to do.
I converted a pre-trained VGG caffe model to tensorflow using the github code from https://github.com/ethereon/caffe-tensorflow and saved it to vgg16.npy...
I then load the network to my sess default session as "net" using:
images = tf.placeholder(tf.float32, [1, 224, 224, 3])
net = VGGNet_xavier({'data': images, 'label' : 1})
with tf.Session() as sess:
net.load("vgg16.npy", sess)
After net.load, I get a graph with a list of tensors. I can access individual tensors per layer using net.layers['conv1_1']... to get weights and biases for the first VGG convolutional layer, etc.
Now suppose that I make another graph that has as its first layer "h_conv1_b":
W_conv1_b = weight_variable([3,3,3,64])
b_conv1_b = bias_variable([64])
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
My question is -- how do you get to assign the pre-trained weights from net.layers['conv1_1'] to h_conv1_b ?? (both are now tensors)
I suggest you have a detailed look at network.py from the https://github.com/ethereon/caffe-tensorflow, especially the function load(). It would help you understand what happened when you called net.load(weight_path, session).
FYI, variables in Tensorflow can be assigned to a numpy array by using var.assign(np_array) which is executed in the session. Here is the solution to your question:
with tf.Session() as sess:
W_conv1_b = weight_variable([3,3,3,64])
sess.run(W_conv1_b.assign(net.layers['conv1_1'].weights))
b_conv1_b = bias_variable([64])
sess.run(b_conv1_b.assign(net.layers['conv1_1'].biases))
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
I would like to kindly remind you the following points:
var.assign(data) where 'data' is a numpy array and 'var' is a TensorFlow variable should be executed in the same session where you want to continue to execute your network either inference or training.
The 'var' should be created as the same shape as the 'data' by default. Therefore, if you can obtain the 'data' before creating the 'var', I suggest you create the 'var' by the method var=tf.Variable(shape=data.shape). Otherwise, you need to create the 'var' by the method var=tf.Variable(validate_shape=False), which means the variable shape is feasible. Detailed explainations can be found in the Tensorflow's API doc.
I extend the same repo caffe-tensorflow to support theano in caffe so that I can load the transformed model from caffe in Theano. Therefore, I am a reasonable expert w.r.t this repo's code. Please feel free to get in contact with me as you have any further question.
You can get variable values using eval method of tf.Variable-s from the first network and load that values into variables of the second network using load method (also method of the tf.Variable).