How tensorflow2 gets intermediate layer output - tensorflow2.0

I want to use tensorflow2 to achieve cnn extraction of image features and then output to SVM for classification. What are the good ways?
I have checked the documentation of tensorflow2 and there is no good explanation. Who can guide me?

Thank you for your clarifying answers above. I have written answers to similar questions before. But you can extract intermediate-layer outputs from a tf.keras model by constructing an auxiliary model using the so called functional model API. "Function model API" uses tf.keras.Model(). When you call the function, you specify the arguments inputs and outputs. You can include the output of the intermediate layer in the outputs argument. See the simple code example below:
import tensorflow as tf
# This is the original model.
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=[28, 28, 1]),
tf.keras.layers.Dense(100, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax")])
# Make an auxiliary model that exposes the output from the intermediate layer
# of interest, which is the first Dense layer in this case.
aux_model = tf.keras.Model(inputs=model.inputs,
outputs=model.outputs + [model.layers[1].output])
# Access both the final and intermediate output of the original model
# by calling `aux_model.predict()`.
final_output, intermediate_layer_output = aux_model.predict(some_input)

Related

How to use sample weights with tensorflow datasets?

I have been training a unet model for multiclass semantic segmentation in python using Tensorflow and Tensorflow Datasets.
I've noticed that one of my classes seems to be underrepresented in training. After doing some research, I found out about sample weights and thought it might be a good solution to my problem, but I've been having trouble deciphering the documentation on how to use it or find examples of it being used.
Could someone help explain how sample weights come into play with datasets for training or point me to an example where it is being implemented? Or even what type of input the model.fit function is expecting would be helpful.
From the documentation of tf.keras model.fit():
sample_weight
[...] This argument is not supported when x is a dataset, generator, or keras.utils.Sequence instance, instead provide the sample_weights as the third element of x.
What is meant by that? This is demonstrated for the Dataset case in one of the official documentation turorials:
sample_weight = np.ones(shape=(len(y_train),))
sample_weight[y_train == 5] = 2.0
# Create a Dataset that includes sample weights
# (3rd element in the return tuple).
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train, sample_weight))
# Shuffle and slice the dataset.
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
model = get_compiled_model()
model.fit(train_dataset, epochs=1)
See the link for a full-fledged example.

Isues with saving and loading tensorflow model which uses hugging face transformer model as its first layer

Hi I am having some serious problems saving and loading a tensorflow model which is combination of hugging face transformers + some custom layers to do classfication. I am using the latest Huggingface transformers tensorflow keras version. The idea is to extract features using distilbert and then run the features through CNN to do classification and extraction. I have got everything to work as far as getting the correct classifications.
The problem is in saving the model once trained and then loading the model again.
I am using tensorflow keras and tensorflow version 2.2
Following is the code to design the model, train it, evaluate it and then save and load it
bert_config = DistilBertConfig(dropout=0.2, attention_dropout=0.2, output_hidden_states=False)
bert_config.output_hidden_states = False
transformer_model = TFDistilBertModel.from_pretrained(DISTIL_BERT, config=bert_config)
input_ids_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='input_token', dtype='int32')
input_masks_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='masked_token', dtype='int32')
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in)[0]
x = tf.keras.layers.Bidirectional(
tf.keras.layers.LSTM(50, return_sequences=True, dropout=0.1,
recurrent_dropout=0, recurrent_activation="sigmoid",
unroll=False, use_bias=True, activation="tanh"))(embedding_layer)
x = tf.keras.layers.GlobalMaxPool1D()(x)
outputs = []
# lots of code here to define the dense layers to generate the outputs
# .....
# .....
model = Model(inputs=[input_ids_in, input_masks_in], outputs=outputs)
for model_layer in model.layers[:3]:
logger.info(f"Setting layer {model_layer.name} to not trainable")
model_layer.trainable = False
rms_optimizer = RMSprop(learning_rate=0.001)
model.compile(loss=SigmoidFocalCrossEntropy(), optimizer=rms_optimizer)
# the code to fit the model (which works)
# then code to evaluate the model (which also works)
# finally saving the model. This too works.
tf.keras.models.save_model(model, save_url, overwrite=True, include_optimizer=True, save_format="tf")
However, when I try to load the saved model using the following
tf.keras.models.load_model(
path, custom_objects={"Addons>SigmoidFocalCrossEntropy": SigmoidFocalCrossEntropy})
I get the following load error
ValueError: The two structures don't have the same nested structure.
First structure: type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')
Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}
More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')" is not
Entire first structure:
.
Entire second structure:
{'input_ids': .}
I believe the issue is because TFDistilBertModel layer can be called using a dictionary input from DistilBertTokenizer.encode() and that happens to be the first layer. So the model compiler on load expects that to be the input signature to the call model. However, the inputs defined to the model are two tensors of shape (None, 128)
So how do I tell the load function or the save function to assume the correct signatures?
I solved the issue.
The issue was the object transformer_model in the above code is itself not a layer. So if we want to embed it inside another keras layer we should use the internal keras layer that is wrapped in the model
So changing the line
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in[0]
to
embedding_layer = transformer_model.distilbert([input_ids_in, input_masks_in])[0]
makes everything work. Hope this helps someone else. Took a long time to debug through tf.keras code to figure this one out although in hindsight it is obvious. :)
I suffered the same problem, casually, yesterday. My solution is very similar to yours, I supposed that the problem was due to how tensorflow keras processes custom models so, the idea was to use the layers of the custom model inside my model. This has the advantage of not calling explicitly the layer by its name (in my case, it is useful for easy building more generic models using different pretrained encoders):
sent_encoder = getattr(transformers, self.model_name).from_pretrained(self.shortcut_weights).layers[0]
I don't explored all the models of HuggingFace, but a few that I tested seem to be a custom model with only one custom layer.
Your solution also works like a charm, in fact, both solutions are the same if "distilbert" references to ".layers[0]".

What is the easiest way to run a part of a model?

I'm dealing with Keras functional API.
Specifically for my experiments, I'm using Keras resnet50 model obtained with:
model = resnet50.ResNet50(weights='imagenet')
Obviously, to get the final output of the network we need to feed a value to the placeholder input_1.
My question is, can I somehow start inferencing this graph from the relu layer which is depicted at the bottom of the picture below, provided that I feed a value of the appropriate dimensions into it?
I tried to achieve this with Keras functions. Something like:
self.inp = model.input
self.outputs = [layer.output for layer in model.layers]
self.functor = K.function([self.inp, K.learning_phase()], [self.outputs[6], self.outputs[17]])
But this approach will not work, because again to inference any output I need to feed value into tensor.
Is recreating graph from scratch my best option here?
Thanks
If I got you right, you can just specify input and output nodes
base_model = tf.keras.applications.ResNet50(weights='imagenet')
inference_model = tf.keras.Model(inputs=base_model.input, outputs=base_model.get_layer('any_layer_name').output)
You can set the output to any layer name

Obtaining Logits of the output from deeplab model

I'm using a pre-trained deeplab model (from here) to obtain segmentations for an input image. I'm able to obtain the sematic labels (i.e. SemanticPredictions) which is argmax applied to logits (link).
I was wondering if there is an easy way to obtain the logits before argmax? I was hoping to find the output tensor name and simply pass it into my tfsession
as in the following:
tf_session.run(
self.OUTPUT_TENSOR_NAME,
feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(input_image)]})
But I have not been able to locate such tensor name in the code that reveals the logits, or softmax outputs.
For a model trained from MobileNet_V2 setting self.OUTPUT_TENSOR_NAME = 'ResizeBilinear_2:0' retrieves the logits before the argmax is performed.
I suspect this is the same for xception, but have not verified it.
I arrived at this answer by loading my model in tensorflow. Then, printing the name of all layers in the loaded graph. Finally, I took the name of the final output layer before the last 'ArgMax' layer and ran some inferencing using that.
Here is a link to a stackoverflow question on printing the names of the layers in a graph. I found the answer by Ted to be most helpful.
By the way, the output layers of DeeplabV3 models does not apply SoftMax. So you cannot simply take the raw value of the elements of output vectors as a confidence.

How to use TensorBoard and summary operations with the tf.layers module

I have followed the TensorFlow Layers tutorial to create a CNN for MNIST digit classification using TensorFlow's tf.layers module. Now I'm trying to learn how to use TensorBoard from TensorBoard: Visualizing Learning. Perhaps this tutorial hasn't been updated recently, because it says its example code is a modification of that tutorial's and links to it, but the code is completely different: it manually defines a single-hidden-layer fully-connected network.
The TensorBoard tutorial shows how to use tf.summary to attach summaries to a layer by creating operations on the layer's weights tensor, which is directly accessible because we manually defined the layer, and attaching tf.summary objects to those operations. To do this if I'm using tf.layers and its tutorial code, I believe I'd have to:
Modify the Layers tutorial's example code to use the non-functional interface (Conv2D instead of conv2d and Dense instead of dense) to create the layers
Use the layer objects' trainable_weights() functions to get the weight tensors and attach tf.summary objects to those
Is that the best way to use TensorBoard with tf.layers, or is there a way that's more directly compatible with tf.layers and the functional interface? If so, is there an updated official TensorBoard tutorial? It would be nice if the documentation and tutorials were more unified.
You should be able to use the output of your tf.layers call to get the activations. Taking the first convolutional layer of the linked layers tutorial:
# Convolutional Layer #1
conv1 = tf.layers.conv2d(
inputs=input_layer,
filters=32,
kernel_size=[5, 5],
padding="same",
activation=tf.nn.relu)
You could do:
tensor_name = conv1.op.name
tf.summary.histogram(tensor_name + '/activation', conv1)
Not sure if this is the best way, but I believe it is the most direct way of doing what you want.
Hope this helps!
You can use something like this
with tf.name_scope('dense2'):
preds = tf.layers.dense(inputs=dense1,units = 12,
activation=tf.nn.sigmoid, name="dense2")
d2_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'dense2')
tf.summary.histogram("weights", d2_vars[0])
tf.summary.histogram("biases", d2_vars[1])
tf.summary.histogram("activations", preds)
Another choice is to use tf.layers.Dense instead of tf.layers.dense (difference between d and D).
The paradigm for Dense is :
x = tf.placeholder(shape=[None, 100])
dlayer = tf.layers.Dense(hidden_unit)
y = dlayer(x)
With dlayer as intermediate, you're able to do:
k = dlayer.kernel
b = dlayer.bias
k_and_b = dlayer.weights
Note that you won't get the dlayer.kernel until you apply y = dlayer(x).
Things are similar for other layers such as convolution layer. Check them with any available auto-completion.