Trouble understanding parts of concepts in creating custom callbacks in keras - tensorflow

import keras
import numpy as numpy
class ActivationLogger(keras.callbacks.Callback):
def set_model(self,model):
self.model = model //inform the callback of what model we will be calling
layer_outputs = [layer.output for layer in model.layers]
self.activations_model = keras.models.Model(model.input,layer_outputs)//returns activation of every layer
def on_epoch_end(self,epoch,logs = None):
if self.validation_data is None:
raise RuntimeError("Requires validation_data")
validation_sample = self.validation_data[0][0:1]
activations = self.activations_model.predict(validation_sample) #computes activation of every epoch
f = open('activations_at_epoch_' + str(epoch) + '.npz', 'w')
np.savez(f, activations)
f.close()
While I was reading this code to create custom callbacks,I couldn't understand few lines of code.I know what are callbacks. What I understood from the above code is that we inherit the super class keras.callbacks.Callback and in the set_model fucntion, we inform the callback of what model it will be calling. I am not able to understand the below line, why does keras.models.Model take model.input?
self.activations_model = keras.models.Model(model.input,
layer_outputs)
and the line activations = self.activations_model.predict(validation_sample)
The further lines just save the numpy arrays to the drive. Also is the callback created,called on every epoch?

Let's say i have an simple model
model = Sequential()
model.add(Dense(32, input_shape=(784, 1), activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(4, activation='softmax'))
cb = ActivationLogger()
cb.set_model(model)
Now let me go through line by line of function set_model:
self.model = model
self.model.summary() = Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 32) 25120
_________________________________________________________________
dense_1 (Dense) (None, 16) 528
_________________________________________________________________
dropout (Dropout) (None, 16) 0
_________________________________________________________________
dense_2 (Dense) (None, 4) 68
=================================================================
Total params: 25,716
Trainable params: 25,716
Non-trainable params: 0
second line:
layer_outputs = [layer.output for layer in model.layers]
print(layer_outputs) = [<tf.Tensor 'dense/Relu:0' shape=(None, 32) dtype=float32>, <tf.Tensor 'dense_1/Relu:0' shape=(None, 16) dtype=float32>, <tf.Tensor 'dropout/cond/Identity:0' shape=(None, 16) dtype=float32>, <tf.Tensor 'dense_2/Softmax:0' shape=(None, 4) dtype=float32>]
layer_outputs contains all the tensors or the layers of the models
and the
third line:
self.activations_model = keras.models.Model(model.input,layer_outputs)
Now in this line, it creates a model with input shape corresponding to original model(model.input = it gives the input tensor or layer of a model. you can also checkout the output shape of a model using model.output)
so self.activation_model is model with one input shape((784, ) in this case) and output at every layer
so when you feed any input through this model it will give you a list of outputs correspond to every layer
Normally output will be a numpy array of shape (none, 4) (taking main Sequential model)
but self.activation will give you a list a numpy arrays. So the line
activations = self.activations_model.predict(validation_sample)
activation just contains the predictions of self.activation_model which a nothing but a list of numpy arrays
[(none, 32)(output of first layer), (None, 16)(output of 2nd), (none, 16)(dropout lyr), (none, 4)(final)
i would suggest you to read about keras Model Function api which is used to make models with many input and outputs

Related

Add Augmentation Layers Before keras.applications.EfficientNetB0 and Retain Layer Names

I have a trained EfficientNetB0-based model with saved weights in a H5 format.
I want to add some preprocessing layers before the model, load the weights, and retrain it.
If I create a model like this:
inp = tf.keras.layers.Input(shape=[224,224,3])
noise = tf.keras.layers.GaussianNoise(stddev=10.)(inp)
feature_extractor = tf.keras.applications.EfficientNetB0(include_top=False, pooling="max")
features = feature_extractor(noise)
output1 = tf.keras.layers.Dense(100, activation="sigmoid")(features)
output2 = tf.keras.layers.Dense(10, activation="softmax")(output1)
model = tf.keras.models.Model(inp, [output1, output2])
I get this summary:
Layer (type) Output Shape Param #
=================================================================
input_27 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
gaussian_noise_13 (GaussianN (None, 224, 224, 3) 0
_________________________________________________________________
efficientnetb0 (Functional) (None, 1280) 4049571
_________________________________________________________________
dense (Dense) (None, 100) 128100
_________________________________________________________________
dense_1 (Dense) (None, 10) 1010
and I lose access to intermediate layers. I can't use the tf.keras.Sequential approach because my model has two outputs.
I want to retain the layer names inside EfficientNetB0 so that I can reload my weights. How do I do that?
So it looks like for the toy example I created above the answer is:
inp = tf.keras.layers.Input(shape=[224,224,3])
noise = tf.keras.layers.GaussianNoise(stddev=10.)(inp)
feature_extractor = tf.keras.applications.EfficientNetB0(input_tensor=noise, include_top=False, pooling="max")
output1 = tf.keras.layers.Dense(100, activation="sigmoid")(feature_extractor.output)
output2 = tf.keras.layers.Dense(10, activation="softmax")(output1)
model = tf.keras.models.Model(inp, [output1, output2])
However, I'm actually working with a custom model class that doesn't have that argument in the constructor...
Without the input_tensor argument is there another way to do this?

Why does Tensor Flow add a dimension to my input & output?

Here is my code:
from tensorflow.keras import layers
import tensorflow as tf
from tensorflow import keras
TFDataType = tf.float16
XTrain = tf.cast(tf.ones((10,10)), dtype=TFDataType)
YTrain = tf.cast(tf.ones((10,10)), dtype=TFDataType)
model = tf.keras.models.Sequential()
model.add(layers.Dense(1, dtype=TFDataType, input_shape=(10, 10)))
model.add(layers.Dense(1, dtype=TFDataType, input_shape=(10, 10)))
print(model.summary())
I am feeding it a 2 dimensional matrix. But when I see the model summary, I see:
Model: "sequential"
_________________________________________________________________
2021-08-23 13:32:18.716788: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-TLG9US3
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 10, 1) 11
_________________________________________________________________
dense_1 (Dense) (None, 10, 2) 4
=================================================================
Total params: 15
Trainable params: 15
Non-trainable params: 0
_________________________________________________________________
Why is the model asking for a 3 Dimensional (None, 10, 1) array?
How do I pass an array that meets the dimensionality of (None, 10, 1)?
I cannot call numpy.ones(None, 10, 1). I cannot reshape the array with -1 in the first dimension.
In your first layer the code input_shape=(10, 10) adds the extra dimension to account for the batch size of the data. Note you only need this code for the FIRST layer in your model so remove input_shape=(10, 10) in your second layer.

Dimension of output in Dense layer Keras

I have the sample following model
from tensorflow.keras import models
from tensorflow.keras import layers
sample_model = models.Sequential()
sample_model.add(layers.Dense(32, input_shape=(4,)))
sample_model.add(layers.Dense(16, input_shape = (44,)))
sample_model.compile(loss="binary_crossentropy",
optimizer="adam", metrics = ["accuracy"])
IP for the model:
sam_x = np.random.rand(10,4)
sam_y = np.array([0,1,1,0,1,0,0,1,0,1,])
sample_model.fit(sam_x,sam_y)
The confusion is the fit should have thrown an error of shape mismatch as the expected_input_shape for the 2nd Dense Layer is given as (None,44) but the output for the 1st Dense Layer (which is the input of the 2nd Dense Layer) will be of shape (None,32). But it ran successfully.
I don't understand why there was no error. Any clarifications will be helpful
The input_shape keyword argument has an effect only on the first layer of a Sequential. The shape of the input of the other layers will be derived from their previous layer.
That behaviour is hinted in the doc of tf.keras.layers.InputShape:
When using InputLayer with Keras Sequential model, it can be skipped by moving the input_shape parameter to the first layer after the InputLayer.
And in the Sequential Model guide.
The behaviour can be confirmed by looking at the source of the Sequential.add method:
if not self._layers:
if isinstance(layer, input_layer.InputLayer):
# Case where the user passes an Input or InputLayer layer via `add`.
set_inputs = True
else:
batch_shape, dtype = training_utils.get_input_shape_and_dtype(layer)
if batch_shape:
# Instantiate an input layer.
x = input_layer.Input(
batch_shape=batch_shape, dtype=dtype, name=layer.name + '_input')
# This will build the current layer
# and create the node connecting the current layer
# to the input layer we just created.
layer(x)
set_inputs = True
If there is no layers yet in the model, then an Input will be added to the model with the shape derived from the first layer of the model. This is done only if no layer is present yet in the model.
That shape is either fully known (if input_shape has been passed to the first layer of the model) or will be fully known once the model is built (for example, with a call to model.build(input_shape)).
The thing is after checking the input shape of the model from the first layer, it won't check or deal with other declared input shape inside that same model. For example, if you write your model the following way
sample_model.add(layers.Dense(32, input_shape=(4,)))
sample_model.add(layers.Dense(16, input_shape = (44,)))
sample_model.add(layers.Dense(8, input_shape = (32,)))
The program will always check the first declared input shape layer and discard the rest. So, if you start your first layer with input_shape = (44,), you need to pass exact feature numbers to your model as input such as:
sam_x = np.random.rand(10,44)
sam_y = np.array([0,1,1,0,1,0,0,1,0,1,])
sample_model.fit(sam_x,sam_y)
Additionally, if you look at the Functional API, unlike the Sequential model, you must create and define a standalone Input layer that specifies the shape of input data. It's not learnable but simply a spec layer. It's a kind of gateway of the input data for the model. That means even if we define input_shape inside the other layers, they all will be discarded. For example:
nputs = keras.Input(shape=(4,))
dense = layers.Dense(64, input_shape=(8,)) # dicard input_shape
x = dense(inputs)
x = layers.Dense(64, input_shape=(16,))(x) # dicard input_shape
outputs = layers.Dense(10)(x)
model = keras.Model(inputs=inputs, outputs=outputs, name="mnist_model")
Here is a more complex example with Conv2D and MNIST.
encoder_input = keras.Input(shape=(28, 28, 1),)
x = layers.Conv2D(16, 3, activation="relu", input_shape=[32,32,3])(encoder_input)
x = layers.Conv2D(32, 3, activation="relu", input_shape=[64,64,3])(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation="relu", input_shape=[224,321,3])(x)
x = layers.Conv2D(16, 3, activation="relu", input_shape=[420,32,3])(x)
x = layers.GlobalMaxPooling2D()(x)
out = layers.Dense(10, activation='softmax')(x)
encoder = keras.Model(encoder_input, out, name="encoder")
encoder.summary()
Model: "encoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_15 (InputLayer) [(None, 28, 28, 1)] 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 26, 26, 16) 160
_________________________________________________________________
conv2d_9 (Conv2D) (None, 24, 24, 32) 4640
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 8, 8, 32) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 6, 6, 32) 9248
_________________________________________________________________
conv2d_11 (Conv2D) (None, 4, 4, 16) 4624
_________________________________________________________________
global_max_pooling2d_2 (Glob (None, 16) 0
_________________________________________________________________
dense_56 (Dense) (None, 10) 170
=================================================================
Total params: 18,842
Trainable params: 18,842
Non-trainable params: 0
def pre_process(image, label):
return (image / 256)[...,None].astype('float32'),
tf.keras.utils.to_categorical(label, num_classes=10)
(x, y), (_, _) = tf.keras.datasets.mnist.load_data('mnist')
encoder.compile(
loss = tf.keras.losses.CategoricalCrossentropy(),
metrics = tf.keras.metrics.CategoricalAccuracy(),
optimizer = tf.keras.optimizers.Adam())
encoder.fit(x, y, batch_size=256)
4s 14ms/step - loss: 1.4303 - categorical_accuracy: 0.5279
I think Keras will create (or preserves to create) an additional Input Layer - but as the second dense layer is added using model.add() it will automatically be connected to the layer before, and thus the extra input layer stays unconnected and is not part of the model.
(I agree that it would be nice of Keras to hint at unconnected layers, I sometimes created unconnected layers when using the functional API and changed the inputs. Keras doesn't remind me that I had jumped several layers, I just wondered why the summary() was so short...)

Tensorflow keras Sequential .add is different than inline definition?

Keras is giving different results when I define my model via the declarative method instead of the functional method. The two models appear to be equivillent, but using the ".add()" syntax works while using the declarative syntax gives errors -- it's a different error each time, but usually something like:
A target array with shape (10, 1) was passed for an output of shape (None, 16) while using as loss `mean_squared_error`. This loss expects targets to have the same shape as the output.
There seems to be something going on with auto-conversion of input shapes, but I can't tell what. Does anyone know what I'm doing wrong? Why aren't these two models exactly equivillent?
import tensorflow as tf
import tensorflow.keras
import numpy as np
x = np.arange(10).reshape((-1,1,1))
y = np.arange(10)
#This model works fine
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(32, input_shape=(1, 1), return_sequences = True))
model.add(tf.keras.layers.LSTM(16))
model.add(tf.keras.layers.Dense(1))
model.add(tf.keras.layers.Activation('linear'))
#This model fails. But shouldn't this be equivalent to the above?
model2 = tf.keras.Sequential(
{
tf.keras.layers.LSTM(32, input_shape=(1, 1), return_sequences = True),
tf.keras.layers.LSTM(16),
tf.keras.layers.Dense(1),
tf.keras.layers.Activation('linear')
})
#This works
model.compile(loss='mean_squared_error', optimizer='adagrad')
model.fit(x, y, epochs=1, batch_size=1, verbose=2)
#But this doesn't! Why not? The error is different each time, but usually
#something about the input size being wrong
model2.compile(loss='mean_squared_error', optimizer='adagrad')
model2.fit(x, y, epochs=1, batch_size=1, verbose=2)
Why aren't those two models equivalent? Why does one handle the input size correctly but the other doesn't? The second model fails with a different error each time (once in a while it even works) so i thought maybe there's some interaction with the first model? But I've tried commenting out the first model and that doesn't help. So why doesn't the second one work?
UPDATE: Here is the "model.summary() for the first and second model. They do seem different but I don't understand why.
For model.summary():
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 1, 32) 4352
_________________________________________________________________
lstm_1 (LSTM) (None, 16) 3136
_________________________________________________________________
dense (Dense) (None, 1) 17
_________________________________________________________________
activation (Activation) (None, 1) 0
=================================================================
Total params: 7,505
Trainable params: 7,505
Non-trainable params: 0
For model2.summary():
model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_2 (LSTM) (None, 1, 32) 4352
_________________________________________________________________
activation_1 (Activation) (None, 1, 32) 0
_________________________________________________________________
lstm_3 (LSTM) (None, 16) 3136
_________________________________________________________________
dense_1 (Dense) (None, 1) 17
=================================================================
Total params: 7,505
Trainable params: 7,505
Non-trainable params: 0```
When you are creating the model with the inline declarations, you put the layers in curly braces {}, which makes it a set, which is inherently unordered. Change the curly braces to square brackets [] to put them in an ordered list. This will make sure that the layers are in the correct order in your model.

how to save custom trained model without full connect layer just like MobileNetV2 include_top=False

i want to save my trained model to .h5 without last two layers, in order to transfer learning using my custom model in the furture, just like MobileNetV2 include_top=False, can someone help me, thanks!
base_model = tf.keras.applications.mobilenet_v2.MobileNetV2(
alpha=1.0,
input_shape=IMG_SHAPE,
include_top=False,
weights='imagenet')
model = tf.keras.Sequential([
base_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(255, activation=tf.nn.softmax)
])
trained model like this:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
mobilenetv2_1.00_224 (Model) (None, 2, 2, 1280) 2257984
_________________________________________________________________
global_average_pooling2d (Gl (None, 1280) 0
_________________________________________________________________
dense (Dense) (None, 205) 262605
=================================================================
Total params: 2,520,589
Trainable params: 2,486,477
Non-trainable params: 34,112
_________________________________________________________________
when i try to using it for transfer learning
keras_model = loadModel(keras_model_path)
keras_model.summary()
input = keras_model.input
hidden = tf.keras.layers.GlobalMaxPooling2D()(keras_model.layers[-3].output)
out = tf.keras.layers.Dense(128, activation=tf.nn.softmax)(hidden)
model2 = tf.keras.Model(input, out)
model2.summary()
an error occurs
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(?, 64, 64, 3), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: []
i want to save my trained model to .h5 without last two layers,
why don't you save the full model with model.save() and when you reload it for transfer learning, just remove the layers using:
model.layers.pop()
You can also remove the layers before saving the model but I wouldn't do that