When I try to look at the .summary of my model with my custom layers, I get the following outputs:
Model: "functional_29"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_118 (InputNode) [(None, 1)] 0
__________________________________________________________________________________________________
input_119 (InputNode) [(None, 1)] 0
__________________________________________________________________________________________________
tf_op_layer_strided_slice_156 ( [(1,)] 0 input_118[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_157 ( [(1,)] 0 input_119[0][0]
__________________________________________________________________________________________________
input_120 (InputNode) [(None, 1)] 0
_________________________________________________________________________________________________
tf_op_layer_concat_106 (TensorF [(2,)] 0 tf_op_layer_strided_slice_162[0][
tf_op_layer_strided_slice_163[0][
...
__________________________________________________________________________________________________
tf_op_layer_strided_slice_164 ( [(1,)] 0 input_120[0][0]
__________________________________________________________________________________________________
tf_op_layer_node_128_output (Te [()] 0 tf_op_layer_Relu_55[0][0]
==================================================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
__________________________________________________________________________________________________
Why is that? How can I wrap all of those operations under the label MyLayer?
You can create your own layer by subclassing from tf.keras.layers.Layer.
import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras.layers import Input
class DirectLayer(tf.keras.layers.Layer):
def __init__(self, name = "direct_layer", **kwargs):
super(DirectLayer, self).__init__(name=name, **kwargs)
def build(self, input_shape):
self.w = self.add_weight(
shape=input_shape[1:],
initializer="random_normal",
trainable=True,
)
self.b = self.add_weight(
shape=input_shape[1:], initializer="random_normal", trainable=True
)
def call(self, inputs):
return tf.multiply(inputs, self.w) + self.b
x_in = Input(shape=[10])
x = DirectLayer(name="my_layer")(x_in)
x_out = DirectLayer()(x)
model = Model(x_in, x_out)
x = tf.ones([16,10])
tf.print(model(x))
tf.print(model.summary())
I created a simple layer called DirectLayer. I built a model that uses that layer twice. The Input layer is there just to specify the shape of input data.
As you can see, you can easily specify the name of the layer.
The summary function produces the following:
Model: "functional_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 10)] 0
_________________________________________________________________
my_layer (DirectLayer) (None, 10) 20
_________________________________________________________________
direct_layer (DirectLayer) (None, 10) 20
=================================================================
Total params: 40
Trainable params: 40
Non-trainable params: 0
_________________________________________________________________
None
Related
Code:
!pip install tensorflow-text==2.7.0
import tensorflow_text as text
import tensorflow_hub as hub
# ... other tf imports....
strategy = tf.distribute.MirroredStrategy()
print('Number of GPU: ' + str(strategy.num_replicas_in_sync)) # 1 or 2, shouldn't matter
NUM_CLASS=2
with strategy.scope():
bert_preprocess = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")
bert_encoder = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4")
def get_model():
text_input = Input(shape=(), dtype=tf.string, name='text')
preprocessed_text = bert_preprocess(text_input)
outputs = bert_encoder(preprocessed_text)
output_sequence = outputs['sequence_output']
x = Dense(NUM_CLASS, activation='sigmoid')(output_sequence)
model = Model(inputs=[text_input], outputs = [x])
return model
optimizer = Adam()
model = get_model()
model.compile(loss=CategoricalCrossentropy(from_logits=True),optimizer=optimizer,metrics=[Accuracy(), ],)
model.summary() # <- look at the output 1
tf.keras.utils.plot_model(model, show_shapes=True, to_file='model.png') # <- look at the figure 1
with strategy.scope():
optimizer = Adam()
model = get_model()
model.compile(loss=CategoricalCrossentropy(from_logits=True),optimizer=optimizer,metrics=[Accuracy(), ],)
model.summary() # <- compare with output 1, it has already lost it's shape
tf.keras.utils.plot_model(model, show_shapes=True, to_file='model_scoped.png') # <- compare this figure too, for ease
With scope, BERT loses seq_length, and it becomes None.
Model summary withOUT scope: (See there is 128 at the very last layer, which is seq_length)
Model: "model_6"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
text (InputLayer) [(None,)] 0 []
keras_layer_2 (KerasLayer) {'input_mask': (Non 0 ['text[0][0]']
e, 128),
'input_word_ids':
(None, 128),
'input_type_ids':
(None, 128)}
keras_layer_3 (KerasLayer) multiple 109482241 ['keras_layer_2[6][0]',
'keras_layer_2[6][1]',
'keras_layer_2[6][2]']
dense_6 (Dense) (None, 128, 2) 1538 ['keras_layer_3[6][14]']
==================================================================================================
Total params: 109,483,779
Trainable params: 1,538
Non-trainable params: 109,482,241
__________________________________________________________________________________________________
Model with scope:
Model: "model_7"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
text (InputLayer) [(None,)] 0 []
keras_layer_2 (KerasLayer) {'input_mask': (Non 0 ['text[0][0]']
e, 128),
'input_word_ids':
(None, 128),
'input_type_ids':
(None, 128)}
keras_layer_3 (KerasLayer) multiple 109482241 ['keras_layer_2[7][0]',
'keras_layer_2[7][1]',
'keras_layer_2[7][2]']
dense_7 (Dense) (None, None, 2) 1538 ['keras_layer_3[7][14]']
==================================================================================================
Total params: 109,483,779
Trainable params: 1,538
Non-trainable params: 109,482,241
__________________________________________________________________________________________________
If these image helps:
Another notable thing encoder_outputs is also missing if you take a look at the 2nd keras layer or 3rd layer of both model.
Consider
import tensorflow as tf
units=11
entrada=tf.keras.Input(name="entrada", shape=(units,))
unidad= tf.Variable([[1.0]]) # + 0.0* entrada[:,:1]
denseSoftmax=tf.keras.layers.Dense(units,name="denseSoftmax",activation="softmax")
softMaxOutput=denseSoftmax(unidad)
finalproduct=tf.keras.layers.Multiply()([entrada,softMaxOutput])
modelo=tf.keras.Model(entrada,finalproduct)
modelo.summary()
This example produces a model without trainable parameters, because the denseSoftMax layer does not act in the input. If I fake it by uncommenting + 0.0 * entrada[:,:1] then it produces the expected graph
Layer (type) Output Shape Param # Connected to
==================================================================================================
entrada (InputLayer) [(None, 11)] 0 []
tf.__operators__.getitem (Slic (None, 1) 0 ['entrada[0][0]']
ingOpLambda)
tf.math.multiply (TFOpLambda) (None, 1) 0 ['tf.__operators__.getitem[0][0]'
tf.__operators__.add (TFOpLamb (None, 1) 0 ['tf.math.multiply[0][0]']
denseSoftmax (Dense) (None, 11) 22 ['tf.__operators__.add[0][0]']
multiply (Multiply) (None, 11) 0 ['entrada[0][0]',
'denseSoftmax[0][0]']
But faking a zero valued link to an input seems as bad as adding a constant branch in the set of input layers.
Is there a way to announce to keras that it should follow the subgraph for a series of layers that are going to be merged with the resulting output, but do not depend on the input?
Is the following your desired?
class CustomModel(tf.keras.Model):
def __init__(self,units) -> None:
super().__init__()
self.entrada = tf.keras.layers.InputLayer(input_shape=(units,))
self.unidad= tf.Variable([[1.0]])
self.denseSoftmax = tf.keras.layers.Dense(units,name="denseSoftmax",activation="softmax")
self.finalproduct = tf.keras.layers.Multiply()
def call(self,inputs):
x = self.entrada(inputs)
softMaxOutput = self.denseSoftmax(self.unidad)
y = self.finalproduct([x,softMaxOutput])
return y
units = 11
modelo = CustomModel(units=units)
modelo.build(input_shape=(None,units))
modelo.summary()
Model: "custom_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 11)] 0
denseSoftmax (Dense) multiple 22
multiply (Multiply) multiple 0
=================================================================
Total params: 23
Trainable params: 23
Non-trainable params: 0
_________________________________________________________________
I have a sequential model with a VGG16 at the top.:
def rescale(x):
return x/65535.
base_model = tf.keras.applications.VGG16(
include_top=True, weights=None, input_tensor=None, input_shape=(224,224,1),
pooling=None, classes=102, classifier_activation='softmax')
model = tf.keras.Sequential([
tf.keras.Input(shape=(None, None, 1)),
tf.keras.layers.Lambda(rescale),
tf.keras.layers.experimental.preprocessing.Resizing(224, 224),
tf.keras.layers.experimental.preprocessing.RandomFlip(mode='horizontal_and_vertical', seed=42),
base_model
])
Output model.summary():
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lambda (Lambda) (None, None, None, 1) 0
_________________________________________________________________
resizing (Resizing) (None, 224, 224, 1) 0
_________________________________________________________________
random_flip (RandomFlip) (None, 224, 224, 1) 0
_________________________________________________________________
vgg16 (Functional) (None, 102) 134677286
=================================================================
Total params: 134,677,286
Trainable params: 134,677,286
Non-trainable params: 0
Now I want to create a new model with two outputs:
vgg_model = model.layers[3]
last_conv_layer = vgg_model.get_layer('block5_conv3')
new_model = tf.keras.models.Model(inputs=[model.inputs], outputs=[last_conv_layer.output, model.output])
But I get this error:
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_1_6:0", shape=(None, 224, 224, 1), dtype=float32) at layer "block1_conv1". The following previous layers were accessed without issue: []
What am I missing here?
Given a fitted model in this form:
def rescale(x):
return x/65535.
base_model = tf.keras.applications.VGG16(
include_top=True, weights=None, input_tensor=None, input_shape=(224,224,1),
pooling=None, classes=102, classifier_activation='softmax')
model = tf.keras.Sequential([
tf.keras.Input(shape=(None, None, 1)),
tf.keras.layers.Lambda(rescale),
tf.keras.layers.experimental.preprocessing.Resizing(224, 224),
tf.keras.layers.experimental.preprocessing.RandomFlip(mode='horizontal_and_vertical', seed=42),
base_model
])
### model.fit(...)
You can wrap your vgg in a Model that returns all the outputs you need
new_model = Model(inputs=model.layers[3].input,
outputs=[model.layers[3].output,
model.layers[3].get_layer('block5_conv3').output])
inp = tf.keras.Input(shape=(None, None, 1))
x = tf.keras.layers.Lambda(rescale)(inp)
x = tf.keras.layers.experimental.preprocessing.Resizing(224, 224)(x)
outputs = new_model(x)
new_model = Model(inp, outputs)
The summary of new_model:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_49 (InputLayer) [(None, None, None, 1)] 0
_________________________________________________________________
lambda_25 (Lambda) (None, None, None, 1) 0
_________________________________________________________________
resizing_25 (Resizing) (None, 224, 224, 1) 0
_________________________________________________________________
functional_47 (Functional) [(None, 102), (None, 14, 134677286
=================================================================
Total params: 134,677,286
Trainable params: 134,677,286
Non-trainable params: 0
I'm trying to make a model like GAN. But I can't figure out how to properly set trainable to False for just one model. Seems all models using the sub-model are affected.
Code:
import tensorflow as tf
from tensorflow.keras import Input, Model
from tensorflow.keras.layers import Dense
print(tf.__version__)
def build_submodel():
inp = tf.keras.Input(shape=(3,))
x = Dense(5)(inp)
model = Model(inputs=inp, outputs=x)
return model
def build_model_A():
inp = tf.keras.Input(shape=(3,))
x = submodel(inp)
x = Dense(7)(x)
model = Model(inputs=inp, outputs=x)
return model
def build_model_B():
inp = tf.keras.Input(shape=(11,))
x = Dense(3)(inp)
x = submodel(x)
model = Model(inputs=inp, outputs=x)
return model
submodel = build_submodel()
model_A = build_model_A()
model_A.compile("adam", "mse")
model_A.summary()
submodel.trainable = False
# same result with freezing layers
# for layer in submodel.layers:
# layer.trainable = True
model_B = build_model_B()
model_B.compile("adam", "mse")
model_B.summary()
model_A.summary()
Output:
Model: "model_10"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_11 (InputLayer) [(None, 3)] 0
_________________________________________________________________
model_9 (Model) (None, 5) 20
_________________________________________________________________
dense_10 (Dense) (None, 7) 42
=================================================================
Total params: 62
Trainable params: 62
Non-trainable params: 0
_________________________________________________________________
Model: "model_11"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_12 (InputLayer) [(None, 11)] 0
_________________________________________________________________
dense_11 (Dense) (None, 3) 36
_________________________________________________________________
model_9 (Model) (None, 5) 20
=================================================================
Total params: 56
Trainable params: 36
Non-trainable params: 20
_________________________________________________________________
Model: "model_10"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_11 (InputLayer) [(None, 3)] 0
_________________________________________________________________
model_9 (Model) (None, 5) 20
_________________________________________________________________
dense_10 (Dense) (None, 7) 42
=================================================================
Total params: 62
Trainable params: 42
Non-trainable params: 20
_________________________________________________________________
At first model_A has no non-trainable weights. But after building model_B. model_A has some non-trainable weights.
Also, the summary does not show which layers are non-trainable, just total non-trainable parameter count. Is there a better way to inspect which layers are frozen in a model?
you can use this function to show which layer are trainable or not
def print_params(model):
def count_params(weights):
"""Count the total number of scalars composing the weights.
# Arguments
weights: An iterable containing the weights on which to compute params
# Returns
The total number of scalars composing the weights
"""
weight_ids = set()
total = 0
for w in weights:
if id(w) not in weight_ids:
weight_ids.add(id(w))
total += int(K.count_params(w))
return total
trainable_count = count_params(model.trainable_weights)
non_trainable_count = count_params(model.non_trainable_weights)
print('id\ttrainable : layer name')
print('-------------------------------')
for i, layer in enumerate(model.layers):
print(i,'\t',layer.trainable,'\t :',layer.name)
print('-------------------------------')
print('Total params: {:,}'.format(trainable_count + non_trainable_count))
print('Trainable params: {:,}'.format(trainable_count))
print('Non-trainable params: {:,}'.format(non_trainable_count))
it will output like this
id trainable : layer name
-------------------------------
0 False : input_1
1 False : block1_conv1
2 False : block1_conv2
3 False : block1_pool
4 False : block2_conv1
5 False : block2_conv2
6 False : block2_pool
7 False : block3_conv1
8 False : block3_conv2
9 False : block3_conv3
10 False : block3_pool
11 False : block4_conv1
12 False : block4_conv2
13 False : block4_conv3
14 False : block4_pool
15 False : block5_conv1
16 False : block5_conv2
17 False : block5_conv3
18 False : block5_pool
19 True : global_average_pooling2d
20 True : dense
21 True : dense_1
22 True : dense_2
-------------------------------
Total params: 15,245,130
Trainable params: 530,442
Non-trainable params: 14,714,688
I have some problems in use tf.keras to build model. Now I want to define a trainbale weight tensor with shape(64, 128), which similar to tf.get_variable. However I can't achieve it.
In the past, I have try many methods.But I want to look for easily method.
inputs = tf.keras.Input((128,))
weights = tf.Variable(tf.random.normal((64, 128)))
output = tf.keras.layers.Lambda(lambda x: tf.matmul(x, tf.transpose(weights)))(inputs)
model = tf.keras.Model(inputs, output)
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_10 (InputLayer) (None, 128) 0
_________________________________________________________________
lambda_2 (Lambda) (None, 64) 0
=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
The defined weights is not trainable.
In addition, I know Dense can get trained matrix weights and bias. But if I want add a bias, I can't use Dense.
However, I have to use add_weights in custome layer, for example:
class Bias(keras.layers.Layer):
def build(self, input_shape):
self.bias = self.add_weight(shape=(64, 128), initializer='zeros', dtype=tf.float32, name='x')
self.built = True
def call(self, inputs):
return inputs + self.bias
inputs = Input(shape=(64, 128))
outputs = Bias()(inputs)
model = Model(inputs=inputs, outputs=outputs)
model.summary()
Layer (type) Output Shape Param #
=================================================================
input_11 (InputLayer) (None, 64, 128) 0
_________________________________________________________________
bias_5 (Bias) (None, 64, 128) 8192
=================================================================
Total params: 8,192
Trainable params: 8,192
Non-trainable params: 0
Is there any more easily method to define a trainable variable ?