I'm trying to customize the model taken from tf hub but
can't access the layers with following error 'KerasLayer' object has no attribute 'layers'
Here is my code as an example:
import tensorflow_hub as hub
from tensorflow.keras import layers
feature_extractor_url = "https://tfhub.dev/tensorflow/efficientnet/lite0/feature-vector/1"
base_model = hub.KerasLayer(feature_extractor_url,
input_shape=(224,224,3))
base_model.trainable = True
import tensorflow
from tensorflow.keras.models import Model
x = base_model.layers[-10].output
x = tensorflow.keras.layers.Conv2D(4, (3, 3), padding="same", activation="relu")(x)
x = tensorflow.keras.layers.GlobalMaxPooling2D()(x)
x = tensorflow.keras.layers.Flatten()(x)
outputs = tensorflow.keras.layers.Activation('sigmoid', name="example_output")(x)
model = Model(base_model.input, outputs=outputs)
model.summary()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-43-0501ec56d6c4> in <module>()
14 from tensorflow.keras.models import Model
15
---> 16 x = base_model.layers[-10].output
17 x = tensorflow.keras.layers.Conv2D(4, (3, 3), padding="same", activation="relu")(x)
18 x = tensorflow.keras.layers.GlobalMaxPooling2D()(x)
AttributeError: 'KerasLayer' object has no attribute 'layers'
What I've tried:
I built the model using sequential api :
model = tf.keras.Sequential([
base_model,
layers.Dense(image_data.num_classes)
])
model.summary()
But still a I can't access the layers inside base_model.
How can I access the layers from KerasLayer?
Thank you!
You can access the layers via weights of the Hub model.
The topic is not straightforwardly mentioned in the TF-docs unfortunately. This is the depth I could dig up to so far and hopefully it sheds some light on accessing layers on Hub.
TF 2.5.0 & TF-Hub 0.12.0 have been used for the below tests.
Layers in KerasLayer object
>>> import tensorflow_hub as hub
>>> model = hub.KerasLayer("https://tfhub.dev/deepmind/ganeval-cifar10-convnet/1")
>>> model
<tensorflow_hub.keras_layer.KerasLayer object at 0x7f0c79372190>
>>> len(model.weights)
57
>>> model.weights[56]
<tf.Variable 'cifar10_convnet/linear/b:0' shape=(10,) dtype=float32, numpy=
array([-0.2734375 , -1.46875 , 0.484375 , 1.2265625 , 0.53515625,
0.96875 , 0.3671875 , 0.02282715, -0.7265625 , -1.078125 ],
dtype=float32)>
>>> model.weights[56].name
'cifar10_convnet/linear/b:0'
Notice the above weights variable. KerasLayer has a get_weights() function as well. The difference in output is as below. Basically, the former is of type TF-Variable, and the latter is a numpy array.
>>> len(model.get_weights())
57
>>> model.get_weights()[56]
array([-0.2734375 , -1.46875 , 0.484375 , 1.2265625 , 0.53515625,
0.96875 , 0.3671875 , 0.02282715, -0.7265625 , -1.078125 ],
dtype=float32)
To get, for example, the names of all layers, simply run:
layers = model.weights
[ layers[i].name for i in range( len(layers) ) ]
A hint of my output:
'cifar10_convnet/conv_net_2d/conv_2d_0/w:0',
'cifar10_convnet/conv_net_2d/conv_2d_0/b:0',
'cifar10_convnet/conv_net_2d/batch_norm_0/moving_mean:0',
'cifar10_convnet/conv_net_2d/batch_norm_0/moving_variance:0'
Note that weight, bias, mean, variance etc. are all listed separately as layers in the output.
Layers in AutoTrackable object
This is for low-level TF2 API users.
>>> import tensorflow_hub as hub
>>> model = hub.load("https://tfhub.dev/deepmind/ganeval-cifar10-convnet/1")
>>> model
<tensorflow.python.training.tracking.tracking.AutoTrackable object at 0x7f95943ec410>
>>> len(model.variables)
57
>>> model.variables[56]
<tf.Variable 'cifar10_convnet/linear/b:0' shape=(10,) dtype=float32, numpy=
array([-0.2734375 , -1.46875 , 0.484375 , 1.2265625 , 0.53515625,
0.96875 , 0.3671875 , 0.02282715, -0.7265625 , -1.078125 ],
dtype=float32)>
Use "variables" instead of "weights" with this API.
The inner structure of the SavedModel loaded into a hub.KerasLayer is inaccessible. For this level of detail, you'll have to turn to EfficientNet source code instead.
I met the same problem as you yesterday,but luckily for me, i have found two ways to solve the problem.
1.base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
include_top=False,
weights='imagenet'
)
base_model.trainable = False
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
prediction_layer = tf.keras.layers.Dense(59)
model = tf.keras.Sequential([
base_model,
global_average_layer,
prediction_layer
])
model.summary()
....
fine_tune_at =100
for layer in base_model.layers[:fine_tune_at]:
layer.trainable = False
use the ' tf.keras.applications' with '.layers' together
2.Reference tensorflow documentation:https://tensorflow.google.cn/hub/tf2_saved_model?hl=en
base_model = hub.KerasLayer(feature_extractor_url,
trainable=True,
input_shape=(224,224,3))
model.summary()
Model: "sequential_2"
Layer (type) Output Shape Param #
keras_layer_1 (KerasLayer) (None, 1024) 3228864
dense_2 (Dense) (None, 59) 60475
Total params: 3,289,339
Trainable params: 3,267,451
Non-trainable params: 21,888
I hope my answer will help you
Related
I am a building model with TensorFlow probability layers. When I do, model.output.shape, I get an error:
AttributeError: 'UserRegisteredSpec' object has no attribute '_shape'
If I do, output_shape = tf.shape(model.output) it gives a Keras Tensor:
<KerasTensor: shape=(5,) dtype=int32 inferred_value=[None, 3, 128, 128, 128] (created by layer 'tf.compat.v1.shape_15')
How can I get the actual values [None, 3, 128, 128, 128]?
I tried output_shape.get_shape(), but that gives the Tensor shape [5].
Code to reproduce error:
import tensorflow as tf
import tensorflow_probability as tfp
from tensorflow_probability import distributions as tfd
tfd = tfp.distributions
model = tf.keras.Sequential()
model.add(tf.keras.layers.Input(10))
model.add(tf.keras.layers.Dense(2, activation="linear"))
model.add(
tfp.layers.DistributionLambda(
lambda t: tfd.Normal(
loc=t[..., :1], scale=1e-3 + tf.math.softplus(0.1 * t[..., 1:])
)
)
)
model.output.shape
tf.shape will return a KerasTensor which is not easy to get the output shape directly.
However you can do this:
tf.shape(model.output)
>> `<KerasTensor: shape=(2,) dtype=int32 inferred_value=[None, 1] (created by layer 'tf.compat.v1.shape_168')>`
You want to get inferred_value, so:
tf.shape(model.output)._inferred_value
>> [None, 1]
Basically you can access any layer's output shape with:
tf.shape(model.layers[idx].output)._inferred_value
where idx is the index of the desired layer.
To get the output shape of all the layers you could do for instance:
out_shape_list=[]
for layer in model.layers:
out_shape = layer.output_shape
out_shape_list.append(out_shape)
You will get a list of output shapes, one for each layer
I followed this tutorial to build a siamese network for my problem.
I was using Tensorflow 2.4.1 and now upgraded
This code worked wonderfully before
base_cnn = resnet.ResNet50(
weights="imagenet", input_shape=target_shape + (3,), include_top=False
)
flatten = layers.Flatten()(base_cnn.output)
dense1 = layers.Dense(512, activation="relu")(flatten)
dense1 = layers.BatchNormalization()(dense1)
dense2 = layers.Dense(256, activation="relu")(dense1)
dense2 = layers.BatchNormalization()(dense2)
output = layers.Dense(256)(dense2)
embedding = Model(base_cnn.input, output, name="Embedding")
trainable = False
for layer in base_cnn.layers:
if layer.name == "conv5_block1_out":
trainable = True
layer.trainable = trainable
Now each resnet layer or mobilenet or efficient net (tried them all)
throws these errors:
WARNING:tensorflow:
The following Variables were used a Lambda layer's call (tf.nn.convolution_620), but
are not present in its tracked objects:
<tf.Variable 'stem_conv/kernel:0' shape=(3, 3, 3, 48) dtype=float32>
It is possible that this is intended behavior, but it is more likely
an omission. This is a strong indication that this layer should be
formulated as a subclassed Layer rather than a Lambda layer.
It compiles and seems to fit.
But do we have to initialize the models somewhat differently in 2.5?
Thanks for any pointers!
I'm not sure what's the main reason for your issue as it's not reproducible generally. But here are some notes about that warning message. The traceback shown in your question is not from ResNet but from EfficientNet.
Now, we know that the Lambda layer exists so that arbitrary expressions can be used as a Layer when constructing Sequential and Functional API models. Lambda layers are best suited for simple operations or quick experimentation. While it is possible to use Variables with Lambda layers, this practice is discouraged as it can easily lead to bugs. For example:
import tensorflow as tf
x_input = tf.range(12.).numpy().reshape(-1, 4)
weights = tf.Variable(tf.random.normal((4, 2)), name='w')
bias = tf.ones((1, 2), name='b')
# lambda custom layer
mylayer1 = tf.keras.layers.Lambda(lambda x: tf.add(tf.matmul(x, weights),
bias), name='lambda1')
mylayer1(x_input)
WARNING:tensorflow:
The following Variables were used a Lambda layer's call (lambda1), but
are not present in its tracked objects:
<tf.Variable 'w:0' shape=(4, 2) dtype=float32, numpy=
array([[-0.753139 , -1.1668463 ],
[-1.3709341 , 0.8887151 ],
[ 0.3157893 , 0.01245957],
[-1.3878908 , -0.38395467]], dtype=float32)>
It is possible that this is intended behavior, but it is more likely
an omission. This is a strong indication that this layer should be
formulated as a subclassed Layer rather than a Lambda layer.
<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[ -3.903028 , 0.7617702],
[-16.687727 , -1.8367348],
[-29.472424 , -4.43524 ]], dtype=float32)>
It's because the mylayer1 layer doesn't trace the tf.Variables directly and so that those parameter won't appear in mylayer1.trainable_weights.
mylayer1.trainable_weights
[]
In general, Lambda layers can be convenient for simple stateless computation, but anything more complex should use a subclass Layer instead. From your traceback, it seems like there can be such a possible scenario with the step_conv layer.
for layer in EfficientNetB0(weights=None).layers:
if layer.name == 'stem_conv':
print(layer)
<tensorflow.python.keras.layers.convolutional.Conv2D object..
Quick surveying on source code of tf.compat.v1.nn.conv2d, lead to a lambda expression that might be the cause.
Here there is no need to revert back to TF2.4.1. I would always recommend try with latest version because it addressed many of the performance issues and new features.
I was able to execute above code without any issues in TF2.5.
import tensorflow as tf
print(tf.__version__)
from tensorflow.keras.applications import ResNet50
from tensorflow.keras import layers, Model
img_width, img_height = 224, 224
target_shape = (img_width, img_height, 3)
base_cnn = ResNet50(
weights="imagenet", input_shape=target_shape, include_top=False
)
flatten = layers.Flatten()(base_cnn.output)
dense1 = layers.Dense(512, activation="relu")(flatten)
dense1 = layers.BatchNormalization()(dense1)
dense2 = layers.Dense(256, activation="relu")(dense1)
dense2 = layers.BatchNormalization()(dense2)
output = layers.Dense(256)(dense2)
embedding = Model(base_cnn.input, output, name="Embedding")
trainable = False
for layer in base_cnn.layers:
if layer.name == "conv5_block1_out":
trainable = True
layer.trainable = trainable
Output:
2.5.0
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
94773248/94765736 [==============================] - 1s 0us/step
As per #Olli, Restarting and clearing the session the kernel has resolved the problem.
pip install tensorflow==2.3.0 , worked for me instead of tf 2.5
I was facing the issue related to using Lambda layer
I'm trying to train an autoencoder, with constraints that force one or more of the hidden/encoded nodes/neurons to have an interpretable value. My training approach uses paired images (though after training the model should operate on a single image) and utilizes a joint loss function that includes (1) the reconstruction loss for each of the images and (2) a comparison between values of the hidden/encoded vector, from each of the two images.
I've created an analogous simple toy problem and model to make this clearer. In the toy problem, the autoencoder is given a vector of length 3 as input. The encoding uses one dense layer to compute the mean (a scalar) and another dense layer to compute some other representation of the vector (given my construction, it will likely just learn an identity matrix, i.e., copy the input vector). See the figure below. The lowest node of the hidden layer is intended to compute the mean of the input vector. The rest of the hidden nodes are unconstrained aside from having to accommodate a reconstruction that matches the input.
The figure below exhibits how I wish to train the model, using paired images. "MSE" is mean-squared-error, although the identity of the actual function is not important for the question I'm asking here. The loss function is the sum of the reconstruction loss and the mean-estimation loss.
I've tried creating (1) a tf.data.Dataset to generate paired vectors, (2) a Keras model, and (3) a custom loss function. However, I'm failing to understand how to do this correctly for this particular situation.
I can't get the Model.fit() to run correctly, and to associate the model outputs with the Dataset targets as intended. See code and errors below. Can anyone help? I've done many Google and stackoverflow searches and still don't understand how I can implement this.
import tensorflow as tf
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
DTYPE = tf.dtypes.float32
N_VEC = 3
def my_generator(n):
while True:
# Create two identical vectors of length, except with different means.
# An internal layer (single neuron) of the model should predict the
# mean of the input vector. To train it to do so, with paired
# vector inputs, use a loss function that penalizes incorrect
# predictions of the difference of the means of two input vectors.
input_vec1 = tf.random.normal((n,), dtype=DTYPE)
target_mean_diff = tf.random.normal((1,), dtype=DTYPE)
input_vec2 = input_vec1 + target_mean_diff
# Model is a constrained autoencoder. Output targets are
# identical to the input vectors. Including them as explicit
# targets in this generator, for generalization.
target_vec1 = tf.identity(input_vec1)
target_vec2 = tf.identity(input_vec2)
yield ({'input_vec1':input_vec1,
'input_vec2':input_vec2},
{'target_vec1':target_vec1,
'target_vec2':target_vec2,
'target_mean_diff':target_mean_diff})
def my_dataset(n, batch_size=4):
ds = tf.data.Dataset.from_generator(my_generator,
output_signature=({'input_vec1':tf.TensorSpec(shape=(n,), dtype=DTYPE),
'input_vec2':tf.TensorSpec(shape=(n,), dtype=DTYPE)},
{'target_vec1':tf.TensorSpec(shape=(n,), dtype=DTYPE),
'target_vec2':tf.TensorSpec(shape=(n,), dtype=DTYPE),
'target_mean_diff':tf.TensorSpec(shape=(1,), dtype=DTYPE)}),
args=(n,))
ds = ds.batch(batch_size)
return ds
## Do a brief test using the Dataset
ds = my_dataset(N_VEC, batch_size=4)
ds_iter = iter(ds)
dict_inputs, dict_targets = next(ds_iter)
print(dict_inputs)
print(dict_targets)
## Define the Model
layer_encode_vec = tf.keras.layers.Dense(N_VEC, activation=None, name='encode_vec')
layer_decode_vec = tf.keras.layers.Dense(N_VEC, activation=None, name='decode_vec')
layer_encode_mean = tf.keras.layers.Dense(1, activation=None, name='encode_mean')
layer_decode_mean = tf.keras.layers.Dense(N_VEC, activation=None, name='decode_mean')
input1 = tf.keras.Input(shape=(N_VEC,), name='input_vec1')
input2 = tf.keras.Input(shape=(N_VEC,), name='input_vec2')
vec_encoded1 = layer_encode_vec(input1)
vec_encoded2 = layer_encode_vec(input2)
mean_encoded1 = layer_encode_mean(input1)
mean_encoded2 = layer_encode_mean(input2)
mean_diff = mean_encoded2 - mean_encoded1
pred_vec1 = layer_decode_vec(vec_encoded1) + layer_decode_mean(mean_encoded1)
pred_vec2 = layer_decode_vec(vec_encoded2) + layer_decode_mean(mean_encoded2)
model = tf.keras.Model(inputs=[input1, input2], outputs=[pred_vec1, pred_vec2, mean_diff])
print(model.summary())
## Define the joint loss function
def loss_total(y_true, y_pred):
loss_reconstruct = tf.reduce_mean(tf.keras.MSE(y_true[0], y_pred[0]))/2 + \
tf.reduce_mean(tf.keras.MSE(y_true[1], y_pred[1]))/2
loss_mean = tf.reduce_mean(tf.keras.MSE(y_true[2], y_pred[2]))
return loss_reconstruct + loss_mean
## Compile model
optimizer = tf.keras.optimizers.Adam(lr=0.01)
model.compile(optimizer=optimizer, loss=loss_total)
## Train model
history = model.fit(x=ds, epochs=10, steps_per_epoch=10)
Output: Example batch from the Dataset:
{'input_vec1': <tf.Tensor: shape=(4, 3), dtype=float32, numpy=
array([[-0.53022575, -0.02389329, 0.32843253],
[-0.61793506, -0.8276422 , -1.3469328 ],
[-0.5401968 , 0.3141346 , -1.3638284 ],
[-1.2189807 , 0.23848908, 0.75108534]], dtype=float32)>, 'input_vec2': <tf.Tensor: shape=(4, 3), dtype=float32, numpy=
array([[-0.23415083, 0.27218163, 0.6245074 ],
[-0.57636774, -0.7860749 , -1.3053654 ],
[ 0.65463066, 1.508962 , -0.16900098],
[-0.49326736, 0.9642024 , 1.4767987 ]], dtype=float32)>}
{'target_vec1': <tf.Tensor: shape=(4, 3), dtype=float32, numpy=
array([[-0.53022575, -0.02389329, 0.32843253],
[-0.61793506, -0.8276422 , -1.3469328 ],
[-0.5401968 , 0.3141346 , -1.3638284 ],
[-1.2189807 , 0.23848908, 0.75108534]], dtype=float32)>, 'target_vec2': <tf.Tensor: shape=(4, 3), dtype=float32, numpy=
array([[-0.23415083, 0.27218163, 0.6245074 ],
[-0.57636774, -0.7860749 , -1.3053654 ],
[ 0.65463066, 1.508962 , -0.16900098],
[-0.49326736, 0.9642024 , 1.4767987 ]], dtype=float32)>, 'target_mean_diff': <tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[0.29607493],
[0.04156734],
[1.1948274 ],
[0.7257133 ]], dtype=float32)>}
Output: The model summary:
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_vec1 (InputLayer) [(None, 3)] 0
__________________________________________________________________________________________________
input_vec2 (InputLayer) [(None, 3)] 0
__________________________________________________________________________________________________
encode_vec (Dense) (None, 3) 12 input_vec1[0][0]
input_vec2[0][0]
__________________________________________________________________________________________________
encode_mean (Dense) (None, 1) 4 input_vec1[0][0]
input_vec2[0][0]
__________________________________________________________________________________________________
decode_vec (Dense) (None, 3) 12 encode_vec[0][0]
encode_vec[1][0]
__________________________________________________________________________________________________
decode_mean (Dense) (None, 3) 6 encode_mean[0][0]
encode_mean[1][0]
__________________________________________________________________________________________________
tf.__operators__.add (TFOpLambd (None, 3) 0 decode_vec[0][0]
decode_mean[0][0]
__________________________________________________________________________________________________
tf.__operators__.add_1 (TFOpLam (None, 3) 0 decode_vec[1][0]
decode_mean[1][0]
__________________________________________________________________________________________________
tf.math.subtract (TFOpLambda) (None, 1) 0 encode_mean[1][0]
encode_mean[0][0]
==================================================================================================
Total params: 34
Trainable params: 34
Non-trainable params: 0
__________________________________________________________________________________________________
Output: The error message when calling model.fit():
Epoch 1/10
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
...
ValueError: Found unexpected keys that do not correspond to any
Model output: dict_keys(['target_vec1', 'target_vec2', 'target_mean_diff']).
Expected: ['tf.__operators__.add', 'tf.__operators__.add_1', 'tf.math.subtract']
You can pass a dict to Model for both inputs and outputs like so:
model = tf.keras.Model(
inputs={"input_vec1": input1, "input_vec2": input2},
outputs={
"target_vec1": pred_vec1,
"target_vec2": pred_vec2,
"target_mean_diff": mean_diff,
},
)
which avoids having to name the output layers.
For the losses, it's currently applying loss_total to each of the 3 outputs individually and summing to get the final loss, which is not what you want. So you can either break out each of the losses individually:
model.compile(
optimizer=optimizer,
loss={"target_vec1": "mse", "target_vec2": "mse", "target_mean_diff": "mse"},
loss_weights={"target_vec1": 0.5, "target_vec2": 0.5, "target_mean_diff": 1},
)
or you can manually train the model using a modified loss function that takes dict input. Something like:
def loss_total(y_true, y_pred):
loss_reconstruct = (
tf.reduce_mean(tf.keras.losses.MSE(y_true["target_vec1"], y_pred["target_vec1"])) / 2
+ tf.reduce_mean(tf.keras.losses.MSE(y_true["target_vec2"], y_pred["target_vec2"])) / 2
)
loss_mean = tf.reduce_mean(tf.keras.losses.MSE(y_true["target_mean_diff"], y_pred["target_mean_diff"]))
return loss_reconstruct + loss_mean
for epoch in range(10):
for batch, (x, y) in zip(range(10), ds):
with tf.GradientTape() as tape:
outputs = model(x, training=True)
loss = loss_total(y, outputs)
trainable_vars = model.trainable_variables
gradients = tape.gradient(loss, trainable_vars)
optimizer.apply_gradients(zip(gradients, trainable_vars))
print(f"Batch: {batch}, loss: {loss.numpy()}")
I am trying the following code to save and load the TF Keras Model which has an LSTM layer with initial_states as inputs. But when I tried to load the model I am getting the error
ValueError: Layer lstm expects 1 inputs, but it received 3 input tensors. Inputs received: [<tf.Tensor 'reshape_1/Identity:0' shape=(None, 5, 5) dtype=float32>, <tf.Tensor 'dense_2/Identity:0' shape=(None, 5) dtype=float32>, <tf.Tensor 'dense_2/Identity:0' shape=(None, 5) dtype=float32>]
Is there any way to load the model with LSTM initial states?
import numpy as np
import tensorflow as tf
inputs = tf.keras.Input(name='input_1', shape=[25])
initial_state = tf.keras.Input(name='initial_state', shape=[5])
dense_x = tf.keras.layers.Dense(units=5)(initial_state)
reshape = tf.keras.layers.Reshape(target_shape=[5, 5])(inputs)
stacked_rnn = tf.keras.layers.LSTM(units=5, return_sequences=True)(inputs=reshape, initial_state=[dense_x, dense_x])
flatten = tf.keras.layers.Flatten()(stacked_rnn)
dense = tf.keras.layers.Dense(name='dense_1', units=1, activation='sigmoid')(flatten)
model = tf.keras.Model(inputs=[inputs, initial_state], outputs=dense, name="test_model")
print(model(inputs=[np.zeros(shape=[5, 25]), np.zeros(shape=[5, 5])]))
tf.keras.models.save_model(model, "sequential/", )
sequential_model = tf.keras.models.load_model("sequential/")
print(sequential_model(inputs=[np.zeros(shape=[5, 25]), np.zeros(shape=[5, 5])]))
The above code snippet is working on Tensorflow 2.4.
import numpy as np
import tensorflow as tf
inputs = tf.keras.Input(name='input_1', shape=[25])
initial_state = tf.keras.Input(name='initial_state', shape=[5])
dense_x = tf.keras.layers.Dense(units=5)(initial_state)
reshape = tf.keras.layers.Reshape(target_shape=[5, 5])(inputs)
stacked_rnn = tf.keras.layers.LSTM(units=5, return_sequences=True)(inputs=reshape, initial_state=[dense_x, dense_x])
flatten = tf.keras.layers.Flatten()(stacked_rnn)
dense = tf.keras.layers.Dense(name='dense_1', units=1, activation='sigmoid')(flatten)
model = tf.keras.Model(inputs=[inputs, initial_state], outputs=dense, name="test_model")
#initial states to the model
print(model(inputs=[np.zeros(shape=[5, 25]), np.zeros(shape=[5, 5])]))
tf.keras.models.save_model(model, "sequential/", )
sequential_model = tf.keras.models.load_model("sequential/")
#load model with initial states
print(sequential_model(inputs=[np.zeros(shape=[5, 25]), np.zeros(shape=[5, 5])]))
Output
tf.Tensor(
[[0.5]
[0.5]
[0.5]
[0.5]
[0.5]], shape=(5, 1), dtype=float32)
tf.Tensor(
[[0.5]
[0.5]
[0.5]
[0.5]
[0.5]], shape=(5, 1), dtype=float32)
It seems that keras trainable attribute is ignored by tensorflow, which makes it very inconvenient to use keras as a syntactical shortcut in tensorflow.
For example:
import keras
import tensorflow as tf
import numpy as np
import keras.backend as K
Conv2 = keras.layers.Conv2D(filters=16, kernel_size=3, padding='same')
Conv2.trainable = False #This layers has been set to not trainable.
A=keras.layers.Input(batch_shape=(1,16,16,3))
B = Conv2(A)
x = np.random.randn(1, 16, 16,3)
y = np.random.randn(1,16, 16, 16)
True_y = tf.placeholder(shape=(1,16,16,16), dtype=tf.float32)
loss = tf.reduce_sum((B - True_y) ** 2)
opt_op = tf.train.AdamOptimizer(learning_rate=0.01).minimize(loss)
print(tf.trainable_variables())
# [<tf.Variable 'conv2d_1/kernel:0' shape=(3, 3, 3, 16) dtype=float32_ref>, <tf.Variable 'conv2d_1/bias:0' shape=(16,) dtype=float32_ref>]
sess = K.get_session()
for _ in range(10):
out = sess.run([opt_op, loss], feed_dict={A:x, True_y:y})
print(out[1])
OutPut:
5173.94
4968.7754
4785.889
4624.289
4482.1
4357.5757
4249.1504
4155.329
4074.634
4005.6482
It simply means the loss is decreasing and the weights are trainable.
I read the blog ''Keras as a simplified interface to TensorFlow'', but it mentioned nothing about the trainable problem.
Any suggestion is appreciated.
Your conclusion is basically correct. Keras is a wrapper around TensorFlow, but not all Keras functionality transfers directly into TensorFlow, so you need to be careful when you mix Keras and raw TF.
Specifically, in this case, if you want to call the minimize function yourself, you need to specify which variables you want to train on using the var_list argument of minimize.