I use tensorflow 2.1 with customize layer as follow:
class Mylayer(KL.layer):
def __init__(self, name):
super(Mylayer, self).__init__(name)
self.conv = KL.Conv2D(32)
def call(self, inputs):
outputs = self.conv(inputs)
np.save('outputs.npy', outputs)
return outputs
However, whether I decorate tf.function at train_step or not, np.save says cannot convert a symbolic tensor to numpy array. If I change to np.save('outputs.txt', outputs.numpy()) without using tf.function, it shows that tensor object has no attribute numpy. Also, call() function seems to be called twice with symbolic tensor in first time and eager tensor in second time when not using tf.function.
How do I save the tensor value inside call()?
Keras models are implicitly compiled into static graphs, whether you use #tf.function in the call method or not. Consequently, all tensors are of type tf.Tensor and not of type tf.EagerTensor and therefore don't have the numpy() method.
To overcome this, simply pass dynamic=True to the constructor of the model that uses the layer. You will then be able to use the numpy() method
But remember, doing so may significantly increase training and inference times.
Related
When I create a Keras model with one or more custom layers, I can use the model.save() method to persist the Keras model using the TensorFlow SavedModel format.
I can load this model from the filesystem using tf.keras.models.load_model() function and save it to the filesystem again.
But when I load the SavedModel from the filesystem a second time, it fails with this exception:
TypeError: f(inputs, training, training, training, training, *, training, training) missing 1 required argument: training
You can try replicating this issue with the following code:
import tensorflow as tf
class CustomLayer(tf.keras.layers.Layer):
def call(self, inputs, *args, **kwargs):
return inputs
model1 = tf.keras.Sequential([
CustomLayer()
])
model1.build((None, 1))
model1.compile()
model1.save("model1")
model2 = tf.keras.models.load_model("model1")
model2.save("model2")
# This line should raise a TypeError.
model3 = tf.keras.models.load_model("model2")
Why the problem exists
The problem is that the TensorFlow SavedModel format does not actually serialize custom Python code. It only saves the TensorFlow graph generated by custom Keras layers and other Python objects.
The tf.keras.models.load_model() function--by default--does not return the Python layer. Instead, it returns a placeholder layer containing the same part of the TensorFlow computation graph. We can see this in the example in my question:
>>> model1.layers
[<__main__.CustomLayer at 0x7ff04c14ee20>]
>>> model2.layers
[<keras.saving.saved_model.load.CustomLayer at 0x7ff114fd7be0>]
When model2 is saved and loaded from the filesystem, TensorFlow cannot correctly parse the *args and **kwargs arguments in CustomLayer.call().
I don't know whether the actual bug is within the saving code, the loading code, or both.
The real fix needs to happen within TensorFlow/Keras, but in the meantime, there are
Workarounds
You can choose any ONE of the below workarounds to avoid serialization errors with custom Keras layers.
Change the signature on Layer.call()
Currently, the official method signature on Layer.call() is def call(self, inputs, *args, **kwargs):
But TensorFlow will throw a TypeError when trying to load a model with a custom layer with this signature. To fix the error, write all of your custom layers with a signature of def call(self, inputs):. If your layer behaves differently during training or inference, then you can use the method signature def call(self, inputs, training=None):
This makes it easier for TensorFlow to generate placeholder layers generated in the keras.saving.saved_model.load module. But this placeholder layer is still not exactly the same as the original Python code.
Use the custom_objects parameter on tf.keras.models.load_model()
It is possible to load a model with its original Python layers instead of the placeholder layers. Just pass a dictionary mapping layer names to Python layer class objects. This requires your code to be able to import the original Python layer. The example in my question can be fixed as follows:
model3 = tf.keras.models.load_model(
"model2",
custom_objects=dict(
CustomLayer=CustomLayer,
),
)
Make sure that your layer implements Layer.get_config() and returns a dictionary with all of the parameters needed to recreate the layer from scratch. The layer must be able to be recreated with Layer.from_config().
Import the Python layer and add it to Keras's global registry
Keras maintains a global registry of custom Python classes and other objects to refer to when loading SavedModels. You can register your custom Keras layer with the #tf.keras.utils.register_keras_serializable() decorator. For example:
#tf.keras.utils.register_keras_serializable(
package="my_python_package"
)
class CustomLayer(tf.keras.layers.Layer):
def call(self, inputs, *args, **kwargs):
return inputs
This method also requires that your layer properly implement Layer.get_config().
Install the Python layer object with tf.keras.utils.custom_object_scope()
Much like the above two solutions, the tf.keras.utils.custom_object_scope() context manager can specify which custom layers to use when deserialization.
I've created a keras subclass model just like this:
class SubModel(tf.Keras.Model):
def __init__(self, features, **kwargs):
"""Init function of Model.
Args:
features: A list of SparseFeature and DenseFeature.
"""
assert len(features) > 0
super(SubModel, self).__init__(name='SubModel', **kwargs)
self.features = features
Note that there is a features arrtibute in __init__ function which will be used in call method of this model. Everything works well when I train and evaluate the model with keras style.
But, now I want to convert this model to estimator using tf.keras.model_to_estimator function. It raise an error: AttributeError: '_ListWrapper' object has no attribute 'get_config'.
According to my debug, it's the features attribute which is added to the model cause this error. When convertint to estimator, it regard features as a layer of the model, and try to call the get_config function when cloning the model. It seems that all the attributes added to the model will be treated as layer when cloning the model.
But I really want to use features as a part of model, so that it can be accessed through other function of this model like call. Is there other ways to solve this?
I think tf.keras.model_to_estimator is compatible with Sequential or Functional API Keras model perfectly but poorly with Subclass model especially when implementing complicated operation in subclass.
So, If you have defined a subclass keras model, and want to covert it to estimator, the best way is define model_fn function, and put the keras model in it like code below:
def model_fn(features, labels, mode):
model = SubModel()
outputs = model(features)
loss = tf.keras.losses.xx(labels, outputs)
return tf.estimator.EstimatorSpec(...)
I want to add to my model a layer that, during evaluation, takes the input, applies some transformations (a quantization in this case, but can be whatever) and return it as the output. This layer must, however, be completely transparent during training, meaning that it must return the same input tensor.
I have written the following function
from keras.layers import Lambda
import keras.backend as K
def myquantize(x):
return K.in_test_phase( K.clip(K.round(x*(2**5))/(2**5),-3.9,3.9) , x)
which I then use via a Lambda layer
y = keras.layers.Conv1D(**args1)
y = keras.layers.AveragePooling1D(pool_size=2)(y)
y = keras.layers.Lambda(myquantize)(y)
y = keras.layers.Conv1D(**args2)
#...
Now, in principle the K.in_test_phase should return x during training, and that expression during test.
However, training the network with such layer prevent the network from learning (i.e. the train loss stops decreasing after 3 epochs), while if I remove it the network keeps training normally. I assume this layer is not actually transparent during training as expected.
in_test_phase has a parameter of training which you can explicitly set to indicate whether you are training or not. If you don't set it explicitly, then the value of learning_phase is used. This value keeps changing when you reset the graph or when you call different types of fit/predict/evaluate functions of model.
Since your full code isn't present, you can make use of training parameter. Set it to True during training. Then save the weights of the model using save_weights function of model. When you wish to test your model, set the training parameter to False. Then load the weights using load_weights function and then you can proceed accordingly.
For those who are in a similar situation, I created a custom layer like the following, which I only use during training:
class MyLayer(keras.layers.Layer):
def __init__(self, **kwargs):
super(MyLayer, self).__init__(**kwargs)
def compute_output_shape(self, input_shape):
return input_shape
def call(self, inputs, **kwargs):
x=inputs
return K.identity(x)
note that this layer always returns the input tensor, but it serves as 'placeholder' for the next step. On the evaluation part of the code, I wrote the following code:
class MyLayer(keras.layers.Layer):
def __init__(self, **kwargs):
super(MyLayer, self).__init__(**kwargs)
def compute_output_shape(self, input_shape):
return input_shape
def call(self, inputs, **kwargs):
x=inputs
return #Your actual processing here
Here, the only difference is that you actually perform the desired processing steps on your tensor. When I load my stored model, I pass this class as custom object
model = keras.models.load_model(model_file,custom_objects={'MyLayer':MyLayer})
be careful to pass as MyLayer the one where the actual processing is performed.
This is my solution, other suggestions are welcome
I have a regular keras model called e and I would like to compare its output for both y_pred and y_true in my custom loss function.
from keras import backend as K
def custom_loss(y_true, y_pred):
return K.mean(K.square(e.predict(y_pred)-e.predict(y_true)), axis=-1)
I am getting the error: AttributeError: 'Tensor' object has no attribute 'ndim'
This is because y_true and y_pred are both tensor object and keras.model.predict expects to be passed a numpy.array.
Any idea how I may succeed in using my keras.model in my custom loss function?
I am open to getting the output of a specified layer if need be or to converting my keras.model to a tf.estimator object (or anything else).
First, let's try to understand the error message you're getting:
AttributeError: 'Tensor' object has no attribute 'ndim'
Let's take a look at the Keras documentation and find the predict method of Keras model. We can see the description of the function parameters:
x: the input data, as a Numpy array.
So, the model is trying to get a ndims property of a numpy array, because it expects an array as input. On other hand, the custom loss function of the Keras framework gets tensors as inputs. So, don't write any python code inside it - it will never be executed during evaluation. This function is just called to construct the computational graph.
Okay, now that we found out the meaning behind that error message, how can we use a Keras model inside custom loss function? Simple! We just need to get the evaluation graph of the model.
Update
The use of global keyword is a bad coding practice. Also, now in 2020 we have better functional API in Keras that makes hacks with layers uneccessary. Better use something like this:
from keras import backend as K
def make_custom_loss(model):
"""Creates a loss function that uses `model` for evaluation
"""
def custom_loss(y_true, y_pred):
return K.mean(K.square(model(y_pred) - model(y_true)), axis=-1)
return custom_loss
custom_loss = make_custom_loss(e)
Deprecated
Try something like this (only for Sequential models and very old API):
def custom_loss(y_true, y_pred):
# Your model exists in global scope
global e
# Get the layers of your model
layers = [l for l in e.layers]
# Construct a graph to evaluate your other model on y_pred
eval_pred = y_pred
for i in range(len(layers)):
eval_pred = layers[i](eval_pred)
# Construct a graph to evaluate your other model on y_true
eval_true = y_true
for i in range(len(layers)):
eval_true = layers[i](eval_true)
# Now do what you wanted to do with outputs.
# Note that we are not returning the values, but a tensor.
return K.mean(K.square(eval_pred - eval_true), axis=-1)
Please note that the code above is not tested. However, the general idea will stay the same regardless of the implementation: you need to construct a graph, in which the y_true and y_pred will flow through it to the final operations.
I am solving a text classification problem. I defined my classifier using the Estimator class with my own model_fn. I would like to use Google's pre-trained word2vec embedding as initial values and then further optimise it for the task at hand.
I saw this post: Using a pre-trained word embedding (word2vec or Glove) in TensorFlow
which explains how to go about it in 'raw' TensorFlow code. However, I would really like to use the Estimator class.
As an extension, I would like to then train this code on Cloud ML Engine, is there a good way of passing in the fairly large file with initial values?
Let's say we have something like:
def build_model_fn():
def _model_fn(features, labels, mode, params):
input_layer = features['feat'] #shape=[-1, params["sequence_length"]]
#... what goes here to initialize W
embedded = tf.nn.embedding_lookup(W, input_layer)
...
return predictions
estimator = tf.contrib.learn.Estimator(
model_fn=build_model_fn(),
model_dir=MODEL_DIR,
params=params)
estimator.fit(input_fn=read_data, max_steps=2500)
Embeddings are typically large enough that the only viable approach is using them to initialize a tf.Variable in your graph. This will allow you to take advantage of param servers in distributed, etc.
For this (and anything else), I would recommend you use the new "core" estimator, tf.estimator.Estimator as this will make things much easier.
From the answer in the link you provided, and knowing that we want a variable not a constant, we can either take approach:
(2) Initialize the variable using a feed dict, or
(3) Load the variable from a checkpoint
I'll cover option (3) first since it's much easier, and better:
In your model_fn, simply initialize a variable using the Tensor returned by a tf.contrib.framework.load_variable call. This requires:
That you have a valid TF checkpoint with your embeddings
You know the fully qualified name of the embeddings variable within the checkpoint.
The code is pretty simple:
def model_fn(mode, features, labels, hparams):
embeddings = tf.Variable(tf.contrib.framework.load_variable(
'gs://my-bucket/word2vec_checkpoints/',
'a/fully/qualified/scope/embeddings'
))
....
return tf.estimator.EstimatorSpec(...)
However this approach won't work for you if your embeddings weren't produced by another TF model, hence option (2).
For (2), we need to use tf.train.Scaffold which is essentially a configuration object that holds all the options for starting a tf.Session (which estimator intentionally hides for lots of reasons).
You may specify a Scaffold in the tf.train.EstimatorSpec you return in your model_fn.
We create a placeholder in our model_fn, and make it the
initializer operation for our embedding variable, then pass an init_feed_dict via the Scaffold. e.g.
def model_fn(mode, features, labels, hparams):
embed_ph = tf.placeholder(
shape=[hparams.vocab_size, hparams.embedding_size],
dtype=tf.float32)
embeddings = tf.Variable(embed_ph)
# Define your model
return tf.estimator.EstimatorSpec(
..., # normal EstimatorSpec args
scaffold=tf.train.Scaffold(init_feed_dict={embed_ph: my_embedding_numpy_array})
)
What's happening here is the init_feed_dict will populate the values of the embed_ph placeholder at runtime, which will then allow the embeddings.initialization_op (assignment of the placeholder), to run.