My code like this:
balabala...
conv_model.add(keras.layers.Flatten())
input2 = keras.models.Sequential()
input2.add(keras.layers.Activation('linear', input_shape=(1,)))
model = keras.models.Sequential()
model.add(keras.layers.Merge([conv_model, input2], mode='concat'))
balabala.....
When I run this code, it says:
UserWarning: The `Merge` layer is deprecated and will be removed after
08/2017. Use instead layers from `keras.layers.merge`, e.g. `add`,
`concatenate`, etc.
I have tried to use 'keras.layers.Concatenate' in many ways like:
model.add(keras.layers.Concatenate([conv_model, angle]))
But it says:
The first layer in a Sequential model must get an `input_shape` or
`batch_input_shape` argument
Is anybody can help?
Sequential models are not supposed to work with branches.
You need a functional API model.
input2 = Input((1,))
out2 = Activation('linear')(input2)
concatenated = Concatenate(axis=chooseOne)([conv_model.output,out2])
model = Model([conv_model.input,input2], concatenated)
PS: the layer Activation('linear') does absolutely nothing in any model.
Related
Hi I am having some serious problems saving and loading a tensorflow model which is combination of hugging face transformers + some custom layers to do classfication. I am using the latest Huggingface transformers tensorflow keras version. The idea is to extract features using distilbert and then run the features through CNN to do classification and extraction. I have got everything to work as far as getting the correct classifications.
The problem is in saving the model once trained and then loading the model again.
I am using tensorflow keras and tensorflow version 2.2
Following is the code to design the model, train it, evaluate it and then save and load it
bert_config = DistilBertConfig(dropout=0.2, attention_dropout=0.2, output_hidden_states=False)
bert_config.output_hidden_states = False
transformer_model = TFDistilBertModel.from_pretrained(DISTIL_BERT, config=bert_config)
input_ids_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='input_token', dtype='int32')
input_masks_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='masked_token', dtype='int32')
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in)[0]
x = tf.keras.layers.Bidirectional(
tf.keras.layers.LSTM(50, return_sequences=True, dropout=0.1,
recurrent_dropout=0, recurrent_activation="sigmoid",
unroll=False, use_bias=True, activation="tanh"))(embedding_layer)
x = tf.keras.layers.GlobalMaxPool1D()(x)
outputs = []
# lots of code here to define the dense layers to generate the outputs
# .....
# .....
model = Model(inputs=[input_ids_in, input_masks_in], outputs=outputs)
for model_layer in model.layers[:3]:
logger.info(f"Setting layer {model_layer.name} to not trainable")
model_layer.trainable = False
rms_optimizer = RMSprop(learning_rate=0.001)
model.compile(loss=SigmoidFocalCrossEntropy(), optimizer=rms_optimizer)
# the code to fit the model (which works)
# then code to evaluate the model (which also works)
# finally saving the model. This too works.
tf.keras.models.save_model(model, save_url, overwrite=True, include_optimizer=True, save_format="tf")
However, when I try to load the saved model using the following
tf.keras.models.load_model(
path, custom_objects={"Addons>SigmoidFocalCrossEntropy": SigmoidFocalCrossEntropy})
I get the following load error
ValueError: The two structures don't have the same nested structure.
First structure: type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')
Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}
More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')" is not
Entire first structure:
.
Entire second structure:
{'input_ids': .}
I believe the issue is because TFDistilBertModel layer can be called using a dictionary input from DistilBertTokenizer.encode() and that happens to be the first layer. So the model compiler on load expects that to be the input signature to the call model. However, the inputs defined to the model are two tensors of shape (None, 128)
So how do I tell the load function or the save function to assume the correct signatures?
I solved the issue.
The issue was the object transformer_model in the above code is itself not a layer. So if we want to embed it inside another keras layer we should use the internal keras layer that is wrapped in the model
So changing the line
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in[0]
to
embedding_layer = transformer_model.distilbert([input_ids_in, input_masks_in])[0]
makes everything work. Hope this helps someone else. Took a long time to debug through tf.keras code to figure this one out although in hindsight it is obvious. :)
I suffered the same problem, casually, yesterday. My solution is very similar to yours, I supposed that the problem was due to how tensorflow keras processes custom models so, the idea was to use the layers of the custom model inside my model. This has the advantage of not calling explicitly the layer by its name (in my case, it is useful for easy building more generic models using different pretrained encoders):
sent_encoder = getattr(transformers, self.model_name).from_pretrained(self.shortcut_weights).layers[0]
I don't explored all the models of HuggingFace, but a few that I tested seem to be a custom model with only one custom layer.
Your solution also works like a charm, in fact, both solutions are the same if "distilbert" references to ".layers[0]".
Is it possible to incorporate an Add() function in the tf.keras.Sequential() model, when defined like:
from tensorflow import keras
model = keras.Sequential([
keras.Input(shape(input_shape,)),
keras.layers.Dense(32),
keras.layers.Dense(8),
# I want to add here
keras.layers.Add()(some_var)
], name='my_model')
some_var is a tensor of with the same size as the network at that point. So each element needs to be added to its corresponding element in some_var.
I know I can do this quite easily with the functional API, but would prefer to use a sequential model as it would match other branches in my network.
If its not clear keras.layers.Add()(some_var) is just a guess of how I would like it to work. This gives the error: ValueError: A merge layer should be called on a list of inputs..
My question is specific to the style in which I define the Sequential model.
One of the main difference between Functional and Sequential API is that Sequential works with single input and single output where as Functional API works with single-input and single-output or single-input and multiple-output, or multiple-inputs and multiple-outputs. So using Functional API, you can add two layers of multiple-inputs through `keras.layers.Add().
Also, this keras.layers.Add() can be used in to add two input tensors which is not really we do. we can rather use like d = tf.add(a,b). Both c and d are equal
a = tf.constant(1.,dtype=tf.float32, shape=(1,3)).
b = tf.constant(2.,dtype=tf.float32, shape=(1,3)).
c = tf.keras.layers.Add()([a, b]).
The following example is from keras website. You can see how it is used in Functional API
import keras
input1 = keras.layers.Input(shape=(16,))
x1 = keras.layers.Dense(8, activation='relu')(input1)
input2 = keras.layers.Input(shape=(32,))
x2 = keras.layers.Dense(8, activation='relu')(input2)
# equivalent to added = keras.layers.add([x1, x2])
added = keras.layers.Add()([x1, x2])
out = keras.layers.Dense(4)(added)
model = keras.models.Model(inputs=[input1, input2], outputs=out)
Thanks to #today comment (and then a deleted answer?!), I solved it using the tf.keras.layer.Lambda function.
model = keras.Sequential([
keras.Input(shape(input_shape,)),
keras.layers.Dense(32),
keras.layers.Dense(8),
keras.layers.Lambda(lambda x : x + some_var)
], name='my_model')
Usually we feed a model for training with external data. But I would like to use tensor coming from intermediate layer of the same model as an input for next batch.
I believe that this can be acheived by using manual loop for training. This time, I prefer to use fit_generator() from Keras (v2.2.4). I create a mode using Functional API.
Any help are appreciated. Thanks.
A very simple approach is to make the loop inside your own model:
inputs = Input(...)
#part 1 layers:
layer1 = SomeLayer(...)
layer2 = SomeLayer(...)
layer3 = SomeLayer(...)
intermediateLayer = IntermediateLayer(...)
#first pass:
out = layer1(inputs)
out = layer2(out)
out = layer3(out)
intermediate_out = intermediateLayer(out)
#second pass:
out = layer1(intermediate_out)
out = layer2(out)
out = layer3(out)
second_pass_out = intermediateLayer(out)
#rest of the model - you decide wheter you need the first pass or only the second
out = SomeLayer(...)(second_pass_out)
out = SomeLayer(...)(out)
...
final_out = FinalLayer(...)(out)
The model then goes:
model = Model(inputs, final_out)
You can, depending on your purposes, make only the second pass participate in training, blocking gradients from the first pass.
#right after intermediate_out, before using it
intermediate_out = Lambda(lambda x: K.stop_gradients(x))(intermediate_out)
You can also create more models that will share these layers, and use each model for a purpose while they will always be updated together (as they use the same layers).
Notice that in "part 1", there are layers that get "reused".
While in "rest of the model" the layers are not "reused", if for some reason you need to reuse the layers for the second part, you should do it the same way it was done for "part 1".
This is how I solve my problem.
model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
model.metrics_tensors =+ [self.model.get_layer('your_intermediate_layer').output] # This line is to access the output of a layer during training (what I want)
Then train like this:
loss_out, ...., your_intermediate_layer_out = model.train_on_batch(X, y)
your_intermediate_layer_out is a numpy array I am looking for during model's training.
I'm dealing with Keras functional API.
Specifically for my experiments, I'm using Keras resnet50 model obtained with:
model = resnet50.ResNet50(weights='imagenet')
Obviously, to get the final output of the network we need to feed a value to the placeholder input_1.
My question is, can I somehow start inferencing this graph from the relu layer which is depicted at the bottom of the picture below, provided that I feed a value of the appropriate dimensions into it?
I tried to achieve this with Keras functions. Something like:
self.inp = model.input
self.outputs = [layer.output for layer in model.layers]
self.functor = K.function([self.inp, K.learning_phase()], [self.outputs[6], self.outputs[17]])
But this approach will not work, because again to inference any output I need to feed value into tensor.
Is recreating graph from scratch my best option here?
Thanks
If I got you right, you can just specify input and output nodes
base_model = tf.keras.applications.ResNet50(weights='imagenet')
inference_model = tf.keras.Model(inputs=base_model.input, outputs=base_model.get_layer('any_layer_name').output)
You can set the output to any layer name
I wanna draw the weights of tf.layers.dense in tensorboard histogram, but it not show in the parameter, how could I do that?
The weights are added as a variable named kernel, so you could use
x = tf.dense(...)
weights = tf.get_default_graph().get_tensor_by_name(
os.path.split(x.name)[0] + '/kernel:0')
You can obviously replace tf.get_default_graph() by any other graph you are working in.
I came across this problem and just solved it. tf.layers.dense 's name is not necessary to be the same with the kernel's name's prefix. My tensor is "dense_2/xxx" but it's kernel is "dense_1/kernel:0". To ensure that tf.get_variable works, you'd better set the name=xxx in the tf.layers.dense function to make two names owning same prefix. It works as the demo below:
l=tf.layers.dense(input_tf_xxx,300,name='ip1')
with tf.variable_scope('ip1', reuse=True):
w = tf.get_variable('kernel')
By the way, my tf version is 1.3.
The latest tensorflow layers api creates all the variables using the tf.get_variable call. This ensures that if you wish to use the variable again, you can just use the tf.get_variable function and provide the name of the variable that you wish to obtain.
In the case of a tf.layers.dense, the variable is created as: layer_name/kernel. So, you can obtain the variable by saying:
with tf.variable_scope("layer_name", reuse=True):
weights = tf.get_variable("kernel") # do not specify
# the shape here or it will confuse tensorflow into creating a new one.
[Edit]: The new version of Tensorflow now has both Functional and Object-Oriented interfaces to the layers api. If you need the layers only for computational purposes, then using the functional api is a good choice. The function names start with small letters for instance -> tf.layers.dense(...). The Layer Objects can be created using capital first letters e.g. -> tf.layers.Dense(...). Once you have a handle to this layer object, you can use all of its functionality. For obtaining the weights, just use obj.trainable_weights this returns a list of all the trainable variables found in that layer's scope.
I am going crazy with tensorflow.
I run this:
sess.run(x.kernel)
after training, and I get the weights.
Comes from the properties described here.
I am saying that I am going crazy because it seems that there are a million slightly different ways to do something in tf, and that fragments the tutorials around.
Is there anything wrong with
model.get_weights()
After I create a model, compile it and run fit, this function returns a numpy array of the weights for me.
In TF 2 if you're inside a #tf.function (graph mode):
weights = optimizer.weights
If you're in eager mode (default in TF2 except in #tf.function decorated functions):
weights = optimizer.get_weights()
in TF2 weights will output a list in length 2
weights_out[0] = kernel weight
weights_out[1] = bias weight
the second layer weight (layer[0] is the input layer with no weights) in a model in size: 50 with input size: 784
inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(50, activation="relu", name="dense_1")(inputs)
x = layers.Dense(50, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, activation="softmax", name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(...)
model.fit(...)
kernel_weight = model.layers[1].weights[0]
bias_weight = model.layers[1].weights[1]
all_weight = model.layers[1].weights
print(len(all_weight)) # 2
print(kernel_weight.shape) # (784,50)
print(bias_weight.shape) # (50,)
Try to make a loop for getting the weight of each layer in your sequential network by printing the name of the layer first which you can get from:
model.summary()
Then u can get the weight of each layer running this code:
for layer in model.layers:
print(layer.name)
print(layer.get_weights())