I've been trying to build a sequential model in Keras using the pooling layer tf.nn.fractional_max_pool. I know I could try making my own custom layer in Keras, but I'm trying to see if I can use the layer already in Tensorflow. For the following code snippet:
p_ratio=[1.0, 1.44, 1.44, 1.0]
model = Sequential()
model.add(ZeroPadding2D((2,2), input_shape=(1, 48, 48)))
model.add(Conv2D(320, (3, 3), activation=PReLU()))
model.add(ZeroPadding2D((1,1)))
model.add(Conv2D(320, (3, 3), activation=PReLU()))
model.add(InputLayer(input_tensor=tf.nn.fractional_max_pool(model.layers[3].output, p_ratio)))
I get this error. I've tried some other things with Input instead of InputLayer and also the Keras Functional API but so far no luck.
Got it to work. For future reference, this is how you would need to implement it. Since tf.nn.fractional_max_pool returns 3 tensors, you need to get the first one only:
model.add(InputLayer(input_tensor=tf.nn.fractional_max_pool(model.layers[3].output, p_ratio)[0]))
Or using Lambda layer:
def frac_max_pool(x):
return tf.nn.fractional_max_pool(x,p_ratio)[0]
With the model implementation being:
model.add(Lambda(frac_max_pool))
Related
I'm working on a toy Keras/Tensorflow project targeting the MNIST dataset. I want to build something akin to a 2D convolutional network, but instead of a stack of filters, I want to produce a dense vector representation.
Here is an example of a model that I used to create an autoencoder for a 3x3 sub-sample of the input:
model = Sequential()
model.add(Flatten(input_shape=(3, 3)))
model.add(Dense(32, activation='elu'))
model.add(Dense(4, activation='elu'))
model.add(Dense(32, activation='elu'))
model.add(Dense(9, activation='sigmoid'))
model.add(Reshape((3, 3)))
Using this model, I know that the topology is close to what for my 3x3 kernel. What I am trying to figure out is how to replicate/tile the first three layers of this model over my 2D image. I would like to have all of the features of the Conv2d layer such as strides/padding but it's not clear to me if/how i could replace the kernel of that layer with an entire multi-layer "sub model".
Some properties that I would like:
The "kernel" needs to be shared across the tiled instances so that we only have to train a single kernel.
However we define this kernel, it would be nice if it could be expressed in keras layers
It has all of the sampling features of Conv2d like padding/strides/dilation
Some things I have tried:
Keras Conv2D custom kernel initialization - seems to require the kernel to be reduced to a single tensor?
Using K.tile but that seems to require me to reimplement large parts of Conv2d and it's not clear if the variables that are created are shared or new instances
You're in luck, because there's a tensorflow function that does exactly what you want. You're looking for tf.image.extract_patches. You can just put it in a tf.keras.layers.Lambda layer to wrap it in a tf.keras.layer.Layer. A cleaner way to do it is tf.keras.layers.Layer, but it has slightly more effort. More info on how to do that can be found in the docs for tf.keras.layers.Lamba
I want to use ResNet50 with Imagenet weights.
The last layer of ResNet50 is (from here)
x = layers.Dense(1000, activation='softmax', name='fc1000')(x)
I need to keep the weights of this layer but remove the softmax function.
I want to manually change it so my last layer looks like this
x = layers.Dense(1000, name='fc1000')(x)
but the weights stay the same.
Currently I call my net like this
resnet = Sequential([
Input(shape(224,224,3)),
ResNet50(weights='imagenet', input_shape(224,224,3))
])
I need the Input layer because otherwise the model.compile says that placeholders aren't filled.
Generally there are two ways of achievieng this:
Quick way - supported functions:
To change the final layer's activation function, you can pass an argument classifier_activation.
So in order to get rid of activation all together, your module can be called like:
import tensorflow as tf
resnet = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
weights='imagenet',
input_shape=(224,224,3),
pooling="avg",
classifier_activation=None
)
])
This however, is not going to work if the you want a different function, that is not supported by Keras classifer_activation parameter (e. g. custom activation function).
To achieve this you can use the workaround solution:
Long way - copy the model's weights
This solution proposes copying the original model's weights onto your custom one. This approach works because apart from the activation function you are not chaning the model's architecture.
You need to:
1. Download original model.
2. Save it's weights.
3. Declare your modified version of the model (in your case, without the activation function).
4. Set the weights of the new model.
Below snippet explains this concept in more detail:
import tensorflow as tf
# 1. Download original resnet
resnet = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
weights='imagenet',
input_shape=(224,224,3),
pooling="avg"
)
])
# 2. Hold weights in memory:
imagenet_weights = resnet.get_weights()
# 3. Declare the model, but without softmax
resnet_no_softmax = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
include_top=False,
weights='imagenet',
input_shape=(224,224,3),
pooling="avg"
),
tf.keras.layers.Dense(1000, name='fc1000')
])
# 4. Pass the imagenet weights onto the second resnet
resnet_no_softmax.set_weights(imagenet_weights)
Hope this helps!
Hi I am having some serious problems saving and loading a tensorflow model which is combination of hugging face transformers + some custom layers to do classfication. I am using the latest Huggingface transformers tensorflow keras version. The idea is to extract features using distilbert and then run the features through CNN to do classification and extraction. I have got everything to work as far as getting the correct classifications.
The problem is in saving the model once trained and then loading the model again.
I am using tensorflow keras and tensorflow version 2.2
Following is the code to design the model, train it, evaluate it and then save and load it
bert_config = DistilBertConfig(dropout=0.2, attention_dropout=0.2, output_hidden_states=False)
bert_config.output_hidden_states = False
transformer_model = TFDistilBertModel.from_pretrained(DISTIL_BERT, config=bert_config)
input_ids_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='input_token', dtype='int32')
input_masks_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='masked_token', dtype='int32')
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in)[0]
x = tf.keras.layers.Bidirectional(
tf.keras.layers.LSTM(50, return_sequences=True, dropout=0.1,
recurrent_dropout=0, recurrent_activation="sigmoid",
unroll=False, use_bias=True, activation="tanh"))(embedding_layer)
x = tf.keras.layers.GlobalMaxPool1D()(x)
outputs = []
# lots of code here to define the dense layers to generate the outputs
# .....
# .....
model = Model(inputs=[input_ids_in, input_masks_in], outputs=outputs)
for model_layer in model.layers[:3]:
logger.info(f"Setting layer {model_layer.name} to not trainable")
model_layer.trainable = False
rms_optimizer = RMSprop(learning_rate=0.001)
model.compile(loss=SigmoidFocalCrossEntropy(), optimizer=rms_optimizer)
# the code to fit the model (which works)
# then code to evaluate the model (which also works)
# finally saving the model. This too works.
tf.keras.models.save_model(model, save_url, overwrite=True, include_optimizer=True, save_format="tf")
However, when I try to load the saved model using the following
tf.keras.models.load_model(
path, custom_objects={"Addons>SigmoidFocalCrossEntropy": SigmoidFocalCrossEntropy})
I get the following load error
ValueError: The two structures don't have the same nested structure.
First structure: type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')
Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}
More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')" is not
Entire first structure:
.
Entire second structure:
{'input_ids': .}
I believe the issue is because TFDistilBertModel layer can be called using a dictionary input from DistilBertTokenizer.encode() and that happens to be the first layer. So the model compiler on load expects that to be the input signature to the call model. However, the inputs defined to the model are two tensors of shape (None, 128)
So how do I tell the load function or the save function to assume the correct signatures?
I solved the issue.
The issue was the object transformer_model in the above code is itself not a layer. So if we want to embed it inside another keras layer we should use the internal keras layer that is wrapped in the model
So changing the line
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in[0]
to
embedding_layer = transformer_model.distilbert([input_ids_in, input_masks_in])[0]
makes everything work. Hope this helps someone else. Took a long time to debug through tf.keras code to figure this one out although in hindsight it is obvious. :)
I suffered the same problem, casually, yesterday. My solution is very similar to yours, I supposed that the problem was due to how tensorflow keras processes custom models so, the idea was to use the layers of the custom model inside my model. This has the advantage of not calling explicitly the layer by its name (in my case, it is useful for easy building more generic models using different pretrained encoders):
sent_encoder = getattr(transformers, self.model_name).from_pretrained(self.shortcut_weights).layers[0]
I don't explored all the models of HuggingFace, but a few that I tested seem to be a custom model with only one custom layer.
Your solution also works like a charm, in fact, both solutions are the same if "distilbert" references to ".layers[0]".
I have followed the TensorFlow Layers tutorial to create a CNN for MNIST digit classification using TensorFlow's tf.layers module. Now I'm trying to learn how to use TensorBoard from TensorBoard: Visualizing Learning. Perhaps this tutorial hasn't been updated recently, because it says its example code is a modification of that tutorial's and links to it, but the code is completely different: it manually defines a single-hidden-layer fully-connected network.
The TensorBoard tutorial shows how to use tf.summary to attach summaries to a layer by creating operations on the layer's weights tensor, which is directly accessible because we manually defined the layer, and attaching tf.summary objects to those operations. To do this if I'm using tf.layers and its tutorial code, I believe I'd have to:
Modify the Layers tutorial's example code to use the non-functional interface (Conv2D instead of conv2d and Dense instead of dense) to create the layers
Use the layer objects' trainable_weights() functions to get the weight tensors and attach tf.summary objects to those
Is that the best way to use TensorBoard with tf.layers, or is there a way that's more directly compatible with tf.layers and the functional interface? If so, is there an updated official TensorBoard tutorial? It would be nice if the documentation and tutorials were more unified.
You should be able to use the output of your tf.layers call to get the activations. Taking the first convolutional layer of the linked layers tutorial:
# Convolutional Layer #1
conv1 = tf.layers.conv2d(
inputs=input_layer,
filters=32,
kernel_size=[5, 5],
padding="same",
activation=tf.nn.relu)
You could do:
tensor_name = conv1.op.name
tf.summary.histogram(tensor_name + '/activation', conv1)
Not sure if this is the best way, but I believe it is the most direct way of doing what you want.
Hope this helps!
You can use something like this
with tf.name_scope('dense2'):
preds = tf.layers.dense(inputs=dense1,units = 12,
activation=tf.nn.sigmoid, name="dense2")
d2_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'dense2')
tf.summary.histogram("weights", d2_vars[0])
tf.summary.histogram("biases", d2_vars[1])
tf.summary.histogram("activations", preds)
Another choice is to use tf.layers.Dense instead of tf.layers.dense (difference between d and D).
The paradigm for Dense is :
x = tf.placeholder(shape=[None, 100])
dlayer = tf.layers.Dense(hidden_unit)
y = dlayer(x)
With dlayer as intermediate, you're able to do:
k = dlayer.kernel
b = dlayer.bias
k_and_b = dlayer.weights
Note that you won't get the dlayer.kernel until you apply y = dlayer(x).
Things are similar for other layers such as convolution layer. Check them with any available auto-completion.
I made a model using Keras with Tensorflow. I use Inputlayer with these lines of code:
img1 = tf.placeholder(tf.float32, shape=(None, img_width, img_heigh, img_ch))
first_input = InputLayer(input_tensor=img1, input_shape=(img_width, img_heigh, img_ch))
first_dense = Conv2D(16, 3, 3, activation='relu', border_mode='same', name='1st_conv1')(first_input)
But I get this error:
ValueError: Layer 1st_conv1 was called with an input that isn't a symbolic tensor. Received type: <class 'keras.engine.topology.InputLayer'>. Full input: [<keras.engine.topology.InputLayer object at 0x00000000112170F0>]. All inputs to the layer should be tensors.
When I use Input like this, it works fine:
first_input = Input(tensor=img1, shape=(224, 224, 3), name='1st_input')
first_dense = Conv2D(16, 3, 3, activation='relu', border_mode='same', name='1st_conv1')(first_input)
What is the difference between Inputlayer and Input?
InputLayer is a layer.
Input is a tensor.
You can only call layers passing tensors to them.
The idea is:
outputTensor = SomeLayer(inputTensor)
So, only Input can be passed because it's a tensor.
Honestly, I have no idea about the reason for the existence of InputLayer. Maybe it's supposed to be used internally. I never used it, and it seems I'll never need it.
According to tensorflow website, "It is generally recommend to use the functional layer API via Input, (which creates an InputLayer) without directly using InputLayer."
Know more at this page here
Input: Used for creating a functional model
inp=tf.keras.Input(shape=[?,?,?])
x=layers.Conv2D(.....)(inp)
Input Layer: used for creating a sequential model
x=tf.keras.Sequential()
x.add(tf.keras.layers.InputLayer(shape=[?,?,?]))
And the other difference is that
When using InputLayer with the Keras Sequential model, it can be skipped by moving the input_shape parameter to the first layer after the InputLayer.
That is in sequential model you can skip the InputLayer and specify the shape directly in the first layer.
i.e From this
model = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(4,)),
tf.keras.layers.Dense(8)])
To this
model = tf.keras.Sequential([
tf.keras.layers.Dense(8, input_shape=(4,))])
To define it in simple words:
keras.layers.Input is used to instantiate a Keras Tensor. In this case, your data is probably not a tf tensor, maybe an np array.
On the other hand, keras.layers.InputLayer is a layer where your data is already defined as one of the tf tensor types, i.e., can be a ragged tensor or constant or other types.
I hope this helps!