Tensorflow concatenate layers not recogniing list of inputs - tensorflow

I am trying to create a tensorflow model using keras.Sequential() as follows:
nn_1 = tf.keras.Sequential() model
nn_1 = tf.keras.Sequential() model
model = tf.keras.Sequential()
model.add(tf.keras.layers.concatenate([nn_1, nn_2]))
When I try this, I get the following error:
ValueError: A `Concatenate` layer should be called on a list of at least 2 inputs
I don't understand why this is happening, since I am passing a list of 2 inputs to concatenate() . Am I missing something basic?
P.S. The same thing happened with dot() (which is what i was initially trying)

Related

Tensorflow Keras output layer shape weird error

I am fairly new to TF, Keras and ML in general.
I am trying to implement a very simple MLP with an input shape of (batch_size,3,2) and an output shape of (batch_size,3), that is (if I got it right): for every 3x2 feature, there is a corresponding 3 value array label.
Here is how I create the model:
model = tf.keras.Sequential([
tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
tf.keras.layers.Dense(3)
])
and these are the X and y shapes:
X_train.shape,y_train.shape
TensorShape([64,3,2]),TensorShape([64,3])
On model.fit I am facing a weird error I cannot understand:
ValueError: Dimensions must be equal, but are 3 and 32 for ... with input shapes: [32,3,3] and [32,3]
I have no clue what's going on, I understand the batch size is 32, but where does that [32,3,3] comes from?
Moreover, if from the original 64, I lower the number (shapes) of X_train and y_train, say, to: (19,3,2) and (19,3), I get the following error instead:
InvalidArgumentError: required broadcastable shapes at loc(unknown)
What's even more weird for me is that if I specify a single unit for the output (last) layer, instead of 3 like this:
model = tf.keras.Sequential([
tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
tf.keras.layers.Dense(1)
])
model.fit works, but the predictions have shape (1,3,1) instead of my expected (3,)
I am very confused.
Whenever you have not any idea about the journey of data throughout your model, use model.summary() to see the details and what happens to the shape of data in each layer.
In this case, the input is a 2D array, and the output is a 1D array, and you just used dense layers. Dense layers can not handle 2d features in nature. For example for an image as input, you can not feed it directly to a dense layer. Instead you should use other layers such as Conv2D or Flatten your input (make it 1D) before feeding your data to the dense layer. Otherwise you will get the other dimension in the output.
Inference: If your input dimension and output dimension differs, somewhere in your model, the shape need to be changed. Most common ways to do so, is using a Flatten layer or GlobalAveragePooling and so on.
When you pass an input to a dense layer, the input should be flattened first. There are 2 ways to deal with this:
Way 1: Adding a flatten input as a first layer of your model:
model = Sequential()
model.add(Flatten(input_shape=(3,2)))
model.add(Dense(50, 'relu'))
model.add(Dense(3))
Way 2: Converting the 2D array to 1D before passing the inputs to your model:
X_train = tf.reshape(X_train, shape=([6]))
or
X_train = tf.reshape(X_train, shape=((6,)))
Then change the input shape of the first layer as:
model.add(Dense(50, 'relu', input_shape=(6,))

Keras remove activation function of last layer

I want to use ResNet50 with Imagenet weights.
The last layer of ResNet50 is (from here)
x = layers.Dense(1000, activation='softmax', name='fc1000')(x)
I need to keep the weights of this layer but remove the softmax function.
I want to manually change it so my last layer looks like this
x = layers.Dense(1000, name='fc1000')(x)
but the weights stay the same.
Currently I call my net like this
resnet = Sequential([
Input(shape(224,224,3)),
ResNet50(weights='imagenet', input_shape(224,224,3))
])
I need the Input layer because otherwise the model.compile says that placeholders aren't filled.
Generally there are two ways of achievieng this:
Quick way - supported functions:
To change the final layer's activation function, you can pass an argument classifier_activation.
So in order to get rid of activation all together, your module can be called like:
import tensorflow as tf
resnet = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
weights='imagenet',
input_shape=(224,224,3),
pooling="avg",
classifier_activation=None
)
])
This however, is not going to work if the you want a different function, that is not supported by Keras classifer_activation parameter (e. g. custom activation function).
To achieve this you can use the workaround solution:
Long way - copy the model's weights
This solution proposes copying the original model's weights onto your custom one. This approach works because apart from the activation function you are not chaning the model's architecture.
You need to:
1. Download original model.
2. Save it's weights.
3. Declare your modified version of the model (in your case, without the activation function).
4. Set the weights of the new model.
Below snippet explains this concept in more detail:
import tensorflow as tf
# 1. Download original resnet
resnet = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
weights='imagenet',
input_shape=(224,224,3),
pooling="avg"
)
])
# 2. Hold weights in memory:
imagenet_weights = resnet.get_weights()
# 3. Declare the model, but without softmax
resnet_no_softmax = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
include_top=False,
weights='imagenet',
input_shape=(224,224,3),
pooling="avg"
),
tf.keras.layers.Dense(1000, name='fc1000')
])
# 4. Pass the imagenet weights onto the second resnet
resnet_no_softmax.set_weights(imagenet_weights)
Hope this helps!

Isues with saving and loading tensorflow model which uses hugging face transformer model as its first layer

Hi I am having some serious problems saving and loading a tensorflow model which is combination of hugging face transformers + some custom layers to do classfication. I am using the latest Huggingface transformers tensorflow keras version. The idea is to extract features using distilbert and then run the features through CNN to do classification and extraction. I have got everything to work as far as getting the correct classifications.
The problem is in saving the model once trained and then loading the model again.
I am using tensorflow keras and tensorflow version 2.2
Following is the code to design the model, train it, evaluate it and then save and load it
bert_config = DistilBertConfig(dropout=0.2, attention_dropout=0.2, output_hidden_states=False)
bert_config.output_hidden_states = False
transformer_model = TFDistilBertModel.from_pretrained(DISTIL_BERT, config=bert_config)
input_ids_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='input_token', dtype='int32')
input_masks_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='masked_token', dtype='int32')
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in)[0]
x = tf.keras.layers.Bidirectional(
tf.keras.layers.LSTM(50, return_sequences=True, dropout=0.1,
recurrent_dropout=0, recurrent_activation="sigmoid",
unroll=False, use_bias=True, activation="tanh"))(embedding_layer)
x = tf.keras.layers.GlobalMaxPool1D()(x)
outputs = []
# lots of code here to define the dense layers to generate the outputs
# .....
# .....
model = Model(inputs=[input_ids_in, input_masks_in], outputs=outputs)
for model_layer in model.layers[:3]:
logger.info(f"Setting layer {model_layer.name} to not trainable")
model_layer.trainable = False
rms_optimizer = RMSprop(learning_rate=0.001)
model.compile(loss=SigmoidFocalCrossEntropy(), optimizer=rms_optimizer)
# the code to fit the model (which works)
# then code to evaluate the model (which also works)
# finally saving the model. This too works.
tf.keras.models.save_model(model, save_url, overwrite=True, include_optimizer=True, save_format="tf")
However, when I try to load the saved model using the following
tf.keras.models.load_model(
path, custom_objects={"Addons>SigmoidFocalCrossEntropy": SigmoidFocalCrossEntropy})
I get the following load error
ValueError: The two structures don't have the same nested structure.
First structure: type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')
Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}
More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')" is not
Entire first structure:
.
Entire second structure:
{'input_ids': .}
I believe the issue is because TFDistilBertModel layer can be called using a dictionary input from DistilBertTokenizer.encode() and that happens to be the first layer. So the model compiler on load expects that to be the input signature to the call model. However, the inputs defined to the model are two tensors of shape (None, 128)
So how do I tell the load function or the save function to assume the correct signatures?
I solved the issue.
The issue was the object transformer_model in the above code is itself not a layer. So if we want to embed it inside another keras layer we should use the internal keras layer that is wrapped in the model
So changing the line
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in[0]
to
embedding_layer = transformer_model.distilbert([input_ids_in, input_masks_in])[0]
makes everything work. Hope this helps someone else. Took a long time to debug through tf.keras code to figure this one out although in hindsight it is obvious. :)
I suffered the same problem, casually, yesterday. My solution is very similar to yours, I supposed that the problem was due to how tensorflow keras processes custom models so, the idea was to use the layers of the custom model inside my model. This has the advantage of not calling explicitly the layer by its name (in my case, it is useful for easy building more generic models using different pretrained encoders):
sent_encoder = getattr(transformers, self.model_name).from_pretrained(self.shortcut_weights).layers[0]
I don't explored all the models of HuggingFace, but a few that I tested seem to be a custom model with only one custom layer.
Your solution also works like a charm, in fact, both solutions are the same if "distilbert" references to ".layers[0]".

Can't build model with DenseFeature input layer, get "'DenseFeatures' object has no attribute 'shape'"

I'm trying to build a Keras model using a DenseFeatures layer as the input; the input comes as a dict of Tensors. TF is insisting that I use model.build() to build the model before optimization, but I can't build it due to DenseFeatures not having an input shape. I get the error
AttributeError: 'DenseFeatures' object has no attribute 'shape'
How can I resolve this? Here's my code:
input_layer = tf.keras.layers.DenseFeatures(params.columns)
predictions = Dense(1, input_dim=len(params.columns), activation='softmax')(input_layer)
model = Sequential([input_layer, predictions])
model.build()
ETA: some further information for insight: I'm not actually fitting the model with this code; rather, I'm creating an EstimatorSpec to use with a Sagemaker model (thus it seems like this will probably require a bit of weird footwork between two different paradigms.)

Tensorflow Keras - feeding input to multiple model layers in parallel

With tensorflow.keras (Tensorflow 2), I want to feed my input into different layers of my model. So we are looking at a graph where the input layers branches off into 3 lines to go to 3 different convolutional layers. It has 3 outputs.
Pseudocode is something like this:
inputs = Input()
conv1 = Conv2D()(inputs)
conv2 = Conv2D()(inputs)
conv3 = Conv2D()(inputs)
model = Model(inputs=inputs, outputs=[conv1, conv2, conv3])
But I'm getting the following error when I try to fit the model with a tf DataSet stream:
ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 3 array(s), for inputs ['conv2d_1', 'conv2d_2', 'conv2d_3'] but instead got the following list of 1 arrays: [<tf.Tensor 'ExpandDims:0' shape=(None, 1) dtype=int32>]
I have verified that my code works fine if I comment out the branches and set outputs=conv1.
Note: I am not trying to feed in multiple different inputs (there are many questions and answers on here that solve this). Just one input which should branch off.
Issue solved. I should be providing a 3-array of labels.