Concatenate two output layers of same dimension - tensorflow

I have two hidden layers of dimension 50 from 2 different autoencoder models. The shape is [None,50] for both of them. But when executing the following code:
concat_layer = Concatenate()([_1.layers[7], _2.layers[11]])
softmax_layer = keras.layers.Dense(2, activation='softmax')(concat_layer)
sum_model = keras.models.Model(inputs=[_1_x_train, _2_x_train], outputs=softmax_layer)
sum_model.compile(optimizer='Adam', loss='mse')
I get the Error: TypeError: 'NoneType' object is not subscriptable for Concatenate()([_1.layers[7], _2.layers[11]])
Edit: Here is the layer structure of the two models.
_1 summary:
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 500)] 0
_________________________________________________________________
dense (Dense) (None, 250) 125250
_________________________________________________________________
dropout (Dropout) (None, 250) 0
_________________________________________________________________
dense_1 (Dense) (None, 100) 25100
_________________________________________________________________
dropout_1 (Dropout) (None, 100) 0
_________________________________________________________________
dense_2 (Dense) (None, 50) 5050
_________________________________________________________________
dropout_2 (Dropout) (None, 50) 0
_________________________________________________________________
dense_3 (Dense) (None, 50) 2550
_________________________________________________________________
dense_4 (Dense) (None, 100) 5100
_________________________________________________________________
dense_5 (Dense) (None, 250) 25250
_________________________________________________________________
dense_6 (Dense) (None, 500) 125500
=================================================================
_2 summary:
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 24765)] 0
_________________________________________________________________
dense (Dense) (None, 5000) 123830000
_________________________________________________________________
dropout (Dropout) (None, 5000) 0
_________________________________________________________________
dense_1 (Dense) (None, 2500) 12502500
_________________________________________________________________
dropout_1 (Dropout) (None, 2500) 0
_________________________________________________________________
dense_2 (Dense) (None, 1000) 2501000
_________________________________________________________________
dropout_2 (Dropout) (None, 1000) 0
_________________________________________________________________
dense_3 (Dense) (None, 500) 500500
_________________________________________________________________
dense_4 (Dense) (None, 250) 125250
_________________________________________________________________
dense_5 (Dense) (None, 100) 25100
_________________________________________________________________
dense_6 (Dense) (None, 50) 5050
_________________________________________________________________
dense_7 (Dense) (None, 50) 2550
_________________________________________________________________
dense_8 (Dense) (None, 100) 5100
_________________________________________________________________
dense_9 (Dense) (None, 250) 25250
_________________________________________________________________
dense_10 (Dense) (None, 500) 125500
_________________________________________________________________
dense_11 (Dense) (None, 1000) 501000
_________________________________________________________________
dense_12 (Dense) (None, 2500) 2502500
_________________________________________________________________
dense_13 (Dense) (None, 5000) 12505000
_________________________________________________________________
dense_14 (Dense) (None, 24765) 123849765
=================================================================
I have to add something here because otherwise the changes will not be accepted because its only 'code' i added.

This error indicates, it is trying to get subscript (object[index]) of an object which is NoneType.
The input to atf.keras.layers.Concatenate() layer should be a tensor. But you have passed layer instances. So instead of passing layers, pass their output like this:
concat_layer = Concatenate()([_1.layers[7].output, _2.layers[11].output])
In addition, your model definition should change, since you have passed input data as inputs, instead of the first layers. Get the models input layer by model.input. So modified code should be like this:
#sum_model = keras.models.Model(inputs=[_1_x_train, _2_x_train], outputs=softmax_layer)
sum_model = keras.models.Model(inputs=[_1.input, _2.input], outputs=softmax_layer)
You should pass input data to model.fit().

Related

Employing LeakyReLU as the activation function of my CNN model causes 'nan' loss during training?

When I change my CNN model's activation from ReLU to LeakyReLU, both training and validation losses become nan. How can I resolve this issue?
Here is my model's summary:
Shape of all data: (1889, 10801)
Shape of X_train: (1322, 10800, 1)
Shape of Y_train: (1322, 3)
Shape of X_test: (567, 10800, 1)
Shape of y_test: (567, 3)
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d_48 (Conv1D) (None, 10721, 128) 10368
_________________________________________________________________
batch_normalization_48 (Batc (None, 10721, 128) 512
_________________________________________________________________
activation_48 (Activation) (None, 10721, 128) 0
_________________________________________________________________
max_pooling1d_41 (MaxPooling (None, 5360, 128) 0
_________________________________________________________________
dropout_29 (Dropout) (None, 5360, 128) 0
_________________________________________________________________
conv1d_49 (Conv1D) (None, 5357, 128) 65664
_________________________________________________________________
batch_normalization_49 (Batc (None, 5357, 128) 512
_________________________________________________________________
activation_49 (Activation) (None, 5357, 128) 0
_________________________________________________________________
max_pooling1d_42 (MaxPooling (None, 2678, 128) 0
_________________________________________________________________
dropout_30 (Dropout) (None, 2678, 128) 0
_________________________________________________________________
conv1d_50 (Conv1D) (None, 2675, 128) 65664
_________________________________________________________________
batch_normalization_50 (Batc (None, 2675, 128) 512
_________________________________________________________________
activation_50 (Activation) (None, 2675, 128) 0
_________________________________________________________________
max_pooling1d_43 (MaxPooling (None, 1337, 128) 0
_________________________________________________________________
dropout_31 (Dropout) (None, 1337, 128) 0
_________________________________________________________________
conv1d_51 (Conv1D) (None, 1334, 256) 131328
_________________________________________________________________
batch_normalization_51 (Batc (None, 1334, 256) 1024
_________________________________________________________________
activation_51 (Activation) (None, 1334, 256) 0
_________________________________________________________________
max_pooling1d_44 (MaxPooling (None, 667, 256) 0
_________________________________________________________________
global_max_pooling1d_6 (Glob (None, 256) 0
_________________________________________________________________
dense_20 (Dense) (None, 512) 131584
_________________________________________________________________
dense_21 (Dense) (None, 3) 1539
=================================================================
The model was compiled as follows:
n_lr = 1e-5
weight_decay = 1e-4
adam = Adam(lr=n_lr, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=weight_decay)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['acc'])
p.s. I'm aware of that this issue has already been reported to Keras on GitHub; but it has not got any responses as of posting this question.

Split trained autoencoder to encoder and decoder

I realize now that implementing it like this would have been a good idea. However, I have an already trained and fine-tuned autoencoder that looks like this:
Model: "autoencoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
user_input (InputLayer) [(None, 5999)] 0
_________________________________________________________________
user_e00 (Dense) (None, 64) 384000
_________________________________________________________________
user_e01 (Dense) (None, 64) 4160
_________________________________________________________________
user_e02 (Dense) (None, 64) 4160
_________________________________________________________________
user_e03 (Dense) (None, 64) 4160
_________________________________________________________________
user_out (Dense) (None, 32) 2080
_________________________________________________________________
emb_dropout (Dropout) (None, 32) 0
_________________________________________________________________
user_d00 (Dense) (None, 64) 2112
_________________________________________________________________
user_d01 (Dense) (None, 64) 4160
_________________________________________________________________
user_d02 (Dense) (None, 64) 4160
_________________________________________________________________
user_d03 (Dense) (None, 64) 4160
_________________________________________________________________
user_res (Dense) (None, 5999) 389935
=================================================================
Total params: 803,087
Trainable params: 0
Non-trainable params: 803,087
_________________________________________________________________
Now I want to split it into encoder and decoder. I believe I already found the right way for the encoder, which would be:
encoder_in = model.input
encoder_out = model.get_layer(name='user_out').output
encoder = Model(encoder_in, encoder_out, name='encoder')
For the decoder I would like to do something like:
decoder_in = model.get_layer("user_d00").input
decoder_out = model.output
decoder = Model(decoder_in, decoder_out, name='decoder')
but that throws:
WARNING:tensorflow:Functional inputs must come from `tf.keras.Input` (thus holding past layer metadata), they cannot be the output of a previous non-Input layer. Here, a tensor specified as input to "decoder" was not an Input tensor, it was generated by layer emb_dropout.
Note that input tensors are instantiated via `tensor = tf.keras.Input(shape)`.
The tensor that caused the issue was: emb_dropout/cond_3/Identity:0
I believe I have to create an Input layer with the shape of the output of emb_dropout and probably add it to user_d00 (since the Dropout layer is not needed anymore since training has ended). Anyone knows how to do it correctly?

Problems with visualization classification_report

I have trying plot classification report, but in my problem have a only 2 classes (0 and 1) and when I called the classification report, his output is it:
enter image description here
My model is a LSTM with Glove embedding for sentiment classification, this is an architecture:
Model: "sequential_6"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_6 (Embedding) (None, 55, 300) 68299200
_________________________________________________________________
spatial_dropout1d_12 (Spatia (None, 55, 300) 0
_________________________________________________________________
lstm_12 (LSTM) (None, 55, 128) 219648
_________________________________________________________________
lstm_13 (LSTM) (None, 55, 64) 49408
_________________________________________________________________
spatial_dropout1d_13 (Spatia (None, 55, 64) 0
_________________________________________________________________
dense_18 (Dense) (None, 55, 512) 33280
_________________________________________________________________
dropout_6 (Dropout) (None, 55, 512) 0
_________________________________________________________________
dense_19 (Dense) (None, 55, 64) 32832
_________________________________________________________________
dense_20 (Dense) (None, 55, 1) 65
=================================================================
Total params: 68,634,433
Trainable params: 335,233
Non-trainable params: 68,299,200
You can define your output from the classification_report to be a dict(), so that you can then read it as a pandas DataFrame via pandas.DataFrame.from_dict() like this:
import pandas as pd
display(pd.DataFrame.from_dict(classification_report(y_true, y_pred, output_dict=True)).T)

Select size for output vector with 1000s of labels

Most of the examples on the Internet regarding multi-label image classification are based on just a few labels. For example, with 6 classes we get:
model = models.Sequential()
model.add(layer=base)
model.add(layer=layers.Flatten())
model.add(layer=layers.Dense(units=256, activation="relu"))
model.add(layer=layers.Dense(units=6, activation="sigmoid"))
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Model) (None, 7, 7, 512) 14714688
_________________________________________________________________
flatten_1 (Flatten) (None, 25088) 0
_________________________________________________________________
dense_1 (Dense) (None, 256) 6422784
_________________________________________________________________
dense_2 (Dense) (None, 6) 1542
=================================================================
Total params: 21,139,014
Trainable params: 13,503,750
Non-trainable params: 7,635,264
However, for datasets with significantly more labels, the size of the training parameters explodes and eventually training process fails with a ResourceExhaustedError error. For example, with 3047 label we get:
model = models.Sequential()
model.add(layer=base)
model.add(layer=layers.Flatten())
model.add(layer=layers.Dense(units=256, activation="relu"))
model.add(layer=layers.Dense(units=3047, activation="sigmoid"))
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Model) (None, 7, 7, 512) 14714688
_________________________________________________________________
flatten_1 (Flatten) (None, 25088) 0
_________________________________________________________________
dense_1 (Dense) (None, 256) 6422784
_________________________________________________________________
dense_2 (Dense) (None, 3047) 783079
=================================================================
Total params: 21,920,551
Trainable params: 14,285,287
Non-trainable params: 7,635,264
_________________________________________________________________
Obviously, there is something wrong with my network but not sure how to overcome this issue...
Resource Exhauseted Error is related to memory issues. Either you don't have enough memory in your system or some other part of the code is causing memory issues.

Keras - Freezing A Model And Then Adding Trainable Layers

I am taking a CNN model that is pretrained, and then trying to implement a CNN-LSTM with parallel CNNs all with the same weights from the pretraining.
# load in CNN
weightsfile = 'final_weights.h5'
modelfile = '2dcnn_model.json'
# load model from json
json_file = open(modelfile, 'r')
loaded_model_json = json_file.read()
json_file.close()
fixed_cnn_model = keras.models.model_from_json(loaded_model_json)
fixed_cnn_model.load_weights(weightsfile)
# remove the last 2 dense FC layers and freeze it
fixed_cnn_model.pop()
fixed_cnn_model.pop()
fixed_cnn_model.trainable = False
print(fixed_cnn_model.summary())
This will produce the summary:
_
________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 32, 32, 4) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 30, 30, 32) 1184
_________________________________________________________________
conv2d_2 (Conv2D) (None, 28, 28, 32) 9248
_________________________________________________________________
conv2d_3 (Conv2D) (None, 26, 26, 32) 9248
_________________________________________________________________
conv2d_4 (Conv2D) (None, 24, 24, 32) 9248
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 32) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 10, 10, 64) 18496
_________________________________________________________________
conv2d_6 (Conv2D) (None, 8, 8, 64) 36928
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 64) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 2, 2, 128) 73856
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 1, 1, 128) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 128) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 128) 0
_________________________________________________________________
dense_1 (Dense) (None, 512) 66048
=================================================================
Total params: 224,256
Trainable params: 0
Non-trainable params: 224,256
_________________________________________________________________
Now, I will add to it and compile and show that the non-trainable all become trainable.
# create sequential model to get this all before the LSTM
# initialize loss function, SGD optimizer and metrics
loss = 'binary_crossentropy'
optimizer = keras.optimizers.Adam(lr=1e-4,
beta_1=0.9,
beta_2=0.999,
epsilon=1e-08,
decay=0.0)
metrics = ['accuracy']
currmodel = Sequential()
currmodel.add(TimeDistributed(fixed_cnn_model, input_shape=(num_timewins, imsize, imsize, n_colors)))
currmodel.add(LSTM(units=size_mem,
activation='relu',
return_sequences=False))
currmodel.add(Dense(1024, activation='relu')
currmodel.add(Dense(2, activation='softmax')
currmodel = Model(inputs=currmodel.input, outputs = currmodel.output)
config = currmodel.compile(optimizer=optimizer, loss=loss, metrics=metrics)
print(currmodel.summary())
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
time_distributed_3_input (In (None, 5, 32, 32, 4) 0
_________________________________________________________________
time_distributed_3 (TimeDist (None, 5, 512) 224256
_________________________________________________________________
lstm_3 (LSTM) (None, 50) 112600
_________________________________________________________________
dropout_1 (Dropout) (None, 50) 0
_________________________________________________________________
dense_1 (Dense) (None, 1024) 52224
_________________________________________________________________
dropout_2 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_2 (Dense) (None, 2) 2050
=================================================================
Total params: 391,130
Trainable params: 391,130
Non-trainable params: 0
_________________________________________________________________
How am I supposed to freeze the layers in this case? I am almost 100% positive that I had working code in this format in an earlier keras version. It seems like this is the right direction, since you define a model and declare certain layers trainable, or not.
Then you add layers, which are by default trainable. However, this seems to convert all the layers to trainable.
try adding
for layer in currmodel.layers[:5]:
layer.trainable = False
First print the layer numbers in you network
for i,layer in enumerate(currmodel.layers):
print(i,layer.name)
Now check which layers are trainable and which are not
for i,layer in enumerate(model.layers):
print(i,layer.name,layer.trainable)
Now you can set the parameter 'trainable' for the layers which you want. Let us say you want to train only last 2 layers out of total 6 (the numbering starts from 0) then you can write something like this
for layer in model.layers[:5]:
layer.trainable=False
for layer in model.layers[5:]:
layer.trainable=True
To cross check try to print again and you will get the desired settings.