I'm trying to apply average pooling at each time step of lstm output, please find my architecture as below
X_input = tf.keras.layers.Input(shape=(64,35))
X= tf.keras.layers.LSTM(512,activation="tanh",return_sequences=True,kernel_initializer=tf.keras.initializers.he_uniform(seed=45),kernel_regularizer=tf.keras.regularizers.l2(0.1))(X_input)
X= tf.keras.layers.LSTM(256,activation="tanh",return_sequences=True,kernel_initializer=tf.keras.initializers.he_uniform(seed=45),kernel_regularizer=tf.keras.regularizers.l2(0.1))(X)
X = tf.keras.layers.GlobalAvgPool1D()(X)
X = tf.keras.layers.Dense(128,activation="relu",kernel_initializer=tf.keras.initializers.he_uniform(seed=45),kernel_regularizer=tf.keras.regularizers.l2(0.1))(X)
X = tf.keras.layers.Dense(64,activation="relu",kernel_initializer=tf.keras.initializers.he_uniform(seed=45),kernel_regularizer=tf.keras.regularizers.l2(0.1))(X)
X = tf.keras.layers.Dense(32,activation="relu",kernel_initializer=tf.keras.initializers.he_uniform(seed=45),kernel_regularizer=tf.keras.regularizers.l2(0.1))(X)
# X = tf.keras.layers.Dense(16,activation="relu",kernel_initializer=tf.keras.initializers.he_uniform(seed=45),kernel_regularizer=tf.keras.regularizers.l2(0.1))(X)
output_layer = tf.keras.layers.Dense(10,activation='softmax', kernel_initializer=tf.keras.initializers.he_uniform(seed=45))(X)
model2 = tf.keras.Model(inputs = X_input,outputs = output_layer)
I want to take average at each time step, not on each unit
For example now I'm getting the shape (None,256) but I want to get the shape (None,64) from global average pooling layer, what I need to do for that.
I am not sure this is the most efficient way, but you can try this :
X = tf.keras.layers.Reshape(target_shape=(64,256,1))(X)
X = tf.keras.layers.TimeDistributed(tf.keras.layers.GlobalAveragePooling1D())(X)
X = tf.keras.layers.Reshape(target_shape=(64,))(X)
instead of :
X = tf.keras.layers.GlobalAvgPool1D()(X)
The summary is now :
Model: "functional_13"
Layer (type) Output Shape Param #
input_14 (InputLayer) [(None, 64, 35)] 0
lstm_26 (LSTM) (None, 64, 512) 1122304
lstm_27 (LSTM) (None, 64, 256) 787456
reshape_2 (Reshape) (None, 64, 256, 1) 0
time_distributed_8 (TimeDist (None, 64, 1) 0
reshape_3 (Reshape) (None, 64) 0
dense_61 (Dense) (None, 128) 8320
dense_62 (Dense) (None, 64) 8256
dense_63 (Dense) (None, 32) 2080
dense_64 (Dense) (None, 10) 330
Total params: 1,928,746
Trainable params: 1,928,746
Non-trainable params: 0
I realize now that implementing it like this would have been a good idea. However, I have an already trained and fine-tuned autoencoder that looks like this:
Model: "autoencoder"
Layer (type) Output Shape Param #
user_input (InputLayer) [(None, 5999)] 0
user_e00 (Dense) (None, 64) 384000
user_e01 (Dense) (None, 64) 4160
user_e02 (Dense) (None, 64) 4160
user_e03 (Dense) (None, 64) 4160
user_out (Dense) (None, 32) 2080
emb_dropout (Dropout) (None, 32) 0
user_d00 (Dense) (None, 64) 2112
user_d01 (Dense) (None, 64) 4160
user_d02 (Dense) (None, 64) 4160
user_d03 (Dense) (None, 64) 4160
user_res (Dense) (None, 5999) 389935
Total params: 803,087
Trainable params: 0
Non-trainable params: 803,087
Now I want to split it into encoder and decoder. I believe I already found the right way for the encoder, which would be:
encoder_in = model.input
encoder_out = model.get_layer(name='user_out').output
encoder = Model(encoder_in, encoder_out, name='encoder')
For the decoder I would like to do something like:
decoder_in = model.get_layer("user_d00").input
decoder_out = model.output
decoder = Model(decoder_in, decoder_out, name='decoder')
but that throws:
WARNING:tensorflow:Functional inputs must come from `tf.keras.Input` (thus holding past layer metadata), they cannot be the output of a previous non-Input layer. Here, a tensor specified as input to "decoder" was not an Input tensor, it was generated by layer emb_dropout.
Note that input tensors are instantiated via `tensor = tf.keras.Input(shape)`.
The tensor that caused the issue was: emb_dropout/cond_3/Identity:0
I believe I have to create an Input layer with the shape of the output of emb_dropout and probably add it to user_d00 (since the Dropout layer is not needed anymore since training has ended). Anyone knows how to do it correctly?
I have trying plot classification report, but in my problem have a only 2 classes (0 and 1) and when I called the classification report, his output is it:
enter image description here
My model is a LSTM with Glove embedding for sentiment classification, this is an architecture:
Model: "sequential_6"
Layer (type) Output Shape Param #
embedding_6 (Embedding) (None, 55, 300) 68299200
spatial_dropout1d_12 (Spatia (None, 55, 300) 0
lstm_12 (LSTM) (None, 55, 128) 219648
lstm_13 (LSTM) (None, 55, 64) 49408
spatial_dropout1d_13 (Spatia (None, 55, 64) 0
dense_18 (Dense) (None, 55, 512) 33280
dropout_6 (Dropout) (None, 55, 512) 0
dense_19 (Dense) (None, 55, 64) 32832
dense_20 (Dense) (None, 55, 1) 65
Total params: 68,634,433
Trainable params: 335,233
Non-trainable params: 68,299,200
You can define your output from the classification_report to be a dict(), so that you can then read it as a pandas DataFrame via pandas.DataFrame.from_dict() like this:
import pandas as pd
display(pd.DataFrame.from_dict(classification_report(y_true, y_pred, output_dict=True)).T)
I am trying to train a neural network on Semantic Role Labeling task (text classification task). The dataset consist of sentences on which the neural network has to be trained to predict a class for each word. Apart from using the embedding matrix, I am also using other features (meta_data_features). The number of classes in Y_train are 61. The number 3306 represents the number of sentences in my dataset (size of my dataset). MAX_LEN = 67. The code for the architecture is:
embedding_layer = Embedding(67,
sentence_input = Input(shape=(67,), dtype='int32')
meta_input = Input(shape=(67,), name='meta_input')
embedded_sequences = embedding_layer(sentence_input)
x_1 = (SimpleRNN(256))(embedded_sequences)
x = concatenate([x_1, meta_input], axis=1)
x = Dropout(0.3)(x)
x = Dense(32, activation='relu')(x)
predictions = Dense(61, activation='softmax')(x)
model = Model([sentence_input,meta_input], predictions)
The snapshot of model summary is:
Layer (type) Output Shape Param # Connected to
input_1 (InputLayer) (None, 67) 0
embedding_1 (Embedding) (None, 67, 300) 1176000 input_1[0][0]
simple_rnn_1 (SimpleRNN) (None, 256) 142592 embedding_1[0][0]
meta_input (InputLayer) (None, 67) 0
concatenate_1 (Concatenate) (None, 323) 0 simple_rnn_1[0][0]
dropout_1 (Dropout) (None, 323) 0 concatenate_1[0][0]
dense_1 (Dense) (None, 32) 10368 dropout_1[0][0]
dense_2 (Dense) (None, 61) 2013 dense_1[0][0]
Total params: 1,330,973
Trainable params: 154,973
Non-trainable params: 1,176,000
The function call is:
simple_RNN_model_trainable.fit([padded_sentences, meta_data_features], padded_verbs,batch_size=32,epochs=1)
X_train constitutes [padded_sentences, meta_data_features] and Y_train is padded_verbs. Their shapes are:
padded_sentences - (3306, 67)
meta_data_features - (3306, 67)
padded_verbs - (3306, 67, 1)
When I try to fit the model, I get the error, "ValueError: Error when checking target: expected dense_2 to have 2 dimensions, but got array with shape (3306, 67, 1)"
It would be great if somebody can help me in resolving the error. Thanks!
I am taking a CNN model that is pretrained, and then trying to implement a CNN-LSTM with parallel CNNs all with the same weights from the pretraining.
# load in CNN
weightsfile = 'final_weights.h5'
modelfile = '2dcnn_model.json'
# load model from json
json_file = open(modelfile, 'r')
loaded_model_json = json_file.read()
fixed_cnn_model = keras.models.model_from_json(loaded_model_json)
# remove the last 2 dense FC layers and freeze it
fixed_cnn_model.trainable = False
This will produce the summary:
Layer (type) Output Shape Param #
input_1 (InputLayer) (None, 32, 32, 4) 0
conv2d_1 (Conv2D) (None, 30, 30, 32) 1184
conv2d_2 (Conv2D) (None, 28, 28, 32) 9248
conv2d_3 (Conv2D) (None, 26, 26, 32) 9248
conv2d_4 (Conv2D) (None, 24, 24, 32) 9248
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 32) 0
conv2d_5 (Conv2D) (None, 10, 10, 64) 18496
conv2d_6 (Conv2D) (None, 8, 8, 64) 36928
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 64) 0
conv2d_7 (Conv2D) (None, 2, 2, 128) 73856
max_pooling2d_3 (MaxPooling2 (None, 1, 1, 128) 0
flatten_1 (Flatten) (None, 128) 0
dropout_1 (Dropout) (None, 128) 0
dense_1 (Dense) (None, 512) 66048
Total params: 224,256
Trainable params: 0
Non-trainable params: 224,256
Now, I will add to it and compile and show that the non-trainable all become trainable.
# create sequential model to get this all before the LSTM
# initialize loss function, SGD optimizer and metrics
loss = 'binary_crossentropy'
optimizer = keras.optimizers.Adam(lr=1e-4,
metrics = ['accuracy']
currmodel = Sequential()
currmodel.add(TimeDistributed(fixed_cnn_model, input_shape=(num_timewins, imsize, imsize, n_colors)))
currmodel.add(Dense(1024, activation='relu')
currmodel.add(Dense(2, activation='softmax')
currmodel = Model(inputs=currmodel.input, outputs = currmodel.output)
config = currmodel.compile(optimizer=optimizer, loss=loss, metrics=metrics)
Layer (type) Output Shape Param #
time_distributed_3_input (In (None, 5, 32, 32, 4) 0
time_distributed_3 (TimeDist (None, 5, 512) 224256
lstm_3 (LSTM) (None, 50) 112600
dropout_1 (Dropout) (None, 50) 0
dense_1 (Dense) (None, 1024) 52224
dropout_2 (Dropout) (None, 1024) 0
dense_2 (Dense) (None, 2) 2050
Total params: 391,130
Trainable params: 391,130
Non-trainable params: 0
How am I supposed to freeze the layers in this case? I am almost 100% positive that I had working code in this format in an earlier keras version. It seems like this is the right direction, since you define a model and declare certain layers trainable, or not.
Then you add layers, which are by default trainable. However, this seems to convert all the layers to trainable.
try adding
for layer in currmodel.layers[:5]:
layer.trainable = False
First print the layer numbers in you network
for i,layer in enumerate(currmodel.layers):
Now check which layers are trainable and which are not
for i,layer in enumerate(model.layers):
Now you can set the parameter 'trainable' for the layers which you want. Let us say you want to train only last 2 layers out of total 6 (the numbering starts from 0) then you can write something like this
for layer in model.layers[:5]:
for layer in model.layers[5:]:
To cross check try to print again and you will get the desired settings.
I have the following sequential model that works with variable length inputs:
m = Sequential()
m.add(Embedding(len(chars), 4, name="embedding"))
m.add(Bidirectional(LSTM(16, unit_forget_bias=True, name="lstm")))
Gives the following summary:
Layer (type) Output Shape Param #
embedding (Embedding) (None, None, 4) 204
bidirectional_2 (Bidirection (None, 32) 2688
dense (Dense) (None, 51) 1683
activation_2 (Activation) (None, 51) 0
Total params: 4,575
Trainable params: 4,575
Non-trainable params: 0
However when I try to implement the same model in functional API I don't know whatever I try as Input layer shape doesn't seem to be the same as the sequential model. Here is one of my tries:
charinput = Input(shape=(4,),name="input",dtype='int32')
embedding = Embedding(len(chars), 4, name="embedding")(charinput)
lstm = Bidirectional(LSTM(16, unit_forget_bias=True, name="lstm"))(embedding)
dense = Dense(len(chars),name="dense")(lstm)
output = Activation("softmax")(dense)
And here is the summary:
Layer (type) Output Shape Param #
input (InputLayer) (None, 4) 0
embedding (Embedding) (None, 4, 4) 204
bidirectional_1 (Bidirection (None, 32) 2688
dense (Dense) (None, 51) 1683
activation_1 (Activation) (None, 51) 0
Total params: 4,575
Trainable params: 4,575
Non-trainable params: 0
Use shape=(None,) in the input layer, in your case:
charinput = Input(shape=(None,),name="input",dtype='int32')
Try adding the argument input_length=None to the embeddinglayer.