ValueError: Input 0 of layer sequential_40 is incompatible with the layer: expected min_ndim=3, found ndim=2. Full shape received: (None, 58) - tensorflow

I am working on a dataset about student performance in a course, and I want to predict student level (low, mid, high) according to their previous year's marks. I'm using a CNN for this purpose, but when I build and fit the model I get this error:
ValueError: Input 0 of layer sequential_40 is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: (None, 58)
This is the code:
#reshaping data
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1]))
X_test = X_test.reshape((X_test.shape[0], X_test.shape[1]))
#checking the shape after reshaping
print(X_train.shape)
print(X_test.shape)
#normalizing the pixel values
X_train=X_train/255
X_test=X_test/255
#defining model
model=Sequential()
#adding convolution layer
model.add(Conv1D(32,3, activation='relu',input_shape=(28,1)))
#adding pooling layer
model.add(MaxPool1D(pool_size=2))
#adding fully connected layer
model.add(Flatten())
model.add(Dense(100,activation='relu'))
#adding output layer
model.add(Dense(10,activation='softmax'))
#compiling the model
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
model.summary()
#fitting the model
model.fit(X_train,y_train,epochs=10)
This is the output:
Model: "sequential_40"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d_23 (Conv1D) (None, 9, 32) 128
_________________________________________________________________
max_pooling1d_19 (MaxPooling (None, 4, 32) 0
_________________________________________________________________
flatten_15 (Flatten) (None, 128) 0
_________________________________________________________________
dense_30 (Dense) (None, 100) 12900
_________________________________________________________________
dense_31 (Dense) (None, 10) 1010
=================================================================
Total params: 14,038
Trainable params: 14,038
Non-trainable params: 0

Related

Tensorflow: access to see the layer activation (fine-tuning),

I use fine-tuning. How can I see and access the activations of all layers that are inside of the convolutional base?
conv_base = VGG16(weights='imagenet',
include_top=False,
input_shape=(inp_img_h, inp_img_w, 3))
def create_functional_model():
inp = Input(shape=(inp_img_h, inp_img_w, 3))
model = conv_base(inp)
model = Flatten()(model)
model = Dense(256, activation='relu')(model)
outp = Dense(1, activation='sigmoid')(model)
return Model(inputs=inp, outputs=outp)
model = create_functional_model()
model.summary()
The model summary is
Layer (type) Output Shape Param #
=================================================================
vgg16 (Functional) (None, 7, 7, 512) 14714688
_________________________________________________________________
flatten_2 (Flatten) (None, 25088) 0
_________________________________________________________________
dense_4 (Dense) (None, 256) 6422784
_________________________________________________________________
dense_5 (Dense) (None, 1) 257
=================================================================
Total params: 21,137,729
Trainable params: 21,137,729
Non-trainable params: 0
_________________________________________________________________
Thus, the levels inside the conv_base are not accessible.
As #Frightera said in comments, you can access the base model summary by:
model.layers[0].summary()
And if you want to access activation functions of its layers you can try this:
print(model.layers[0].layers[index_of_layer].activation)
#or
print(model.layers[0].get_layer("name_of_layer").activation)

Shape of the LSTM layers in multilayer LSTM model

model = tf.keras.Sequential([tf.keras.layers.Embedding(tokenizer.vocab_size, 64),tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64,return_sequences=True))
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
The second layer has 64 hidden units and since the return_sequences=True, it will output 64 sequences as well. But how can it be fed to a 32 hidden units LSTM. Won't it cause shape mismatch error?
Actually no, it won't cause it. First of all the second layer won't have the output shape of 64, but instead of 128. This is because you are using Bidirectional layer, it will be concatenated by a forward and backward pass and so you output will be (None, None, 64+64=128). You can refer to the link.
The RNN data is shaped in the following was (Batch_size, time_steps, number_of_features). This means when you try to connect two layer with different neurons the features increased or decreased based on the number of neurons.You can follow the particular link for more details.
And for your particular code this is how the model summary will look like. So to answer in short their won't be a mismatch.
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, None, 64) 32000
_________________________________________________________________
bidirectional (Bidirectional (None, None, 128) 66048
_________________________________________________________________
bidirectional_1 (Bidirection (None, 64) 41216
_________________________________________________________________
dense_2 (Dense) (None, 64) 4160
_________________________________________________________________
dense_3 (Dense) (None, 1) 65
=================================================================
Total params: 143,489
Trainable params: 143,489
Non-trainable params: 0
_________________________________________________________________

ValueError: Error when checking target: expected dense_2 to have 2 dimensions, but got array with shape (3306, 67, 1)

I am trying to train a neural network on Semantic Role Labeling task (text classification task). The dataset consist of sentences on which the neural network has to be trained to predict a class for each word. Apart from using the embedding matrix, I am also using other features (meta_data_features). The number of classes in Y_train are 61. The number 3306 represents the number of sentences in my dataset (size of my dataset). MAX_LEN = 67. The code for the architecture is:
embedding_layer = Embedding(67,
300,
embeddings_initializer=Constant(embedding_matrix),
input_length=MAX_LEN,
trainable=False)
sentence_input = Input(shape=(67,), dtype='int32')
meta_input = Input(shape=(67,), name='meta_input')
embedded_sequences = embedding_layer(sentence_input)
x_1 = (SimpleRNN(256))(embedded_sequences)
x = concatenate([x_1, meta_input], axis=1)
x = Dropout(0.3)(x)
x = Dense(32, activation='relu')(x)
predictions = Dense(61, activation='softmax')(x)
model = Model([sentence_input,meta_input], predictions)
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['sparse_categorical_accuracy'])
print(model.summary())
The snapshot of model summary is:
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 67) 0
__________________________________________________________________________________________________
embedding_1 (Embedding) (None, 67, 300) 1176000 input_1[0][0]
__________________________________________________________________________________________________
simple_rnn_1 (SimpleRNN) (None, 256) 142592 embedding_1[0][0]
__________________________________________________________________________________________________
meta_input (InputLayer) (None, 67) 0
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 323) 0 simple_rnn_1[0][0]
meta_input[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, 323) 0 concatenate_1[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 32) 10368 dropout_1[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 61) 2013 dense_1[0][0]
==================================================================================================
Total params: 1,330,973
Trainable params: 154,973
Non-trainable params: 1,176,000
__________________________________________________________________________________________________
The function call is:
simple_RNN_model_trainable.fit([padded_sentences, meta_data_features], padded_verbs,batch_size=32,epochs=1)
X_train constitutes [padded_sentences, meta_data_features] and Y_train is padded_verbs. Their shapes are:
padded_sentences - (3306, 67)
meta_data_features - (3306, 67)
padded_verbs - (3306, 67, 1)
When I try to fit the model, I get the error, "ValueError: Error when checking target: expected dense_2 to have 2 dimensions, but got array with shape (3306, 67, 1)"
It would be great if somebody can help me in resolving the error. Thanks!

How to read Keras's model structure?

For example:
BUFFER_SIZE = 10000
BATCH_SIZE = 64
train_dataset = train_dataset.shuffle(BUFFER_SIZE)
train_dataset = train_dataset.padded_batch(BATCH_SIZE, tf.compat.v1.data.get_output_shapes(train_dataset))
test_dataset = test_dataset.padded_batch(BATCH_SIZE, tf.compat.v1.data.get_output_shapes(test_dataset))
def pad_to_size(vec, size):
zeros = [0] * (size - len(vec))
vec.extend(zeros)
return vec
...
model = tf.keras.Sequential([
tf.keras.layers.Embedding(encoder.vocab_size, 64),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64, return_sequences=False)),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
print(model.summary())
The print reads as:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, None, 64) 523840
_________________________________________________________________
bidirectional (Bidirectional (None, 128) 66048
_________________________________________________________________
dense (Dense) (None, 64) 8256
_________________________________________________________________
dense_1 (Dense) (None, 1) 65
=================================================================
Total params: 598,209
Trainable params: 598,209
Non-trainable params: 0
I have the following question:
1) For the embedding layer, why is the ouput shape is (None, None, 64). I understand '64' is the vector length. Why are the other two None?
2) How is the output shape of bidirectional layer is (None, 128)? Why is it 128?
For the embedding layer, why is the ouput shape is (None, None, 64). I understand '64' is the vector length. Why are the other two None?
You can see this function produces (None,None) (including the batch dimension) (in other words it does input_shape=(None,) as default) if you don't define the input_shape to the first layer of the Sequential model.
If you pass in an input tensor of size (None, None) to an embedding layer, it produces a (None, None, 64) tensor assuming embedding dimension is 64. The first None is the batch dimension and the second is the time dimension (refers to the input_length parameter). So that's why you get a (None, None, 64) sized output.
How is the output shape of bidirectional layer is (None, 128)? Why is it 128?
Here, you have a Bidirectional LSTM. Your LSTM layer produces a (None, 64) sized output (when return_sequences=False). When you have a Bidirectional layer it is like having two LSTM layers (one going forward, other going backwards). And you have a default merge_mode of concat meaning that the two output states from forward and backward layers will be concatenated. This gives you a (None, 128) sized output.

ValueError: Error when checking input: expected dense_1_input to have 3 dimensions, but got array with shape (5, 1)

I know this question is asked before but i was unable to get my ans out of them.
So
state is [[ 0.2]
[ 10. ]
[ 1. ]
[-10.5]
[ 41.1]]
and
(5, 1) # np.shape(state)
and when model.predict(state) it throw
ValueError: Error when checking input: expected dense_1_input to have
3 dimensions, but got array with shape (5, 1)
But....
model = Sequential()
model.add(Dense(5,activation='relu',input_shape=(5,1)))
My first layer of model have input_shape=(5,1) which is equal to the shape of state that I am passing.
I also have 2 dense more layers after this.
And
print(model.summary())
// output
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 5, 5) 10
_________________________________________________________________
dropout_1 (Dropout) (None, 5, 5) 0
_________________________________________________________________
dense_2 (Dense) (None, 5, 5) 30
_________________________________________________________________
dropout_2 (Dropout) (None, 5, 5) 0
_________________________________________________________________
dense_3 (Dense) (None, 5, 3) 18
=================================================================
And model definition is (!!noob alert )
model = Sequential()
model.add(Dense(5,activation='relu',input_shape=(5,1)))
model.add(Dropout(0.2))
# model.add(Flatten())
model.add(Dense(5,activation='relu'))
model.add(Dropout(0.2))
# model.add(Flatten())
model.add(Dense(3,activation='softmax'))
model.compile(loss="mse", optimizer=Adam(lr=0.001), metrics=['accuracy'])
A couple things. First, the predict function assumes the first dimension of the input tensor is the batch size (even if you're only predicting for one sample), but the input_shape attribute on your first layer in the Sequential model excludes batch size, as indicated here. Second, the dense layers are being applied over the last dimension, which is not going to give you what you want since I assume your input vector has 5 features, but you are adding this last 1 dimension which makes your model outputs the wrong size. Try the following code:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam
model = Sequential()
model.add(Dense(5, activation='relu', input_shape=(5,)))
model.add(Dropout(0.2))
model.add(Dense(5,activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(3,activation='softmax'))
model.compile(loss="mse", optimizer=Adam(lr=0.001), metrics=['accuracy'])
print(model.summary())
state = np.array([0.2, 10., 1., -10.5, 41.1]) # shape (5,)
print("Prediction:", model.predict(np.expand_dims(state, 0))) # expand_dims adds batch dimension
You should see this output for model summary and also be able to see a predicted vector:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 5) 30
_________________________________________________________________
dropout (Dropout) (None, 5) 0
_________________________________________________________________
dense_1 (Dense) (None, 5) 30
_________________________________________________________________
dropout_1 (Dropout) (None, 5) 0
_________________________________________________________________
dense_2 (Dense) (None, 3) 18
=================================================================
Total params: 78
Trainable params: 78
Non-trainable params: 0