ValueError: Input 0 is incompatible with layer model: expected shape=(None, 14999, 7), found shape=(None, 7) - tensorflow

I'm trying to apply Conv1D layers for a classification model which has a numeric dataset. The neural network of my model is as follows:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv1D(8,kernel_size = 3, strides = 1,padding = 'valid', activation = 'relu',input_shape = (14999,7)))
model.add(tf.keras.layers.Conv1D(16,kernel_size = 3, strides = 1,padding = 'valid', activation = 'relu'))
model.add(tf.keras.layers.MaxPooling1D(2))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Conv1D(32,kernel_size = 3, strides = 1,padding = 'valid', activation = 'relu'))
model.add(tf.keras.layers.Conv1D(64,kernel_size = 3, strides = 1,padding = 'valid', activation = 'relu'))
model.add(tf.keras.layers.MaxPooling1D(2))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Conv1D(128,kernel_size = 3, strides = 1,padding = 'valid', activation = 'relu'))
model.add(tf.keras.layers.Conv1D(256,kernel_size = 3, strides = 1,padding = 'valid', activation = 'relu'))
model.add(tf.keras.layers.MaxPooling1D(2))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(512,activation = 'relu'))
model.add(tf.keras.layers.Dense(128,activation = 'relu'))
model.add(tf.keras.layers.Dense(32,activation = 'relu'))
model.add(tf.keras.layers.Dense(3, activation = 'softmax'))
And the model's input shape is (14999, 7).
model.summary() gives the following output
Model: "sequential_8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d_24 (Conv1D) (None, 14997, 8) 176
_________________________________________________________________
conv1d_25 (Conv1D) (None, 14995, 16) 400
_________________________________________________________________
max_pooling1d_10 (MaxPooling (None, 7497, 16) 0
_________________________________________________________________
dropout_9 (Dropout) (None, 7497, 16) 0
_________________________________________________________________
conv1d_26 (Conv1D) (None, 7495, 32) 1568
_________________________________________________________________
conv1d_27 (Conv1D) (None, 7493, 64) 6208
_________________________________________________________________
max_pooling1d_11 (MaxPooling (None, 3746, 64) 0
_________________________________________________________________
dropout_10 (Dropout) (None, 3746, 64) 0
_________________________________________________________________
conv1d_28 (Conv1D) (None, 3744, 128) 24704
_________________________________________________________________
conv1d_29 (Conv1D) (None, 3742, 256) 98560
_________________________________________________________________
max_pooling1d_12 (MaxPooling (None, 1871, 256) 0
_________________________________________________________________
dropout_11 (Dropout) (None, 1871, 256) 0
_________________________________________________________________
flatten_3 (Flatten) (None, 478976) 0
_________________________________________________________________
dense_14 (Dense) (None, 512) 245236224
_________________________________________________________________
dense_15 (Dense) (None, 128) 65664
_________________________________________________________________
dense_16 (Dense) (None, 32) 4128
_________________________________________________________________
dense_17 (Dense) (None, 3) 99
=================================================================
Total params: 245,437,731
Trainable params: 245,437,731
Non-trainable params: 0
The code for model fitting is:
model.compile(loss = 'sparse_categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
history = model.fit(xtrain_scaled, ytrain_scaled, epochs = 30, batch_size = 5, validation_data = (xval_scaled, yval_scaled))
While executing, I'm facing the following error:
ValueError: Input 0 is incompatible with layer model: expected shape=(None, 14999, 7), found shape=(None, 7)
Could anyone help to sort out this issue?

TL;DR:
Change
model.add(tf.keras.layers.Conv1D(8,kernel_size = 3, strides = 1,padding = 'valid', activation = 'relu',input_shape = (14999,7)))
to
model.add(tf.keras.layers.Conv1D(8,kernel_size = 3, strides = 1,padding = 'valid', activation = 'relu',input_shape = (7)))
Full answer:
Assumption: I am guessing the reason you chose to write 14999 in the input shape is because that's your batch size / total size of training data
Problem with this:
Tensorflow assumes the input shape does not include the batch size
By specifying a 2D input_shape you're making Tensorflow expect a 3D input of shape (Batch_size, 14999, 7). But your input is clearly of size (Batch_size, 7)
Solution:
Change the shape from (14999, 7) to just (7)
TF will now be expecting the same shape that you are providing
PS: Don't be worried about informing your model of how many training examples you have in the dataset. TF Keras works with the assumption it can be shown unlimited training examples.

Related

Sentence classification: Why does my embedding not reduce the shape of the subsequent layer?

I want to embed sentences that all contain 5 words and a my training-set has a total vocabulary of 10000 words. I use this code:
import tensorflow as tf
vocab_size = 10000
inputs = tf.keras.layers.Input(shape=(5,vocab_size), name="input", )
embedding = tf.keras.layers.Embedding(10000, 64)(inputs)
conv2d_1 = Conv2D( filters = 32, kernel_size = (3,3),
strides =(1), padding = 'SAME',)(embedding)
model = tf.keras.models.Model(inputs=inputs, outputs=conv2d_1)
model.summary()
After running I get:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) [(None, 5, 10000)] 0
_________________________________________________________________
embedding_105 (Embedding) (None, 5, 10000, 64) 640000
_________________________________________________________________
conv2d_102 (Conv2D) (None, 5, 10000, 32) 18464
=================================================================
I want to do the embedding to convert the sparse 10000x5 tensor to a dense 64x5 tensor. Apparently that doesn't work as intended, so my question is: Why is the shape of the next layer not (None, 5, 64, 32) instead of (None, 5, 10000, 32)? How can I achieve the compactization?

Keras model shape incompatible / ValueError: Shapes (None, 3) and (None, 3, 3) are incompatible

I'm trying to train my keras model but shapes are incompatible.
The error says
ValueError: Shapes (None, 3) and (None, 3, 3) are incompatible
My train set's shape is (2000, 3, 768) and lable's shape is (2000, 3).
What is the wrong the point?
Model define & fit code
input_shape = x_train.shape[1:]
model = my_dnn(input_shape, 3)
model.fit(x_train, y_train, epochs=25, verbose=1)
Model code
def my_dnn(input, num_classes):
model = Sequential()
model.add(tf.keras.Input(input))
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dense(225))
model.add(Activation('relu'))
model.add(Dense(100))
model.add(Activation('relu'))
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))
model.compile( loss='categorical_crossentropy',
optimizer='adam',
metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])
return model
In addition to what's said, it seems you are carrying the second dimension of the input data until the end of the model. So your model summary is something like this:
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 3, 1024) 787456
_________________________________________________________________
activation_1 (Activation) (None, 3, 1024) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 3, 1024) 0
_________________________________________________________________
dense_2 (Dense) (None, 3, 512) 524800
_________________________________________________________________
activation_2 (Activation) (None, 3, 512) 0
_________________________________________________________________
dense_3 (Dense) (None, 3, 225) 115425
_________________________________________________________________
activation_3 (Activation) (None, 3, 225) 0
_________________________________________________________________
dense_4 (Dense) (None, 3, 100) 22600
_________________________________________________________________
activation_4 (Activation) (None, 3, 100) 0
_________________________________________________________________
dense_5 (Dense) (None, 3, 3) 303
_________________________________________________________________
activation_5 (Activation) (None, 3, 3) 0
=================================================================
Total params: 1,450,584
Trainable params: 1,450,584
Non-trainable params: 0
As you can see, the output shape of the model (None, 3, 3) is not compatible with the label's shape (None, 3), and at some point, you need to use a Flatten layer.
There are two possible reasons:
Your problem is multi-class classification, hence you need softmax instead of sigmoid + accuracy or CategoricalAccuracy() as a metric.
Your problem is multi-label classification, hence you need binary_crossentropy and tf.keras.metrics.BinaryAccuracy()
Depending on how your dataset is built/the task you are trying to solve, you need to opt for one of those.
For case 1, ensure your data is OHE(one-hot encoded).
Also, Marco Cerliani and Amir (in the comment below) point out that the data output needs to be in a 2D format rather than 3D : you should either preprocess the data accordingly before feeding it to the network or use, as suggested in the comment below, a Flatten() at a point (probably before the final Dense())

How to use model subclassing in Keras?

Having the following model written in the sequential API:
config = {
'learning_rate': 0.001,
'lstm_neurons':32,
'lstm_activation':'tanh',
'dropout_rate': 0.08,
'batch_size': 128,
'dense_layers':[
{'neurons': 32, 'activation': 'relu'},
{'neurons': 32, 'activation': 'relu'},
]
}
def get_model(num_features, output_size):
opt = Adam(learning_rate=0.001)
model = Sequential()
model.add(Input(shape=[None,num_features], dtype=tf.float32, ragged=True))
model.add(LSTM(config['lstm_neurons'], activation=config['lstm_activation']))
model.add(BatchNormalization())
if 'dropout_rate' in config:
model.add(Dropout(config['dropout_rate']))
for layer in config['dense_layers']:
model.add(Dense(layer['neurons'], activation=layer['activation']))
model.add(BatchNormalization())
if 'dropout_rate' in layer:
model.add(Dropout(layer['dropout_rate']))
model.add(Dense(output_size, activation='sigmoid'))
model.compile(loss='mse', optimizer=opt, metrics=['mse'])
print(model.summary())
return model
When using a distributed training framework, I need to convert the syntax to use model subclassing instead.
I've looked at the docs but couldn't figure out how to do it.
Here is one equivalent subclassed implementation. Though I didn't test.
import tensorflow as tf
# your config
config = {
'learning_rate': 0.001,
'lstm_neurons':32,
'lstm_activation':'tanh',
'dropout_rate': 0.08,
'batch_size': 128,
'dense_layers':[
{'neurons': 32, 'activation': 'relu'},
{'neurons': 32, 'activation': 'relu'},
]
}
# Subclassed API Model
class MySubClassed(tf.keras.Model):
def __init__(self, output_size):
super(MySubClassed, self).__init__()
self.lstm = tf.keras.layers.LSTM(config['lstm_neurons'],
activation=config['lstm_activation'])
self.bn = tf.keras.layers.BatchNormalization()
if 'dropout_rate' in config:
self.dp1 = tf.keras.layers.Dropout(config['dropout_rate'])
self.dp2 = tf.keras.layers.Dropout(config['dropout_rate'])
self.dp3 = tf.keras.layers.Dropout(config['dropout_rate'])
for layer in config['dense_layers']:
self.dense1 = tf.keras.layers.Dense(layer['neurons'],
activation=layer['activation'])
self.bn1 = tf.keras.layers.BatchNormalization()
self.dense2 = tf.keras.layers.Dense(layer['neurons'],
activation=layer['activation'])
self.bn2 = tf.keras.layers.BatchNormalization()
self.out = tf.keras.layers.Dense(output_size,
activation='sigmoid')
def call(self, inputs, training=True, **kwargs):
x = self.lstm(inputs)
x = self.bn(x)
if 'dropout_rate' in config:
x = self.dp1(x)
x = self.dense1(x)
x = self.bn1(x)
if 'dropout_rate' in config:
x = self.dp2(x)
x = self.dense2(x)
x = self.bn2(x)
if 'dropout_rate' in config:
x = self.dp3(x)
return self.out(x)
# A convenient way to get model summary
# and plot in subclassed api
def build_graph(self, raw_shape):
x = tf.keras.layers.Input(shape=(None, raw_shape),
ragged=True)
return tf.keras.Model(inputs=[x],
outputs=self.call(x))
Build and compile the mdoel
s = MySubClassed(output_size=1)
s.compile(
loss = 'mse',
metrics = ['mse'],
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001))
Pass some tensor to create weights (check).
raw_input = (16, 16, 16)
y = s(tf.ones(shape=(raw_input)))
print("weights:", len(s.weights))
print("trainable weights:", len(s.trainable_weights))
weights: 21
trainable weights: 15
Summary and Plot
Summarize and visualize the model graph.
s.build_graph(16).summary()
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, None, 16)] 0
_________________________________________________________________
lstm (LSTM) (None, 32) 6272
_________________________________________________________________
batch_normalization (BatchNo (None, 32) 128
_________________________________________________________________
dropout (Dropout) (None, 32) 0
_________________________________________________________________
dense_2 (Dense) (None, 32) 1056
_________________________________________________________________
batch_normalization_3 (Batch (None, 32) 128
_________________________________________________________________
dropout_1 (Dropout) (None, 32) 0
_________________________________________________________________
dense_3 (Dense) (None, 32) 1056
_________________________________________________________________
batch_normalization_4 (Batch (None, 32) 128
_________________________________________________________________
dropout_2 (Dropout) (None, 32) 0
_________________________________________________________________
dense_4 (Dense) (None, 1) 33
=================================================================
Total params: 8,801
Trainable params: 8,609
Non-trainable params: 192
tf.keras.utils.plot_model(
s.build_graph(16),
show_shapes=True,
show_dtype=True,
show_layer_names=True,
rankdir="TB",
)

ValueError: Error when checking target: expected dense_2 to have 2 dimensions, but got array with shape (3306, 67, 1)

I am trying to train a neural network on Semantic Role Labeling task (text classification task). The dataset consist of sentences on which the neural network has to be trained to predict a class for each word. Apart from using the embedding matrix, I am also using other features (meta_data_features). The number of classes in Y_train are 61. The number 3306 represents the number of sentences in my dataset (size of my dataset). MAX_LEN = 67. The code for the architecture is:
embedding_layer = Embedding(67,
300,
embeddings_initializer=Constant(embedding_matrix),
input_length=MAX_LEN,
trainable=False)
sentence_input = Input(shape=(67,), dtype='int32')
meta_input = Input(shape=(67,), name='meta_input')
embedded_sequences = embedding_layer(sentence_input)
x_1 = (SimpleRNN(256))(embedded_sequences)
x = concatenate([x_1, meta_input], axis=1)
x = Dropout(0.3)(x)
x = Dense(32, activation='relu')(x)
predictions = Dense(61, activation='softmax')(x)
model = Model([sentence_input,meta_input], predictions)
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['sparse_categorical_accuracy'])
print(model.summary())
The snapshot of model summary is:
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 67) 0
__________________________________________________________________________________________________
embedding_1 (Embedding) (None, 67, 300) 1176000 input_1[0][0]
__________________________________________________________________________________________________
simple_rnn_1 (SimpleRNN) (None, 256) 142592 embedding_1[0][0]
__________________________________________________________________________________________________
meta_input (InputLayer) (None, 67) 0
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 323) 0 simple_rnn_1[0][0]
meta_input[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, 323) 0 concatenate_1[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 32) 10368 dropout_1[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 61) 2013 dense_1[0][0]
==================================================================================================
Total params: 1,330,973
Trainable params: 154,973
Non-trainable params: 1,176,000
__________________________________________________________________________________________________
The function call is:
simple_RNN_model_trainable.fit([padded_sentences, meta_data_features], padded_verbs,batch_size=32,epochs=1)
X_train constitutes [padded_sentences, meta_data_features] and Y_train is padded_verbs. Their shapes are:
padded_sentences - (3306, 67)
meta_data_features - (3306, 67)
padded_verbs - (3306, 67, 1)
When I try to fit the model, I get the error, "ValueError: Error when checking target: expected dense_2 to have 2 dimensions, but got array with shape (3306, 67, 1)"
It would be great if somebody can help me in resolving the error. Thanks!

Upsampling by decimal factor in Keras

I want to use an upsampling 2D layer in keras so that I can increase the image size by a decimal factor (in this case from [213,213] to [640,640]). The layer is compiled as expected, but when I want to train or predict on real images, they are upsampled only by the closest integer to the input factor. Any idea? Details below:
Network:
mp_size = (3,3)
inputs = Input(input_data.shape[1:])
lay1 = Conv2D(32, (3,3), strides=(1,1), activation='relu', padding='same', kernel_initializer='glorot_normal')(inputs)
lay2 = MaxPooling2D(pool_size=mp_size)(lay1)
lay3 = Conv2D(32, (3,3), strides=(1,1), activation='relu', padding='same', kernel_initializer='glorot_normal')(lay2)
size1=lay3.get_shape()[1:3]
size2=lay1.get_shape()[1:3]
us_size = size2[0].value/size1[0].value, size2[1].value/size1[1].value
lay4 = Concatenate(axis=-1)([UpSampling2D(size=us_size)(lay3),lay1])
lay5 = Conv2D(1, (1, 1), strides=(1,1), activation='sigmoid')(lay4)
model = Model(inputs=inputs, outputs=lay5)
Network summary when I use model.summary() :
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
input_4 (InputLayer) (None, 640, 640, 2) 0
____________________________________________________________________________________________________
conv2d_58 (Conv2D) (None, 640, 640, 32) 608 input_4[0][0]
____________________________________________________________________________________________________
max_pooling2d_14 (MaxPooling2D) (None, 213, 213, 32) 0 conv2d_58[0][0]
____________________________________________________________________________________________________
conv2d_59 (Conv2D) (None, 213, 213, 32) 9248 max_pooling2d_14[0][0]
____________________________________________________________________________________________________
up_sampling2d_14 (UpSampling2D) (None, 640.0, 640.0, 0 conv2d_59[0][0]
____________________________________________________________________________________________________
concatenate_14 (Concatenate) (None, 640.0, 640.0, 0 up_sampling2d_14[0][0]
conv2d_58[0][0]
____________________________________________________________________________________________________
conv2d_60 (Conv2D) (None, 640.0, 640.0, 65 concatenate_14[0][0]
====================================================================================================
Total params: 9,921
Trainable params: 9,921
Non-trainable params: 0
Error when training the network:
InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [1,639,639,32] vs. shape[1] = [1,640,640,32]
[[Node: concatenate_14/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](up_sampling2d_14/ResizeNearestNeighbor, conv2d_58/Relu, concatenate_14/concat/axis)]]
It can be resolved by using the below code:
from keras.layers import UpSampling2D
from keras.utils.generic_utils import transpose_shape
class UpSamplingUnet(UpSampling2D):
def compute_output_shape(self, input_shape):
size_all_dims = (1,) + self.size + (1,)
spatial_axes = list(range(1, 1 + self.rank))
size_all_dims = transpose_shape(size_all_dims,
self.data_format,
spatial_axes)
output_shape = list(input_shape)
for dim in range(len(output_shape)):
if output_shape[dim] is not None:
output_shape[dim] *= size_all_dims[dim]
output_shape[dim]=int(output_shape[dim])
return tuple(output_shape)
Then alter UpSampling2D(size=us_size) to UpSamplingUnet(size=us_size).