ValueError: Dimensions must be equal, but are 2 and 64 for '{{node binary_crossentropy/mul}} with input shapes[?,2], [?,64] - tensorflow

I'm trying binary classification of text with bi-lstm model but getting this error: ValueError: Dimensions must be equal, but are 2 and 64 for '{{node binary_crossentropy/mul}} = Mul[T=DT_FLOAT](binary_crossentropy/Cast, binary_crossentropy/Log)' with input shapes: [?,2], [?,64].
I am a beginner please provide some valuable solutions.
text=df['text']
label=df['label']
X = pad_sequences(X, maxlen=max_len,padding=pad_type,truncating=trunc_type)
Y = pd.get_dummies(label).values
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.20)
print(X_train.shape,Y_train.shape)
print(X_test.shape,Y_test.shape)
#model creation
model=tf.keras.Sequential([
# add an embedding layer
tf.keras.layers.Embedding(word_count, 16, input_length=max_len),
tf.keras.layers.Dropout(0.2),
# add another bi-lstm layer
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(2,return_sequences=True)),
# add a dense layer
tf.keras.layers.Dense(32, activation=tf.keras.activations.relu),
tf.keras.layers.Dense(32, activation=tf.keras.activations.relu),
tf.keras.layers.Dense(32, activation=tf.keras.activations.relu),
tf.keras.layers.Dense(32, activation=tf.keras.activations.softmax),
# add the prediction layer
tf.keras.layers.Dense(1, activation=tf.keras.activations.sigmoid),
])
model.compile(loss=tf.keras.losses.BinaryCrossentropy(), optimizer=tf.keras.optimizers.Adam(), metrics=['accuracy'])
model.summary()
history = model.fit(X_train, Y_train, validation_data=(X_test, Y_test), epochs = 10, batch_size=batch_size, callbacks = [callback_func], verbose=1)

The output dimension of the prediction layer of the binary classification should be 2:
# add the prediction layer
tf.keras.layers.Dense(2, activation=tf.keras.activations.sigmoid)
Flatten:
#model creation
model=tf.keras.Sequential([
# add an embedding layer
tf.keras.layers.Embedding(word_count, 16, input_length=max_len),
tf.keras.layers.Dropout(0.2),
# add another bi-lstm layer
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(2,return_sequences=True)),
# add flatten
tf.keras.layers.Flatten(), #<========================
# add a dense layer
tf.keras.layers.Dense(32, activation=tf.keras.activations.relu),
tf.keras.layers.Dense(32, activation=tf.keras.activations.relu),
tf.keras.layers.Dense(32, activation=tf.keras.activations.relu),
tf.keras.layers.Dense(32, activation=tf.keras.activations.softmax),
# add the prediction layer
tf.keras.layers.Dense(2, activation=tf.keras.activations.sigmoid),
])

Related

Tensorflow neural network does not work, incompatible types

This is my code:
X_train, Y_train, X_test, Y_test = load_data(DATA_PATH)
model = keras.Sequential([
# input layer
# 1st dense layer
keras.layers.Dense(256, activation='relu', input_shape=(X_train.shape[1], X_train.shape[2], X_train.shape[3])),
# 2nd dense layer
keras.layers.Dense(128, activation='relu'),
# 3rd dense layer
keras.layers.Dense(64, activation='relu'),
# output layer
keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])
model.summary()
classifier = model.fit(X_train,
Y_train,
epochs=100,
batch_size=128)
Y_train ,X_train and Y_test ,X_test are numpy arrays. X_train contains 800 and X_test 200 .png pictures of size 128X128.
Y_train contains 800 labels (80x1, 80x2, etc.) and Y_test contains testing target (20x1, 20x2, etc.).
When I try to run this program I get the following error:
ValueError: Shapes (None, 1) and (None, 128, 128, 10) are incompatible
You need to reshape your input
Here is a running code
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
X_train = tf.random.normal(shape=(800,128,128,3))
X_test = tf.random.normal(shape=(200,128,128,3))
Y_train = tf.random.normal(shape=(800,10))
Y_test = tf.random.normal(shape=(200,10))
#reshape
X_train = tf.reshape(X_train, shape=(800, 128*128*3))
model = keras.Sequential([
# input layer
# 1st dense layer
keras.layers.Dense(256, activation='relu', input_shape=(X_train.shape[0], X_train.shape[1])),
# 2nd dense layer
keras.layers.Dense(128, activation='relu'),
# 3rd dense layer
keras.layers.Dense(64, activation='relu'),
# output layer
keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])
model.summary()
classifier = model.fit(X_train,
Y_train,
epochs=100,
batch_size=128)

Tensorflow Keras Shape mismatch

While trying to implement a standard MNIST digit recognizer that many tutorials use to introduce you to neural networks, I'm encountering the error
ValueError: Shape mismatch: The shape of labels (received (1,)) should equal the shape of logits except for the last dimension (received (28, 10)).
I would like to use from_tensor_slices to process the data, since I want to apply the code to another problem where the data comes from a CSV file. Anyway, here is the code producing the error in the line model.fit(...)
import tensorflow as tf
train_dataset, test_dataset = tf.keras.datasets.mnist.load_data()
train_images, train_labels = train_dataset
train_images = train_images/255.0
train_dataset_tensor = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
num_of_validation_data = 10000
validation_data = train_dataset_tensor.take(num_of_validation_data)
train_data = train_dataset_tensor.skip(num_of_validation_data)
model = tf.keras.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(100, activation='sigmoid'),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy']
)
model.fit(train_data, batch_size=50, epochs=5)
performance = model.evaluate(validation_data)
I don't understand where the shape (28, 10) of the logits comes from, I thought I was flattening the image, essentially making a 1D vector out of the 2D image? How can I prevent the error?
You can use the following code
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
x_train = x_train[..., tf.newaxis]
x_test = x_test[..., tf.newaxis]
train_ds = tf.data.Dataset.from_tensor_slices(
(x_train, y_train)).shuffle(10000).batch(32)
test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(100, activation='sigmoid'),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy']
)
model.fit(train_ds)

Error when checking target: expected dense_18 to have shape (1,) but got array with shape (10,)

Y_train = to_categorical(Y_train, num_classes = 10)#
random_seed = 2
X_train,X_val,Y_train,Y_val = train_test_split(X_train, Y_train, test_size = 0.1, random_state=random_seed)
Y_train.shape
model = Sequential()
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size = 86, epochs = 3,validation_data = (X_val, Y_val), verbose =2)
I have to classify the MNIST data into 10 classes. I am converting the Y_train into one hot encoded array. I have gone through a number of answers but none have helped. Kindly guide me in this regard as I am a novice in ML and neural network.
It seems there is no need to use model.add(Flatten()) in your first layer. Instead of doing so, you can use a dense layer with a specific input size like: model.add(Dense(64, input_shape=your_input_shape, activation="relu").
To ensure this issue happens because of the layers, you can check whether to_categorical() function works alone with jupyter notebook.
Updated Answer
Before the model, you should reshape your model. In that case 28*28 to 784.
train_images = train_images.reshape((-1, 784))
test_images = test_images.reshape((-1, 784))
I also suggest to normalize the data that could be done by simply dividing the images to 255
After that step you should create your model.
model = Sequential([
Dense(64, activation='relu', input_shape=(784,)),
Dense(64, activation='relu'),
Dense(10, activation='softmax'),
])
Have you noticed input_shape=(784,) That is the shape of your flattened input.
Last step, compiling and fitting.
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'],
)
model.fit(
train_images,
train_labels,
epochs=10,
batch_size=16,
)
What you do is you have just flattened the input layer without feeding the network with an input. That's why you experience an issue. The point is you should manually reshape your inputs and feed forward to the Dense() layers with parameter input_shape

Problems with input shapes at finetuning VGG with Keras

I'm trying to finetune the last layer of the VGG-16. Here is the part of the code where i make the new model:
def train2false(model):
for layer in model.layers:
layer.trainable = False
return model
def define_training_layers(model):
model.layers = model.layers[0:21]
model = train2false(model)
last_layer = model.get_layer('fc7')
out = Dense(n_classes, activation='softmax', name='fc8')(last_layer)
model = Model(input=model.input, output=out)
return model
def compile_model(epochs, lrate, model):
decay = lrate / epochs
sgd = SGD(lr=lrate, momentum=0, decay=0.0002, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
print (model.summary())
return model
def train_evaluate(model, X_train, y_train, X_test, y_test, epochs):
model.fit(X_train, y_train, validation_data=(X_test, y_test), nb_epoch=epochs, batch_size=32)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
return model
X_train, X_test, labels_test, labels_train, n_classes = load_dataset()
image_input = Input(shape=(3, 224, 224))
vgg_model = VGGFace(input_tensor= image_input, include_top=True)
custom_vgg_model = define_training_layers(vgg_model)
custom_vgg_model = compile_model(epochs=50, lrate=0.001, model=custom_vgg_model)
custom_vgg_model = train_evaluate(custom_vgg_model, X_train=X_train, y_train=labels_train, X_test=X_test, y_test=labels_test, epochs=50)
I get the following error:
tensorflow.python.framework.errors_impl.InvalidArgumentError:
Dimension 1 in both shapes must be equal, but are 1000 and 2622 for
'Assign_30' (op: 'Assign') with input shapes: [4096,1000],
[4096,2622].
It works for me if i try to finetune all the fully connected part with include_top=False instead of just the softmax activation.
Is there something that i'm missing?
Solved!!! I've take the pre-trained weights from https://github.com/rcmalli/keras-vggface/releases/download/v1.0/rcmalli_vggface_th_weights_th_ordering.h5 which has 2622 number of outputs and i had 1000 outputs. So just change the number of outputs for the last layer in VGG.py

input_shape parameter mismatch error in Convolution1D in keras

I want to classify a dataset using Convulation1D in keras.
DataSet Description:
train dataset size = [340,30] ; no of sample = 340 , sample dimension = 30
test dataset size = [230,30] ; no of sample = 230 , sample dimension = 30
label size = 2
Fist I try by the following code using the information from keras site https://keras.io/layers/convolutional/
batch_size=1
nb_epoch = 10
sizeX=340
sizeY=30
model = Sequential()
model.add(Convolution1D(64, 3, border_mode='same', input_shape=(sizeX,sizeY)))
model.add(Convolution1D(32, 3, border_mode='same'))
model.add(Convolution1D(16, 3, border_mode='same'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print('Train...')
model.fit(X_train_transformed, y_train, batch_size=batch_size, nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test_transformed, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
it gives the following error ,
ValueError: Error when checking model input: expected convolution1d_input_1 to have 3 dimensions, but got array with shape (340, 30)
Then I have transformed the Train and Test data into 3 dimension from 2 dimension by using the following code ,
X_train = np.reshape(X_train_transformed, (X_train_transformed.shape[0], X_train_transformed.shape[1], 1))
X_test = np.reshape(X_test_transformed, (X_test_transformed.shape[0], X_test_transformed.shape[1], 1))
Then I run the modified following code ,
batch_size=1
nb_epoch = 10
sizeX=340
sizeY=30
model = Sequential()
model.add(Convolution1D(64, 3, border_mode='same', input_shape=(sizeX,sizeY)))
model.add(Convolution1D(32, 3, border_mode='same'))
model.add(Convolution1D(16, 3, border_mode='same'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print('Train...')
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
But it shows the error ,
ValueError: Error when checking model input: expected convolution1d_input_1 to have shape (None, 340, 30) but got array with shape (340, 30, 1)
I am unable to find the dimension mismatch error here.
With the release of TF 2.0 and tf.keras, you can fairly easily update your model to work with these new versions. This can be done with the following code:
# import tensorflow 2.0
# keras doesn't need to be imported because it is built into tensorflow
from __future__ import absolute_import, division, print_function, unicode_literals
try:
%tensorflow_version 2.x
except Exception:
pass
import tensorflow as tf
batch_size = 1
nb_epoch = 10
# the model only needs the size of the sample as input, explained further below
size = 30
# reshape as you had before
X_train = np.reshape(X_train_transformed, (X_train_transformed.shape[0],
X_train_transformed.shape[1], 1))
X_test = np.reshape(X_test_transformed, (X_test_transformed.shape[0],
X_test_transformed.shape[1], 1))
# define the sequential model using tf.keras
model = tf.keras.Sequential([
# the 1d convolution layers can be defined as shown with the same
# number of filters and kernel size
# instead of border_mode, the parameter is padding
# the input_shape is (the size of each sample, 1), explained below
tf.keras.layers.Conv1D(64, 3, padding='same', input_shape=(size, 1)),
tf.keras.layers.Conv1D(32, 3, padding='same'),
tf.keras.layers.Conv1D(16, 3, padding='same'),
# Dense and Activation can be combined into one layer
# where the dense layer has 1 neuron and a sigmoid activation
tf.keras.layers.Dense(1, activation='sigmoid')
])
# the model can be compiled, fit, and evaluated in the same way
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print('Train...')
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
The problem you are having comes from the input shape of your model. According to the keras documentation the input shape of the model has to be (batch, step, channels). This means that the first dimension is the number of instances you have. The second dimension is the size of each sample. The third dimension is the number of channels which in your case would only be one. Overall, your input shape would be (340, 30, 1). When you actually define the input shape in the model, you only need to specify the the second and third dimension which means your input shape would be (size, 1). The model already expects the first dimension, the number of instances you have, as input so you do not need to specify that dimension.
Can you try this?
X_train = np.reshape(X_train_transformed, (1, X_train_transformed.shape[0], X_train_transformed.shape[1]))
X_test = np.reshape(X_test_transformed, (1, X_test_transformed.shape[0], X_test_transformed.shape[1]))