Keras - Trying to get 'logits' - one layer before the softmax activation function - tensorflow

I'm trying to get the 'logits' out of my Keras CNN classifier.
I have tried the suggested method here: link.
First I created two models to check the implementation :
create_CNN_MNIST CNN classifier that returns the softmax probabilities.
create_CNN_MNIST_logits CNN with the same layers as in (1) with a little twist in the last layer - changed the activation function to linear to return logits.
Both models were fed with the same Train and Test data of MNIST. Then I applied softmax on the logits, I got a different output from the softmax CNN.
I couldn't find a problem in my code. Maybe you could help advise another method to extract 'logits' from the model?
the code:
def softmax(x):
"""Compute softmax values for each sets of scores in x."""
e_x = np.exp(x - np.max(x))
return e_x / e_x.sum(axis=0)
def create_CNN_MNIST_logits() :
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(10, activation='linear'))
# compile model
opt = SGD(learning_rate=0.01, momentum=0.9)
def my_sparse_categorical_crossentropy(y_true, y_pred):
return keras.losses.categorical_crossentropy(y_true, y_pred, from_logits=True)
model.compile(optimizer=opt, loss=my_sparse_categorical_crossentropy, metrics=['accuracy'])
return model
def create_CNN_MNIST() :
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(10, activation='softmax'))
# compile model
opt = SGD(learning_rate=0.01, momentum=0.9)
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
return model
# load data
X_train = np.load('./data/X_train.npy')
X_test = np.load('./data/X_test.npy')
y_train = np.load('./data/y_train.npy')
y_test = np.load('./data/y_test.npy')
#create models
model_softmax = create_CNN_MNIST()
model_logits = create_CNN_MNIST_logits()
pixels = 28
channels = 1
num_labels = 10
# Reshaping to format which CNN expects (batch, height, width, channels)
trainX_cnn = X_train.reshape(X_train.shape[0], pixels, pixels, channels).astype('float32')
testX_cnn = X_test.reshape(X_test.shape[0], pixels, pixels, channels).astype('float32')
# Normalize images from 0-255 to 0-1
trainX_cnn /= 255
testX_cnn /= 255
train_y_cnn = utils.to_categorical(y_train, num_labels)
test_y_cnn = utils.to_categorical(y_test, num_labels)
#train the models:
model_logits.fit(trainX_cnn, train_y_cnn, validation_split=0.2, epochs=10,
batch_size=32)
model_softmax.fit(trainX_cnn, train_y_cnn, validation_split=0.2, epochs=10,
batch_size=32)
On the evaluation stage, I'll do softmax on the logits to check if its the same as the regular model:
#predict
y_pred_softmax = model_softmax.predict(testX_cnn)
y_pred_logits = model_logits.predict(testX_cnn)
#apply softmax on the logits to get the same result of regular CNN
y_pred_logits_activated = softmax(y_pred_logits)
Now I get different values in both y_pred_logits_activated and y_pred_softmax that lead to different accuracy on the test set.

Your models are probably being trained differently, make sure to set the seed prior to both fit commands so that they're initialised the same weights and have the same train/val split. Also, is the softmax might be incorrect:
def softmax(x):
"""Compute softmax values for each sets of scores in x."""
e_x = np.exp(x)
return e_x / e_x.sum(axis=1)
This is numerically equivalent to subtracting the max (https://stackoverflow.com/a/34969389/10475762), and the axis should be 1 if your matrix is of shape [batch, outputs].

Related

Keras layer shape incompatibility for a small MLP

I have a simple MLP built in Keras. The shapes of my inputs are:
X_train.shape - (6, 5)
Y_train.shape - 6
Create the model
model = Sequential()
model.add(Dense(32, input_shape=(X_train.shape[0],), activation='relu'))
model.add(Dense(Y_train.shape[0], activation='softmax'))
# Compile and fit
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, Y_train, epochs=10, batch_size=1, verbose=1, validation_split=0.2)
# Get output vector from softmax
output = model.layers[-1].output
This gives me the error:
ValueError: Error when checking input: expected dense_1_input to have shape (6,) but got array with shape (5,).
I have two questions:
Why do I get the above error and how can I solve it?
Is output = model.layers[-1].output the way to return the softmax vector for a given input vector? I haven't ever done this in Keras.
in the input layer use input_shape=(X_train.shape[1],) while your last layer has to be a dimension equal to the number of classes to predict
the way to return the softmax vector is model.predict(X)
here a complete example
n_sample = 5
n_class = 2
X = np.random.uniform(0,1, (n_sample,6))
y = np.random.randint(0,n_class, n_sample)
model = Sequential()
model.add(Dense(32, input_shape=(X.shape[1],), activation='relu'))
model.add(Dense(n_class, activation='softmax'))
# Compile and fit
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=10, batch_size=1, verbose=1)
# Get output vector from softmax
model.predict(X)

VGG16 training with colab reutrns WARNING:root:kernel

im still beginner with DL, here im trying to train VGG16 on a list of images to return 2048 features, but my issue was it returns 4094 features instead of 2048, so what i did to solve the issue is to create sequential model and remove one layer to be 2048 but will during the training on google cloab it returns kernel issue as below :
WARNING:root:kernel d3b89e05-3e27-4035-afcd-79a2c35a319a restarted
i have restarted the runtime and make a factorial restart, but still issue exists maybe because I did something wrong I'm my code, this issue happened only in the last code block
def read_images(folder_path, classlbl):
# load all images into a list
images = []
img_width, img_height = 224, 224
class1=[]
for img in os.listdir(folder_path):
img = os.path.join(folder_path, img)
img = load_img(img, target_size=(img_width, img_height))
class1.append(classlbl)# class one.
images.append(img)
return images, class1
#compute features for each image.
def computefeatures(model,image):
# convert the image pixels to a numpy array
image = img_to_array(image)
# reshape data for the model
image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))
# prepare the image for the VGG model
image = preprocess_input(image)
# get extracted features
features = model.predict(image)
return features
model = Sequential()
# second convolutional block
model.add(Conv2D(128, (3,3), strides=(1,1), padding="same", activation="relu"))
model.add(Conv2D(128,(2,2), strides=(1,1), padding="same",activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2,2)))
# third convolutional block
model.add(Conv2D(256, (3,3), strides=(1,1), padding="same", activation="relu"))
model.add(Conv2D(256,(3,3), strides=(1,1), padding="same",activation="relu"))
model.add(Conv2D(256,(3,3), strides=(1,1), padding="same",activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2,2)))
# third convolutional block
model.add(Conv2D(512, (3,3), strides=(1,1), padding="same", activation="relu"))
model.add(Conv2D(512,(3,3), strides=(1,1), padding="same",activation="relu"))
model.add(Conv2D(512,(3,3), strides=(1,1), padding="same",activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2,2)))
# third convolutional block
model.add(Conv2D(512, (3,3), strides=(1,1), padding="same", activation="relu"))
model.add(Conv2D(512,(3,3), strides=(1,1), padding="same",activation="relu"))
model.add(Conv2D(512,(3,3), strides=(1,1), padding="same",activation="relu"))
#DNN Backend
model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dense(4096, activation='relu'))
#Output layer for classification (1000 classes)
model.add(Dense(1000, activation='softmax'))
#use the categorical cross entropy loss function for a multi-class classifier.
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# call the image read and
folder_path = '/content/imageDir/101_ObjectCategories/windsor_chair'
classlbl=0
images, class1 =read_images(folder_path, classlbl)
# call the fucntion to compute the features for each image.
list_features1=[]
list_features1 = np.empty((0,4096), float)# create an empty array with 0 row and 4096 columns this number from fature
# extraction from vg16
for img in range(len(images)):
f2=computefeatures(model,images[img]) # compute features forea each image
list_features1 = np.append(list_features1, f2, axis=0)
One of the most popular reasons leads to restart kernel itself is out of memory (CPU RAM). It seems like because of the read_images function, you were trying to read and store too many image files at the same time.
You may want to try ImageDataGenerator API for reading images in small batches.
OR
You can simply change your read_images function as below:
def read_images(folder_path, classlbl):
# load all images into a list
images = []
img_width, img_height = 224, 224
class1=[]
for img in os.listdir(folder_path):
img = os.path.join(folder_path, img)
img = load_img(img, target_size=(img_width, img_height))
yield img, classlbl ## <--- this line
And use it like this
result =read_images(folder_path, classlbl)
# call the fucntion to compute the features for each image.
list_features1=[]
list_features1 = np.empty((0,4096), float)# create an empty array with 0 row and 4096 columns this number from fature
# extraction from vg16
for img, _ in result: # <--- change this line
f2=computefeatures(model, img) # compute features forea each image
list_features1 = np.append(list_features1, f2, axis=0)

Error when checking target: expected dense_18 to have shape (1,) but got array with shape (10,)

Y_train = to_categorical(Y_train, num_classes = 10)#
random_seed = 2
X_train,X_val,Y_train,Y_val = train_test_split(X_train, Y_train, test_size = 0.1, random_state=random_seed)
Y_train.shape
model = Sequential()
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size = 86, epochs = 3,validation_data = (X_val, Y_val), verbose =2)
I have to classify the MNIST data into 10 classes. I am converting the Y_train into one hot encoded array. I have gone through a number of answers but none have helped. Kindly guide me in this regard as I am a novice in ML and neural network.
It seems there is no need to use model.add(Flatten()) in your first layer. Instead of doing so, you can use a dense layer with a specific input size like: model.add(Dense(64, input_shape=your_input_shape, activation="relu").
To ensure this issue happens because of the layers, you can check whether to_categorical() function works alone with jupyter notebook.
Updated Answer
Before the model, you should reshape your model. In that case 28*28 to 784.
train_images = train_images.reshape((-1, 784))
test_images = test_images.reshape((-1, 784))
I also suggest to normalize the data that could be done by simply dividing the images to 255
After that step you should create your model.
model = Sequential([
Dense(64, activation='relu', input_shape=(784,)),
Dense(64, activation='relu'),
Dense(10, activation='softmax'),
])
Have you noticed input_shape=(784,) That is the shape of your flattened input.
Last step, compiling and fitting.
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'],
)
model.fit(
train_images,
train_labels,
epochs=10,
batch_size=16,
)
What you do is you have just flattened the input layer without feeding the network with an input. That's why you experience an issue. The point is you should manually reshape your inputs and feed forward to the Dense() layers with parameter input_shape

Why all results in CNN Keras the same

I have a problem in my CNN code in Keras
after building simple neural-network architecture and run the prediction function
all predictions have the same resutls
My own data divided into two classes (healthy , unhealthy)
I thought before that the problem was the complex of nature of pictures
but when i applied code in another data contain from (White image , black image)
i found the same results
What is the cause of the problem in your opinion
My architecture CNN code
#batch_size to train
batch_size = 32
# number of output classes
nb_classes = 2
# number of epochs to train
nb_epoch = 2
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
nb_pool = 2
# convolution kernel size
nb_conv = 3
HIDDEN_LAYERS = 4
sgd = SGD(lr=0.5, decay=1e-6, momentum=0.6, nesterov=True)
Y_train = np_utils.to_categorical(Y_train, nb_classes)
Y_test = np_utils.to_categorical(Y_test, nb_classes)
X_train /= 255
X_test /= 255
# first set of CONV => RELU => POOL layers
model = Sequential()
model.add(Conv2D(20, (5, 5), padding="same",
input_shape=shape_ord))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool), strides=(2, 2)))
# second set of CONV => RELU => POOL layers
model.add(Conv2D(50, (5, 5), padding="same"))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool), strides=(2, 2)))
model.add(Dropout(0.5))
# first (and only) set of FC => RELU layers
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
#model.add(Dropout(0.5))
# softmax classifier
model.add(Dense(nb_classes))
model.add(Activation(K.sigmoid))
#categorical_crossentropy
model.compile(loss="binary_crossentropy", optimizer=sgd,metrics=["accuracy"])
The results
The result of predictions
Since you have a final sigmoid layer and are using binary_crossentropy, try encoding your labels as an Nx1 vector of 0s and 1s.
The lines below are converting them to an Nx2 vector, which you would typically use with a final softmax layer.
Y_train = np_utils.to_categorical(Y_train, nb_classes)
Y_test = np_utils.to_categorical(Y_test, nb_classes)
If Y_train and Y_test are already Nx1 of 0s and 1s, try commenting the above lines out.

input_shape parameter mismatch error in Convolution1D in keras

I want to classify a dataset using Convulation1D in keras.
DataSet Description:
train dataset size = [340,30] ; no of sample = 340 , sample dimension = 30
test dataset size = [230,30] ; no of sample = 230 , sample dimension = 30
label size = 2
Fist I try by the following code using the information from keras site https://keras.io/layers/convolutional/
batch_size=1
nb_epoch = 10
sizeX=340
sizeY=30
model = Sequential()
model.add(Convolution1D(64, 3, border_mode='same', input_shape=(sizeX,sizeY)))
model.add(Convolution1D(32, 3, border_mode='same'))
model.add(Convolution1D(16, 3, border_mode='same'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print('Train...')
model.fit(X_train_transformed, y_train, batch_size=batch_size, nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test_transformed, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
it gives the following error ,
ValueError: Error when checking model input: expected convolution1d_input_1 to have 3 dimensions, but got array with shape (340, 30)
Then I have transformed the Train and Test data into 3 dimension from 2 dimension by using the following code ,
X_train = np.reshape(X_train_transformed, (X_train_transformed.shape[0], X_train_transformed.shape[1], 1))
X_test = np.reshape(X_test_transformed, (X_test_transformed.shape[0], X_test_transformed.shape[1], 1))
Then I run the modified following code ,
batch_size=1
nb_epoch = 10
sizeX=340
sizeY=30
model = Sequential()
model.add(Convolution1D(64, 3, border_mode='same', input_shape=(sizeX,sizeY)))
model.add(Convolution1D(32, 3, border_mode='same'))
model.add(Convolution1D(16, 3, border_mode='same'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print('Train...')
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
But it shows the error ,
ValueError: Error when checking model input: expected convolution1d_input_1 to have shape (None, 340, 30) but got array with shape (340, 30, 1)
I am unable to find the dimension mismatch error here.
With the release of TF 2.0 and tf.keras, you can fairly easily update your model to work with these new versions. This can be done with the following code:
# import tensorflow 2.0
# keras doesn't need to be imported because it is built into tensorflow
from __future__ import absolute_import, division, print_function, unicode_literals
try:
%tensorflow_version 2.x
except Exception:
pass
import tensorflow as tf
batch_size = 1
nb_epoch = 10
# the model only needs the size of the sample as input, explained further below
size = 30
# reshape as you had before
X_train = np.reshape(X_train_transformed, (X_train_transformed.shape[0],
X_train_transformed.shape[1], 1))
X_test = np.reshape(X_test_transformed, (X_test_transformed.shape[0],
X_test_transformed.shape[1], 1))
# define the sequential model using tf.keras
model = tf.keras.Sequential([
# the 1d convolution layers can be defined as shown with the same
# number of filters and kernel size
# instead of border_mode, the parameter is padding
# the input_shape is (the size of each sample, 1), explained below
tf.keras.layers.Conv1D(64, 3, padding='same', input_shape=(size, 1)),
tf.keras.layers.Conv1D(32, 3, padding='same'),
tf.keras.layers.Conv1D(16, 3, padding='same'),
# Dense and Activation can be combined into one layer
# where the dense layer has 1 neuron and a sigmoid activation
tf.keras.layers.Dense(1, activation='sigmoid')
])
# the model can be compiled, fit, and evaluated in the same way
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print('Train...')
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
The problem you are having comes from the input shape of your model. According to the keras documentation the input shape of the model has to be (batch, step, channels). This means that the first dimension is the number of instances you have. The second dimension is the size of each sample. The third dimension is the number of channels which in your case would only be one. Overall, your input shape would be (340, 30, 1). When you actually define the input shape in the model, you only need to specify the the second and third dimension which means your input shape would be (size, 1). The model already expects the first dimension, the number of instances you have, as input so you do not need to specify that dimension.
Can you try this?
X_train = np.reshape(X_train_transformed, (1, X_train_transformed.shape[0], X_train_transformed.shape[1]))
X_test = np.reshape(X_test_transformed, (1, X_test_transformed.shape[0], X_test_transformed.shape[1]))