Should I convert classification output to integer and how? - tensorflow

I'm using a neural network to classify text, and the label of the training data is 0 or 1(i.e. binary classification). It works well in the training and evaluating process, but the prediction output is float values rather than integer 0 or 1. How could I always get integer results? Do i need to manually convert them or change network parameters?
model = Sequential()
e = Embedding(vocab_size, embedding_dim, weights=[embedding_matrix],
input_length=max_length, trainable=False)
model.add(Dense(1, activation='sigmoid'))
# compile
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc'])
# fit, labels, epochs=5, verbose=2)
# eval
loss, accuracy = model.evaluate(padded_docs, labels, verbose=0)
print('Accuracy: %f' % (accuracy*100))
# predict
result = model.predict(padded_docs_test, verbose=2)

You need to manually convert them by setting a threshold, like:
threshold = 0.5
result = model.predict(padded_docs_test, verbose=2)
result = result > threshold
This will give binary predictions. Keras uses a threshold of 0.5 when computing binary accuracy.


Keras - Trying to get 'logits' - one layer before the softmax activation function

I'm trying to get the 'logits' out of my Keras CNN classifier.
I have tried the suggested method here: link.
First I created two models to check the implementation :
create_CNN_MNIST CNN classifier that returns the softmax probabilities.
create_CNN_MNIST_logits CNN with the same layers as in (1) with a little twist in the last layer - changed the activation function to linear to return logits.
Both models were fed with the same Train and Test data of MNIST. Then I applied softmax on the logits, I got a different output from the softmax CNN.
I couldn't find a problem in my code. Maybe you could help advise another method to extract 'logits' from the model?
the code:
def softmax(x):
"""Compute softmax values for each sets of scores in x."""
e_x = np.exp(x - np.max(x))
return e_x / e_x.sum(axis=0)
def create_CNN_MNIST_logits() :
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(10, activation='linear'))
# compile model
opt = SGD(learning_rate=0.01, momentum=0.9)
def my_sparse_categorical_crossentropy(y_true, y_pred):
return keras.losses.categorical_crossentropy(y_true, y_pred, from_logits=True)
model.compile(optimizer=opt, loss=my_sparse_categorical_crossentropy, metrics=['accuracy'])
return model
def create_CNN_MNIST() :
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(10, activation='softmax'))
# compile model
opt = SGD(learning_rate=0.01, momentum=0.9)
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
return model
# load data
X_train = np.load('./data/X_train.npy')
X_test = np.load('./data/X_test.npy')
y_train = np.load('./data/y_train.npy')
y_test = np.load('./data/y_test.npy')
#create models
model_softmax = create_CNN_MNIST()
model_logits = create_CNN_MNIST_logits()
pixels = 28
channels = 1
num_labels = 10
# Reshaping to format which CNN expects (batch, height, width, channels)
trainX_cnn = X_train.reshape(X_train.shape[0], pixels, pixels, channels).astype('float32')
testX_cnn = X_test.reshape(X_test.shape[0], pixels, pixels, channels).astype('float32')
# Normalize images from 0-255 to 0-1
trainX_cnn /= 255
testX_cnn /= 255
train_y_cnn = utils.to_categorical(y_train, num_labels)
test_y_cnn = utils.to_categorical(y_test, num_labels)
#train the models:, train_y_cnn, validation_split=0.2, epochs=10,
batch_size=32), train_y_cnn, validation_split=0.2, epochs=10,
On the evaluation stage, I'll do softmax on the logits to check if its the same as the regular model:
y_pred_softmax = model_softmax.predict(testX_cnn)
y_pred_logits = model_logits.predict(testX_cnn)
#apply softmax on the logits to get the same result of regular CNN
y_pred_logits_activated = softmax(y_pred_logits)
Now I get different values in both y_pred_logits_activated and y_pred_softmax that lead to different accuracy on the test set.
Your models are probably being trained differently, make sure to set the seed prior to both fit commands so that they're initialised the same weights and have the same train/val split. Also, is the softmax might be incorrect:
def softmax(x):
"""Compute softmax values for each sets of scores in x."""
e_x = np.exp(x)
return e_x / e_x.sum(axis=1)
This is numerically equivalent to subtracting the max (, and the axis should be 1 if your matrix is of shape [batch, outputs].

Weighted custom loss

I want to write the model using a custom loss, something like loss = sum(abs(y_true-y_pred)*returns), where returns is a vector containing different weights for each prediction (different vector for each dataset, and so different for training and test sets).
Here is my network using categorical_crossentropy loss. How is possible to change it?
model = Sequential()
model.add(Dense(2, activation='relu',input_shape=(X_train.shape[1],)) )
model.add(Dense(2, activation='relu', input_shape=(X_train.shape[1],)))
model.add(Dense(3, activation='softmax'))
callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=50)
model.compile(loss='categorical_crossentropy', optimizer='RMSprop', metrics=['AUC'])
history =, y_train, epochs=epochs, batch_size=32,validation_data=(X_test, y_test),callbacks=[callback])
You need to declare a custom loss function:
def my_loss(y_true,y_pred):
return someFunc(y_true, y_pred)
And then pass it to model.compile:
model.compile(loss=my_loss, optimizer='RMSprop', metrics=['AUC'])

How to improve the model's accuracy?

I am definitely a new beginner of tensorflow, I tried to create a simple model, but the accuracy is super low, can someone help to figure out what is wrong?
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
train_x = [[i, j] for i in range(1000) for j in range(1000)]
train_y = [[(2 * i + 3 * j) % 10] for i in range(1000) for j in range(1000)]
model = Sequential()
model.add(Dense(32, activation="relu", input_dim=2))
model.add(Dense(32, activation="relu"))
model.add(Dense(10, activation="softmax"))
model.compile(optimzier="rmsprop", loss="sparse_categorical_crossentropy", metrics=['accuracy'])
model.summary(), train_y, epochs=10, batch_size=1000)
test_x = train_x[10:60]
test_y = train_y[10:60]
model.evaluate(test_x, test_y, batch_size=100)
loss: 2.3026 - accuracy: 0.1000
There are few ways to improve your model's accuracy
Reduce the batch size (You are using whole dataset)
Increase the number of layers, units.
Increase the number of epochs.
Use unseen data to evaluate the model. (You are using 60 elements of training data)
I suggest you to read Deep Learning with Python by Francois Chollet, section 3.5 "A multiclass classification example"

Keras layer shape incompatibility for a small MLP

I have a simple MLP built in Keras. The shapes of my inputs are:
X_train.shape - (6, 5)
Y_train.shape - 6
Create the model
model = Sequential()
model.add(Dense(32, input_shape=(X_train.shape[0],), activation='relu'))
model.add(Dense(Y_train.shape[0], activation='softmax'))
# Compile and fit
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']), Y_train, epochs=10, batch_size=1, verbose=1, validation_split=0.2)
# Get output vector from softmax
output = model.layers[-1].output
This gives me the error:
ValueError: Error when checking input: expected dense_1_input to have shape (6,) but got array with shape (5,).
I have two questions:
Why do I get the above error and how can I solve it?
Is output = model.layers[-1].output the way to return the softmax vector for a given input vector? I haven't ever done this in Keras.
in the input layer use input_shape=(X_train.shape[1],) while your last layer has to be a dimension equal to the number of classes to predict
the way to return the softmax vector is model.predict(X)
here a complete example
n_sample = 5
n_class = 2
X = np.random.uniform(0,1, (n_sample,6))
y = np.random.randint(0,n_class, n_sample)
model = Sequential()
model.add(Dense(32, input_shape=(X.shape[1],), activation='relu'))
model.add(Dense(n_class, activation='softmax'))
# Compile and fit
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy']), y, epochs=10, batch_size=1, verbose=1)
# Get output vector from softmax

Error when checking target: expected dense_18 to have shape (1,) but got array with shape (10,)

Y_train = to_categorical(Y_train, num_classes = 10)#
random_seed = 2
X_train,X_val,Y_train,Y_val = train_test_split(X_train, Y_train, test_size = 0.1, random_state=random_seed)
model = Sequential()
model.add(Dense(64, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy',metrics=['accuracy']), Y_train, batch_size = 86, epochs = 3,validation_data = (X_val, Y_val), verbose =2)
I have to classify the MNIST data into 10 classes. I am converting the Y_train into one hot encoded array. I have gone through a number of answers but none have helped. Kindly guide me in this regard as I am a novice in ML and neural network.
It seems there is no need to use model.add(Flatten()) in your first layer. Instead of doing so, you can use a dense layer with a specific input size like: model.add(Dense(64, input_shape=your_input_shape, activation="relu").
To ensure this issue happens because of the layers, you can check whether to_categorical() function works alone with jupyter notebook.
Updated Answer
Before the model, you should reshape your model. In that case 28*28 to 784.
train_images = train_images.reshape((-1, 784))
test_images = test_images.reshape((-1, 784))
I also suggest to normalize the data that could be done by simply dividing the images to 255
After that step you should create your model.
model = Sequential([
Dense(64, activation='relu', input_shape=(784,)),
Dense(64, activation='relu'),
Dense(10, activation='softmax'),
Have you noticed input_shape=(784,) That is the shape of your flattened input.
Last step, compiling and fitting.
What you do is you have just flattened the input layer without feeding the network with an input. That's why you experience an issue. The point is you should manually reshape your inputs and feed forward to the Dense() layers with parameter input_shape