Keras model classifies images as the same class - tensorflow

I am using Keras' pre-trained VGGNet16 as my base model, but I need to add layers on the end to make it work for my data. I've got the data pre-processed and formatted, so I'll jump to the part of the code actually involving the CNN.
import tensorflow as tf
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D, AveragePooling2D
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
import tensorflow.keras as K
.
.
.
input_t = K.Input(shape=(224,224,3))
base_model = K.applications.VGG16(include_top=False,weights="imagenet",input_tensor=input_t)
for layer in base_model.layers:
layer.trainable = False
model = K.models.Sequential()
model.add(base_model)
model.add(Flatten())
model.add(Dense(num_classes,activation='softmax'))
#Training
model.compile(loss = 'binary_crossentropy',
optimizer = Adam(learning_rate=1e-5),
metrics = ['accuracy'])
#Fit
model.fit(x_train,
y_train,
epochs = 3,
validation_split = 0.1,
validation_data =(x_test,y_test))
model.summary()
This code yields the following output.
Epoch 1/3
14/14 [==============================] - 48s 3s/step - loss: 0.5887 - accuracy: 0.2287 - val_loss: 0.4951 - val_accuracy: 0.3000
Epoch 2/3
14/14 [==============================] - 48s 3s/step - loss: 0.5170 - accuracy: 0.2220 - val_loss: 0.4972 - val_accuracy: 0.2800
Epoch 3/3
14/14 [==============================] - 48s 3s/step - loss: 0.4982 - accuracy: 0.2265 - val_loss: 0.4975 - val_accuracy: 0.2200
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Functional) (None, 7, 7, 512) 14714688
_________________________________________________________________
flatten (Flatten) (None, 25088) 0
_________________________________________________________________
dense (Dense) (None, 5) 125445
=================================================================
Total params: 14,840,133
Trainable params: 125,445
Non-trainable params: 14,714,688
_________________________________________________________________
Test loss: 0.4997282326221466
Test accuracy: 0.2720000147819519
All of my test images are being classified into the same class. Is there any reason for this?
Edit: Modified question upon realizing the model is running properly

Perhaps this isn't a great solution, but I fixed this by making the last six layers trainable with the others trainable
for layer in base_model.layers[:-6]:
layer.trainable = False

Related

training loss is Nan - but trainning data is all in range without null

When I execute my model.fit(x_train_lstm, y_train_lstm, epochs=3, shuffle=False, verbose=2)
I always get loss as nan:
Epoch 1/3
73/73 - 5s - loss: nan - accuracy: 0.5417 - 5s/epoch - 73ms/step
Epoch 2/3
73/73 - 5s - loss: nan - accuracy: 0.5417 - 5s/epoch - 74ms/step
Epoch 3/3
73/73 - 5s - loss: nan - accuracy: 0.5417 - 5s/epoch - 73ms/step
My x_training is shaped (2475, 48), y_train is shaped (2475,)
I derive my input train set in (2315, 160, 48), so 2315 sets of training data, 160 as my loopback timewindow, 48 features
corresspondingly, the y_train is 0 or 1 in shape of (2315, 1)
All in range of (-1,1):
My model is like this:
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_6 (LSTM) (None, 160, 128) 90624
dropout_4 (Dropout) (None, 160, 128) 0
lstm_7 (LSTM) (None, 160, 64) 49408
dropout_5 (Dropout) (None, 160, 64) 0
lstm_8 (LSTM) (None, 32) 12416
dense_2 (Dense) (None, 1) 33
=================================================================
Total params: 152,481
Trainable params: 152,481
Non-trainable params: 0
I tried different LSTM units: 48, 60, 128, 160, none of them work
I check my training data, all of them are in the range of (-1,1)
There is no 'null' in my dataset, x_train.isnull().values.any() outputs False
Now I have no clue where can I try more~
My model code is:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
from tensorflow.keras.layers import Dropout
def create_model(win = 100, features = 9):
model = Sequential()
model.add(LSTM(units=128, activation='relu', input_shape=(win, features),
return_sequences=True))
model.add(Dropout(0.1))
model.add(LSTM(units=64, activation='relu', return_sequences=True))
model.add(Dropout(0.2))
# no need return sequences from 'the last layer'
model.add(LSTM(units=32))
# adding the output layer
model.add(Dense(units=1, activation='sigmoid'))
# may also try mean_squared_error
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
here I plot some train_y samples:
Two things: try normalizing your time series data and using relu as the activation function for lstm layers is not 'conventional'. Check this post for further insights. An example:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
from tensorflow.keras.layers import Dropout
import tensorflow as tf
layer = tf.keras.layers.Normalization(axis=-1)
x = tf.random.normal((500, 100, 9))
y = tf.random.uniform((500, ), dtype=tf.int32, maxval=2)
layer.adapt(x)
def create_model(win = 100, features = 9):
model = Sequential()
model.add(layer)
model.add(LSTM(units=128, activation='tanh', input_shape=(win, features),
return_sequences=True))
model.add(Dropout(0.1))
model.add(LSTM(units=64, activation='tanh', return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=32))
model.add(Dense(units=1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
model = create_model()
model.fit(x, y, epochs=20)
2022.Mar.17 update
After further debugging, I eventually found out the problem is actually because of my newly added feature contains np.inf, after I remove those rows, my problem resolved, I can see the loss value now
6/6 [==============================] - 2s 50ms/step - loss: 0.6936 - accuracy: 0.5176
Note, np.inf has symbol, so make sure both np.inf and -np.inf are removed:
all_dataset = all_dataset[all_dataset.feature_issue != np.inf]
all_dataset = all_dataset[all_dataset.feature_issue != -np.inf]
2022.Mar.16
After some debugging, I addressed 2 of my new added features actually cause the problem. So, the problem comes from data, but unlike the others, my data contains no nan, no out of range(originally I thought all the data needs to be normalized)
But I can't tell the reason yet, as they looks good
I will continue research more on it tomorrow, any advice is appreciated!

Invalid Argument Error when using Tensorboard callback

I have used a Tensorboard callback in fitting a model consisting of one embedding layer and one SimpleRNN layer. The model performs binary sentiment classification for 9600 input text sequences. They have been tokenised and padded in advance.
# 1. Remove previous logs
!rm -rf ./logs/
# 2. Change to Py_file_dir
os.chdir(...)
# input_dim = 43489 (size of tokenizer word dictionary); output_dim = 100 (GloVe 100d embeddings); input_length = 1403 (length of longest text sequence).
# xtr_pad is padded, tokenised text sequences. nrow = 9600, ncol = input_length = 1403.
model= Sequential()
model.add(Embedding(input_dim, output_dim, input_length= input_length,
weights= [Embedding_matrix], trainable= False))
model.add(SimpleRNN(200))
model.add(Dense(1, activation= 'sigmoid'))
model.compile(loss='binary_crossentropy', optimizer= 'adam', metrics=['accuracy'])
tb = TensorBoard(histogram_freq=1, log_dir= 'tbcallback_prac')
tr_results= model.fit(xtr_pad, ytr, epochs= 2, batch_size= 64, verbose= 1,
validation_split= 0.2, callbacks= [tb])
# In command prompt enter: tensorboard --logdir tbcallback_prac
I have run this on Jupyterlab and on the first time the model trains without issue. I was able to view the Tensorboard statistics on local host.
However when I run this same code a second time, i.e. removing logs and fitting model it completed the first epoch of training, but returns this error before the 2nd epoch begins.
Train on 7680 samples, validate on 1920 samples
Epoch 1/2
7680/7680 [==============================] - ETA: 0s - loss: 0.2919 - accuracy: 0.9004
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
<ipython-input-12-a1cde9b5b1f4> in <module>()
7 tb = TensorBoard(histogram_freq=1, log_dir= 'tbcallback_prac')
8 tr_results= model.fit(xtr_pad, ytr, epochs= 2, batch_size= 64, verbose= 1,
----> 9 validation_split= 0.2, callbacks= [tb])
...
InvalidArgumentError: You must feed a value for placeholder tensor 'embedding_input' with dtype float and shape [?,1403]
[[{{node embedding_input}}]]
Note 1403 is the length of all padded, tokenised sequences in training input 'xtr'.
Thanks in advance for any help!
I have no issue but I think that is a dimensions problem when working on logtis and sigmoid
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 3072, 64) 64000
simple_rnn (SimpleRNN) (None, 200) 53000
dense (Dense) (None, 1) 201
=================================================================
Total params: 117,201
Trainable params: 117,201
Non-trainable params: 0
_________________________________________________________________
val_dir: F:\models\checkpoint\ale_highscores_3\validation
Epoch 1/1500
2/2 [==============================] - ETA: 0s - loss: -0.5579 - accuracy: 0.1000[<KerasTensor: shape=(None, 3072) dtype=float32 (created by layer 'embedding_input')>]
<keras.engine.functional.Functional object at 0x00000233003A8550>
Press AnyKey!
2/2 [==============================] - 14s 7s/step - loss: -0.5579 - accuracy: 0.1000 - val_loss: -0.6446 - val_accuracy: 0.1000
Epoch 2/1500
2/2 [==============================] - ETA: 0s - loss: -0.6588 - accuracy: 0.1000[<KerasTensor: shape=(None, 3072) dtype=float32 (created by layer 'embedding_input')>]
<keras.engine.functional.Functional object at 0x00000233003A8C40>
Press AnyKey!
2/2 [==============================] - 13s 7s/step - loss: -0.6588 - accuracy: 0.1000 - val_loss: -0.7242 - val_accuracy: 0.1000
Epoch 3/1500
1/2 [==============>...............] - ETA: 6s - loss: -0.1867 - accuracy: 0.1429

AttributeError: module 'tensorflow_core.compat.v2' has no attribute '__internal__'

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
import matplotlib.pyplot as plt
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)
im = plt.imshow(x_train[0], cmap="gray")
plt.show()
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train/255
x_test = x_test/255
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dense(256, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.summary()
model.compile(optimizer=SGD(), loss='categorical_crossentropy', metics=['accuracy'])
model.fit(x_train, y_train, batch_size=64, epochs=5, validation_data=(x_test, y_test))
I tried several different versions of the combination, but still reported an error about
AttributeError: module 'tensorflow_core.compat.v2' has no attribute 'internal'
AttributeError: module 'tensorflow_core.compat.v2' has no attribute '__internal__'
Generally you will get above error due to incompatibility between Tensorflow and Keras. You can import keras without any issues by upgrade to latest version. For more details you can refer this solution.
Coming to your code, have couple of issues and it can be resolved
1.to_categorical has now packed in np_utils. You need to add import as shown below
from keras.utils.np_utils import to_categorical
2.Typo mistake, replace metics to metrics in model.compile
Working code as shown below
import keras
print(keras.__version__)
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
import matplotlib.pyplot as plt
from keras.utils.np_utils import to_categorical
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)
im = plt.imshow(x_train[0], cmap="gray")
plt.show()
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train/255
x_test = x_test/255
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dense(256, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.summary()
model.compile(optimizer=SGD(), loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=64, epochs=5, validation_data=(x_test, y_test))
Output:
2.5.0
(60000, 28, 28) (60000,)
(10000, 28, 28) (10000,)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 512) 401920
_________________________________________________________________
dense_1 (Dense) (None, 256) 131328
_________________________________________________________________
dense_2 (Dense) (None, 10) 2570
=================================================================
Total params: 535,818
Trainable params: 535,818
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
938/938 [==============================] - 21s 8ms/step - loss: 1.2351 - accuracy: 0.6966 - val_loss: 0.3644 - val_accuracy: 0.9011
Epoch 2/5
938/938 [==============================] - 7s 7ms/step - loss: 0.3554 - accuracy: 0.9023 - val_loss: 0.2943 - val_accuracy: 0.9166
Epoch 3/5
938/938 [==============================] - 7s 7ms/step - loss: 0.2929 - accuracy: 0.9176 - val_loss: 0.2553 - val_accuracy: 0.9282
Epoch 4/5
938/938 [==============================] - 7s 7ms/step - loss: 0.2538 - accuracy: 0.9281 - val_loss: 0.2309 - val_accuracy: 0.9337
Epoch 5/5
938/938 [==============================] - 7s 8ms/step - loss: 0.2313 - accuracy: 0.9355 - val_loss: 0.2096 - val_accuracy: 0.9401
<keras.callbacks.History at 0x7f615c82d090>
You can refer this gist,for the above use case in tensorflow version.
Error:
AttributeError: module 'tensorflow_core.compat.v2' has no attribute '__internal__'
Solution:
Install Libraries
!pip install tensorflow==2.1
!pip install keras==2.3.1
Import
from tensorflow.keras.models import load_model

Why masking input produces the same loss as unmasked input on Keras?

I am experimenting with LSTM using variable-length input due to this reason. I wanted to be sure that loss is calculated correctly under masking. So, I trained the below model that uses Masking layer on padded sequences.
from tensorflow.keras.layers import LSTM, Masking, Dense
from tensorflow.keras.utils import to_categorical
from tensorflow.keras import models, losses
import tensorflow as tf
import numpy as np
import os
"""
For generating reproducible results, set seed.
"""
def set_seed(seed):
os.environ['PYTHONHASHSEED'] = str(seed)
np.random.seed(seed)
tf.random.set_seed(seed)
"""
Set some right most indices to mask value like padding
"""
def create_padded_seq(num_samples, timesteps, num_feats, mask_value):
feats = np.random.random([num_samples, timesteps, num_feats]).astype(np.float32) # Generate samples
for i in range(0, num_samples):
rand_index = np.random.randint(low=2, high=timesteps, size=1)[0] # Apply padding
feats[i, rand_index:, 0] = mask_value
return feats
set_seed(42)
num_samples = 100
timesteps = 6
num_feats = 1
num_classes = 3
num_lstm_cells = 1
mask_value = -100
num_epochs = 5
X_train = create_padded_seq(num_samples, timesteps, num_feats, mask_value)
y_train = np.random.randint(num_classes, size=num_samples)
cat_y_train = to_categorical(y_train, num_classes)
masked_model = models.Sequential(name='masked')
masked_model.add(Masking(mask_value=mask_value, input_shape=(timesteps, num_feats)))
masked_model.add(LSTM(num_lstm_cells, return_sequences=False))
masked_model.add(Dense(num_classes, activation='relu'))
masked_model.compile(loss=losses.categorical_crossentropy, optimizer='adam', metrics=["accuracy"])
print(masked_model.summary())
masked_model.fit(X_train, cat_y_train, batch_size=1, epochs=5, verbose=True)
This is the verbose output,
Model: "masked"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
masking (Masking) (None, 6, 1) 0
_________________________________________________________________
lstm (LSTM) (None, 1) 12
_________________________________________________________________
dense (Dense) (None, 3) 6
=================================================================
Total params: 18
Trainable params: 18
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/5
100/100 [==============================] - 0s 2ms/step - loss: 10.6379 - accuracy: 0.3400
Epoch 2/5
100/100 [==============================] - 0s 2ms/step - loss: 10.6379 - accuracy: 0.3400
Epoch 3/5
100/100 [==============================] - 0s 2ms/step - loss: 10.6379 - accuracy: 0.3400
Epoch 4/5
100/100 [==============================] - 0s 2ms/step - loss: 10.6379 - accuracy: 0.3400
Epoch 5/5
100/100 [==============================] - 0s 2ms/step - loss: 10.6379 - accuracy: 0.3400
I also removed Masking layer and trained another model on the same data to see the effect of masking, this is the model,
unmasked_model = models.Sequential(name='unmasked')
unmasked_model.add(LSTM(num_lstm_cells, return_sequences=False, input_shape=(timesteps, num_feats)))
unmasked_model.add(Dense(num_classes, activation='relu'))
unmasked_model.compile(loss=losses.categorical_crossentropy, optimizer='adam', metrics=["accuracy"])
print(unmasked_model.summary())
unmasked_model.fit(X_train, cat_y_train, batch_size=1, epochs=5, verbose=True)
And this is the verbose output,
Model: "unmasked"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 1) 12
_________________________________________________________________
dense (Dense) (None, 3) 6
=================================================================
Total params: 18
Trainable params: 18
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/5
100/100 [==============================] - 0s 1ms/step - loss: 10.6379 - accuracy: 0.3400
Epoch 2/5
100/100 [==============================] - 0s 2ms/step - loss: 10.6379 - accuracy: 0.3400
Epoch 3/5
100/100 [==============================] - 0s 1ms/step - loss: 10.6379 - accuracy: 0.3400
Epoch 4/5
100/100 [==============================] - 0s 1ms/step - loss: 10.6379 - accuracy: 0.3400
Epoch 5/5
100/100 [==============================] - 0s 1ms/step - loss: 10.6379 - accuracy: 0.3400
Losses are the same in both outputs, what is the reason for that ? It seems like Masking layer has no effect on loss, is that correct ? If not, then how can I observe the effect of Masking layer ?
In the case of a multi-classification task, the problem seems to be the last activation function...
If you change relu with softmax, your network can produce probabilities in the range [0,1]

Use loss in the keras model function

I am trying to build the a very simple model using keras using the Model function, like below, where the input and output of the Model function are [img,labels] and the loss.
I am confused why this code is not working, if the output cannot be the loss. How should the Model function work and when should we use the Model function? Thanks.
sess = tf.Session()
K.set_session(sess)
K.set_learning_phase(1)
img = Input((784,),name='img')
labels = Input((10,),name='labels')
# img = tf.placeholder(tf.float32, shape=(None, 784))
# labels = tf.placeholder(tf.float32, shape=(None, 10))
x = Dense(128, activation='relu')(img)
x = Dropout(0.5)(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.5)(x)
preds = Dense(10, activation='softmax')(x)
from keras.losses import binary_crossentropy
#loss = tf.reduce_mean(categorical_crossentropy(labels, preds))
loss = binary_crossentropy(labels, preds)
print(type(loss))
model = Model([img,labels], loss, name='squeezenet')
model.summary()
As #yu-yang pointed out, the loss is specified with compile().
If you think about it, it makes sense because the real output of your model is your prediction, not the loss, the loss is only used to train the model.
A working example of your network:
import keras
from keras.optimizers import Adam
from keras.models import Model
from keras.layers import Input, Dense, Dropout
from keras.losses import categorical_crossentropy
img = Input((784,),name='img')
x = Dense(128, activation='relu')(img)
x = Dropout(0.5)(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.5)(x)
preds = Dense(10, activation='softmax')(x)
model = Model(inputs=img, outputs=preds, name='squeezenet')
model.compile(optimizer=Adam(),
loss=categorical_crossentropy,
metrics=['acc'])
model.summary()
Output:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img (InputLayer) (None, 784) 0
_________________________________________________________________
dense_32 (Dense) (None, 128) 100480
_________________________________________________________________
dropout_21 (Dropout) (None, 128) 0
_________________________________________________________________
dense_33 (Dense) (None, 128) 16512
_________________________________________________________________
dropout_22 (Dropout) (None, 128) 0
_________________________________________________________________
dense_34 (Dense) (None, 10) 1290
=================================================================
Total params: 118,282
Trainable params: 118,282
Non-trainable params: 0
_________________________________________________________________
With MNIST dataset:
from keras.datasets import mnist
from keras.utils import to_categorical
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(-1, 784)
y_train = to_categorical(y_train, num_classes=10)
x_test = x_test.reshape(-1, 784)
y_test = to_categorical(y_test, num_classes=10)
model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))
Output:
Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 [==============================] - 4s - loss: 12.2797 - acc: 0.2360 - val_loss: 11.0902 - val_acc: 0.3116
Epoch 2/10
60000/60000 [==============================] - 4s - loss: 10.4161 - acc: 0.3527 - val_loss: 8.7122 - val_acc: 0.4589
Epoch 3/10
60000/60000 [==============================] - 4s - loss: 9.5797 - acc: 0.4051 - val_loss: 8.9226 - val_acc: 0.4460
Epoch 4/10
60000/60000 [==============================] - 4s - loss: 9.2017 - acc: 0.4285 - val_loss: 8.0564 - val_acc: 0.4998
Epoch 5/10
60000/60000 [==============================] - 4s - loss: 8.8558 - acc: 0.4501 - val_loss: 8.0878 - val_acc: 0.4980
Epoch 6/10
60000/60000 [==============================] - 5s - loss: 8.8239 - acc: 0.4521 - val_loss: 8.2495 - val_acc: 0.4880
Epoch 7/10
60000/60000 [==============================] - 4s - loss: 8.7842 - acc: 0.4547 - val_loss: 7.7146 - val_acc: 0.5211
Epoch 8/10
60000/60000 [==============================] - 4s - loss: 8.7395 - acc: 0.4575 - val_loss: 7.7944 - val_acc: 0.5163
Epoch 9/10
60000/60000 [==============================] - 5s - loss: 8.7109 - acc: 0.4593 - val_loss: 7.8235 - val_acc: 0.5145
Epoch 10/10
60000/60000 [==============================] - 4s - loss: 8.4927 - acc: 0.4729 - val_loss: 7.5933 - val_acc: 0.5288