I am attempting to print the predicted probabilities of each class outcome from my trained model, when I present new raw data. This is a multi-class classification problem, with 8 outputs and 21 inputs.
I am able to print 1 outcome when I present new data, for example:
"Example 0 prediction: 1 (15.0%)"
Instead, I would expect to see something similar to the below. Where the probabilities of each class (0, 1, 2, 3, 4, 6, Wide, Out) are shown:
Example 0 prediction 0: (12.5%), prediction 1: (12.5%), prediction 2: (12.5%), prediction 3: (12.5%), prediction 4: (12.5%), prediction 6: (12.5%), prediction Wide: (12.5%), prediction Out: (12.5%)
Please note I have tried searching for similar issues including here, here and here as well as consulted the TensorFlow documentation. However, these mainly discuss alterations to the model itself e.g. softmax activation on the final layer, categorical crossentropy as the loss function etc. so that probabilities are generated.
I have included the model architecture as well as the prediction code for full visibility.
Model:
earlystopping = callbacks.EarlyStopping(monitor ="val_loss",
mode ="min", patience = 125,
restore_best_weights = True)
#define Keras
model = Sequential()
model.add(Dense(50, input_dim=21))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.5,input_shape=(50,)))
model.add(Dense(50))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.5,input_shape=(50,)))
model.add(Dense(8, activation='softmax'))
#compile the keras model
model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
model.fit(X, dummy_y, validation_split=0.25, epochs=1000, batch_size=100, verbose=1, callbacks=[earlystopping])
_, accuracy3 = model.evaluate(X, dummy_y, verbose=0)
print('Accuracy: %.2f' % (accuracy3*100))
Making predictions:
class_names = ['0', '1', '2','3','4','6','Wide','Out']
predict_dataset = tf.convert_to_tensor([
[1,5,1,0.459,0.322,0.041,0.002,0.103,0.032,0.041,14,0.404,0.284,0.052,0.008,0.128,0.044,0.037,0.043,54,0,],
[1,18,5,0.512,0.286,0,0,0.083,0.024,0.095,13,0.24,0.44,0.08,0,0.08,0.08,0,0.08,173,3],
[2,11,13,0.5,0.417,0,0,0.083,0,0.083,82,0.35,0.36,0.042,0.003,0.135,0.039,0.051,0.02,51,7]
])
predictions = model(predict_dataset, training=False)
for i, logits in enumerate(predictions):
class_idx = tf.argmax(logits).numpy()
p = tf.nn.softmax(logits)[class_idx]
name = class_names[class_idx]
print("Example {} prediction: {} ({:4.1f}%)".format(i, name,100*p))
Output:
Example 0 prediction: 1 (15.0%)
Example 1 prediction: 1 (16.0%)
Example 2 prediction: 0 (16.9%)
I have tried making changes to the for loop which makes use of TensorFlow's logits, but I am still unable to get it to print each outcome and associated probability.
Any guidance is much appreciated.
In the end instead of trying to implement a For loop, I instead just printed each outcome from the numpy array.
Not the cleanest of ways, but it does the job. Hopefully useful to someone in the future.
predict_dataset = tf.convert_to_tensor([
[1,5,1,0.459,0.322,0.041,0.002,0.103,0.032,0.041,14,0.404,0.284,0.052,0.008,0.128,0.044,0.037,0.043,54,0,155]
])
predictions = model3(predict_dataset, training=False)
predictions2 = predictions.numpy()
prob_0 = predictions2[0,0]
prob_1 = predictions2[0,1]
prob_2 = predictions2[0,2]
prob_3 = predictions2[0,3]
prob_4 = predictions2[0,4]
prob_wide = predictions2[0,5]
prob_6 = predictions2[0,6]
prob_wicket = predictions2[0,7]
print(prob_0)
print(prob_1)
print(prob_2)
print(prob_3)
print(prob_4)
print(prob_wide)
print(prob_6)
print(prob_wicket)
Output
0.28349978
0.32451397
0.06382967
0.0053077294
0.20397986
0.07999096
6.386134e-08
0.038877998
Related
I am trying to solve the Spoken Digit Recognition task using the LSTM model, where the audio files are converted into spectrograms and fed into an LSTM model after doing Global Average Pooling. Here is the architecture of it
tf.keras.backend.clear_session()
#input layer
input_= Input(shape = (64, 35))
lstm = LSTM(100, activation='tanh', return_sequences= True, kernel_regularizer = l2(0.000001),
recurrent_initializer = 'glorot_uniform')(input_)
lstm = GlobalAveragePooling1D(data_format='channels_first')(lstm)
dense = Dense(20, activation='relu', kernel_regularizer = l2(0.000001), kernel_initializer='glorot_uniform')(lstm)
drop = Dropout(0.8)(dense)
dense1 = Dense(25, activation='relu', kernel_regularizer = l2(0.000001), kernel_initializer= 'he_uniform')(drop)
drop = Dropout(0.95)(dense1)
output = Dense(10,activation = 'softmax', kernel_regularizer = l2(0.000001), kernel_initializer= 'glorot_uniform')(drop)
model_2 = Model(inputs = [input_], outputs = output)
model_2.summary()
Having summary as -
I need to calculate the F1 score to check the performance of the model, I have implemented a custom callback and used TensorFlow addons F1 score too. However, I won't get the correct result, for every epoch I get the constant F1 score value.
On further digging, I found out that my model predicts the same class label, for the entire epoch, whereas it is supposed to predict 10 classes in one epoch. as there are 10 class label values present.
Here is my model.compile and model.predict commands. I have used TensorFlow addon here -
from tensorflow import keras
opt = keras.optimizers.Adam(0.001, clipnorm=0.8)
model_2.compile(loss='categorical_crossentropy', optimizer=opt, metrics = metric)
hist = model_2.fit([X_train_spectrogram],
[y_train_converted],
validation_data= ([X_test_spectrogram], [y_test_converted]),
epochs = 10,
verbose =1,
callbacks=[tensorBoard_callbk2, ClearMemory()],
# steps_per_epoch = 3,
batch_size=32)
Here is what I mean by getting the same prediction, the entire array is filled with the same predicted values.
Why is the model predicting the same class label? or How to rectify it?
I have tried increasing the number of trainable parameters, increasing - decreasing batch size too, but it won't help me. If anyone knows can you please help me out?
As the title clearly describes the issue I've been experiencing during the training of my CNN model, the accuracies of training and validation sets are constant despite the losses of them are changing. I have included the detail regarding the model and its training setup below. What may cause this issue?
Here is the data that was used by training (X_train & y_train), validation, and test sets (X_test and y_test):
df = pd.read_csv(CSV_PATH, sep=',', header=None)
print(f'Shape of all data: {df.shape}')
y = df.iloc[:, -1].values
X = df.iloc[:, :-1].values
encoder = LabelEncoder()
encoder.fit(y)
encoded_Y = encoder.transform(y)
dummy_y = to_categorical(encoded_Y)
X_train, X_test, y_train, y_test = train_test_split(X, dummy_y, test_size=0.3, random_state=RANDOM_STATE)
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1))
X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 1))
Here are the shapes of training and test sets:
Shape of X_train: (1322, 10800, 1)
Shape of Y_train: (1322, 3)
Shape of X_test: (567, 10800, 1)
Shape of y_test: (567, 3)
Here is my CNN model:
# Model hyper-parameters
activation_fn = 'relu'
n_lr = 1e-4
weight_decay = 1e-4
batch_size = 64
num_epochs = 200*10*10
num_classes = 3
n_dropout = 0.6
n_momentum = 0.5
n_kernel = 5
n_reg = 1e-5
# the sequential model
model = Sequential()
model.add(Conv1D(128, n_kernel, input_shape=(10800, 1)))
model.add(BatchNormalization())
model.add(Activation(activation_fn))
model.add(MaxPooling1D(pool_size=2, strides=2))
model.add(Dropout(n_dropout))
model.add(Conv1D(256, n_kernel))
model.add(BatchNormalization())
model.add(Activation(activation_fn))
model.add(MaxPooling1D(pool_size=2, strides=2))
model.add(Dropout(n_dropout))
model.add(GlobalAveragePooling1D()) # have tried model.add(Flatten()) as well
model.add(Dense(256, activation=activation_fn))
model.add(Dropout(n_dropout))
model.add(Dense(64, activation=activation_fn))
model.add(Dropout(n_dropout))
model.add(Dense(num_classes, activation='softmax'))
adam = Adam(lr=n_lr, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=weight_decay)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['acc'])
Here is how I have evaluated the model:
Y_pred = model.predict(X_test, verbose=0)
y_pred = np.argmax(Y_pred, axis=1)
y_test_int = np.argmax(y_test, axis=1)
And, my model always predicts the same class of three classes during the model evaluation as you can see from the classification result below (via classification_result(y_test_int, y_pred) function):
precision recall f1-score support
normal 0.743 1.000 0.852 421
apb 0.000 0.000 0.000 45
pvc 0.000 0.000 0.000 101
The model was trained using the EarlyStopping callback of Keras. Thus, the training has continued for 4,173 epochs. Here is the obtained losses during the training for training and validation sets:
Here are the obtained accuracies during the training for training and validation sets:
The model was implemented using Keras and hosted on Google Colab.
Although such issues are difficult to resolve without the data, there are a couple of general rules applicable.
The very first thing we do when the model does not seem to learn anything, like here (despite the mild drop in the loss), is to remove all dropout.
In fact, dropout is not supposed to be used by default; its nominal function is to guard against overfitting - but of course, before starting to worry about overfitting, you must first have some success with fitting, something that is clearly not happening here. The fact that, with a dropout rate of n_dropout = 0.6, you also seem to be rather too aggressive in its use, does not help, either.
So I am trying to build an LSTM based autoencoder, which I want to use for the time series data. These are spitted up to sequences of different lengths. Input to the model has thus shape [None, None, n_features], where the first None stands for number of samples and the second for time_steps of the sequence. The sequences are processed by LSTM with argument return_sequences = False, coded dimension is then recreated by function RepeatVector and ran through LSTM again. In the end I would like to use the TimeDistributed layer, but how to tell python that the time_steps dimension is dynamic? See my code:
from keras import backend as K
.... other dependencies .....
input_ae = Input(shape=(None, 2)) # shape: time_steps, n_features
LSTM1 = LSTM(units=128, return_sequences=False)(input_ae)
code = RepeatVector(n=K.shape(input_ae)[1])(LSTM1) # bottleneck layer
LSTM2 = LSTM(units=128, return_sequences=True)(code)
output = TimeDistributed(Dense(units=2))(LSTM2) # ??????? HOW TO ????
# no problem here so far:
model = Model(input_ae, outputs=output)
model.compile(optimizer='adam', loss='mse')
this function seems to do the trick
def repeat(x_inp):
x, inp = x_inp
x = tf.expand_dims(x, 1)
x = tf.repeat(x, [tf.shape(inp)[1]], axis=1)
return x
example
input_ae = Input(shape=(None, 2))
LSTM1 = LSTM(units=128, return_sequences=False)(input_ae)
code = Lambda(repeat)([LSTM1, input_ae])
LSTM2 = LSTM(units=128, return_sequences=True)(code)
output = TimeDistributed(Dense(units=2))(LSTM2)
model = Model(input_ae, output)
model.compile(optimizer='adam', loss='mse')
X = np.random.uniform(0,1, (100,30,2))
model.fit(X, X, epochs=5)
I'm using tf.keras with TF 2.2
I have a tf.keras model that needs to accept multiple inputs of multiple shapes. My goal is to build it in such a way that I can train and evaluate it easily using its fit and evaluate API.
So far, the model is built as follows:
class MultipleLSTM(Model):
def __init__(self, lstm_dims=128, name='multi_lstm', **kwargs):
super(MultipleLSTM, self).__init__(name=name)
# initialize encoders for every attribute
self.encoders = []
for key, value in kwargs.items():
self.encoders.append(self._create_encoder(lstm_dims, value))
# initialize the rest of the network layers
self.concat = Concatenate(axis=0)
self.conv_1 = Conv2D(6, 4, activation='relu')
self.flatten = Flatten()
self.dense = Dense(128, activation='relu')
self.out = Dense(1, activation='sigmoid')
def call(self, inputs):
x_1 = self.encoders[0](inputs[0])
x_2 = self.encoders[1](inputs[1])
x_3 = self.encoders[2](inputs[2])
x_4 = self.encoders[3](inputs[3])
x = self.concat([x_1, x_2, x_3, x_4])
# fix the shape for the convolutions
x = tf.expand_dims(x, axis=0)
x = tf.expand_dims(x, axis=3)
x = self.conv_1(x)
x = self.flatten(x)
x = self.dense(x)
x = self.out(x)
return x
def _create_encoder(self, lstm_dims, conf):
with tf.name_scope(conf['name']) as scope:
encoder = tf.keras.Sequential(name=scope)
encoder.add(Embedding(conf['vocab'],
conf['embed_dim'],
input_length=conf['input_length']))
encoder.add(Bidirectional(LSTM(lstm_dims)))
return encoder
There are four different inputs, text sentences of different lengths, that are fed to four different Embedding and LSTM layers (encoders). Then the outputs of those layers are concatenated to create a single tensor that is forwarded to the subsequent layers.
To train this network, I'm passing as input a list of lists, for the different tokenized sentences. The label is just number, 0 or 1 (binary classification). For example, an input could be:
x = [[1, 2, 3, 4],
[2, 3, 5],
[3, 5, 6, 7],
[1, 5, 7]]
y = 0
For now, I have implemented a custom loop that takes such input and trains the network:
def train(data, model, loss_fn, optimizer, metric, epochs=10, print_every=50):
for epoch in range(epochs):
print(f'Start of epoch {epoch+1}')
for step, (x_batch, y_batch) in enumerate(data):
with GradientTape() as tape:
output = model(x_batch)
loss = loss_fn(y_batch, output)
grads = tape.gradient(loss, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
metric(loss)
if step % print_every == 0:
print(f'step {step}: mean loss = {metric.result()}')
But this prevents me from exploiting the easy to use tf.keras API, to fit and evaluate the model or even split the dataset into train and test sets. Thus, the question is: How can I create a tf.data.Dataset from such x's and y's and pass it to the fit function of tf.keras?
You can use the functional api of keras to do so. Here is the link of the keras documentation on multi input, output if you want : Multi-input and multi-output models
You can directly pass the different inputs as a list and fit and evaluate methods.
model.fit([X_train[:,0], X_train[:,1]], y_train, ...)
I am building a model to classify sequence class. firstly i build the model use keras API. As we know the keras API packed the tensorflow function, but when i convert the keras code to tensorflow API, i found the result of two framwork is different. Below is the key code.
tensorflow code
x = tf.placeholder(tf.int32, shape=[None, time_steps], name='x_input')
y = tf.placeholder(tf.float32, shape=[None, num_classes], name='y_label')
定义网络结构
def rnn_model(x):
x = tf.one_hot(x,api_vob_size)
rnn_cell_fw = tf.nn.rnn_cell.BasicLSTMCell(rnn_size)
rnn_cell_bw = tf.nn.rnn_cell.BasicLSTMCell(rnn_size)
# 将输入送入rnn,得到输出与中间状态,输出shape为[batch_size, time_steps, rnn_size]
outputs, states = tf.nn.bidirectional_dynamic_rnn(rnn_cell_fw,rnn_cell_bw, x, dtype=tf.float32)
# 获取最后一个时刻的输出,输出shape为[batch_size, rnn_size]
outputs1 = tf.concat(outputs, 2)
output = tf.transpose(outputs1, [1, 0, 2])[-1]
# 全连接层,最终输出大小为[batch_size, num_classes]
fc_w = tf.Variable(tf.random_normal([2*rnn_size, num_classes]))
fc_b = tf.Variable(tf.random_normal([num_classes]))
return tf.matmul(output, fc_w) + fc_b `
# 构建网络
logits= rnn_model(x)
prediction = tf.nn.softmax(logits)
# 定义损失函数与优化器
loss_op = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=y, logits=logits, name='cross_entropy'))
optimizer = tf.train.AdamOptimizer(learning_rate=lr)
train_op = optimizer.minimize(loss_op,name='optimizer_min')
#keras API
model = Sequential()
model.add(Bidirectional(LSTM(units=150), merge_mode='concat'))
model.add(Dense(9, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10, batch_size=64)
so why two code block has different result. thank you for answer !!!!