i build a model based on this architecture to make a binary classification ["0" for "Delay-insensitive", "1" for "Interactive"] using 5 features. The target column is vmcategory. When I train the model the accuracy remain at zero.
You can check the my colab here please.
Epoch 1/100
1/1 [==============================] - 29s 29s/step - loss: 0.6931 - accuracy: 0.0000e+00
Epoch 2/100
1/1 [==============================] - 7s 7s/step - loss: 0.6893 - accuracy: 0.0000e+00
Epoch 3/100
1/1 [==============================] - 7s 7s/step - loss: 0.6808 - accuracy: 0.0000e+00
Epoch 4/100
1/1 [==============================] - 7s 7s/step - loss: 0.6571 - accuracy: 0.0000e+00
Epoch 5/100
1/1 [==============================] - 7s 7s/step - loss: 0.5957 - accuracy: 0.0000e+00
Epoch 6/100
1/1 [==============================] - 7s 7s/step - loss: 0.5372 - accuracy: 0.0000e+00
Epoch 7/100
1/1 [==============================] - 7s 7s/step - loss: 0.3760 - accuracy: 0.0000e+00
Epoch 8/100
1/1 [==============================] - 7s 7s/step - loss: 0.2411 - accuracy: 0.0000e+00
Epoch 9/100
1/1 [==============================] - 7s 7s/step - loss: 0.1913 - accuracy: 0.0000e+00
Epoch 10/100
1/1 [==============================] - 7s 7s/step - loss: 0.0571 - accuracy: 0.0000e+00
Epoch 11/100
1/1 [==============================] - 7s 7s/step - loss: 0.0483 - accuracy: 0.0000e+00
Epoch 12/100
1/1 [==============================] - 7s 7s/step - loss: 0.0088 - accuracy: 0.0000e+00
Epoch 13/100
1/1 [==============================] - 7s 7s/step - loss: 6.1697e-04 - accuracy: 0.0000e+00
Epoch 14/100
1/1 [==============================] - 6s 6s/step - loss: 3.2386e-04 - accuracy: 0.0000e+00
Epoch 15/100
1/1 [==============================] - 6s 6s/step - loss: 6.8086e-06 - accuracy: 0.0000e+00
Epoch 16/100
1/1 [==============================] - 6s 6s/step - loss: 7.7796e-05 - accuracy: 0.0000e+00
Epoch 17/100
1/1 [==============================] - 7s 7s/step - loss: 1.1021e-06 - accuracy: 0.0000e+00
Epoch 18/100
1/1 [==============================] - 6s 6s/step - loss: 2.7273e-07 - accuracy: 0.0000e+00
Epoch 87/100
1/1 [==============================] - 6s 6s/step - loss: 1.0003e-13 - accuracy: 0.0000e+00
Epoch 88/100
1/1 [==============================] - 6s 6s/step - loss: 2.6685e-14 - accuracy: 0.0000e+00
Epoch 89/100
1/1 [==============================] - 7s 7s/step - loss: 2.4792e-12 - accuracy: 0.0000e+00
Epoch 90/100
1/1 [==============================] - 7s 7s/step - loss: 1.2417e-13 - accuracy: 0.0000e+00
Epoch 91/100
1/1 [==============================] - 7s 7s/step - loss: 1.4707e-11 - accuracy: 0.0000e+00
Epoch 92/100
1/1 [==============================] - 7s 7s/step - loss: 4.9625e-14 - accuracy: 0.0000e+00
Epoch 93/100
1/1 [==============================] - 7s 7s/step - loss: 3.7239e-13 - accuracy: 0.0000e+00
Epoch 94/100
1/1 [==============================] - 7s 7s/step - loss: 6.0243e-13 - accuracy: 0.0000e+00
Epoch 95/100
1/1 [==============================] - 6s 6s/step - loss: 1.4047e-11 - accuracy: 0.0000e+00
Epoch 96/100
1/1 [==============================] - 7s 7s/step - loss: 1.0687e-14 - accuracy: 0.0000e+00
Epoch 97/100
1/1 [==============================] - 7s 7s/step - loss: 3.4614e-16 - accuracy: 0.0000e+00
Epoch 98/100
1/1 [==============================] - 7s 7s/step - loss: 4.5617e-11 - accuracy: 0.0000e+00
Epoch 99/100
1/1 [==============================] - 7s 7s/step - loss: 1.5913e-14 - accuracy: 0.0000e+00
Epoch 100/100
1/1 [==============================] - 7s 7s/step - loss: 3.0236e-10 - accuracy: 0.0000e+00
you are using accuracy as metrics which expects class labels as input but you are providing class logits (or confidence) as input. Please replace accuracy with tf.keras.metrics.CategoricalAccuracy()
--Edit--
So, there is a different problem which i just noticed. You have 223461 such that the input features is a vector of 5 length and the aim is to do binary classification.
You are assuming the input samples as feature vectors and because of the same you are trying to predict 223461 classes. To fix this you would need to do the following changes
Make the following changes in the architecture,
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(n_outputs, activation='sigmoid')) # here n_outputs should be 1
Remove the Conv1D and GRU layers, your input features are tabular and does not need Convolutional operations.
Replace the softmax function with sigmoid as you are doing binary classification.
Replace the CategoricalAccuracy with 'accuracy'
Ensure your data is of shape [X_batch, 5]
Ensure your y is of shape [X_batch, 1]
here, X_batch could be of shape [223461, 5] and in complex models you might not process the whole data in a single loop and would use a small batch size.
Related
I created two anaconda environments for tensorflow2x and tensorflow1x respectively. In tensorflow2x, the tensorflow 2.3.2 and keras 2.4.3 (the latest) are installed, while in tensorflow1x, the tensorflow-gpu 1.15 and keras 2.3.1 are installed. Then I run a toy example mnist_cnn.py. It is found that the former tensorflow2 version give much lower accuracy than that the one obtained by the latter tensorflow 1.
Here below are the results:
# tensorflow2.3.2 + keras 2.4.3:
Epoch 1/12
60000/60000 [==============================] - 3s 54us/step - loss: 2.2795 - accuracy: 0.1270 - val_loss: 2.2287 - val_accuracy: 0.2883
Epoch 2/12
60000/60000 [==============================] - 3s 52us/step - loss: 2.2046 - accuracy: 0.2435 - val_loss: 2.1394 - val_accuracy: 0.5457
Epoch 3/12
60000/60000 [==============================] - 3s 52us/step - loss: 2.1133 - accuracy: 0.3636 - val_loss: 2.0215 - val_accuracy: 0.6608
Epoch 4/12
60000/60000 [==============================] - 3s 52us/step - loss: 1.9932 - accuracy: 0.4560 - val_loss: 1.8693 - val_accuracy: 0.7147
Epoch 5/12
60000/60000 [==============================] - 3s 52us/step - loss: 1.8430 - accuracy: 0.5239 - val_loss: 1.6797 - val_accuracy: 0.7518
Epoch 6/12
60000/60000 [==============================] - 3s 52us/step - loss: 1.6710 - accuracy: 0.5720 - val_loss: 1.4724 - val_accuracy: 0.7755
Epoch 7/12
60000/60000 [==============================] - 3s 53us/step - loss: 1.5003 - accuracy: 0.6071 - val_loss: 1.2725 - val_accuracy: 0.7928
Epoch 8/12
60000/60000 [==============================] - 3s 52us/step - loss: 1.3414 - accuracy: 0.6363 - val_loss: 1.0991 - val_accuracy: 0.8077
Epoch 9/12
60000/60000 [==============================] - 3s 53us/step - loss: 1.2129 - accuracy: 0.6604 - val_loss: 0.9603 - val_accuracy: 0.8169
Epoch 10/12
60000/60000 [==============================] - 3s 53us/step - loss: 1.1103 - accuracy: 0.6814 - val_loss: 0.8530 - val_accuracy: 0.8281
Epoch 11/12
60000/60000 [==============================] - 3s 52us/step - loss: 1.0237 - accuracy: 0.7021 - val_loss: 0.7689 - val_accuracy: 0.8350
Epoch 12/12
60000/60000 [==============================] - 3s 52us/step - loss: 0.9576 - accuracy: 0.7168 - val_loss: 0.7030 - val_accuracy: 0.8429
Test loss: 0.7029915698051452
Test accuracy: 0.8428999781608582
# tensorflow1.15.5 + keras2.3.1
60000/60000 [==============================] - 5s 84us/step - loss: 0.2631 - accuracy: 0.9198 - val_loss: 0.0546 - val_accuracy: 0.9826
Epoch 2/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0898 - accuracy: 0.9731 - val_loss: 0.0394 - val_accuracy: 0.9866
Epoch 3/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0674 - accuracy: 0.9799 - val_loss: 0.0341 - val_accuracy: 0.9881
Epoch 4/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0563 - accuracy: 0.9835 - val_loss: 0.0320 - val_accuracy: 0.9895
Epoch 5/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0465 - accuracy: 0.9859 - val_loss: 0.0343 - val_accuracy: 0.9889
Epoch 6/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0423 - accuracy: 0.9872 - val_loss: 0.0327 - val_accuracy: 0.9892
Epoch 7/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0387 - accuracy: 0.9882 - val_loss: 0.0279 - val_accuracy: 0.9907
Epoch 8/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0351 - accuracy: 0.9893 - val_loss: 0.0269 - val_accuracy: 0.9909
Epoch 9/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0330 - accuracy: 0.9902 - val_loss: 0.0311 - val_accuracy: 0.9895
Epoch 10/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0292 - accuracy: 0.9915 - val_loss: 0.0256 - val_accuracy: 0.9919
Epoch 11/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0293 - accuracy: 0.9911 - val_loss: 0.0276 - val_accuracy: 0.9911
Epoch 12/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0269 - accuracy: 0.9917 - val_loss: 0.0264 - val_accuracy: 0.9915
Test loss: 0.026934823030711867
Test accuracy: 0.9918000102043152
What caused the poor results for the tensorflow 2.3.2 + keras 2.4.3?? Is there any compatibility issue between tensorflow and keras here?
According to the author of keras, users should consider switching their Keras code to tf.keras in TensorFlow 2.x. In the above toy example, if
from tensorflow import keras in place of import keras, it also leads lower accuracy. It seems tf.keras gives poorer accuracy than keras? Maybe I run a wrong toy example for Tensorflow 2.X??
Update:
I also note if I decrease tensorflow to the version 2.2.1 (along with keras 2.3.1). They will produce about the same result. It seems there are some major changes from keras 2.3.1 to keras 2.4.0 (https://newreleases.io/project/github/keras-team/keras/release/2.4.0).
What are the specific main differences between keras 2.3.1 and keras 2.4.x??
Which versions of tensorflow are compatible with keras 2.4.x??
I am training a classifier model on cats vs dogs data. The model is a minor variant of ResNet18 & returns a softmax probability for classes. However, I am noticing that the validation loss is majorly NaN whereas training loss is steadily decreasing & behaves as expected. Training & Validation accuracy increase epoch by epoch.
Epoch 1/15
312/312 [==============================] - 1372s 4s/step - loss: 0.7849 - accuracy: 0.5131 - val_loss: nan - val_accuracy: 0.5343
Epoch 2/15
312/312 [==============================] - 1372s 4s/step - loss: 0.6966 - accuracy: 0.5539 - val_loss: 13989871201999266517090304.0000 - val_accuracy: 0.5619
Epoch 3/15
312/312 [==============================] - 1373s 4s/step - loss: 0.6570 - accuracy: 0.6077 - val_loss: 747123703808.0000 - val_accuracy: 0.5679
Epoch 4/15
312/312 [==============================] - 1372s 4s/step - loss: 0.6180 - accuracy: 0.6483 - val_loss: nan - val_accuracy: 0.6747
Epoch 5/15
312/312 [==============================] - 1373s 4s/step - loss: 0.5838 - accuracy: 0.6852 - val_loss: nan - val_accuracy: 0.6240
Epoch 6/15
312/312 [==============================] - 1372s 4s/step - loss: 0.5338 - accuracy: 0.7301 - val_loss: 31236203781405710523301888.0000 - val_accuracy: 0.7590
Epoch 7/15
312/312 [==============================] - 1373s 4s/step - loss: 0.4872 - accuracy: 0.7646 - val_loss: 52170.8672 - val_accuracy: 0.7378
Epoch 8/15
312/312 [==============================] - 1372s 4s/step - loss: 0.4385 - accuracy: 0.7928 - val_loss: 2130819335420217655296.0000 - val_accuracy: 0.8101
Epoch 9/15
312/312 [==============================] - 1373s 4s/step - loss: 0.3966 - accuracy: 0.8206 - val_loss: 116842888.0000 - val_accuracy: 0.7857
Epoch 10/15
312/312 [==============================] - 1372s 4s/step - loss: 0.3643 - accuracy: 0.8391 - val_loss: nan - val_accuracy: 0.8199
Epoch 11/15
312/312 [==============================] - 1373s 4s/step - loss: 0.3285 - accuracy: 0.8557 - val_loss: 788904.2500 - val_accuracy: 0.8438
Epoch 12/15
312/312 [==============================] - 1372s 4s/step - loss: 0.3029 - accuracy: 0.8670 - val_loss: nan - val_accuracy: 0.8245
Epoch 13/15
312/312 [==============================] - 1373s 4s/step - loss: 0.2857 - accuracy: 0.8781 - val_loss: 121907.8594 - val_accuracy: 0.8444
Epoch 14/15
312/312 [==============================] - 1373s 4s/step - loss: 0.2585 - accuracy: 0.8891 - val_loss: nan - val_accuracy: 0.8674
Epoch 15/15
312/312 [==============================] - 1374s 4s/step - loss: 0.2430 - accuracy: 0.8965 - val_loss: 822.7968 - val_accuracy: 0.8776
I checked for the following -
Infinity/NaN in validation data
Infinity/NaN caused when normalizing data (using tf.keras.applications.resnet.preprocess_input)
If the model is predicting only one class & hence causing loss function to behave oddly
Training code for reference -
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-3)
model = Resnet18(NUM_CLASSES=NUM_CLASSES) # variant of original model
model.compile(optimizer=optimizer, loss="categorical_crossentropy", metrics=["accuracy"])
history = model.fit(
train_dataset,
steps_per_epoch=len(X_train) // BATCH_SIZE,
epochs=EPOCHS,
validation_data=valid_dataset,
validation_steps=len(X_valid) // BATCH_SIZE,
verbose=1,
)
The most relevant answer I found was the last paragraph of the accepted answer here. However, that doesn't seem to be the case here as validation loss diverges by order of magnitudes compared to training loss & returns nan. Seems like the loss function is misbehaving.
I am currently studying the book hands on machine learning. I want to create a simple neural network, as described in the book chapter 10 for the mnist hand written data. But my model is stuck, and the accuracy is not increasing at all.
Here is my code:
import tensorflow as tf
from tensorflow import keras
import pandas as pd
import numpy as np
data = pd.read_csv('sample_data/mnist_train_small.csv', header=None)
test = pd.read_csv('sample_data/mnist_test.csv', header=None)
labels = data[0]
data = data.drop(0, axis=1)
test_labels = test[0]
test = test.drop(0, axis=1)
model = keras.models.Sequential([
keras.layers.Dense(300, activation='relu', input_shape=(784,)),
keras.layers.Dense(100, activation='relu'),
keras.layers.Dense(10, activation='softmax'),
])
model.compile(loss='sparse_categorical_crossentropy',
optimizer='sgd',
metrics=['accuracy'])
keras.utils.plot_model(model, show_shapes=True)
hist = model.fit(data.to_numpy(), labels.to_numpy(), epochs=20, validation_data=(test.to_numpy(), test_labels.to_numpy()))
The first few outputs are :
Epoch 1/20
625/625 [==============================] - 2s 3ms/step - loss: 2055059923226079526912.0000 - accuracy: 0.1115 - val_loss: 2.4539 - val_accuracy: 0.1134
Epoch 2/20
625/625 [==============================] - 2s 3ms/step - loss: 2.4160 - accuracy: 0.1085 - val_loss: 2.2979 - val_accuracy: 0.1008
Epoch 3/20
625/625 [==============================] - 2s 2ms/step - loss: 2.3006 - accuracy: 0.1110 - val_loss: 2.3014 - val_accuracy: 0.1136
Epoch 4/20
625/625 [==============================] - 2s 3ms/step - loss: 2.3009 - accuracy: 0.1121 - val_loss: 2.3014 - val_accuracy: 0.1136
Epoch 5/20
625/625 [==============================] - 2s 3ms/step - loss: 2.3009 - accuracy: 0.1121 - val_loss: 2.3014 - val_accuracy: 0.1136
Epoch 6/20
625/625 [==============================] - 2s 3ms/step - loss: 2.3008 - accuracy: 0.1121 - val_loss: 2.3014 - val_accuracy: 0.1136
Epoch 7/20
625/625 [==============================] - 2s 3ms/step - loss: 2.3008 - accuracy: 0.1121 - val_loss: 2.3014 - val_accuracy: 0.1136
Epoch 8/20
625/625 [==============================] - 2s 3ms/step - loss: 2.3008 - accuracy: 0.1121 - val_loss: 2.3014 - val_accuracy: 0.1136
Epoch 9/20
625/625 [==============================] - 2s 2ms/step - loss: 2.3008 - accuracy: 0.1121 - val_loss: 2.3014 - val_accuracy: 0.1136
Epoch 10/20
625/625 [==============================] - 2s 3ms/step - loss: 2.3008 - accuracy: 0.1121 - val_loss: 2.3014 - val_accuracy: 0.1136
Epoch 11/20
625/625 [==============================] - 2s 3ms/step - loss: 2.3008 - accuracy: 0.1121 - val_loss: 2.3014 - val_accuracy: 0.1136
Epoch 12/20
625/625 [==============================] - 2s 3ms/step - loss: 2.3008 - accuracy: 0.1121 - val_loss: 2.3014 - val_accuracy: 0.1136
Your loss function should be categorical_crossentrophy. Sparse is for large and mostly empty matrixes(word matrixes etc.). And also instead of data[] you can use data.iloc[]. And adam optimizer would be better in this problem.
Hi… I’m running mnist code in my P3 AWS machine and the initialization process seems to be very long compared to my previous P2 machine (although P3>P2)
Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 [==============================] - 265s 4ms/step - loss: 0.2674 - acc: 0.9175 - val_loss: 0.0602 - val_acc: 0.9811
Epoch 2/10
60000/60000 [==============================] - 3s 51us/step - loss: 0.0860 - acc: 0.9742 - val_loss: 0.0393 - val_acc: 0.9866
Epoch 3/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0647 - acc: 0.9808 - val_loss: 0.0338 - val_acc: 0.9884
Epoch 4/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0542 - acc: 0.9839 - val_loss: 0.0337 - val_acc: 0.9887
Epoch 5/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0453 - acc: 0.9863 - val_loss: 0.0311 - val_acc: 0.9900
Epoch 6/10
60000/60000 [==============================] - 3s 51us/step - loss: 0.0412 - acc: 0.9873 - val_loss: 0.0291 - val_acc: 0.9898
Epoch 7/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0368 - acc: 0.9891 - val_loss: 0.0300 - val_acc: 0.9901
Epoch 8/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0340 - acc: 0.9897 - val_loss: 0.0298 - val_acc: 0.9897
Epoch 9/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0320 - acc: 0.9908 - val_loss: 0.0267 - val_acc: 0.9916
Epoch 10/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0286 - acc: 0.9914 - val_loss: 0.0276 - val_acc: 0.9903
Test loss: 0.02757222411266339
Test accuracy: 0.9903
I’m using Keras=2.1.4
tensorflow-gpu=1.5.0
my keras.json file is configured as follows:
{
"floatx": "float32",
"epsilon": 1e-07,
"backend": "tensorflow",
"image_data_format": "channels_last"
}
Any ideas why is it like that?
Thanks in advance
Based on this issue:
The first epoch takes the same time, but the counter also takes into
account the time taken by building the part of the computational graph
that deals with training (a few seconds). This used to be done during
the compile step, but now it is done lazily one demand to avoid
unnecessary work.
I'm pretty new to keras I have built a simple network to try:
import numpy as np;
from keras.models import Sequential;
from keras.layers import Dense,Activation;
data= np.genfromtxt("./kerastests/mydata.csv", delimiter=';')
x_target=data[:,29]
x_training=np.delete(data,6,axis=1)
x_training=np.delete(x_training,28,axis=1)
model=Sequential()
model.add(Dense(20,activation='relu', input_dim=x_training.shape[1]))
model.add(Dense(10,activation='relu'))
model.add(Dense(1));
model.compile(optimizer='adam',loss='mean_squared_error',metrics=['accuracy'])
model.fit(x_training, x_target)
From my source data, I have removed 2 columns, as you can see. One is a column that came with dates in a string format (in the dataset, besides it, I have a column for the day, another for the month, and another for the year, so I don't need that column) and the other column is the column I use as target for the model).
When I train this model I get this output:
32/816 [>.............................] - ETA: 23s - loss: 13541942.0000 - acc: 0.0000e+00
800/816 [============================>.] - ETA: 0s - loss: 11575466.0400 - acc: 0.0000e+00
816/816 [==============================] - 1s - loss: 11536905.2353 - acc: 0.0000e+00
Epoch 2/10
32/816 [>.............................] - ETA: 0s - loss: 6794785.0000 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5381360.4314 - acc: 0.0000e+00
Epoch 3/10
32/816 [>.............................] - ETA: 0s - loss: 6235184.0000 - acc: 0.0000e+00
800/816 [============================>.] - ETA: 0s - loss: 5199512.8700 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5192977.4216 - acc: 0.0000e+00
Epoch 4/10
32/816 [>.............................] - ETA: 0s - loss: 4680165.5000 - acc: 0.0000e+00
736/816 [==========================>...] - ETA: 0s - loss: 5050110.3043 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5168771.5490 - acc: 0.0000e+00
Epoch 5/10
32/816 [>.............................] - ETA: 0s - loss: 5932391.0000 - acc: 0.0000e+00
768/816 [===========================>..] - ETA: 0s - loss: 5198882.9167 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5159585.9020 - acc: 0.0000e+00
Epoch 6/10
32/816 [>.............................] - ETA: 0s - loss: 4488318.0000 - acc: 0.0000e+00
768/816 [===========================>..] - ETA: 0s - loss: 5144843.8333 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5151492.1765 - acc: 0.0000e+00
Epoch 7/10
32/816 [>.............................] - ETA: 0s - loss: 6920405.0000 - acc: 0.0000e+00
800/816 [============================>.] - ETA: 0s - loss: 5139358.5000 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5169839.2941 - acc: 0.0000e+00
Epoch 8/10
32/816 [>.............................] - ETA: 0s - loss: 3973038.7500 - acc: 0.0000e+00
672/816 [=======================>......] - ETA: 0s - loss: 5183285.3690 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5141417.0000 - acc: 0.0000e+00
Epoch 9/10
32/816 [>.............................] - ETA: 0s - loss: 4969548.5000 - acc: 0.0000e+00
768/816 [===========================>..] - ETA: 0s - loss: 5126550.1667 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5136524.5098 - acc: 0.0000e+00
Epoch 10/10
32/816 [>.............................] - ETA: 0s - loss: 6334703.5000 - acc: 0.0000e+00
768/816 [===========================>..] - ETA: 0s - loss: 5197778.8229 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5141391.2059 - acc: 0.0000e+00
Why is this happening? My data is a time series. I know that for time series people do not usually use Dense neurons, but it is just a test. What really tricks me is that accuracy is always 0. And, with other tests, I did even lose: gets to a "NAN" value.
Could anybody help here?
Your model seems to correspond to a regression model for the following reasons:
You are using linear (the default one) as an activation function in the output layer (and relu in the layer before).
Your loss is loss='mean_squared_error'.
However, the metric that you use- metrics=['accuracy'] corresponds to a classification problem. If you want to do regression, remove metrics=['accuracy']. That is, use
model.compile(optimizer='adam',loss='mean_squared_error')
Here is a list of keras metrics for regression and classification (taken from this blog post):
Keras Regression Metrics
•Mean Squared Error: mean_squared_error, MSE or mse
•Mean Absolute Error: mean_absolute_error, MAE, mae
•Mean Absolute Percentage Error: mean_absolute_percentage_error, MAPE,
mape
•Cosine Proximity: cosine_proximity, cosine
Keras Classification Metrics
•Binary Accuracy: binary_accuracy, acc
•Categorical Accuracy: categorical_accuracy, acc
•Sparse Categorical Accuracy: sparse_categorical_accuracy
•Top k Categorical Accuracy: top_k_categorical_accuracy (requires you
specify a k parameter)
•Sparse Top k Categorical Accuracy: sparse_top_k_categorical_accuracy
(requires you specify a k parameter)
Add following to get metrics:
history = model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mean_squared_error'])
# OR
history = model.compile(optimizer='adam', loss='mean_absolute_error', metrics=['mean_absolute_error'])
history.history.keys()
history.history
I would like to point out something that is very important and has been unfortunately neglected: mean_squared_error is not an invalid loss function for classification.
The mathematical properties of cross_entropy in conjunction with the assumptions of mean_squared_error(both of which I will not expand upon in this comment) make the latter inappropriate or worse than the cross_entropy when it comes to training on classification problems.
Try this one.
while trying to solve the Titanic problem from kaggle, I forgot to fill the missing data from the Dataframe, because of which the missing data was filled with "nan".
The model threw a similar output
#------------------------------------------------------
Epoch 1/50
891/891 [==============================] - 3s 3ms/step - loss: 9.8239 - acc: 0.0000e+00
Epoch 2/50
891/891 [==============================] - 1s 2ms/step - loss: 9.8231 - acc: 0.0000e+00
Epoch 3/50
891/891 [==============================] - 1s 1ms/step - loss: 9.8231 - acc: 0.0000e+00
Epoch 4/50
891/891 [==============================] - 1s 1ms/step - loss: 9.8231 - acc: 0.0000e+00
Epoch 5/50
891/891 [==============================] - 1s 1ms/step - loss: 9.8231 - acc: 0.0000e+00
#------------------------------------------------------
Make sure you prepare your data before feeding it to the model.
In my case I had to do the following changes
+++++++++++++++++++++++++++++++++++
dataset[['Age']] = dataset[['Age']].fillna(value=dataset[['Age']].mean())
dataset[['Fare']] = dataset[['Fare']].fillna(value=dataset[['Fare']].mean())
dataset[['Embarked']] = dataset[['Embarked']].fillna(value=dataset['Embarked'].value_counts().idxmax())