Keras training cats vs dogs gives constant validation accuracy - tensorflow

I am following this keras tutorial to train a cats/dogs model with few data. I ran the exact code as given on the github, but the accuracy stays 0.5 and never changes.
My keras version is 2.0.9.
Found 2000 images belonging to 2 classes.
Found 800 images belonging to 2 classes.
Epoch 1/50
125/125 [==============================] - 150s 1s/step - loss: 0.7777 - acc: 0.4975 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 2/50
125/125 [==============================] - 158s 1s/step - loss: 0.6932 - acc: 0.5000 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 3/50
125/125 [==============================] - 184s 1s/step - loss: 0.6932 - acc: 0.5000 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 4/50
125/125 [==============================] - 203s 2s/step - loss: 0.6932 - acc: 0.4940 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 5/50
3/125 [..............................] - ETA: 2:30 - loss: 0.6931 - acc: 0.5417
Does anyone know what's the reason behind this?
My data directory looks like this:
data/
train/
cats/
cat846.jpg
cat828.jpg
cat926.jpg
cat382.jpg
cat792.jpg
...
dogs/
dog533.jpg
dog850.jpg
dog994.jpg
dog626.jpg
dog974.jpg
...
validation/
cats/
cat1172.jpg
cat1396.jpg
cat1336.jpg
cat1347.jpg
cat1211.jpg
...
dogs/
dog1014.jpg
dog1211.jpg
dog1088.jpg
dog1207.jpg
dog1186.jpg
...
It seems to have something to do with OS. The above result was on Ubuntu 16.04 virtual machine. I copied the code to windows, it worked correctly. But why?

Related

Using TensorFlow-metal plugin, training stops after some time without any errors?

I've followed the steps provided by Apple, which utilizes conda, to install TensorFlow to get the best out of the M1 Pro MacBook Pro. As the title is self-descriptive, the training stops after some time without any errors. Please see the Keras training log below. This has happened many times. What could be the reason behind this situation? Have you experienced the same on your end? If so, how can I overcome this situation?
...
Epoch 38/50
625/625 [==============================] - 18s 29ms/step - loss: 1.6704 - acc: 0.4178 - val_loss: 1.8169 - val_acc: 0.4044
Epoch 39/50
625/625 [==============================] - 18s 29ms/step - loss: 1.6788 - acc: 0.4157 - val_loss: 1.6830 - val_acc: 0.4029
Epoch 40/50
625/625 [==============================] - 18s 28ms/step - loss: 1.6921 - acc: 0.4089 - val_loss: 1.7088 - val_acc: 0.4049
Epoch 41/50
625/625 [==============================] - 18s 28ms/step - loss: 1.6705 - acc: 0.4170 - val_loss: 1.6650 - val_acc: 0.4182
Epoch 42/50
625/625 [==============================] - 18s 29ms/step - loss: 1.6659 - acc: 0.4177 - val_loss: 1.9102 - val_acc: 0.3443
Epoch 43/50
625/625 [==============================] - 18s 29ms/step - loss: 1.6760 - acc: 0.4166 - val_loss: 1.6647 - val_acc: 0.4222
Epoch 44/50
532/625 [========================>.....] - ETA: 2s - loss: 1.6639 - acc: 0.4217

Lower model accuracies with tensorflow2.3+keras2.4 than tensorflow1.15+keras2.3

I created two anaconda environments for tensorflow2x and tensorflow1x respectively. In tensorflow2x, the tensorflow 2.3.2 and keras 2.4.3 (the latest) are installed, while in tensorflow1x, the tensorflow-gpu 1.15 and keras 2.3.1 are installed. Then I run a toy example mnist_cnn.py. It is found that the former tensorflow2 version give much lower accuracy than that the one obtained by the latter tensorflow 1.
Here below are the results:
# tensorflow2.3.2 + keras 2.4.3:
Epoch 1/12
60000/60000 [==============================] - 3s 54us/step - loss: 2.2795 - accuracy: 0.1270 - val_loss: 2.2287 - val_accuracy: 0.2883
Epoch 2/12
60000/60000 [==============================] - 3s 52us/step - loss: 2.2046 - accuracy: 0.2435 - val_loss: 2.1394 - val_accuracy: 0.5457
Epoch 3/12
60000/60000 [==============================] - 3s 52us/step - loss: 2.1133 - accuracy: 0.3636 - val_loss: 2.0215 - val_accuracy: 0.6608
Epoch 4/12
60000/60000 [==============================] - 3s 52us/step - loss: 1.9932 - accuracy: 0.4560 - val_loss: 1.8693 - val_accuracy: 0.7147
Epoch 5/12
60000/60000 [==============================] - 3s 52us/step - loss: 1.8430 - accuracy: 0.5239 - val_loss: 1.6797 - val_accuracy: 0.7518
Epoch 6/12
60000/60000 [==============================] - 3s 52us/step - loss: 1.6710 - accuracy: 0.5720 - val_loss: 1.4724 - val_accuracy: 0.7755
Epoch 7/12
60000/60000 [==============================] - 3s 53us/step - loss: 1.5003 - accuracy: 0.6071 - val_loss: 1.2725 - val_accuracy: 0.7928
Epoch 8/12
60000/60000 [==============================] - 3s 52us/step - loss: 1.3414 - accuracy: 0.6363 - val_loss: 1.0991 - val_accuracy: 0.8077
Epoch 9/12
60000/60000 [==============================] - 3s 53us/step - loss: 1.2129 - accuracy: 0.6604 - val_loss: 0.9603 - val_accuracy: 0.8169
Epoch 10/12
60000/60000 [==============================] - 3s 53us/step - loss: 1.1103 - accuracy: 0.6814 - val_loss: 0.8530 - val_accuracy: 0.8281
Epoch 11/12
60000/60000 [==============================] - 3s 52us/step - loss: 1.0237 - accuracy: 0.7021 - val_loss: 0.7689 - val_accuracy: 0.8350
Epoch 12/12
60000/60000 [==============================] - 3s 52us/step - loss: 0.9576 - accuracy: 0.7168 - val_loss: 0.7030 - val_accuracy: 0.8429
Test loss: 0.7029915698051452
Test accuracy: 0.8428999781608582
# tensorflow1.15.5 + keras2.3.1
60000/60000 [==============================] - 5s 84us/step - loss: 0.2631 - accuracy: 0.9198 - val_loss: 0.0546 - val_accuracy: 0.9826
Epoch 2/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0898 - accuracy: 0.9731 - val_loss: 0.0394 - val_accuracy: 0.9866
Epoch 3/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0674 - accuracy: 0.9799 - val_loss: 0.0341 - val_accuracy: 0.9881
Epoch 4/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0563 - accuracy: 0.9835 - val_loss: 0.0320 - val_accuracy: 0.9895
Epoch 5/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0465 - accuracy: 0.9859 - val_loss: 0.0343 - val_accuracy: 0.9889
Epoch 6/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0423 - accuracy: 0.9872 - val_loss: 0.0327 - val_accuracy: 0.9892
Epoch 7/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0387 - accuracy: 0.9882 - val_loss: 0.0279 - val_accuracy: 0.9907
Epoch 8/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0351 - accuracy: 0.9893 - val_loss: 0.0269 - val_accuracy: 0.9909
Epoch 9/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0330 - accuracy: 0.9902 - val_loss: 0.0311 - val_accuracy: 0.9895
Epoch 10/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0292 - accuracy: 0.9915 - val_loss: 0.0256 - val_accuracy: 0.9919
Epoch 11/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0293 - accuracy: 0.9911 - val_loss: 0.0276 - val_accuracy: 0.9911
Epoch 12/12
60000/60000 [==============================] - 4s 63us/step - loss: 0.0269 - accuracy: 0.9917 - val_loss: 0.0264 - val_accuracy: 0.9915
Test loss: 0.026934823030711867
Test accuracy: 0.9918000102043152
What caused the poor results for the tensorflow 2.3.2 + keras 2.4.3?? Is there any compatibility issue between tensorflow and keras here?
According to the author of keras, users should consider switching their Keras code to tf.keras in TensorFlow 2.x. In the above toy example, if
from tensorflow import keras in place of import keras, it also leads lower accuracy. It seems tf.keras gives poorer accuracy than keras? Maybe I run a wrong toy example for Tensorflow 2.X??
Update:
I also note if I decrease tensorflow to the version 2.2.1 (along with keras 2.3.1). They will produce about the same result. It seems there are some major changes from keras 2.3.1 to keras 2.4.0 (https://newreleases.io/project/github/keras-team/keras/release/2.4.0).
What are the specific main differences between keras 2.3.1 and keras 2.4.x??
Which versions of tensorflow are compatible with keras 2.4.x??

two kind of description of model.compile get the different results

have already tried
The data pipeline and model build function is right
code
The code is https://gist.github.com/MaoXianXin/cd398521546d967560942e702c243ba7
I want to know why this two kind of description of model.compile get the different results.
see the accuracy
model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=0.01), loss='categorical_crossentropy', metrics=['acc'])
Epoch 1/50
2019-05-27 13:53:20.280605: I tensorflow/stream_executor/dso_loader.cc:153] successfully opened CUDA library libcublas.so.10.0 locally
468/468 [==============================] - 6s 12ms/step - loss: 14.4332 - acc: 0.1045 - val_loss: 14.4601 - val_acc: 0.1029
Epoch 2/50
468/468 [==============================] - 3s 6ms/step - loss: 14.4354 - acc: 0.1044 - val_loss: 14.4763 - val_acc: 0.1023
Epoch 3/50
468/468 [==============================] - 3s 6ms/step - loss: 14.4359 - acc: 0.1044 - val_loss: 14.4714 - val_acc: 0.1026
Epoch 4/50
468/468 [==============================] - 3s 6ms/step - loss: 14.4359 - acc: 0.1044 - val_loss: 14.4682 - val_acc: 0.1028
model.compile('adam', 'categorical_crossentropy', metrics=['acc'])
Epoch 1/50
2019-05-27 13:51:16.122054: I tensorflow/stream_executor/dso_loader.cc:153] successfully opened CUDA library libcublas.so.10.0 locally
468/468 [==============================] - 5s 12ms/step - loss: 3.6567 - acc: 0.7388 - val_loss: 0.0732 - val_acc: 0.9791
Epoch 2/50
468/468 [==============================] - 3s 6ms/step - loss: 0.0812 - acc: 0.9760 - val_loss: 0.0449 - val_acc: 0.9854
Epoch 3/50
468/468 [==============================] - 3s 6ms/step - loss: 0.0533 - acc: 0.9836 - val_loss: 0.0428 - val_acc: 0.9869
Epoch 4/50
468/468 [==============================] - 3s 6ms/step - loss: 0.0426 - acc: 0.9871 - val_loss: 0.0446 - val_acc: 0.9872
Epoch 5/50
468/468 [==============================] - 3s 6ms/step - loss: 0.0376 - acc: 0.9886 - val_loss: 0.0449 - val_acc: 0.9867

First training epoch is very slow

Hi… I’m running mnist code in my P3 AWS machine and the initialization process seems to be very long compared to my previous P2 machine (although P3>P2)
Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 [==============================] - 265s 4ms/step - loss: 0.2674 - acc: 0.9175 - val_loss: 0.0602 - val_acc: 0.9811
Epoch 2/10
60000/60000 [==============================] - 3s 51us/step - loss: 0.0860 - acc: 0.9742 - val_loss: 0.0393 - val_acc: 0.9866
Epoch 3/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0647 - acc: 0.9808 - val_loss: 0.0338 - val_acc: 0.9884
Epoch 4/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0542 - acc: 0.9839 - val_loss: 0.0337 - val_acc: 0.9887
Epoch 5/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0453 - acc: 0.9863 - val_loss: 0.0311 - val_acc: 0.9900
Epoch 6/10
60000/60000 [==============================] - 3s 51us/step - loss: 0.0412 - acc: 0.9873 - val_loss: 0.0291 - val_acc: 0.9898
Epoch 7/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0368 - acc: 0.9891 - val_loss: 0.0300 - val_acc: 0.9901
Epoch 8/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0340 - acc: 0.9897 - val_loss: 0.0298 - val_acc: 0.9897
Epoch 9/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0320 - acc: 0.9908 - val_loss: 0.0267 - val_acc: 0.9916
Epoch 10/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0286 - acc: 0.9914 - val_loss: 0.0276 - val_acc: 0.9903
Test loss: 0.02757222411266339
Test accuracy: 0.9903
I’m using Keras=2.1.4
tensorflow-gpu=1.5.0
my keras.json file is configured as follows:
{
"floatx": "float32",
"epsilon": 1e-07,
"backend": "tensorflow",
"image_data_format": "channels_last"
}
Any ideas why is it like that?
Thanks in advance
Based on this issue:
The first epoch takes the same time, but the counter also takes into
account the time taken by building the part of the computational graph
that deals with training (a few seconds). This used to be done during
the compile step, but now it is done lazily one demand to avoid
unnecessary work.

Why is the accuracy for my Keras model always 0 when training?

I'm pretty new to keras I have built a simple network to try:
import numpy as np;
from keras.models import Sequential;
from keras.layers import Dense,Activation;
data= np.genfromtxt("./kerastests/mydata.csv", delimiter=';')
x_target=data[:,29]
x_training=np.delete(data,6,axis=1)
x_training=np.delete(x_training,28,axis=1)
model=Sequential()
model.add(Dense(20,activation='relu', input_dim=x_training.shape[1]))
model.add(Dense(10,activation='relu'))
model.add(Dense(1));
model.compile(optimizer='adam',loss='mean_squared_error',metrics=['accuracy'])
model.fit(x_training, x_target)
From my source data, I have removed 2 columns, as you can see. One is a column that came with dates in a string format (in the dataset, besides it, I have a column for the day, another for the month, and another for the year, so I don't need that column) and the other column is the column I use as target for the model).
When I train this model I get this output:
32/816 [>.............................] - ETA: 23s - loss: 13541942.0000 - acc: 0.0000e+00
800/816 [============================>.] - ETA: 0s - loss: 11575466.0400 - acc: 0.0000e+00
816/816 [==============================] - 1s - loss: 11536905.2353 - acc: 0.0000e+00
Epoch 2/10
32/816 [>.............................] - ETA: 0s - loss: 6794785.0000 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5381360.4314 - acc: 0.0000e+00
Epoch 3/10
32/816 [>.............................] - ETA: 0s - loss: 6235184.0000 - acc: 0.0000e+00
800/816 [============================>.] - ETA: 0s - loss: 5199512.8700 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5192977.4216 - acc: 0.0000e+00
Epoch 4/10
32/816 [>.............................] - ETA: 0s - loss: 4680165.5000 - acc: 0.0000e+00
736/816 [==========================>...] - ETA: 0s - loss: 5050110.3043 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5168771.5490 - acc: 0.0000e+00
Epoch 5/10
32/816 [>.............................] - ETA: 0s - loss: 5932391.0000 - acc: 0.0000e+00
768/816 [===========================>..] - ETA: 0s - loss: 5198882.9167 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5159585.9020 - acc: 0.0000e+00
Epoch 6/10
32/816 [>.............................] - ETA: 0s - loss: 4488318.0000 - acc: 0.0000e+00
768/816 [===========================>..] - ETA: 0s - loss: 5144843.8333 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5151492.1765 - acc: 0.0000e+00
Epoch 7/10
32/816 [>.............................] - ETA: 0s - loss: 6920405.0000 - acc: 0.0000e+00
800/816 [============================>.] - ETA: 0s - loss: 5139358.5000 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5169839.2941 - acc: 0.0000e+00
Epoch 8/10
32/816 [>.............................] - ETA: 0s - loss: 3973038.7500 - acc: 0.0000e+00
672/816 [=======================>......] - ETA: 0s - loss: 5183285.3690 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5141417.0000 - acc: 0.0000e+00
Epoch 9/10
32/816 [>.............................] - ETA: 0s - loss: 4969548.5000 - acc: 0.0000e+00
768/816 [===========================>..] - ETA: 0s - loss: 5126550.1667 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5136524.5098 - acc: 0.0000e+00
Epoch 10/10
32/816 [>.............................] - ETA: 0s - loss: 6334703.5000 - acc: 0.0000e+00
768/816 [===========================>..] - ETA: 0s - loss: 5197778.8229 - acc: 0.0000e+00
816/816 [==============================] - 0s - loss: 5141391.2059 - acc: 0.0000e+00
Why is this happening? My data is a time series. I know that for time series people do not usually use Dense neurons, but it is just a test. What really tricks me is that accuracy is always 0. And, with other tests, I did even lose: gets to a "NAN" value.
Could anybody help here?
Your model seems to correspond to a regression model for the following reasons:
You are using linear (the default one) as an activation function in the output layer (and relu in the layer before).
Your loss is loss='mean_squared_error'.
However, the metric that you use- metrics=['accuracy'] corresponds to a classification problem. If you want to do regression, remove metrics=['accuracy']. That is, use
model.compile(optimizer='adam',loss='mean_squared_error')
Here is a list of keras metrics for regression and classification (taken from this blog post):
Keras Regression Metrics
•Mean Squared Error: mean_squared_error, MSE or mse
•Mean Absolute Error: mean_absolute_error, MAE, mae
•Mean Absolute Percentage Error: mean_absolute_percentage_error, MAPE,
mape
•Cosine Proximity: cosine_proximity, cosine
Keras Classification Metrics
•Binary Accuracy: binary_accuracy, acc
•Categorical Accuracy: categorical_accuracy, acc
•Sparse Categorical Accuracy: sparse_categorical_accuracy
•Top k Categorical Accuracy: top_k_categorical_accuracy (requires you
specify a k parameter)
•Sparse Top k Categorical Accuracy: sparse_top_k_categorical_accuracy
(requires you specify a k parameter)
Add following to get metrics:
history = model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mean_squared_error'])
# OR
history = model.compile(optimizer='adam', loss='mean_absolute_error', metrics=['mean_absolute_error'])
history.history.keys()
history.history
I would like to point out something that is very important and has been unfortunately neglected: mean_squared_error is not an invalid loss function for classification.
The mathematical properties of cross_entropy in conjunction with the assumptions of mean_squared_error(both of which I will not expand upon in this comment) make the latter inappropriate or worse than the cross_entropy when it comes to training on classification problems.
Try this one.
while trying to solve the Titanic problem from kaggle, I forgot to fill the missing data from the Dataframe, because of which the missing data was filled with "nan".
The model threw a similar output
#------------------------------------------------------
Epoch 1/50
891/891 [==============================] - 3s 3ms/step - loss: 9.8239 - acc: 0.0000e+00
Epoch 2/50
891/891 [==============================] - 1s 2ms/step - loss: 9.8231 - acc: 0.0000e+00
Epoch 3/50
891/891 [==============================] - 1s 1ms/step - loss: 9.8231 - acc: 0.0000e+00
Epoch 4/50
891/891 [==============================] - 1s 1ms/step - loss: 9.8231 - acc: 0.0000e+00
Epoch 5/50
891/891 [==============================] - 1s 1ms/step - loss: 9.8231 - acc: 0.0000e+00
#------------------------------------------------------
Make sure you prepare your data before feeding it to the model.
In my case I had to do the following changes
+++++++++++++++++++++++++++++++++++
dataset[['Age']] = dataset[['Age']].fillna(value=dataset[['Age']].mean())
dataset[['Fare']] = dataset[['Fare']].fillna(value=dataset[['Fare']].mean())
dataset[['Embarked']] = dataset[['Embarked']].fillna(value=dataset['Embarked'].value_counts().idxmax())