Val_accuracy is not changing between each epoch - tensorflow

I'm currently learning about gensim word2vec, and in the middle of making a model that can predict what the rating would be based on the words. All of my code compiles fine, however each epoch posses the same val_accuracy, and almost the exact same training accuracy. I've tried changing the architecture and hyperparameters, but have had no luck whatsoever. Any advice would be appreciated!
from keras.preprocessing.sequence import pad_sequences
word_index = {w: i+1 for i,w in enumerate(index_to_key) if i < max_words-1} # Keep just max_words (zero is reserved for unknown)
sequences = [[word_index.get(w, 0) for w in sent] for sent in reviews] # code the sentences
seqs_truncated = pad_sequences(sequences, maxlen=max_review_length, padding="pre", truncating="post")
ratings = np.asarray(Ratings)
# prepare training and validation data
x_val = seqs_truncated[:len_val]
partial_x_train = seqs_truncated[len_val:]
y_val = to_categorical(ratings[:len_val]-1, num_classes=5)
partial_y_train = to_categorical(ratings[len_val:]-1, num_classes=5)
print('Length of validation set =', len(x_val))
print('Length of training set =', len(partial_x_train))
def sample_network(embedding):
network = Sequential()
network.add(embedding)
network.add(Bidirectional(LSTM(64)))
network.add(Dense(64, activation='relu'))
network.add(Dense(32, activation="relu"))
network.add(Dense(16, activation="relu"))
network.add(Dropout(0.5))
network.add(Dense(5,activation='softmax'))
return network```
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_24 (Embedding) (None, 100, 300) 1500000
_________________________________________________________________
bidirectional_5 (Bidirection (None, 128) 186880
_________________________________________________________________
dense_16 (Dense) (None, 64) 8256
_________________________________________________________________
dense_17 (Dense) (None, 32) 2080
_________________________________________________________________
dense_18 (Dense) (None, 16) 528
_________________________________________________________________
dropout_10 (Dropout) (None, 16) 0
_________________________________________________________________
dense_19 (Dense) (None, 5) 85
=================================================================
Total params: 1,697,829
Trainable params: 1,697,829
Non-trainable params: 0 ```
hist_word_vec =network_wordvec.fit(partial_x_train,partial_y_train, epochs=no_epochs,
batch_size=256, validation_data=(x_val,y_val)
65/65 [==============================] - 39s 606ms/step - loss: 1.5886 - accuracy: 0.4449 - val_loss: 1.5750 - val_accuracy: 0.3875
Epoch 2/8
65/65 [==============================] - 38s 587ms/step - loss: 1.5491 - accuracy: 0.4554 - val_loss: 1.5458 - val_accuracy: 0.3875
Epoch 3/8
65/65 [==============================] - 37s 564ms/step - loss: 1.5152 - accuracy: 0.4554 - val_loss: 1.5215 - val_accuracy: 0.3875

Related

Invalid Argument Error when using Tensorboard callback

I have used a Tensorboard callback in fitting a model consisting of one embedding layer and one SimpleRNN layer. The model performs binary sentiment classification for 9600 input text sequences. They have been tokenised and padded in advance.
# 1. Remove previous logs
!rm -rf ./logs/
# 2. Change to Py_file_dir
os.chdir(...)
# input_dim = 43489 (size of tokenizer word dictionary); output_dim = 100 (GloVe 100d embeddings); input_length = 1403 (length of longest text sequence).
# xtr_pad is padded, tokenised text sequences. nrow = 9600, ncol = input_length = 1403.
model= Sequential()
model.add(Embedding(input_dim, output_dim, input_length= input_length,
weights= [Embedding_matrix], trainable= False))
model.add(SimpleRNN(200))
model.add(Dense(1, activation= 'sigmoid'))
model.compile(loss='binary_crossentropy', optimizer= 'adam', metrics=['accuracy'])
tb = TensorBoard(histogram_freq=1, log_dir= 'tbcallback_prac')
tr_results= model.fit(xtr_pad, ytr, epochs= 2, batch_size= 64, verbose= 1,
validation_split= 0.2, callbacks= [tb])
# In command prompt enter: tensorboard --logdir tbcallback_prac
I have run this on Jupyterlab and on the first time the model trains without issue. I was able to view the Tensorboard statistics on local host.
However when I run this same code a second time, i.e. removing logs and fitting model it completed the first epoch of training, but returns this error before the 2nd epoch begins.
Train on 7680 samples, validate on 1920 samples
Epoch 1/2
7680/7680 [==============================] - ETA: 0s - loss: 0.2919 - accuracy: 0.9004
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
<ipython-input-12-a1cde9b5b1f4> in <module>()
7 tb = TensorBoard(histogram_freq=1, log_dir= 'tbcallback_prac')
8 tr_results= model.fit(xtr_pad, ytr, epochs= 2, batch_size= 64, verbose= 1,
----> 9 validation_split= 0.2, callbacks= [tb])
...
InvalidArgumentError: You must feed a value for placeholder tensor 'embedding_input' with dtype float and shape [?,1403]
[[{{node embedding_input}}]]
Note 1403 is the length of all padded, tokenised sequences in training input 'xtr'.
Thanks in advance for any help!
I have no issue but I think that is a dimensions problem when working on logtis and sigmoid
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 3072, 64) 64000
simple_rnn (SimpleRNN) (None, 200) 53000
dense (Dense) (None, 1) 201
=================================================================
Total params: 117,201
Trainable params: 117,201
Non-trainable params: 0
_________________________________________________________________
val_dir: F:\models\checkpoint\ale_highscores_3\validation
Epoch 1/1500
2/2 [==============================] - ETA: 0s - loss: -0.5579 - accuracy: 0.1000[<KerasTensor: shape=(None, 3072) dtype=float32 (created by layer 'embedding_input')>]
<keras.engine.functional.Functional object at 0x00000233003A8550>
Press AnyKey!
2/2 [==============================] - 14s 7s/step - loss: -0.5579 - accuracy: 0.1000 - val_loss: -0.6446 - val_accuracy: 0.1000
Epoch 2/1500
2/2 [==============================] - ETA: 0s - loss: -0.6588 - accuracy: 0.1000[<KerasTensor: shape=(None, 3072) dtype=float32 (created by layer 'embedding_input')>]
<keras.engine.functional.Functional object at 0x00000233003A8C40>
Press AnyKey!
2/2 [==============================] - 13s 7s/step - loss: -0.6588 - accuracy: 0.1000 - val_loss: -0.7242 - val_accuracy: 0.1000
Epoch 3/1500
1/2 [==============>...............] - ETA: 6s - loss: -0.1867 - accuracy: 0.1429

Keras model classifies images as the same class

I am using Keras' pre-trained VGGNet16 as my base model, but I need to add layers on the end to make it work for my data. I've got the data pre-processed and formatted, so I'll jump to the part of the code actually involving the CNN.
import tensorflow as tf
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D, AveragePooling2D
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
import tensorflow.keras as K
.
.
.
input_t = K.Input(shape=(224,224,3))
base_model = K.applications.VGG16(include_top=False,weights="imagenet",input_tensor=input_t)
for layer in base_model.layers:
layer.trainable = False
model = K.models.Sequential()
model.add(base_model)
model.add(Flatten())
model.add(Dense(num_classes,activation='softmax'))
#Training
model.compile(loss = 'binary_crossentropy',
optimizer = Adam(learning_rate=1e-5),
metrics = ['accuracy'])
#Fit
model.fit(x_train,
y_train,
epochs = 3,
validation_split = 0.1,
validation_data =(x_test,y_test))
model.summary()
This code yields the following output.
Epoch 1/3
14/14 [==============================] - 48s 3s/step - loss: 0.5887 - accuracy: 0.2287 - val_loss: 0.4951 - val_accuracy: 0.3000
Epoch 2/3
14/14 [==============================] - 48s 3s/step - loss: 0.5170 - accuracy: 0.2220 - val_loss: 0.4972 - val_accuracy: 0.2800
Epoch 3/3
14/14 [==============================] - 48s 3s/step - loss: 0.4982 - accuracy: 0.2265 - val_loss: 0.4975 - val_accuracy: 0.2200
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Functional) (None, 7, 7, 512) 14714688
_________________________________________________________________
flatten (Flatten) (None, 25088) 0
_________________________________________________________________
dense (Dense) (None, 5) 125445
=================================================================
Total params: 14,840,133
Trainable params: 125,445
Non-trainable params: 14,714,688
_________________________________________________________________
Test loss: 0.4997282326221466
Test accuracy: 0.2720000147819519
All of my test images are being classified into the same class. Is there any reason for this?
Edit: Modified question upon realizing the model is running properly
Perhaps this isn't a great solution, but I fixed this by making the last six layers trainable with the others trainable
for layer in base_model.layers[:-6]:
layer.trainable = False

I use tensorflow2 to identify captcha image,but happend some wrong

this is my code :
number = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
absPath='F:/Projects/AI/Tensorflow/verificationcode/image/'
imagePaths=os.listdir('./image')
model=tf.keras.models.Sequential([
Conv2D(32,kernel_size=3, activation='relu'),
Conv2D(32,kernel_size=3, activation='relu'),
MaxPool2D((2, 2)),
Conv2D(64, kernel_size=3, activation='relu'),
Conv2D(64, kernel_size=3, activation='relu'),
MaxPool2D((2, 2)),
Conv2D(128, kernel_size=3, activation='relu'),
Conv2D(128, kernel_size=3, activation='relu'),
MaxPool2D((2, 2)),
Conv2D(256, kernel_size=3, activation='relu'),
Conv2D(256, kernel_size=3, activation='relu'),
MaxPool2D((2, 2)),
Flatten(),
Dropout(0.25),
Dense(40,activation='softmax')
])
model(inputs=tf.keras.Input(shape=(80,170,3)))
model.compile(optimizer='adam',
loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.summary()
history = model.fit(x_train, y_train, batch_size=32,shuffle=True, epochs=5, validation_freq=1)
this is my model:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 78, 168, 32) 896
_________________________________________________________________
conv2d_1 (Conv2D) (None, 76, 166, 32) 9248
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 38, 83, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 36, 81, 64) 18496
_________________________________________________________________
conv2d_3 (Conv2D) (None, 34, 79, 64) 36928
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 17, 39, 64) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 15, 37, 128) 73856
_________________________________________________________________
conv2d_5 (Conv2D) (None, 13, 35, 128) 147584
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 6, 17, 128) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 4, 15, 256) 295168
_________________________________________________________________
conv2d_7 (Conv2D) (None, 2, 13, 256) 590080
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 1, 6, 256) 0
_________________________________________________________________
flatten (Flatten) (None, 1536) 0
_________________________________________________________________
dropout (Dropout) (None, 1536) 0
_________________________________________________________________
dense (Dense) (None, 40) 61480
=================================================================
Total params: 1,233,736
Trainable params: 1,233,736
Non-trainable params: 0
I use 3935 samples to train,it is result:
3616/3935 [==========================>...] - ETA: 10s - loss: 14.7558 - accuracy: 0.0105
3648/3935 [==========================>...] - ETA: 9s - loss: 14.7558 - accuracy: 0.0112
3680/3935 [===========================>..] - ETA: 8s - loss: 14.7558 - accuracy: 0.0128
3712/3935 [===========================>..] - ETA: 7s - loss: 14.7558 - accuracy: 0.0129
3744/3935 [===========================>..] - ETA: 6s - loss: 14.7558 - accuracy: 0.0134
3776/3935 [===========================>..] - ETA: 5s - loss: 14.7558 - accuracy: 0.0143
3808/3935 [============================>.] - ETA: 4s - loss: 14.7558 - accuracy: 0.0142
3840/3935 [============================>.] - ETA: 3s - loss: 14.7558 - accuracy: 0.0148
3872/3935 [============================>.] - ETA: 2s - loss: 14.7558 - accuracy: 0.0155
3904/3935 [============================>.] - ETA: 1s - loss: 14.7558 - accuracy: 0.0166
3935/3935 [==============================] - 135s 34ms/sample - loss: 14.7558 - accuracy: 0.0170
this is a captcha image:
captcha image
the loss unchanged, the accuracy is very low
How to solve it ?? thanks!

TF 2.2 precision and recall are always returning zeros in training and validation

Hello I am trying to train a small model using tf.keras. with tf 2.2.0, i'm using a generator which returns sequences of [5,120,32,64,9] and labels [5,120,1] and I'm importing from tf.keras
from tensorflow.keras.metrics import Recall, Precision, Metric
Additionally I am adding them into the compile and fit section
model.compile(
loss="mse",
optimizer=Adam(learning_rate=self.learning_rate),
metrics=[Recall(), Precision()],
sample_weight_mode="temporal",
)
if callbacks is None:
callbacks = []
model.fit(
data.training(),
callbacks=callbacks,
steps_per_epoch=epoch_size,
epochs=epochs,
validation_data=data.training(),
validation_steps=validation_size,
verbose=0,
)
(I'm conscious that I'm using training as training data and validation data. I'm trying to find a bug in my code or in TF since we get strange and strong changes in results in recall and precision w.r.t validation. It never converges and produces extreme changes for example from 0 - 0.8 - 0.2 - 0.9 - 0.4 - 0.8 ...)
Additionally I'm using a generator which yields tuples of inputs and outputs, since that "corrected the problem"
however I'm still having results with precision and recall 0.00000
100/100 [==============================] - 224s 2s/step - loss: 0.0371 - recall: 0.0000e+00 - precision: 0.0000e+00 - val_loss: 0.0331 - val_recall: 0.0000e+00 - val_precision: 0.0000e+00
Does anyone know any other trick to use in tf 2.2 that I can use in order to solve that problem?
a summary of my NN is the following:
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) [(None, None, 32, 64, 9)] 0
_________________________________________________________________
conv_lst_m2d_1 (ConvLSTM2D) (None, None, 30, 62, 20) 20960
_________________________________________________________________
time_distributed_MP_1 (TimeD (None, None, 15, 31, 20) 0
_________________________________________________________________
time_distributed_BN_1 (TimeD (None, None, 15, 31, 20) 80
_________________________________________________________________
time_distributed_F (TimeDist (None, None, 9300) 0
_________________________________________________________________
time_distributed_D1 (TimeDis (None, None, 32) 297632
_________________________________________________________________
time_distributed (TimeDistri (None, None, 32) 0
_________________________________________________________________
time_distributed_D2 (TimeDis (None, None, 24) 792
_________________________________________________________________
time_distributed_1 (TimeDist (None, None, 24) 0
_________________________________________________________________
time_distributed_D3 (TimeDis (None, None, 16) 400
_________________________________________________________________
time_distributed_2 (TimeDist (None, None, 16) 0
_________________________________________________________________
output (TimeDistributed) (None, None, 1) 17
=================================================================
This was happening to me and I finally figured out why. My data was ordered. So for example all my negative samples were at the end of the array and all my positive at the beginning. So when the Neural Network was training at the beginning it would only find negative samples of the class.

Why Tensorflow GPU is not working with larger batch sizes?

I am training an Auto-encoder network on Tensorflow GPU 1.13.1. Initially, I used the batch size 32/64/128 but it seems the GPU is not being used at all. Although, "memory-usage" from "nvidia-smi returns the following:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.39 Driver Version: 418.39 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:06:00.0 Off | 0 |
| N/A 34C P0 53W / 300W | 31316MiB / 32480MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
And, the training stops at 39th steps every time.
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) (None, 256, 256, 3) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 64, 64, 96) 34944
_________________________________________________________________
batch_normalization_6 (Batch (None, 64, 64, 96) 384
_________________________________________________________________
activation_6 (Activation) (None, 64, 64, 96) 0
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 31, 31, 96) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 31, 31, 256) 614656
_________________________________________________________________
batch_normalization_7 (Batch (None, 31, 31, 256) 1024
_________________________________________________________________
activation_7 (Activation) (None, 31, 31, 256) 0
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 15, 15, 256) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 15, 15, 384) 885120
_________________________________________________________________
batch_normalization_8 (Batch (None, 15, 15, 384) 1536
_________________________________________________________________
activation_8 (Activation) (None, 15, 15, 384) 0
_________________________________________________________________
conv2d_9 (Conv2D) (None, 15, 15, 384) 1327488
_________________________________________________________________
batch_normalization_9 (Batch (None, 15, 15, 384) 1536
_________________________________________________________________
activation_9 (Activation) (None, 15, 15, 384) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 15, 15, 256) 884992
_________________________________________________________________
batch_normalization_10 (Batc (None, 15, 15, 256) 1024
_________________________________________________________________
activation_10 (Activation) (None, 15, 15, 256) 0
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 7, 7, 256) 0
_________________________________________________________________
conv2d_11 (Conv2D) (None, 1, 1, 1024) 12846080
_________________________________________________________________
batch_normalization_11 (Batc (None, 1, 1, 1024) 4096
_________________________________________________________________
encoded (Activation) (None, 1, 1, 1024) 0
_________________________________________________________________
reshape_1 (Reshape) (None, 2, 2, 256) 0
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 4, 4, 128) 819328
_________________________________________________________________
activation_11 (Activation) (None, 4, 4, 128) 0
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 8, 8, 64) 204864
_________________________________________________________________
activation_12 (Activation) (None, 8, 8, 64) 0
_________________________________________________________________
conv2d_transpose_3 (Conv2DTr (None, 16, 16, 32) 51232
_________________________________________________________________
activation_13 (Activation) (None, 16, 16, 32) 0
_________________________________________________________________
conv2d_transpose_4 (Conv2DTr (None, 32, 32, 16) 12816
_________________________________________________________________
activation_14 (Activation) (None, 32, 32, 16) 0
_________________________________________________________________
conv2d_transpose_5 (Conv2DTr (None, 64, 64, 8) 3208
_________________________________________________________________
activation_15 (Activation) (None, 64, 64, 8) 0
_________________________________________________________________
conv2d_transpose_6 (Conv2DTr (None, 128, 128, 4) 804
_________________________________________________________________
activation_16 (Activation) (None, 128, 128, 4) 0
_________________________________________________________________
conv2d_transpose_7 (Conv2DTr (None, 256, 256, 3) 303
=================================================================
Total params: 17,695,435
Trainable params: 17,690,635
Non-trainable params: 4,800
_________________________________________________________________
Epoch 1/1
Found 11058 images belonging to 1 classes.
Found 11058 images belonging to 1 classes.
Found 11058 images belonging to 1 classes.
Found 44234 images belonging to 1 classes.
Found 11058 images belonging to 1 classes.
Found 44234 images belonging to 1 classes.
Found 44234 images belonging to 1 classes.
Found 44234 images belonging to 1 classes.
1/1382 [..............................] - ETA: 19:43:47 - loss: 0.6934 - accuracy: 0.1511
2/1382 [..............................] - ETA: 10:04:16 - loss: 0.6933 - accuracy: 0.1545
3/1382 [..............................] - ETA: 7:28:21 - loss: 0.6933 - accuracy: 0.1571
4/1382 [..............................] - ETA: 6:07:30 - loss: 0.6932 - accuracy: 0.1590
5/1382 [..............................] - ETA: 5:21:58 - loss: 0.6931 - accuracy: 0.1614
6/1382 [..............................] - ETA: 4:55:45 - loss: 0.6930 - accuracy: 0.1648
7/1382 [..............................] - ETA: 4:32:58 - loss: 0.6929 - accuracy: 0.1668
8/1382 [..............................] - ETA: 4:15:07 - loss: 0.6929 - accuracy: 0.1692
9/1382 [..............................] - ETA: 4:02:22 - loss: 0.6928 - accuracy: 0.1726
10/1382 [..............................] - ETA: 3:50:11 - loss: 0.6926 - accuracy: 0.1745
11/1382 [..............................] - ETA: 3:39:13 - loss: 0.6925 - accuracy: 0.1769
12/1382 [..............................] - ETA: 3:29:38 - loss: 0.6924 - accuracy: 0.1797
13/1382 [..............................] - ETA: 3:21:11 - loss: 0.6923 - accuracy: 0.1824
14/1382 [..............................] - ETA: 3:13:42 - loss: 0.6922 - accuracy: 0.1845
15/1382 [..............................] - ETA: 3:07:17 - loss: 0.6920 - accuracy: 0.1871
16/1382 [..............................] - ETA: 3:01:59 - loss: 0.6919 - accuracy: 0.1896
17/1382 [..............................] - ETA: 2:57:36 - loss: 0.6918 - accuracy: 0.1916
18/1382 [..............................] - ETA: 2:53:06 - loss: 0.6917 - accuracy: 0.1938
19/1382 [..............................] - ETA: 2:49:37 - loss: 0.6915 - accuracy: 0.1956
20/1382 [..............................] - ETA: 2:45:51 - loss: 0.6915 - accuracy: 0.1979
21/1382 [..............................] - ETA: 2:43:18 - loss: 0.6914 - accuracy: 0.2000
22/1382 [..............................] - ETA: 2:41:02 - loss: 0.6913 - accuracy: 0.2022
23/1382 [..............................] - ETA: 2:39:23 - loss: 0.6912 - accuracy: 0.2039
24/1382 [..............................] - ETA: 2:37:23 - loss: 0.6911 - accuracy: 0.2060
25/1382 [..............................] - ETA: 2:35:58 - loss: 0.6909 - accuracy: 0.2080
26/1382 [..............................] - ETA: 2:34:06 - loss: 0.6909 - accuracy: 0.2098
27/1382 [..............................] - ETA: 2:33:19 - loss: 0.6908 - accuracy: 0.2115
28/1382 [..............................] - ETA: 2:32:24 - loss: 0.6906 - accuracy: 0.2130
29/1382 [..............................] - ETA: 2:31:43 - loss: 0.6904 - accuracy: 0.2143
30/1382 [..............................] - ETA: 2:31:09 - loss: 0.6904 - accuracy: 0.2157
31/1382 [..............................] - ETA: 2:30:34 - loss: 0.6902 - accuracy: 0.2173
32/1382 [..............................] - ETA: 2:29:26 - loss: 0.6901 - accuracy: 0.2185
33/1382 [..............................] - ETA: 2:28:55 - loss: 0.6900 - accuracy: 0.2199
34/1382 [..............................] - ETA: 2:28:05 - loss: 0.6899 - accuracy: 0.2213
35/1382 [..............................] - ETA: 2:27:23 - loss: 0.6898 - accuracy: 0.2227
36/1382 [..............................] - ETA: 2:27:02 - loss: 0.6897 - accuracy: 0.2238
37/1382 [..............................] - ETA: 2:26:56 - loss: 0.6895 - accuracy: 0.2253
38/1382 [..............................] - ETA: 2:26:32 - loss: 0.6893 - accuracy: 0.2266
39/1382 [..............................] - ETA: 2:26:11 - loss: 0.6891 - accuracy: 0.2278
Even waiting hours, the training process doesn't move further.
Another, unusual thing I noticed is that, setting the batch size to "1", the GPU is being continuously utilized.
What could be the problem?
This might be an issue with the drive where you placed the dataset. The code was working fine everywhere but not on this server. I changed the drive (from one NFS share to another) and everything works well.