Cosine Similarity function as loss function and metric in Keras Deep-learning - tensorflow

I am training a encoder-decoder network for time series sequence prediction (multi-step input and multi-step output). As my output is a sequence, I chose my loss function and metric as the Cosine Similarity function since it is important for each output sequence value by as close as possible to the actual ground truth after training and I felt this was the best way to measure this.
Ideally, during training, I expected my loss function value and metric should be equal (since the same function is used for both) except for the sign (negative value for loss function tending towards -1 and positive value for metric tending towards +1). Also I expected both the loss function and the metric to either improve or deteriorate in the same direction during each epoch. However this is not happening and my training epochs are as follows:
Epoch 1/20
200/200 - 73s - loss: -9.3310e-01 - cosine_similarity: 0.8913 - 73s/epoch - 365ms/step
Epoch 2/20
200/200 - 72s - loss: -9.9833e-01 - cosine_similarity: 0.9779 - 72s/epoch - 362ms/step
Epoch 3/20
200/200 - 68s - loss: -9.9924e-01 - cosine_similarity: 0.9813 - 68s/epoch - 338ms/step
Epoch 4/20
200/200 - 69s - loss: -9.9955e-01 - cosine_similarity: 0.9852 - 69s/epoch - 344ms/step
Epoch 5/20
200/200 - 68s - loss: -9.9970e-01 - cosine_similarity: 0.9890 - 68s/epoch - 338ms/step
Epoch 6/20
200/200 - 69s - loss: -9.9978e-01 - cosine_similarity: 0.9920 - 69s/epoch - 344ms/step
Epoch 7/20
200/200 - 67s - loss: -9.9982e-01 - cosine_similarity: 0.9922 - 67s/epoch - 337ms/step
Epoch 8/20
200/200 - 67s - loss: -9.9985e-01 - cosine_similarity: 0.9903 - 67s/epoch - 334ms/step
Epoch 9/20
200/200 - 67s - loss: -9.9987e-01 - cosine_similarity: 0.9871 - 67s/epoch - 336ms/step
Epoch 10/20
200/200 - 68s - loss: -9.9988e-01 - cosine_similarity: 0.9830 - 68s/epoch - 340ms/step
Epoch 11/20
200/200 - 67s - loss: -9.9989e-01 - cosine_similarity: 0.9784 - 67s/epoch - 337ms/step
Epoch 12/20
200/200 - 67s - loss: -9.9990e-01 - cosine_similarity: 0.9736 - 67s/epoch - 336ms/step
Epoch 13/20
200/200 - 72s - loss: -9.9991e-01 - cosine_similarity: 0.9684 - 72s/epoch - 360ms/step
Epoch 14/20
200/200 - 70s - loss: -9.9991e-01 - cosine_similarity: 0.9639 - 70s/epoch - 352ms/step
Epoch 15/20
200/200 - 68s - loss: -9.9991e-01 - cosine_similarity: 0.9589 - 68s/epoch - 341ms/step
Epoch 15: early stopping
As can be seen above, my loss function has been progressively improving (and finally triggering early stopping on account of min_delta=0.0001). However my metric initially improved and subsequently started progressively deteriorating. Also the absolute value of the loss function and metric are slightly different (and worsens with each epoch).
Is this normal or am I doing something wrong? Any suggestions?

Related

Unclear difference between progress output in TensorFlow 2.3.0 and 1.15.0 in same code

I am new in ML. I have installed two tensorflows in two anaconda environments 1.15.0 and 2.3.0. (1.15.0 for be able to use my old GTX 660 videocard) and saw the difference in output progress info when training the same model.
Сode from book "Deep Learning with Python" by François Chollet:
import numpy as np
import os
data_dir='C:/Users/Username/_JupyterDocs/sund/data'
fname = os.path.join(data_dir, 'jena_climate_2009_2016.csv')
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="-1"
f = open(fname)
data = f.read()
f.close()
lines = data.split('\n')
header = lines[0].split(',')
lines = lines[1:]
float_data = np.zeros((len(lines), len(header) - 1))
for i, line in enumerate(lines):
values = [float(x) for x in line.split(',')[1:]]
float_data[i, :] = values
mean = float_data[:200000].mean(axis=0)
float_data -= mean
std = float_data[:200000].std(axis=0)
float_data /= std
def generator(data, lookback, delay, min_index, max_index, shuffle=False, batch_size=128, step=6):
if max_index is None:
max_index = len(data) - delay - 1
i = min_index + lookback
while 1:
if shuffle:
rows = np.random.randint(min_index + lookback, max_index, size=batch_size)
else:
if i + batch_size >= max_index:
i = min_index + lookback
rows = np.arange(i, min(i + batch_size, max_index))
i += len(rows)
samples = np.zeros((len(rows), lookback // step, data.shape[-1]))
targets = np.zeros((len(rows),))
for j, row in enumerate(rows):
indices = range(rows[j] - lookback, rows[j], step)
samples[j] = data[indices]
targets[j] = data[rows[j] + delay ][1]
yield samples, targets
lookback = 1440
step = 6
delay = 144
batch_size = 128
train_gen = generator(float_data,
lookback=lookback,
delay=delay,
min_index=0,
max_index=200000,
shuffle=True,
step=step,
batch_size=batch_size)
val_gen = generator(float_data,
lookback=lookback,
delay=delay,
min_index=200001,
max_index=300000,
step=step,
batch_size=batch_size)
test_gen = generator(float_data,
lookback=lookback,
delay=delay,
min_index=300001,
max_index=None,
step=step,
batch_size=batch_size)
val_steps = (300000 - 200001 - lookback) // batch_size
test_steps = (len(float_data) - 300001 - lookback) // batch_size
import time
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
from tensorflow.keras.optimizers import RMSprop
model = Sequential()
model.add(layers.GRU(32, input_shape=(None, float_data.shape[-1])))
model.add(layers.Dense(1))
model.compile(optimizer=RMSprop(), loss='mae')
start = time.perf_counter()
history = model.fit_generator(train_gen,
steps_per_epoch=500,
epochs=20,
validation_data=val_gen,
validation_steps=val_steps,
verbose=1)
elapsed = time.perf_counter() - start
f = open("C:/Users/Username/Desktop/log1.txt", "a")
f.write('Elapsed %.3f seconds.' % elapsed)
f.close()
print('Elapsed %.3f seconds.' % elapsed)
TF 2.3.0 progress output:
-Warning about deprecated in output:
WARNING:tensorflow:From C:\Users\Username\AppData\Local\Temp/ipykernel_10804/2601851929.py:13: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version. Instructions for updating: Please use Model.fit, which supports generators.
-Output:
Epoch 1/20
500/500 [==============================] - 45s 89ms/step - loss: 0.3050 - val_loss: 0.2686
Epoch 2/20
500/500 [==============================] - 45s 90ms/step - loss: 0.2841 - val_loss: 0.2658
Epoch 3/20
500/500 [==============================] - 46s 92ms/step - loss: 0.2771 - val_loss: 0.2653
Epoch 4/20
500/500 [==============================] - 46s 91ms/step - loss: 0.2729 - val_loss: 0.2795
Epoch 5/20
500/500 [==============================] - 45s 90ms/step - loss: 0.2690 - val_loss: 0.2644
Epoch 6/20
500/500 [==============================] - 45s 90ms/step - loss: 0.2632 - val_loss: 0.2673
Epoch 7/20
500/500 [==============================] - 45s 90ms/step - loss: 0.2602 - val_loss: 0.2641
Epoch 8/20
500/500 [==============================] - 45s 90ms/step - loss: 0.2549 - val_loss: 0.2667
Epoch 9/20
500/500 [==============================] - 45s 91ms/step - loss: 0.2507 - val_loss: 0.2768
Epoch 10/20
500/500 [==============================] - 45s 90ms/step - loss: 0.2447 - val_loss: 0.2785
Epoch 11/20
500/500 [==============================] - 45s 90ms/step - loss: 0.2422 - val_loss: 0.2763
Epoch 12/20
500/500 [==============================] - 45s 90ms/step - loss: 0.2354 - val_loss: 0.2794
Epoch 13/20
500/500 [==============================] - 46s 92ms/step - loss: 0.2320 - val_loss: 0.2807
Epoch 14/20
500/500 [==============================] - 45s 89ms/step - loss: 0.2277 - val_loss: 0.2848
Epoch 15/20
500/500 [==============================] - 45s 90ms/step - loss: 0.2222 - val_loss: 0.2909
Epoch 16/20
500/500 [==============================] - 45s 90ms/step - loss: 0.2178 - val_loss: 0.2910
Epoch 17/20
500/500 [==============================] - 45s 89ms/step - loss: 0.2152 - val_loss: 0.2918
Epoch 18/20
500/500 [==============================] - 45s 90ms/step - loss: 0.2112 - val_loss: 0.2917
Epoch 19/20
500/500 [==============================] - 44s 89ms/step - loss: 0.2103 - val_loss: 0.2979
Epoch 20/20
500/500 [==============================] - 45s 89ms/step - loss: 0.2068 - val_loss: 0.2986
Elapsed 904.779 seconds.
TF 1.15.0 progress output:
-Warning about deprecated in output:
WARNING:tensorflow:From C:\Users\Username\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\ops\resource_variable_ops.py:1630: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass *_constraint arguments to layers.
Output:
Epoch 1/20
WARNING:tensorflow:From C:\Users\Username\anaconda3\envs\tf-gpu\lib\site-packages\tensorflow_core\python\ops\math_grad.py:1424: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
499/500 [============================>.] - ETA: 0s - loss: 0.3014Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.2285
500/500 [==============================] - 63s 126ms/step - loss: 0.3014 - val_loss: 0.2686
Epoch 2/20
499/500 [============================>.] - ETA: 0s - loss: 0.2836Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.2225
500/500 [==============================] - 62s 123ms/step - loss: 0.2836 - val_loss: 0.2667
Epoch 3/20
499/500 [============================>.] - ETA: 0s - loss: 0.2761Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.3162
500/500 [==============================] - 62s 123ms/step - loss: 0.2762 - val_loss: 0.2721
Epoch 4/20
499/500 [============================>.] - ETA: 0s - loss: 0.2731Epoch 1/20
769/500 [==============================================] - 16s 21ms/step - loss: 0.2422
500/500 [==============================] - 62s 124ms/step - loss: 0.2730 - val_loss: 0.2667
Epoch 5/20
499/500 [============================>.] - ETA: 0s - loss: 0.2667Epoch 1/20
769/500 [==============================================] - 16s 21ms/step - loss: 0.3732
500/500 [==============================] - 61s 122ms/step - loss: 0.2667 - val_loss: 0.2663
Epoch 6/20
499/500 [============================>.] - ETA: 0s - loss: 0.2613Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.2088
500/500 [==============================] - 62s 124ms/step - loss: 0.2613 - val_loss: 0.2648
Epoch 7/20
499/500 [============================>.] - ETA: 0s - loss: 0.2544Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.3043
500/500 [==============================] - 62s 125ms/step - loss: 0.2544 - val_loss: 0.2710
Epoch 8/20
499/500 [============================>.] - ETA: 0s - loss: 0.2493Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.2767
500/500 [==============================] - 63s 127ms/step - loss: 0.2493 - val_loss: 0.2717
Epoch 9/20
499/500 [============================>.] - ETA: 0s - loss: 0.2455Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.2336
500/500 [==============================] - 62s 124ms/step - loss: 0.2455 - val_loss: 0.2743
Epoch 10/20
499/500 [============================>.] - ETA: 0s - loss: 0.2406Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.3041
500/500 [==============================] - 63s 126ms/step - loss: 0.2406 - val_loss: 0.2776
Epoch 11/20
499/500 [============================>.] - ETA: 0s - loss: 0.2345Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.2655
500/500 [==============================] - 62s 124ms/step - loss: 0.2344 - val_loss: 0.2779
Epoch 12/20
499/500 [============================>.] - ETA: 0s - loss: 0.2310Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.3085
500/500 [==============================] - 62s 124ms/step - loss: 0.2310 - val_loss: 0.2800
Epoch 13/20
499/500 [============================>.] - ETA: 0s - loss: 0.2271Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.3029
500/500 [==============================] - 64s 127ms/step - loss: 0.2271 - val_loss: 0.2839
Epoch 14/20
499/500 [============================>.] - ETA: 0s - loss: 0.2226Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.3110
500/500 [==============================] - 62s 125ms/step - loss: 0.2226 - val_loss: 0.2886
Epoch 15/20
499/500 [============================>.] - ETA: 0s - loss: 0.2190Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.3329
500/500 [==============================] - 62s 123ms/step - loss: 0.2190 - val_loss: 0.2919
Epoch 16/20
499/500 [============================>.] - ETA: 0s - loss: 0.2170Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.3022
500/500 [==============================] - 62s 125ms/step - loss: 0.2170 - val_loss: 0.2937
Epoch 17/20
499/500 [============================>.] - ETA: 0s - loss: 0.2132Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.2463
500/500 [==============================] - 62s 124ms/step - loss: 0.2132 - val_loss: 0.3004
Epoch 18/20
499/500 [============================>.] - ETA: 0s - loss: 0.2101Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.3423
500/500 [==============================] - 62s 124ms/step - loss: 0.2101 - val_loss: 0.3018
Epoch 19/20
499/500 [============================>.] - ETA: 0s - loss: 0.2072Epoch 1/20
769/500 [==============================================] - 17s 23ms/step - loss: 0.2689
500/500 [==============================] - 62s 125ms/step - loss: 0.2073 - val_loss: 0.3045
Epoch 20/20
499/500 [============================>.] - ETA: 0s - loss: 0.2066Epoch 1/20
769/500 [==============================================] - 17s 22ms/step - loss: 0.2809
500/500 [==============================] - 62s 124ms/step - loss: 0.2066 - val_loss: 0.2978
Elapsed 1245.008 seconds.
What the two additional progress bar in each epoch in TF 1.15.0 output?
From Documentation:
verbose: Integer. 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress
bar, 2 = one line per epoch
Default is 1.
and this is internal TensorFlow warning, you can safely ignore. which is tell us about future versions of TensorFlow, no actions from your side is needed.

Binary Classification for binary dataset with DNN

I have a dataset of binary data like this:
age (0-9)
age (10-19)
age (20-59)
age (10-19)
gender (male)
gender (female)
...
desired (very much)
desired (moderate)
desired (little)
desired (None)
1
0
0
0
0
1
...
0
1
0
0
0
0
1
0
1
0
...
1
0
0
0
the features here are the first few columns, and the target is the final 4 columns.
I'm trying here to use DNN implemented with tensorflow/keras to fit on this data.
here's my model and code:
input_layer = Input(shape=(len(x_training)))
x = Dense(30,activation="relu")(input_layer)
x = Dense(20,activation="relu")(x)
x = Dense(10,activation="relu")(x)
x = Dense(5,activation="relu")(x)
output_layer = Dense(4,activation="softmax")(x)
model = Model(inputs=input_layer, outputs=output_layer)
model.compile(optimizer="sgd",
loss="categorical_crossentropy",
metrics=['accuracy'])
model.fit(x=x_train,
y=y_train,
batch_size=128,
epochs=10,
validation_data=(x_validate,y_validate))
and this is the history of the training:
Epoch 1/10
2005/2005 [==============================] - 9s 4ms/step - loss: 1.3864 - accuracy: 0.2525 - val_loss: 1.3863 - val_accuracy: 0.2533
Epoch 2/10
2005/2005 [==============================] - 6s 3ms/step - loss: 1.3863 - accuracy: 0.2518 - val_loss: 1.3864 - val_accuracy: 0.2486
Epoch 3/10
2005/2005 [==============================] - 6s 3ms/step - loss: 1.3863 - accuracy: 0.2499 - val_loss: 1.3863 - val_accuracy: 0.2487
Epoch 4/10
2005/2005 [==============================] - 6s 3ms/step - loss: 1.3863 - accuracy: 0.2515 - val_loss: 1.3863 - val_accuracy: 0.2539
Epoch 5/10
2005/2005 [==============================] - 6s 3ms/step - loss: 1.3863 - accuracy: 0.2511 - val_loss: 1.3863 - val_accuracy: 0.2504
Epoch 6/10
2005/2005 [==============================] - 6s 3ms/step - loss: 1.3863 - accuracy: 0.2501 - val_loss: 1.3863 - val_accuracy: 0.2484
Epoch 7/10
2005/2005 [==============================] - 6s 3ms/step - loss: 1.3863 - accuracy: 0.2511 - val_loss: 1.3863 - val_accuracy: 0.2468
Epoch 8/10
2005/2005 [==============================] - 6s 3ms/step - loss: 1.3863 - accuracy: 0.2509 - val_loss: 1.3863 - val_accuracy: 0.2519
Epoch 9/10
2005/2005 [==============================] - 6s 3ms/step - loss: 1.3863 - accuracy: 0.2505 - val_loss: 1.3863 - val_accuracy: 0.2463
Epoch 10/10
2005/2005 [==============================] - 6s 3ms/step - loss: 1.3863 - accuracy: 0.2512 - val_loss: 1.3863 - val_accuracy: 0.2474
<tensorflow.python.keras.callbacks.History at 0x7f6893c61e90>
The accuracy and the loss doesn't change at all, I have tried the following experiments, and all gave the same result:
changed hidden layers activation to sigmoid, tanh
changed the final layer to be only one node and the y_train to be labeled with (1,2,3) instead of one hot encoding, and change the loss function to be sparse categorical cross entropy
changed the optimizer to Adam
changed the data to be in (-1,1) instead of (0,1)
What am I missing here?
I figured out some method to solve this problem, I don't think it's very scientific but actually it worked for my case
First I replaced every "1" in the training dataset with "0.8" and every "0" with "0.2".
Then I multiplied every similar features in some weight. for example if the age is "18" then the features first will be like this [0,1,0,0], then apply the first step the feats will be like this [0.2,0.8,0.2,0.2] then multiply this array to [0.1,0.2,0.3,0.4] and sum them together to give [0.32] which represents the age "18" somehow.
then by applying the previous stages to the features, I got an array of length 15 instead of 22.
The third stage was to apply feature size reduction using PCA to reduce number of features to 10.
This method was like extracting some other features from the existing features by giving it a new domain instead of the binary domain.
This gave me accuracy about 85% which was very satisfying to me.

Validation accuracy zero and Loss is higher. Intent classification Using LSTM

I'm trying to Build and LSTM model for intent classification using Tensorflow, Keras. But whenever I'm training the model with 30 or 40 epochs, my 1st 20 validation accuracy is zero and loss is more than accuracy. and if I try to change the code a little bit, validation accuracy is getting lower than Loss.
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 100, 16) 16000
_________________________________________________________________
bidirectional (Bidirectional (None, 64) 12544
_________________________________________________________________
dense (Dense) (None, 24) 1560
_________________________________________________________________
dense_1 (Dense) (None, 3) 75
=================================================================
Total params: 30,179
Trainable params: 30,179
Non-trainable params: 0
_________________________________________________________________
Train on 200 samples, validate on 79 samples
loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy']
Epoch 1/40
2020-08-16 14:08:17.986786: W tensorflow/core/grappler/optimizers/implementation_selector.cc:310] Skipping optimization due to error while loading function libraries: Invalid argument: Functions '__inference___backward_standard_lstm_4932_5417' and '__inference___backward_standard_lstm_4932_5417_specialized_for_StatefulPartitionedCall_at___inference_distributed_function_6117' both implement 'lstm_6b97c168-2c7b-4dc3-93dd-5e68cddc574f' but their signatures do not match.
2020-08-16 14:08:20.550894: W tensorflow/core/grappler/optimizers/implementation_selector.cc:310] Skipping optimization due to error while loading function libraries: Invalid argument: Functions '__inference_standard_lstm_6338' and '__inference_standard_lstm_6338_specialized_for_sequential_bidirectional_forward_lstm_StatefulPartitionedCall_at___inference_distributed_function_7185' both implement 'lstm_62f19cc7-a1f0-447b-a17f-84a70fc095cd' but their signatures do not match.
200/200 - 10s - loss: 1.0798 - accuracy: 0.7850 - val_loss: 1.1327 - val_accuracy: 0.0000e+00
Epoch 2/40
200/200 - 1s - loss: 1.0286 - accuracy: 0.7850 - val_loss: 1.1956 - val_accuracy: 0.0000e+00
Epoch 3/40
200/200 - 1s - loss: 0.9294 - accuracy: 0.7850 - val_loss: 1.4287 - val_accuracy: 0.0000e+00
Epoch 4/40
200/200 - 1s - loss: 0.7026 - accuracy: 0.7850 - val_loss: 2.2190 - val_accuracy: 0.0000e+00
Epoch 5/40
200/200 - 1s - loss: 0.6183 - accuracy: 0.7850 - val_loss: 1.8499 - val_accuracy: 0.0000e+00
Epoch 6/40
200/200 - 1s - loss: 0.5980 - accuracy: 0.7850 - val_loss: 1.5809 - val_accuracy: 0.0000e+00
Epoch 7/40
200/200 - 1s - loss: 0.5927 - accuracy: 0.7850 - val_loss: 1.5118 - val_accuracy: 0.0000e+00
Epoch 8/40
200/200 - 1s - loss: 0.5861 - accuracy: 0.7850 - val_loss: 1.5711 - val_accuracy: 0.0000e+00
Epoch 9/40
200/200 - 1s - loss: 0.5728 - accuracy: 0.7850 - val_loss: 1.5106 - val_accuracy: 0.0000e+00
Epoch 10/40
200/200 - 1s - loss: 0.5509 - accuracy: 0.7850 - val_loss: 1.6389 - val_accuracy: 0.0000e+00
Epoch 11/40
200/200 - 1s - loss: 0.5239 - accuracy: 0.7850 - val_loss: 1.5991 - val_accuracy: 0.0000e+00
Epoch 12/40
200/200 - 1s - loss: 0.4860 - accuracy: 0.7850 - val_loss: 1.4903 - val_accuracy: 0.0000e+00
Epoch 13/40
200/200 - 1s - loss: 0.4388 - accuracy: 0.7850 - val_loss: 1.3937 - val_accuracy: 0.0000e+00
Epoch 14/40
200/200 - 1s - loss: 0.3859 - accuracy: 0.7850 - val_loss: 1.2329 - val_accuracy: 0.0000e+00
Epoch 15/40
200/200 - 1s - loss: 0.3460 - accuracy: 0.7850 - val_loss: 1.1700 - val_accuracy: 0.0000e+00
Epoch 16/40
200/200 - 1s - loss: 0.3323 - accuracy: 0.7850 - val_loss: 1.0077 - val_accuracy: 0.0127
Epoch 17/40
200/200 - 1s - loss: 0.3007 - accuracy: 0.8150 - val_loss: 1.2465 - val_accuracy: 0.2278
Epoch 18/40
200/200 - 0s - loss: 0.2752 - accuracy: 0.9200 - val_loss: 0.8890 - val_accuracy: 0.6329
Epoch 19/40
200/200 - 1s - loss: 0.2613 - accuracy: 0.9700 - val_loss: 0.9181 - val_accuracy: 0.6582
Epoch 20/40
200/200 - 1s - loss: 0.2447 - accuracy: 0.9600 - val_loss: 0.8786 - val_accuracy: 0.7468
Epoch 21/40
200/200 - 1s - loss: 0.2171 - accuracy: 0.9700 - val_loss: 0.7162 - val_accuracy: 0.8481
Epoch 22/40
200/200 - 1s - loss: 0.1949 - accuracy: 0.9700 - val_loss: 0.8051 - val_accuracy: 0.7848
Epoch 23/40
200/200 - 1s - loss: 0.1654 - accuracy: 0.9700 - val_loss: 0.4710 - val_accuracy: 0.8861
Epoch 24/40
200/200 - 1s - loss: 0.1481 - accuracy: 0.9700 - val_loss: 0.4209 - val_accuracy: 0.8861
Epoch 25/40
200/200 - 1s - loss: 0.1192 - accuracy: 0.9700 - val_loss: 0.3792 - val_accuracy: 0.8861
Epoch 26/40
200/200 - 1s - loss: 0.1022 - accuracy: 0.9700 - val_loss: 0.7279 - val_accuracy: 0.8101
Epoch 27/40
200/200 - 1s - loss: 0.0995 - accuracy: 0.9700 - val_loss: 1.3112 - val_accuracy: 0.6582
Epoch 28/40
200/200 - 1s - loss: 0.1161 - accuracy: 0.9650 - val_loss: 0.1435 - val_accuracy: 0.9747
Epoch 29/40
200/200 - 1s - loss: 0.0889 - accuracy: 0.9700 - val_loss: 0.3896 - val_accuracy: 0.8608
Epoch 30/40
200/200 - 1s - loss: 0.0830 - accuracy: 0.9700 - val_loss: 0.3840 - val_accuracy: 0.8608
Epoch 31/40
200/200 - 1s - loss: 0.0688 - accuracy: 0.9700 - val_loss: 0.3100 - val_accuracy: 0.9241
Epoch 32/40
200/200 - 1s - loss: 0.0611 - accuracy: 0.9700 - val_loss: 0.3524 - val_accuracy: 0.8987
Epoch 33/40
200/200 - 1s - loss: 0.0518 - accuracy: 0.9750 - val_loss: 0.4621 - val_accuracy: 0.8481
Epoch 34/40
200/200 - 1s - loss: 0.0457 - accuracy: 0.9900 - val_loss: 0.4344 - val_accuracy: 0.8481
Epoch 35/40
200/200 - 1s - loss: 0.0423 - accuracy: 0.9900 - val_loss: 0.4417 - val_accuracy: 0.8608
Epoch 36/40
200/200 - 1s - loss: 0.0372 - accuracy: 0.9900 - val_loss: 0.4701 - val_accuracy: 0.8481
Epoch 37/40
200/200 - 1s - loss: 0.0319 - accuracy: 0.9950 - val_loss: 0.3913 - val_accuracy: 0.8608
Epoch 38/40
200/200 - 1s - loss: 0.0309 - accuracy: 0.9950 - val_loss: 0.5739 - val_accuracy: 0.7975
Epoch 39/40
200/200 - 1s - loss: 0.0290 - accuracy: 0.9950 - val_loss: 0.5416 - val_accuracy: 0.8228
Epoch 40/40
200/200 - 1s - loss: 0.0292 - accuracy: 1.0000 - val_loss: 0.3162 - val_accuracy: 0.8861
There can be multiple reasons for the validation accuracy to be zero, you can check on these below things to make changes accordingly.
The samples you had taken are very less, train on 200 samples and validate on 79 samples, you can try increasing samples through some upsampling methods.
There is a possibility that there can be unseen data on validation and weights are not useful with unseen data.
For example, if you have digits from 1-9 to classify, if you keep 1-7 for training and 8-9 for validation this data becomes an unseen scenario and val_acc will be 0.
To fix this, shuffle the data before sending it to model.fit.
This can be a silly mistake, check if your validation data is mapped properly to its label.
You can also give validation data explicitly like below.
model.fit(X_train, y_train, batch_size = 32, nb_epoch=30,
shuffle=True,verbose=1,callbacks=[remote, early_stopping],
validation_data (X_validation, y_validation))

Issue with Tensorflow classification - loss not decreasing

Following is the reference code:
Xtrain2 = df2.iloc[:,:-1].values
ytrain2 = df2['L'].values.reshape((200,1))
print(Xtrain2.shape, ytrain2.shape)
#--------------------------
lrelu = lambda x: tf.keras.activations.relu(x, alpha=0.1)
model2 = tf.keras.models.Sequential([
tf.keras.layers.Dense(1501, input_dim=1501, activation='relu'),
tf.keras.layers.Dense(100, activation='relu'),
tf.keras.layers.Dense(100, activation='relu'),
#tf.keras.layers.Dense(1),
#tf.keras.layers.Dense(1, activation=lrelu)
tf.keras.layers.Dense(1, activation='sigmoid')
])
model2.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
#--------------------------
model2.fit(Xtrain2, ytrain2, epochs=50)#, verbose=0)
Just a simple attempt at a classifier. The last layer is sigmoid since it's just a binary classifier. The loss is also appropriate for the problem. Dimension of input is 1500 and the number of samples is 200. I get the following output:
(200, 1501) (200, 1)
Train on 200 samples
Epoch 1/50
200/200 [==============================] - 0s 2ms/sample - loss: 0.4201 - accuracy: 0.0300
Epoch 2/50
200/200 [==============================] - 0s 359us/sample - loss: -1.1114 - accuracy: 0.0000e+00
Epoch 3/50
200/200 [==============================] - 0s 339us/sample - loss: -4.6102 - accuracy: 0.0000e+00
Epoch 4/50
200/200 [==============================] - 0s 344us/sample - loss: -13.7864 - accuracy: 0.0000e+00
Epoch 5/50
200/200 [==============================] - 0s 342us/sample - loss: -34.7789 - accuracy: 0.0000e+00
.
.
.
Epoch 40/50
200/200 [==============================] - 0s 348us/sample - loss: -905166.4000 - accuracy: 0.3750
Epoch 41/50
200/200 [==============================] - 0s 344us/sample - loss: -1010177.5300 - accuracy: 0.3400
Epoch 42/50
200/200 [==============================] - 0s 354us/sample - loss: -1129819.1825 - accuracy: 0.3450
Epoch 43/50
200/200 [==============================] - 0s 379us/sample - loss: -1263355.3200 - accuracy: 0.3900
Epoch 44/50
200/200 [==============================] - 0s 359us/sample - loss: -1408803.0400 - accuracy: 0.3750
Epoch 45/50
200/200 [==============================] - 0s 355us/sample - loss: -1566850.5900 - accuracy: 0.3300
Epoch 46/50
200/200 [==============================] - 0s 359us/sample - loss: -1728280.7550 - accuracy: 0.3550
Epoch 47/50
200/200 [==============================] - 0s 354us/sample - loss: -1909759.2400 - accuracy: 0.3400
Epoch 48/50
200/200 [==============================] - 0s 379us/sample - loss: -2108889.7200 - accuracy: 0.3750
Epoch 49/50
200/200 [==============================] - 0s 369us/sample - loss: -2305491.9800 - accuracy: 0.3700
Epoch 50/50
200/200 [==============================] - 0s 374us/sample - loss: -2524282.6300 - accuracy: 0.3050
I don't see where I'm going wrong in the above code. Any help would be appreciated!

tf.distribute.MirroredStrategy.scope mode setting vocab_size does not match but does not report an error

I am using tf.distribute.MirroredStrategy() to train the textcnn model, but when I set vocab_size=0 or other wrong values, no error will be reported in this mode. When tf.distribute.MirroredStrategy() is not used, the wrong vocab_size will immediately report an error
use wrong value fro vocab_size:
model=TextCNN(padding_size,vocab_size-10,embed_size,filter_num,num_classes)
model.compile(loss='sparse_categorical_crossentropy',optimizer=tf.keras.optimizers.Adam(),metrics=['accuracy'])
model.fit(train_dataset, epochs=epoch,validation_data=valid_dataset, callbacks=callbacks)
Error:
2 root error(s) found.
(0) Invalid argument: indices[63,10] = 4726 is not in [0, 4726)
[[node text_cnn_1/embedding/embedding_lookup (defined at <ipython-input-7-6ef8a4397184>:37) ]]
[[Adam/Adam/update/AssignSubVariableOp/_45]]
(1) Invalid argument: indices[63,10] = 4726 is not in [0, 4726)
[[node text_cnn_1/embedding/embedding_lookup (defined at <ipython-input-7-6ef8a4397184>:37) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_234431]
but with strategy.scope() no Error and works well:
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
print(vocab_size)
model=TextCNN(padding_size,vocab_size-1000,embed_size,filter_num,num_classes)
model.compile(loss='sparse_categorical_crossentropy',optimizer=tf.keras.optimizers.Adam(),metrics=['accuracy'])
model.fit(train_dataset, epochs=epoch,validation_data=valid_dataset, callbacks=callbacks)
log like this(looks very good):
Learning rate for epoch 1 is 0.0010000000474974513
2813/2813 [==============================] - 16s 6ms/step - loss: 0.8097 - accuracy: 0.7418 - val_loss: 0.4567 - val_accuracy: 0.8586 - lr: 0.0010
Epoch 2/15
2813/2813 [==============================] - ETA: 0s - loss: 0.4583 - accuracy: 0.8560
Learning rate for epoch 2 is 0.0010000000474974513
2813/2813 [==============================] - 14s 5ms/step - loss: 0.4583 - accuracy: 0.8560 - val_loss: 0.4051 - val_accuracy: 0.8756 - lr: 0.0010
Epoch 3/15
2810/2813 [============================>.] - ETA: 0s - loss: 0.3909 - accuracy: 0.8768
Learning rate for epoch 3 is 0.0010000000474974513
2813/2813 [==============================] - 14s 5ms/step - loss: 0.3909 - accuracy: 0.8767 - val_loss: 0.3853 - val_accuracy: 0.8844 - lr: 0.0010
Epoch 4/15
2811/2813 [============================>.] - ETA: 0s - loss: 0.2999 - accuracy: 0.9047
Learning rate for epoch 4 is 9.999999747378752e-05
2813/2813 [==============================] - 14s 5ms/step - loss: 0.2998 - accuracy: 0.9047 - val_loss: 0.3700 - val_accuracy: 0.8865 - lr: 1.0000e-04
Epoch 5/15
2807/2813 [============================>.] - ETA: 0s - loss: 0.2803 - accuracy: 0.9114
Learning rate for epoch 5 is 9.999999747378752e-05
2813/2813 [==============================] - 15s 5ms/step - loss: 0.2803 - accuracy: 0.9114 - val_loss: 0.3644 - val_accuracy: 0.8888 - lr: 1.0000e-04
Epoch 6/15
2803/2813 [============================>.] - ETA: 0s - loss: 0.2639 - accuracy: 0.9162
Learning rate for epoch 6 is 9.999999747378752e-05
2813/2813 [==============================] - 14s 5ms/step - loss: 0.2636 - accuracy: 0.9163 - val_loss: 0.3615 - val_accuracy: 0.8896 - lr: 1.0000e-04
Epoch 7/15
2805/2813 [============================>.] - ETA: 0s - loss: 0.2528 - accuracy: 0.9188
Learning rate for epoch 7 is 9.999999747378752e-05
2813/2813 [==============================] - 14s 5ms/step - loss: 0.2526 - accuracy: 0.9189 - val_loss: 0.3607 - val_accuracy: 0.8909 - lr: 1.0000e-04
More simply,like this,run and no error:
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
model = Sequential()
model.add(Embedding(1000, 64, input_length=20))
test_array=np.random.randint(10000,size=(32,20))
model.predict(test_array)
why???