I have written a custom loss function adjusted_r2. I am trying to print the tensor values inside the function, but when Logs are printed, I don't see anything. Could somebody help me in this.
def coeff_determination(y_true, y_pred):
from keras import backend as K
SS_res = K.sum(K.square( y_true-y_pred ))
SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )
SS_res = K.print_tensor(SS_res, message='SS_res = ')
SS_tot = K.print_tensor(SS_tot, message='SS_tot = ')
r_squared = 1 - SS_res/(SS_tot + K.epsilon())
r_squared = K.print_tensor(r_squared, message='r_squared = ')
adj_r_squared = 1 -( (1-r_squared)*K.cast(K.shape(y_true)[0]-1,"float32")/K.cast((K.shape(y_true)[0]-n_features-1),"float32"))
adj_r_squared = K.print_tensor(adj_r_squared, message='adj_r_squared = ')
return -adj_r_squared
Logs are :
1/250 [..............................] - ETA: 51:44 - loss: -6.7060 - coeff_determination: -6.7060 - mean_squared_error: 40.5785
2/250 [..............................] - ETA: 42:56 - loss: -7.2036 - coeff_determination: -7.2036 - mean_squared_error: 48.8251
3/250 [..............................] - ETA: 41:30 - loss: -8.0279 - coeff_determination: -8.0279 - mean_squared_error: 48.1565
4/250 [..............................] - ETA: 40:48 - loss: -9.1016 - coeff_determination: -9.1016 - mean_squared_error: 51.9965
The K.print_tensor() function works when the tensors are evaluated (see documentation here). The tensors are not initialized when the custom loss function is being called. That is why you can not evaluate tensor values from within the loss function. The arguments to your custom loss function are tensors which work as placeholders, without having actual data attached to them.
The same problem has been also discussed in this thread.
Related
I am trying to move a model from Tf1 to Torch.
The model is quite involved and I have been unable to get a portion of it to work. In particular, I have found that a function appears to return a result in PyTorch that is around 10% off the result the equivalent function in Tensorflow or Numpy.
I believe that this 10% difference is an error that impacts my loss function and prevents the model from learning.
I have isolated the function here and show both the torch and numpy ‘equivalents’. Attached is a link to the torch model and the comparison data needed. Below are two code segments. I believe the Numpy result is the better one because it both agrees the Tensorflow v1 result to an accuracy of 10e-05 and in the model I'm dealing with, this function trains successfully when the Torch equivalent does not.
My question is: how come the Numpy function returns better results than the Torch function and is there away of arranging the Torch function so it has accuracy closer to the Numpy function.
Regards,
Simon
The data needed to run this review is saved here:
https://drive.google.com/file/d/1lClIUWuHDGtibSXN2h5X-cyMaalU-cbX/view?usp=sharing
The full torch model is saved in a pickle for use with torch.load:
https://drive.google.com/file/d/1bFJYC5bHme7YmIbqTOjaxXvd-yrKczxH/view?usp=sharing
The data load and two functions:
with open('recovered_autoencoder_network.pkl', 'rb') as f:
recovered_autoencoder_network = pickle.load(f)
# parameters needed for this issue
params: Dict[str, Any] = {'weight_precision': torch.float64,
'sindy_precision': torch.float64,
'target_device': 'cuda'}
sindy_autoencoder = torch.load('saved_model.pkl')
sindy_autoencoder.to(params['target_device'])
# this is a version of the 'problem' function in torch.
def calculate_first_and_second_derivative_with_torch(input_and_derivatives, stack):
x, dx, ddx = input_and_derivatives
layer_count = len(stack)
for i in range(layer_count - 1):
x = torch.mm(x, stack[i].weights) + stack[i].bias
x = torch.sigmoid(x)
dx_prev = torch.mm(dx, stack[i].weights)
sigmoid_first_derivative = torch.mul(x, 1 - x)
sigmoid_second_derivative = torch.mul(sigmoid_first_derivative, 1 - 2 * x)
dx = torch.mul(sigmoid_first_derivative, dx_prev)
ddx = torch.mul(sigmoid_second_derivative, torch.square(dx_prev)) \
+ torch.mul(sigmoid_first_derivative, torch.mm(ddx, stack[i].weights))
dx = torch.mm(dx, stack[layer_count - 1].weights)
ddx = torch.mm(ddx, stack[layer_count - 1].weights)
return dx, ddx
# this is the equivalent 'problem' function in numpy.
def calculate_first_and_second_derivative_with_np(input, dx, ddx, weights, biases):
dz = dx
ddz = ddx
def sigmoid(x):
return 1 / (1 + np.exp(-x))
for i in range(len(weights) - 1):
input = np.matmul(input, weights[i]) + biases[i]
input = sigmoid(input)
dz_prev = np.matmul(dz, weights[i])
sigmoid_derivative = np.multiply(input, 1 - input)
sigmoid_derivative2 = np.multiply(sigmoid_derivative, 1 - 2 * input)
dz = np.multiply(sigmoid_derivative, dz_prev)
ddz = np.multiply(sigmoid_derivative2, np.square(dz_prev)) \
+ np.multiply(sigmoid_derivative, np.matmul(ddz, weights[i]))
dz = np.matmul(dz, weights[-1])
ddz = np.matmul(ddz, weights[-1])
return dz, ddz
dx_decode_np_test, ddx_decode_np_test = \
calculate_first_and_second_derivative_with_np(
recovered_autoencoder_network['v2_in_z'],
recovered_autoencoder_network['v2_in_dz'],
recovered_autoencoder_network['v2_in_sindy_predict'],
recovered_autoencoder_network['v2_in_decoder_weights'],
recovered_autoencoder_network['v2_in_decoder_biases'])
# Here I access the tensors recovered from the saved Tensorflow model and convert them to torch.
converted_stack = [torch.tensor(recovered_autoencoder_network['v2_in_z'],
device=torch.device(params['target_device']),
dtype=params['sindy_precision']),
torch.tensor(recovered_autoencoder_network['v2_in_dz'],
device=torch.device(params['target_device']),
dtype=params['sindy_precision']),
torch.tensor(recovered_autoencoder_network['v2_in_sindy_predict'],
device=torch.device(params['target_device']),
dtype=params['sindy_precision'])]
# Here I use the tensors captured from the tensorflow model (converted to torch)
# with the torch version of the function and the layers from the model.
dx_decode_torch_test, ddx_decode_torch_test = \
calculate_first_and_second_derivative_with_torch(converted_stack,
sindy_autoencoder.ψ_decoder_to_x)
# Here I show the error between the two functions.
print(dx_decode_np_test - dx_decode_torch_test, ddx_decode_np_test - ddx_decode_torch_test)
# Here I show that the Torch weights in the model feeding the Torch
# function are equivalent to the Numpy arrays feeding the Numpy
# function. (the weights were initialized from those arrays after conversion to Torch.tensor.
print(("\n\nWeight and bias comparison for two models (imported from np source)\n\n" +
"weights comparison: \nl1 {:.5f} ({:.2%})\nl2 {:.5f} ({:.2%})\nl3 {:.5f} ({:.2%})\nl4 {:.5f} ({:.2%})\n\n" +
"bias comparison: \nb1 {:.5f} ({:.2%})\nb2 {:.5f} ({:.2%})\nb3 {:.5f} ({:.2%})\nb4 {:.5f} ({:.2%}))")
.format(np.sum(sindy_autoencoder.ψ_decoder_to_x[0].weights.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_weights'][0]),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[0].weights.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_weights'][0]))
/ np.sum(recovered_autoencoder_network['v2_in_decoder_weights'][0]),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[1].weights.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_weights'][1])),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[1].weights.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_weights'][1]))
/ np.sum(recovered_autoencoder_network['v2_in_decoder_weights'][1]),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[2].weights.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_weights'][2])),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[2].weights.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_weights'][2]))
/ np.sum(recovered_autoencoder_network['v2_in_decoder_weights'][2]),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[3].weights.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_weights'][3])),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[3].weights.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_weights'][3]))
/ np.sum(recovered_autoencoder_network['v2_in_decoder_weights'][3]),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[0].bias.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_biases'][0])),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[0].bias.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_biases'][0]))
/ np.sum(recovered_autoencoder_network['v2_in_decoder_biases'][0]),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[1].bias.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_biases'][1])),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[1].bias.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_biases'][1]))
/ np.sum(recovered_autoencoder_network['v2_in_decoder_biases'][1]),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[2].bias.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_biases'][2])),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[2].bias.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_biases'][2]))
/ np.sum(recovered_autoencoder_network['v2_in_decoder_biases'][2]),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[3].bias.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_biases'][3])),
(np.sum(sindy_autoencoder.ψ_decoder_to_x[3].bias.cpu().detach().numpy()
- recovered_autoencoder_network['v2_in_decoder_biases'][3]))
/ np.sum(recovered_autoencoder_network['v2_in_decoder_biases'][3])))
I am using keras to train a regressional NN.
There is a problem. I would like to have my output non-negative. My training data is non-negative in range(0,1).
When I try to use sigmoid or relu in last output layer, the model is not learning anymore.
opt = keras.optimizers.Adam(learning_rate=lr1)
modelx = tf.keras.Sequential()
modelx.add(keras.Input(shape=(3,)))
modelx.add(layers.Dense(10, activation='relu'))
modelx.add(layers.Dense(10, activation='relu'))
modelx.add(layers.Dense(1, activation='relu' ))
modelx.compile(optimizer=opt, loss="mse", metrics=['mse','mae'])
history = modelx.fit(X, Y, epochs=epp[i],batch_size=bs1,shuffle=True)
I keep getting constant loss function.
Epoch 1/15
8/8 [==============================] - 0s 832us/step - loss: 0.0617 - mse: 0.0617 - mae: 0.1283
Epoch 2/15
8/8 [==============================] - 0s 753us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 3/15
8/8 [==============================] - 0s 657us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 4/15
8/8 [==============================] - 0s 709us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 5/15
8/8 [==============================] - 0s 668us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 6/15
8/8 [==============================] - 0s 552us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 7/15
8/8 [==============================] - 0s 702us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 8/15
8/8 [==============================] - 0s 595us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 9/15
8/8 [==============================] - 0s 633us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 10/15
8/8 [==============================] - 0s 855us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 11/15
8/8 [==============================] - 0s 681us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 12/15
8/8 [==============================] - 0s 620us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 13/15
8/8 [==============================] - 0s 572us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 14/15
8/8 [==============================] - 0s 618us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 15/15
8/8 [==============================] - 0s 572us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
If I am using something that can be negative (ELU, leaky relu), it converges once again, but I don't like the results (a lot of negative predictions)
This is the data used in traning:
Y is
[1.33931561e-04 6.18145666e-05 3.52451476e-05 2.14290498e-05
1.29611188e-05 7.23954384e-06 3.11468746e-06 0.00000000e+00
2.34380232e-03 1.25174497e-03 8.49408058e-04 6.40192861e-04
5.11964193e-04 4.25323200e-04 3.62861090e-04 3.15695822e-04
4.55367307e-03 2.44167538e-03 1.66357097e-03 1.25895667e-03
1.01096727e-03 8.43406857e-04 7.22607492e-04 6.31391645e-04
6.76354383e-03 3.63160579e-03 2.47773388e-03 1.87772048e-03
1.50997034e-03 1.26149051e-03 1.08235389e-03 9.47087467e-04
8.97341459e-03 4.82153620e-03 3.29189679e-03 2.49648430e-03
2.00897341e-03 1.67957417e-03 1.44210030e-03 1.26278329e-03
1.11832853e-02 6.01146660e-03 4.10605970e-03 3.11524811e-03
2.50797649e-03 2.09765783e-03 1.80184670e-03 1.57847911e-03
1.33931561e-02 7.20139701e-03 4.92022261e-03 3.73401192e-03
3.00697956e-03 2.51574148e-03 2.16159310e-03 1.89417493e-03
1.56030269e-02 8.39132742e-03 5.73438552e-03 4.35277573e-03
3.50598264e-03 2.93382514e-03 2.52133950e-03 2.20987076e-03
6.02692024e-04 3.14224047e-04 2.07946371e-04 1.52681980e-04
1.18810256e-04 9.59239558e-05 7.94245303e-05 6.69657805e-05
9.44217505e-03 5.07394568e-03 3.46459801e-03 2.62773723e-03
2.11482255e-03 1.76825858e-03 1.51841014e-03 1.32974907e-03
1.82816581e-02 9.83366730e-03 6.72124965e-03 5.10279247e-03
4.11083485e-03 3.44059321e-03 2.95739575e-03 2.59253236e-03
2.71211411e-02 1.45933889e-02 9.97790129e-03 7.57784772e-03
6.10684714e-03 5.11292783e-03 4.39638136e-03 3.85531565e-03
3.59606241e-02 1.93531106e-02 1.32345529e-02 1.00529030e-02
8.10285944e-03 6.78526246e-03 5.83536696e-03 5.11809894e-03
4.48001071e-02 2.41128322e-02 1.64912046e-02 1.25279582e-02
1.00988717e-02 8.45759709e-03 7.27435257e-03 6.38088223e-03
5.36395902e-02 2.88725538e-02 1.97478562e-02 1.50030135e-02
1.20948840e-02 1.01299317e-02 8.71333818e-03 7.64366552e-03
6.24790732e-02 3.36322754e-02 2.30045079e-02 1.74780687e-02
1.40908963e-02 1.18022663e-02 1.01523238e-02 8.90644880e-03
1.38395946e-03 7.34906514e-04 4.95781743e-04 3.71436862e-04
2.95225484e-04 2.43731309e-04 2.06607602e-04 1.78575415e-04
2.12727963e-02 1.14442802e-02 7.82324793e-03 5.94031117e-03
4.78625315e-03 4.00648422e-03 3.44432522e-03 3.01983782e-03
4.11616331e-02 2.21536538e-02 1.51507141e-02 1.15091855e-02
9.27728081e-03 7.76923713e-03 6.68204284e-03 5.86110022e-03
6.10504699e-02 3.28630275e-02 2.24781803e-02 1.70780598e-02
1.37683085e-02 1.15319900e-02 9.91976046e-03 8.70236262e-03
8.09393067e-02 4.35724012e-02 2.98056465e-02 2.26469341e-02
1.82593361e-02 1.52947429e-02 1.31574781e-02 1.15436250e-02
1.00828143e-01 5.42817748e-02 3.71331127e-02 2.82158084e-02
2.27503638e-02 1.90574959e-02 1.63951957e-02 1.43848874e-02
1.20716980e-01 6.49911485e-02 4.44605789e-02 3.37846827e-02
2.72413915e-02 2.28202488e-02 1.96329133e-02 1.72261498e-02
1.40605817e-01 7.57005222e-02 5.17880451e-02 3.93535570e-02
3.17324191e-02 2.65830017e-02 2.28706309e-02 2.00674122e-02
2.47773388e-03 1.32386197e-03 8.98751264e-04 6.77693699e-04
5.42206803e-04 4.50661604e-04 3.84663902e-04 3.34828902e-04
3.78356660e-02 2.03627485e-02 1.39253578e-02 1.05779147e-02
8.52625599e-03 7.14000011e-03 6.14060634e-03 5.38596206e-03
7.31935981e-02 3.94016350e-02 2.69519644e-02 2.04781357e-02
1.65103052e-02 1.38293386e-02 1.18965488e-02 1.04370952e-02
1.08551530e-01 5.84405215e-02 3.99785710e-02 3.03783567e-02
2.44943544e-02 2.05186771e-02 1.76524912e-02 1.54882284e-02
1.43909462e-01 7.74794080e-02 5.30051775e-02 4.02785776e-02
3.24784035e-02 2.72080156e-02 2.34084336e-02 2.05393615e-02
1.79267394e-01 9.65182945e-02 6.60317841e-02 5.01787986e-02
4.04624527e-02 3.38973541e-02 2.91643761e-02 2.55904947e-02
2.14625326e-01 1.15557181e-01 7.90583906e-02 6.00790196e-02
4.84465019e-02 4.05866926e-02 3.49203185e-02 3.06416278e-02
2.49983259e-01 1.34596068e-01 9.20849972e-02 6.99792406e-02
5.64305511e-02 4.72760311e-02 4.06762609e-02 3.56927610e-02
3.88401527e-03 2.08109041e-03 1.41685493e-03 1.07145249e-03
8.59754214e-04 7.16714840e-04 6.13593431e-04 5.35726244e-04
5.91307842e-02 3.18293506e-02 2.17709277e-02 1.65405478e-02
1.33348311e-02 1.11688063e-02 9.60725348e-03 8.42812180e-03
1.14377553e-01 6.15776108e-02 4.21250004e-02 3.20096431e-02
2.58099079e-02 2.16208977e-02 1.86009135e-02 1.63205174e-02
1.69624322e-01 9.13258709e-02 6.24790732e-02 4.74787384e-02
3.82849848e-02 3.20729891e-02 2.75945736e-02 2.42129129e-02
2.24871091e-01 1.21074131e-01 8.28331459e-02 6.29478337e-02
5.07600616e-02 4.25250805e-02 3.65882336e-02 3.21053085e-02
2.80117860e-01 1.50822391e-01 1.03187219e-01 7.84169289e-02
6.32351385e-02 5.29771719e-02 4.55818937e-02 3.99977040e-02
3.35364629e-01 1.80570651e-01 1.23541291e-01 9.38860242e-02
7.57102153e-02 6.34292633e-02 5.45755538e-02 4.78900996e-02
3.90611398e-01 2.10318912e-01 1.43895364e-01 1.09355120e-01
8.81852922e-02 7.38813547e-02 6.35692138e-02 5.57824951e-02
5.60280363e-03 3.00659184e-03 2.05009275e-03 1.55271323e-03
1.24786772e-03 1.04189102e-03 8.93396188e-04 7.81267439e-04
8.51581509e-02 4.58440865e-02 3.13599575e-02 2.38282105e-02
1.92119784e-02 1.60929027e-02 1.38442667e-02 1.21463170e-02
1.64713498e-01 8.86815811e-02 6.06698223e-02 4.61037077e-02
3.71760890e-02 3.11439143e-02 2.67951371e-02 2.35113666e-02
2.44268845e-01 1.31519076e-01 8.99796870e-02 6.83792049e-02
5.51401997e-02 4.61949259e-02 3.97460076e-02 3.48764162e-02
3.23824193e-01 1.74356570e-01 1.19289552e-01 9.06547021e-02
7.31043104e-02 6.12459376e-02 5.26968781e-02 4.62414658e-02
4.03379540e-01 2.17194065e-01 1.48599417e-01 1.12930199e-01
9.10684210e-02 7.62969492e-02 6.56477486e-02 5.76065155e-02
4.82934887e-01 2.60031560e-01 1.77909281e-01 1.35205697e-01
1.09032532e-01 9.13479608e-02 7.85986191e-02 6.89715651e-02
5.62490234e-01 3.02869054e-01 2.07219146e-01 1.57481194e-01
1.26996642e-01 1.06398972e-01 9.15494895e-02 8.03366147e-02
7.63409898e-03 4.10036625e-03 2.79846472e-03 2.12147593e-03
1.70654731e-03 1.42619014e-03 1.22407217e-03 1.07145249e-03
1.15917766e-01 6.24069562e-02 4.26924473e-02 3.24409027e-02
2.61576979e-02 2.19122893e-02 1.88516459e-02 1.65405478e-02
2.24201433e-01 1.20713546e-01 8.25864299e-02 6.27603295e-02
5.06088486e-02 4.23983885e-02 3.64792196e-02 3.20096431e-02
3.32485100e-01 1.79020136e-01 1.22480413e-01 9.30797562e-02
7.50599992e-02 6.28844876e-02 5.41067933e-02 4.74787384e-02
4.40768767e-01 2.37326726e-01 1.62374395e-01 1.23399183e-01
9.95111498e-02 8.33705868e-02 7.17343670e-02 6.29478337e-02
5.49052434e-01 2.95633316e-01 2.02268378e-01 1.53718610e-01
1.23962300e-01 1.03856686e-01 8.93619407e-02 7.84169289e-02
6.57336101e-01 3.53939906e-01 2.42162360e-01 1.84038037e-01
1.48413451e-01 1.24342785e-01 1.06989514e-01 9.38860242e-02
7.65619768e-01 4.12246496e-01 2.82056343e-01 2.14357463e-01
1.72864602e-01 1.44828884e-01 1.24617088e-01 1.09355120e-01
9.97790129e-03 5.36241365e-03 3.66197084e-03 2.77774057e-03
2.23579299e-03 1.86961220e-03 1.60562139e-03 1.40628139e-03
1.51409630e-01 8.15179597e-02 5.57683971e-02 4.23786245e-02
3.41719897e-02 2.86269662e-02 2.46293911e-02 2.16108140e-02
2.92841358e-01 1.57673506e-01 1.07874823e-01 8.19795085e-02
6.61081865e-02 5.53843202e-02 4.76531609e-02 4.18153466e-02
4.34273086e-01 2.33829052e-01 1.59981250e-01 1.21580392e-01
9.80443832e-02 8.21416743e-02 7.06769306e-02 6.20198793e-02
5.75704815e-01 3.09984598e-01 2.12087676e-01 1.61181276e-01
1.29980580e-01 1.08899028e-01 9.37007003e-02 8.22244119e-02
7.17136543e-01 3.86140144e-01 2.64194102e-01 2.00782160e-01
1.61916777e-01 1.35656382e-01 1.16724470e-01 1.02428945e-01
8.58568272e-01 4.62295690e-01 3.16300528e-01 2.40383044e-01
1.93852973e-01 1.62413736e-01 1.39748240e-01 1.22633477e-01
1.00000000e+00 5.38451236e-01 3.68406955e-01 2.79983928e-01
2.25789170e-01 1.89171090e-01 1.62772010e-01 1.42838010e-01]
X is
[[0. 0. 0. ]
[0. 0. 0.14285714]
[0. 0. 0.28571429]
...
[1. 1. 0.71428571]
[1. 1. 0.85714286]
[1. 1. 1. ]]
As you can see, it is non-negative.
Please tell me how could I change it to give only positive predictions.
It seems it's a regression problem. In your last layer still you are using relu activation function. For regression models, you shouldn't use activation in the last layer. I will recommend to remove relu from the last layer and retrain the model.
I have a CRNN fitted model with CTC loss output.
I have the prediction and I use keras.backend.ctc_decode to decode it. As written in documentation (https://code.i-harness.com/en/docs/tensorflow~python/tf/keras/backend/ctc_decode), the function will return a Tuple with the decoded result and a Tensor with the log probability of the prediction.
keras.backend.ctc_decode can accept multiple values for its prediction but I need to pass it once at time.
This is the code:
def decode_single_prediction(pred, num_to_char):
input_len = np.ones(pred.shape[0]) * pred.shape[1]
# Use greedy search. For complex tasks, you can use beam search
decoded = keras.backend.ctc_decode(pred, input_length=input_len, greedy=True)
# decoded[0] is supposed to be the decoded result
# decoded[1] is supposed to be it's log probability
accuracy = float(decoded[1][0][0])
# take the resultin encoded char until it gets -1
result = decoded[0][0][:,: np.argmax(decoded[0][0] == -1)]
output_text = tf.strings.reduce_join(num_to_char(result)).numpy().decode("utf-8")
return (output_text, accuracy)
for image in images:
pred = prediction_model.predict(image)
# num_to_char is the mapping from number to char
pred_texts, acc = decode_single_prediction(pred, num_to_char)
print("True value: " + <true_result> + " prediction: " + pred_texts + " acc: " + str(acc))
Output:
True value: test0, prediction: test0, acc: 1.841524362564087
True value: test1, prediction: test1, acc: 0.9661365151405334
True value: test2, prediction: test2, acc: 1.0634151697158813
True value: test3, prediction: test3, acc: 2.471940755844116
True value: test4, prediction: test4, acc: 1.4866207838058472
True value: test5, prediction: test5, acc: 0.7630811333656311
True value: test6, prediction: test6, acc: 0.35642576217651367
True value: test7, prediction: test7, acc: 1.5693446397781372
True value: test8, prediction: test8, acc: 0.9700028896331787
True value: test9, prediction: test9, acc: 1.4783780574798584
The prediction is always correct. However what I think it's the probability seems not to be what I expect. They looks like completely random numbers, even grater than 1 or 2! What am I doing wrong??
Thank you in advance!
I am trying to create a sample feed forward neural network in tensorflow.js using a small data set initially (just for POC). There are 5 input nodes and one output node. Data is related to housing where there are multiple inputs and we are predicting price.
x_train:
[ [ 79545.45857, 5.682861322, 7.009188143, 4.09, 23086.8005 ],
[ 79248.64245, 6.002899808, 6.730821019, 3.09, 40173.07217 ],
[ 61287.06718, 5.86588984, 8.51272743, 5.13, 36882.1594 ],
[ 63345.24005, 7.188236095, 5.586728665, 3.26, 34310.24283 ],
[ 59982.19723, 5.040554523, 7.839387785, 4.23, 26354.10947 ],
...
]
y_train
[ [ 1059033.558 ],
[ 1505890.915 ],
[ 1058987.988 ],
[ 1260616.807 ],
[ 630943.4893 ],
...
]
const model = tf.sequential();
const config_hidden = {
inputShape: [5],
activation: 'sigmoid',
units: 6
}
const config_output = {
units: 1,
activation: 'sigmoid'
}
const hidden = tf.layers.dense(config_hidden);
const output = tf.layers.dense(config_output);
model.add(hidden);
model.add(output);
const optimizer = tf.train.sgd(0.5);
const config = {
optimizer: optimizer,
loss: 'meanSquaredError',
metrics: ['accuracy']
}
model.compile(config);
train_data().then(function () {
console.log('Training is Complete');
}
async function train_data() {
const options = {
shuffle: true,
epochs: 10,
batch_size: 100,
validationSplit: 0.1
}
for (let i = 0; i < 10; i++) {
const res = await model.fit(xs, ys, options);
console.log(res.history.loss[0]);
}
}
The model compiles fine. But the loss while training the model is huge
Model Successfully Compiled
Epoch 1 / 10
eta=0.0 ====================================================================>
1058ms 235us/step - acc=0.00 loss=1648912629760.00 val_acc=0.00 val_loss=1586459705344.00
Epoch 2 / 10
eta=0.0 ====================================================================>
700ms 156us/step - acc=0.00 loss=1648913285120.00 val_acc=0.00 val_loss=1586459705344.00
Epoch 3 / 10
eta=0.0 ====================================================================>
615ms 137us/step - acc=0.00 loss=1648913022976.00 val_acc=0.00 val_loss=1586459705344.00
Epoch 4 / 10
eta=0.0 ====================================================================>
852ms 189us/step - acc=0.00 loss=1648913285120.00 val_acc=0.00 val_loss=1586459705344.00
I figured it could be because the training data is not normalized. So I took the mean of the data and divided it
xs = xs.div(xs.mean(0));
x_train
[[1.1598413, 0.9507535, 1.003062 , 1.0272969, 0.6384002],
[1.1555134, 1.0042965, 0.9632258, 0.7761241, 1.1108726],
[0.8936182, 0.9813745, 1.2182286, 1.2885166, 1.0198718],
...,
There is not much change to the loss
Model Successfully Compiled
Epoch 1 / 10
eta=0.0 ====================================================================>
841ms 187us/step - acc=0.00 loss=1648912760832.00 val_acc=0.00 val_loss=1586459705344.00
Epoch 2 / 10
eta=0.0 ====================================================================>
613ms 136us/step - acc=0.00 loss=1648913154048.00 val_acc=0.00 val_loss=1586459705344.00
Epoch 3 / 10
eta=0.0 ====================================================================>
646ms 144us/step - acc=0.00 loss=1648913022976.00 val_acc=0.00 val_loss=1586459705344.00
I then normalized the output too,
ys = ys.div(1000000);
Model Successfully Compiled
Epoch 1 / 10
eta=0.0 ====================================================================>
899ms 200us/step - acc=0.00 loss=0.202 val_acc=0.00 val_loss=0.161
Epoch 2 / 10
eta=0.0 ====================================================================>
667ms 148us/step - acc=0.00 loss=0.183 val_acc=0.00 val_loss=0.160
Epoch 3 / 10
eta=0.0 ====================================================================>
609ms 135us/step - acc=0.00 loss=0.182 val_acc=0.00 val_loss=0.159
This brought the loss down to decimal figures. However it is seen that even running 10000 iterations on the training data does not decrease the loss substantially. e.g.
Epoch 8 / 10
eta=0.0 ====================================================================>
502ms 112us/step - acc=0.00 loss=0.181 val_acc=0.00 val_loss=0.158
Epoch 9 / 10
eta=0.0 ====================================================================>
551ms 122us/step - acc=0.00 loss=0.181 val_acc=0.00 val_loss=0.158
Epoch 10 / 10
eta=0.0 ====================================================================>
470ms 104us/step - acc=0.00 loss=0.181 val_acc=0.00 val_loss=0.158
0.18076679110527039
Finally the loss starts at around 0.202 and goes down to around 0.180. This results in incorrect predictions.
This is a very common scenario. Multiple inputs having values in different ranges (e.g. housing data as used above). Multiple inputs passed to a feed forward neural network. Expected only one output (price in this case).
Questions:
1. What am I doing wrong in the code above?
2. Am I normalizing the data in the correct manner?
3. Am I using the correct loss function/optimizer/learning rate/activation etc.
4. How do I know whether the model is performing good
5. Is there any other way to do this in tensorflow.js?
I'm going to assume your not attempting linear regression, because of the sigmoidal activations. If you are trying linear regression, remove the sigmoidal activations everywhere. Will try and address all the errors I can see:
Remove the sigmoid activation from the output. The sigmoid function squashes inputs to between 0 and 1, so it's not meant for regression. Your last layer does not need an activation.
Your learning rate is way too high, so I doubt a learning algorithm would be able to converge. Start off with values around 0.001 - 0.01 etc and adjust if required.
No your not normalizing correctly. Generally data is normalised to a mean of zero and standard deviation of one. This is done for each feature column, using the mean and standard deviation of that column only, not all of the data. The formula for example i in feature column x is as follows: (x_i - x.mean()) / x.std(). (I don't know javascript)
The performance metric you provided, "accuracy", is meant for classification, not regression, and would be meaningless (if it is even provided). Minimising your mean squared error or absolute square error is the best way to quantify model performance.
I am currently trying to create my own loss function for Keras (using Tensorflow backend). This is a simple categorical crossentropy but I am applying a factor on the 1st column to penalize more loss from the 1st class.
Yet I am new to Keras and I can't figure out how to translate my function (below) as I have to use symbolic expressions and it seems I can't go element-wise:
def custom_categorical_crossentropy(y_true, y_pred):
y_pred = np.clip(y_pred, _EPSILON, 1.0-_EPSILON)
out = np.zeros(y_true.shape).astype('float32')
for i in range(0,y_true.shape[0]):
for j in range (0,y_true.shape[1]):
#penalize more all elements on class 1 so that loss takes its low proportion in the dataset into account
if(j==0):
out[i][j] = -(prop_database*(y_true[i][j] * np.log(y_pred[i][j]) + (1.0 - y_true[i][j]) * np.log(1.0 - y_pred[i][j])))
else:
out[i][j] = -(y_true[i][j] * np.log(y_pred[i][j]) + (1.0 - y_true[i][j]) * np.log(1.0 - y_pred[i][j]))
out = np.mean(out.astype('float32'), axis=-1)
return tf.convert_to_tensor(out,
dtype=tf.float32,
name='custom_loss')
Can someone help me?
Many thanks!
You can use class_weight in the fit method to penalize classes without creating functions:
weights = {
0:2,
1:1,
2:1,
3:1,
...
}
model.compile(optimizer=chooseOne, loss='categorical_crossentropy')
model.fit(......., class_weight = weights)
This will make the first class be twice as important as the others.