I am using keras to train a regressional NN.
There is a problem. I would like to have my output non-negative. My training data is non-negative in range(0,1).
When I try to use sigmoid or relu in last output layer, the model is not learning anymore.
opt = keras.optimizers.Adam(learning_rate=lr1)
modelx = tf.keras.Sequential()
modelx.add(keras.Input(shape=(3,)))
modelx.add(layers.Dense(10, activation='relu'))
modelx.add(layers.Dense(10, activation='relu'))
modelx.add(layers.Dense(1, activation='relu' ))
modelx.compile(optimizer=opt, loss="mse", metrics=['mse','mae'])
history = modelx.fit(X, Y, epochs=epp[i],batch_size=bs1,shuffle=True)
I keep getting constant loss function.
Epoch 1/15
8/8 [==============================] - 0s 832us/step - loss: 0.0617 - mse: 0.0617 - mae: 0.1283
Epoch 2/15
8/8 [==============================] - 0s 753us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 3/15
8/8 [==============================] - 0s 657us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 4/15
8/8 [==============================] - 0s 709us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 5/15
8/8 [==============================] - 0s 668us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 6/15
8/8 [==============================] - 0s 552us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 7/15
8/8 [==============================] - 0s 702us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 8/15
8/8 [==============================] - 0s 595us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 9/15
8/8 [==============================] - 0s 633us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 10/15
8/8 [==============================] - 0s 855us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 11/15
8/8 [==============================] - 0s 681us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 12/15
8/8 [==============================] - 0s 620us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 13/15
8/8 [==============================] - 0s 572us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 14/15
8/8 [==============================] - 0s 618us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
Epoch 15/15
8/8 [==============================] - 0s 572us/step - loss: 0.0197 - mse: 0.0197 - mae: 0.0731
If I am using something that can be negative (ELU, leaky relu), it converges once again, but I don't like the results (a lot of negative predictions)
This is the data used in traning:
Y is
[1.33931561e-04 6.18145666e-05 3.52451476e-05 2.14290498e-05
1.29611188e-05 7.23954384e-06 3.11468746e-06 0.00000000e+00
2.34380232e-03 1.25174497e-03 8.49408058e-04 6.40192861e-04
5.11964193e-04 4.25323200e-04 3.62861090e-04 3.15695822e-04
4.55367307e-03 2.44167538e-03 1.66357097e-03 1.25895667e-03
1.01096727e-03 8.43406857e-04 7.22607492e-04 6.31391645e-04
6.76354383e-03 3.63160579e-03 2.47773388e-03 1.87772048e-03
1.50997034e-03 1.26149051e-03 1.08235389e-03 9.47087467e-04
8.97341459e-03 4.82153620e-03 3.29189679e-03 2.49648430e-03
2.00897341e-03 1.67957417e-03 1.44210030e-03 1.26278329e-03
1.11832853e-02 6.01146660e-03 4.10605970e-03 3.11524811e-03
2.50797649e-03 2.09765783e-03 1.80184670e-03 1.57847911e-03
1.33931561e-02 7.20139701e-03 4.92022261e-03 3.73401192e-03
3.00697956e-03 2.51574148e-03 2.16159310e-03 1.89417493e-03
1.56030269e-02 8.39132742e-03 5.73438552e-03 4.35277573e-03
3.50598264e-03 2.93382514e-03 2.52133950e-03 2.20987076e-03
6.02692024e-04 3.14224047e-04 2.07946371e-04 1.52681980e-04
1.18810256e-04 9.59239558e-05 7.94245303e-05 6.69657805e-05
9.44217505e-03 5.07394568e-03 3.46459801e-03 2.62773723e-03
2.11482255e-03 1.76825858e-03 1.51841014e-03 1.32974907e-03
1.82816581e-02 9.83366730e-03 6.72124965e-03 5.10279247e-03
4.11083485e-03 3.44059321e-03 2.95739575e-03 2.59253236e-03
2.71211411e-02 1.45933889e-02 9.97790129e-03 7.57784772e-03
6.10684714e-03 5.11292783e-03 4.39638136e-03 3.85531565e-03
3.59606241e-02 1.93531106e-02 1.32345529e-02 1.00529030e-02
8.10285944e-03 6.78526246e-03 5.83536696e-03 5.11809894e-03
4.48001071e-02 2.41128322e-02 1.64912046e-02 1.25279582e-02
1.00988717e-02 8.45759709e-03 7.27435257e-03 6.38088223e-03
5.36395902e-02 2.88725538e-02 1.97478562e-02 1.50030135e-02
1.20948840e-02 1.01299317e-02 8.71333818e-03 7.64366552e-03
6.24790732e-02 3.36322754e-02 2.30045079e-02 1.74780687e-02
1.40908963e-02 1.18022663e-02 1.01523238e-02 8.90644880e-03
1.38395946e-03 7.34906514e-04 4.95781743e-04 3.71436862e-04
2.95225484e-04 2.43731309e-04 2.06607602e-04 1.78575415e-04
2.12727963e-02 1.14442802e-02 7.82324793e-03 5.94031117e-03
4.78625315e-03 4.00648422e-03 3.44432522e-03 3.01983782e-03
4.11616331e-02 2.21536538e-02 1.51507141e-02 1.15091855e-02
9.27728081e-03 7.76923713e-03 6.68204284e-03 5.86110022e-03
6.10504699e-02 3.28630275e-02 2.24781803e-02 1.70780598e-02
1.37683085e-02 1.15319900e-02 9.91976046e-03 8.70236262e-03
8.09393067e-02 4.35724012e-02 2.98056465e-02 2.26469341e-02
1.82593361e-02 1.52947429e-02 1.31574781e-02 1.15436250e-02
1.00828143e-01 5.42817748e-02 3.71331127e-02 2.82158084e-02
2.27503638e-02 1.90574959e-02 1.63951957e-02 1.43848874e-02
1.20716980e-01 6.49911485e-02 4.44605789e-02 3.37846827e-02
2.72413915e-02 2.28202488e-02 1.96329133e-02 1.72261498e-02
1.40605817e-01 7.57005222e-02 5.17880451e-02 3.93535570e-02
3.17324191e-02 2.65830017e-02 2.28706309e-02 2.00674122e-02
2.47773388e-03 1.32386197e-03 8.98751264e-04 6.77693699e-04
5.42206803e-04 4.50661604e-04 3.84663902e-04 3.34828902e-04
3.78356660e-02 2.03627485e-02 1.39253578e-02 1.05779147e-02
8.52625599e-03 7.14000011e-03 6.14060634e-03 5.38596206e-03
7.31935981e-02 3.94016350e-02 2.69519644e-02 2.04781357e-02
1.65103052e-02 1.38293386e-02 1.18965488e-02 1.04370952e-02
1.08551530e-01 5.84405215e-02 3.99785710e-02 3.03783567e-02
2.44943544e-02 2.05186771e-02 1.76524912e-02 1.54882284e-02
1.43909462e-01 7.74794080e-02 5.30051775e-02 4.02785776e-02
3.24784035e-02 2.72080156e-02 2.34084336e-02 2.05393615e-02
1.79267394e-01 9.65182945e-02 6.60317841e-02 5.01787986e-02
4.04624527e-02 3.38973541e-02 2.91643761e-02 2.55904947e-02
2.14625326e-01 1.15557181e-01 7.90583906e-02 6.00790196e-02
4.84465019e-02 4.05866926e-02 3.49203185e-02 3.06416278e-02
2.49983259e-01 1.34596068e-01 9.20849972e-02 6.99792406e-02
5.64305511e-02 4.72760311e-02 4.06762609e-02 3.56927610e-02
3.88401527e-03 2.08109041e-03 1.41685493e-03 1.07145249e-03
8.59754214e-04 7.16714840e-04 6.13593431e-04 5.35726244e-04
5.91307842e-02 3.18293506e-02 2.17709277e-02 1.65405478e-02
1.33348311e-02 1.11688063e-02 9.60725348e-03 8.42812180e-03
1.14377553e-01 6.15776108e-02 4.21250004e-02 3.20096431e-02
2.58099079e-02 2.16208977e-02 1.86009135e-02 1.63205174e-02
1.69624322e-01 9.13258709e-02 6.24790732e-02 4.74787384e-02
3.82849848e-02 3.20729891e-02 2.75945736e-02 2.42129129e-02
2.24871091e-01 1.21074131e-01 8.28331459e-02 6.29478337e-02
5.07600616e-02 4.25250805e-02 3.65882336e-02 3.21053085e-02
2.80117860e-01 1.50822391e-01 1.03187219e-01 7.84169289e-02
6.32351385e-02 5.29771719e-02 4.55818937e-02 3.99977040e-02
3.35364629e-01 1.80570651e-01 1.23541291e-01 9.38860242e-02
7.57102153e-02 6.34292633e-02 5.45755538e-02 4.78900996e-02
3.90611398e-01 2.10318912e-01 1.43895364e-01 1.09355120e-01
8.81852922e-02 7.38813547e-02 6.35692138e-02 5.57824951e-02
5.60280363e-03 3.00659184e-03 2.05009275e-03 1.55271323e-03
1.24786772e-03 1.04189102e-03 8.93396188e-04 7.81267439e-04
8.51581509e-02 4.58440865e-02 3.13599575e-02 2.38282105e-02
1.92119784e-02 1.60929027e-02 1.38442667e-02 1.21463170e-02
1.64713498e-01 8.86815811e-02 6.06698223e-02 4.61037077e-02
3.71760890e-02 3.11439143e-02 2.67951371e-02 2.35113666e-02
2.44268845e-01 1.31519076e-01 8.99796870e-02 6.83792049e-02
5.51401997e-02 4.61949259e-02 3.97460076e-02 3.48764162e-02
3.23824193e-01 1.74356570e-01 1.19289552e-01 9.06547021e-02
7.31043104e-02 6.12459376e-02 5.26968781e-02 4.62414658e-02
4.03379540e-01 2.17194065e-01 1.48599417e-01 1.12930199e-01
9.10684210e-02 7.62969492e-02 6.56477486e-02 5.76065155e-02
4.82934887e-01 2.60031560e-01 1.77909281e-01 1.35205697e-01
1.09032532e-01 9.13479608e-02 7.85986191e-02 6.89715651e-02
5.62490234e-01 3.02869054e-01 2.07219146e-01 1.57481194e-01
1.26996642e-01 1.06398972e-01 9.15494895e-02 8.03366147e-02
7.63409898e-03 4.10036625e-03 2.79846472e-03 2.12147593e-03
1.70654731e-03 1.42619014e-03 1.22407217e-03 1.07145249e-03
1.15917766e-01 6.24069562e-02 4.26924473e-02 3.24409027e-02
2.61576979e-02 2.19122893e-02 1.88516459e-02 1.65405478e-02
2.24201433e-01 1.20713546e-01 8.25864299e-02 6.27603295e-02
5.06088486e-02 4.23983885e-02 3.64792196e-02 3.20096431e-02
3.32485100e-01 1.79020136e-01 1.22480413e-01 9.30797562e-02
7.50599992e-02 6.28844876e-02 5.41067933e-02 4.74787384e-02
4.40768767e-01 2.37326726e-01 1.62374395e-01 1.23399183e-01
9.95111498e-02 8.33705868e-02 7.17343670e-02 6.29478337e-02
5.49052434e-01 2.95633316e-01 2.02268378e-01 1.53718610e-01
1.23962300e-01 1.03856686e-01 8.93619407e-02 7.84169289e-02
6.57336101e-01 3.53939906e-01 2.42162360e-01 1.84038037e-01
1.48413451e-01 1.24342785e-01 1.06989514e-01 9.38860242e-02
7.65619768e-01 4.12246496e-01 2.82056343e-01 2.14357463e-01
1.72864602e-01 1.44828884e-01 1.24617088e-01 1.09355120e-01
9.97790129e-03 5.36241365e-03 3.66197084e-03 2.77774057e-03
2.23579299e-03 1.86961220e-03 1.60562139e-03 1.40628139e-03
1.51409630e-01 8.15179597e-02 5.57683971e-02 4.23786245e-02
3.41719897e-02 2.86269662e-02 2.46293911e-02 2.16108140e-02
2.92841358e-01 1.57673506e-01 1.07874823e-01 8.19795085e-02
6.61081865e-02 5.53843202e-02 4.76531609e-02 4.18153466e-02
4.34273086e-01 2.33829052e-01 1.59981250e-01 1.21580392e-01
9.80443832e-02 8.21416743e-02 7.06769306e-02 6.20198793e-02
5.75704815e-01 3.09984598e-01 2.12087676e-01 1.61181276e-01
1.29980580e-01 1.08899028e-01 9.37007003e-02 8.22244119e-02
7.17136543e-01 3.86140144e-01 2.64194102e-01 2.00782160e-01
1.61916777e-01 1.35656382e-01 1.16724470e-01 1.02428945e-01
8.58568272e-01 4.62295690e-01 3.16300528e-01 2.40383044e-01
1.93852973e-01 1.62413736e-01 1.39748240e-01 1.22633477e-01
1.00000000e+00 5.38451236e-01 3.68406955e-01 2.79983928e-01
2.25789170e-01 1.89171090e-01 1.62772010e-01 1.42838010e-01]
X is
[[0. 0. 0. ]
[0. 0. 0.14285714]
[0. 0. 0.28571429]
...
[1. 1. 0.71428571]
[1. 1. 0.85714286]
[1. 1. 1. ]]
As you can see, it is non-negative.
Please tell me how could I change it to give only positive predictions.
It seems it's a regression problem. In your last layer still you are using relu activation function. For regression models, you shouldn't use activation in the last layer. I will recommend to remove relu from the last layer and retrain the model.
Related
Have a question regarding the Hyperopt Score output and best param output. After running the best_param (as shown in the example), it generated best loss: -0.0921409214092141 with the final score being 0.06775067750677506. However, when I used the best param generated to fit, it generated a score of 0.06775067750677506.
Shouldn't i be expecting 0.0921409214092141 or am i interpreting it wrongly? I passed the parameter dictionary as XGBClassifier(**param_dict) when fitting with the param generated from Hyperopt.
best_params = fmin(fn = objective,
space = space,
algo = tpe.suggest,
max_evals = 100,
trials = trials,
rstate=np.random.default_rng(0))
...
SCORE:
0.07046070460704607
SCORE:
0.08130081300813008
SCORE:
0.06775067750677506
100%|██████| 100/100 [04:42<00:00, 2.83s/trial, best loss: -0.0921409214092141]
print(best_params)
{'colsample_bytree': 0.8945730049917332, 'eta': 0.05817033918345017, 'gamma': 4.954333064013377, 'max_depth': 6.0, 'min_child_weight': 5.0, 'n_estimators': 258.0, 'reg_alpha': 5.0, 'reg_lambda': 8.0, 'subsample': 0.9}
thanks in advance
I have a CRNN fitted model with CTC loss output.
I have the prediction and I use keras.backend.ctc_decode to decode it. As written in documentation (https://code.i-harness.com/en/docs/tensorflow~python/tf/keras/backend/ctc_decode), the function will return a Tuple with the decoded result and a Tensor with the log probability of the prediction.
keras.backend.ctc_decode can accept multiple values for its prediction but I need to pass it once at time.
This is the code:
def decode_single_prediction(pred, num_to_char):
input_len = np.ones(pred.shape[0]) * pred.shape[1]
# Use greedy search. For complex tasks, you can use beam search
decoded = keras.backend.ctc_decode(pred, input_length=input_len, greedy=True)
# decoded[0] is supposed to be the decoded result
# decoded[1] is supposed to be it's log probability
accuracy = float(decoded[1][0][0])
# take the resultin encoded char until it gets -1
result = decoded[0][0][:,: np.argmax(decoded[0][0] == -1)]
output_text = tf.strings.reduce_join(num_to_char(result)).numpy().decode("utf-8")
return (output_text, accuracy)
for image in images:
pred = prediction_model.predict(image)
# num_to_char is the mapping from number to char
pred_texts, acc = decode_single_prediction(pred, num_to_char)
print("True value: " + <true_result> + " prediction: " + pred_texts + " acc: " + str(acc))
Output:
True value: test0, prediction: test0, acc: 1.841524362564087
True value: test1, prediction: test1, acc: 0.9661365151405334
True value: test2, prediction: test2, acc: 1.0634151697158813
True value: test3, prediction: test3, acc: 2.471940755844116
True value: test4, prediction: test4, acc: 1.4866207838058472
True value: test5, prediction: test5, acc: 0.7630811333656311
True value: test6, prediction: test6, acc: 0.35642576217651367
True value: test7, prediction: test7, acc: 1.5693446397781372
True value: test8, prediction: test8, acc: 0.9700028896331787
True value: test9, prediction: test9, acc: 1.4783780574798584
The prediction is always correct. However what I think it's the probability seems not to be what I expect. They looks like completely random numbers, even grater than 1 or 2! What am I doing wrong??
Thank you in advance!
I want to tune a hyperparameter in slightly modified DNNClassifier. I was able to run the tuning job and it succeeded too but the output does not show the final metrics for each trial. This is what the final output looks like:
{
"completedTrialCount": "2",
"trials": [
{
"trialId": "1",
"hyperparameters": {
"myparam": "0.003"
}
},
{
"trialId": "2",
"hyperparameters": {
"myparam": "0.07"
}
}
],
"consumedMLUnits": 1.48,
"isHyperparameterTuningJob": true
}
how do I get the final metric for each trial so as to decide which value is the best?
My code looks like this.
My DNNClassifier:
classifier = DNNClassifier(
feature_columns=feature_columns,
hidden_units=hu,
optimizer=tf.train.AdamOptimizer(learning_rate=lr),
activation_fn=tf.nn.leaky_relu,
dropout=dr,
n_classes=2,
config=self.get_run_config(),
model_dir=self.model_dir,
weight_column=weight_column
)
tf.contrib.estimator.add_metrics(classifier, compute_metrics)
def compute_metrics(labels, predictions):
return {'my-roc-auc': tf.metrics.auc(labels, predictions)}
The hyperparameters spec is as follows.
trainingInput:
hyperparameters:
hyperparameterMetricTag: my-roc-auc
maxTrials: 2
enableTrialEarlyStopping: True
params:
- parameterName: myparam
type: DISCRETE
discreteValues:
- 0.0001
- 0.0005
- 0.001
- 0.003
- 0.005
- 0.007
- 0.01
- 0.03
- 0.05
- 0.07
- 0.1
I mostly followed the instructions here.
Fixed it. The issue was
tf.contrib.estimator.add_metrics(classifier, compute_metrics)
It should have been
classifier = tf.contrib.estimator.add_metrics(classifier, compute_metrics)
I have written a custom loss function adjusted_r2. I am trying to print the tensor values inside the function, but when Logs are printed, I don't see anything. Could somebody help me in this.
def coeff_determination(y_true, y_pred):
from keras import backend as K
SS_res = K.sum(K.square( y_true-y_pred ))
SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )
SS_res = K.print_tensor(SS_res, message='SS_res = ')
SS_tot = K.print_tensor(SS_tot, message='SS_tot = ')
r_squared = 1 - SS_res/(SS_tot + K.epsilon())
r_squared = K.print_tensor(r_squared, message='r_squared = ')
adj_r_squared = 1 -( (1-r_squared)*K.cast(K.shape(y_true)[0]-1,"float32")/K.cast((K.shape(y_true)[0]-n_features-1),"float32"))
adj_r_squared = K.print_tensor(adj_r_squared, message='adj_r_squared = ')
return -adj_r_squared
Logs are :
1/250 [..............................] - ETA: 51:44 - loss: -6.7060 - coeff_determination: -6.7060 - mean_squared_error: 40.5785
2/250 [..............................] - ETA: 42:56 - loss: -7.2036 - coeff_determination: -7.2036 - mean_squared_error: 48.8251
3/250 [..............................] - ETA: 41:30 - loss: -8.0279 - coeff_determination: -8.0279 - mean_squared_error: 48.1565
4/250 [..............................] - ETA: 40:48 - loss: -9.1016 - coeff_determination: -9.1016 - mean_squared_error: 51.9965
The K.print_tensor() function works when the tensors are evaluated (see documentation here). The tensors are not initialized when the custom loss function is being called. That is why you can not evaluate tensor values from within the loss function. The arguments to your custom loss function are tensors which work as placeholders, without having actual data attached to them.
The same problem has been also discussed in this thread.
I am trying to create a sample feed forward neural network in tensorflow.js using a small data set initially (just for POC). There are 5 input nodes and one output node. Data is related to housing where there are multiple inputs and we are predicting price.
x_train:
[ [ 79545.45857, 5.682861322, 7.009188143, 4.09, 23086.8005 ],
[ 79248.64245, 6.002899808, 6.730821019, 3.09, 40173.07217 ],
[ 61287.06718, 5.86588984, 8.51272743, 5.13, 36882.1594 ],
[ 63345.24005, 7.188236095, 5.586728665, 3.26, 34310.24283 ],
[ 59982.19723, 5.040554523, 7.839387785, 4.23, 26354.10947 ],
...
]
y_train
[ [ 1059033.558 ],
[ 1505890.915 ],
[ 1058987.988 ],
[ 1260616.807 ],
[ 630943.4893 ],
...
]
const model = tf.sequential();
const config_hidden = {
inputShape: [5],
activation: 'sigmoid',
units: 6
}
const config_output = {
units: 1,
activation: 'sigmoid'
}
const hidden = tf.layers.dense(config_hidden);
const output = tf.layers.dense(config_output);
model.add(hidden);
model.add(output);
const optimizer = tf.train.sgd(0.5);
const config = {
optimizer: optimizer,
loss: 'meanSquaredError',
metrics: ['accuracy']
}
model.compile(config);
train_data().then(function () {
console.log('Training is Complete');
}
async function train_data() {
const options = {
shuffle: true,
epochs: 10,
batch_size: 100,
validationSplit: 0.1
}
for (let i = 0; i < 10; i++) {
const res = await model.fit(xs, ys, options);
console.log(res.history.loss[0]);
}
}
The model compiles fine. But the loss while training the model is huge
Model Successfully Compiled
Epoch 1 / 10
eta=0.0 ====================================================================>
1058ms 235us/step - acc=0.00 loss=1648912629760.00 val_acc=0.00 val_loss=1586459705344.00
Epoch 2 / 10
eta=0.0 ====================================================================>
700ms 156us/step - acc=0.00 loss=1648913285120.00 val_acc=0.00 val_loss=1586459705344.00
Epoch 3 / 10
eta=0.0 ====================================================================>
615ms 137us/step - acc=0.00 loss=1648913022976.00 val_acc=0.00 val_loss=1586459705344.00
Epoch 4 / 10
eta=0.0 ====================================================================>
852ms 189us/step - acc=0.00 loss=1648913285120.00 val_acc=0.00 val_loss=1586459705344.00
I figured it could be because the training data is not normalized. So I took the mean of the data and divided it
xs = xs.div(xs.mean(0));
x_train
[[1.1598413, 0.9507535, 1.003062 , 1.0272969, 0.6384002],
[1.1555134, 1.0042965, 0.9632258, 0.7761241, 1.1108726],
[0.8936182, 0.9813745, 1.2182286, 1.2885166, 1.0198718],
...,
There is not much change to the loss
Model Successfully Compiled
Epoch 1 / 10
eta=0.0 ====================================================================>
841ms 187us/step - acc=0.00 loss=1648912760832.00 val_acc=0.00 val_loss=1586459705344.00
Epoch 2 / 10
eta=0.0 ====================================================================>
613ms 136us/step - acc=0.00 loss=1648913154048.00 val_acc=0.00 val_loss=1586459705344.00
Epoch 3 / 10
eta=0.0 ====================================================================>
646ms 144us/step - acc=0.00 loss=1648913022976.00 val_acc=0.00 val_loss=1586459705344.00
I then normalized the output too,
ys = ys.div(1000000);
Model Successfully Compiled
Epoch 1 / 10
eta=0.0 ====================================================================>
899ms 200us/step - acc=0.00 loss=0.202 val_acc=0.00 val_loss=0.161
Epoch 2 / 10
eta=0.0 ====================================================================>
667ms 148us/step - acc=0.00 loss=0.183 val_acc=0.00 val_loss=0.160
Epoch 3 / 10
eta=0.0 ====================================================================>
609ms 135us/step - acc=0.00 loss=0.182 val_acc=0.00 val_loss=0.159
This brought the loss down to decimal figures. However it is seen that even running 10000 iterations on the training data does not decrease the loss substantially. e.g.
Epoch 8 / 10
eta=0.0 ====================================================================>
502ms 112us/step - acc=0.00 loss=0.181 val_acc=0.00 val_loss=0.158
Epoch 9 / 10
eta=0.0 ====================================================================>
551ms 122us/step - acc=0.00 loss=0.181 val_acc=0.00 val_loss=0.158
Epoch 10 / 10
eta=0.0 ====================================================================>
470ms 104us/step - acc=0.00 loss=0.181 val_acc=0.00 val_loss=0.158
0.18076679110527039
Finally the loss starts at around 0.202 and goes down to around 0.180. This results in incorrect predictions.
This is a very common scenario. Multiple inputs having values in different ranges (e.g. housing data as used above). Multiple inputs passed to a feed forward neural network. Expected only one output (price in this case).
Questions:
1. What am I doing wrong in the code above?
2. Am I normalizing the data in the correct manner?
3. Am I using the correct loss function/optimizer/learning rate/activation etc.
4. How do I know whether the model is performing good
5. Is there any other way to do this in tensorflow.js?
I'm going to assume your not attempting linear regression, because of the sigmoidal activations. If you are trying linear regression, remove the sigmoidal activations everywhere. Will try and address all the errors I can see:
Remove the sigmoid activation from the output. The sigmoid function squashes inputs to between 0 and 1, so it's not meant for regression. Your last layer does not need an activation.
Your learning rate is way too high, so I doubt a learning algorithm would be able to converge. Start off with values around 0.001 - 0.01 etc and adjust if required.
No your not normalizing correctly. Generally data is normalised to a mean of zero and standard deviation of one. This is done for each feature column, using the mean and standard deviation of that column only, not all of the data. The formula for example i in feature column x is as follows: (x_i - x.mean()) / x.std(). (I don't know javascript)
The performance metric you provided, "accuracy", is meant for classification, not regression, and would be meaningless (if it is even provided). Minimising your mean squared error or absolute square error is the best way to quantify model performance.