How to properly use tf.metrics.accuracy? - tensorflow

I have some trouble using the accuracy function from tf.metrics for a multiple classification problem with logits as input.
My model output looks like:
logits = [[0.1, 0.5, 0.4],
[0.8, 0.1, 0.1],
[0.6, 0.3, 0.2]]
And my labels are one hot encoded vectors:
labels = [[0, 1, 0],
[1, 0, 0],
[0, 0, 1]]
When I try to do something like tf.metrics.accuracy(labels, logits) it never gives the correct result. I am obviously doing something wrong but I can't figure what it is.

TL;DR
The accuracy function tf.metrics.accuracy calculates how often predictions matches labels based on two local variables it creates: total and count, that are used to compute the frequency with which logits matches labels.
acc, acc_op = tf.metrics.accuracy(labels=tf.argmax(labels, 1),
predictions=tf.argmax(logits,1))
print(sess.run([acc, acc_op]))
print(sess.run([acc]))
# Output
#[0.0, 0.66666669]
#[0.66666669]
acc (accuracy): simply returns the metrics using total and count, doesnt update the metrics.
acc_op (update up): updates the metrics.
To understand why the acc returns 0.0, go through the details below.
Details using a simple example:
logits = tf.placeholder(tf.int64, [2,3])
labels = tf.Variable([[0, 1, 0], [1, 0, 1]])
acc, acc_op = tf.metrics.accuracy(labels=tf.argmax(labels, 1),
predictions=tf.argmax(logits,1))
Initialize the variables:
Since metrics.accuracy creates two local variables total and count, we need to call local_variables_initializer() to initialize them.
sess = tf.Session()
sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
stream_vars = [i for i in tf.local_variables()]
print(stream_vars)
#[<tf.Variable 'accuracy/total:0' shape=() dtype=float32_ref>,
# <tf.Variable 'accuracy/count:0' shape=() dtype=float32_ref>]
Understanding update ops and accuracy calculation:
print('acc:',sess.run(acc, {logits:[[0,1,0],[1,0,1]]}))
#acc: 0.0
print('[total, count]:',sess.run(stream_vars))
#[total, count]: [0.0, 0.0]
The above returns 0.0 for accuracy as total and count are zeros, inspite of giving matching inputs.
print('ops:', sess.run(acc_op, {logits:[[0,1,0],[1,0,1]]}))
#ops: 1.0
print('[total, count]:',sess.run(stream_vars))
#[total, count]: [2.0, 2.0]
With the new inputs, the accuracy is calculated when the update op is called. Note: since all the logits and labels match, we get accuracy of 1.0 and the local variables total and count actually give total correctly predicted and the total comparisons made.
Now we call accuracy with the new inputs (not the update ops):
print('acc:', sess.run(acc,{logits:[[1,0,0],[0,1,0]]}))
#acc: 1.0
Accuracy call doesnt update the metrics with the new inputs, it just returns the value using the two local variables. Note: the logits and labels dont match in this case. Now calling update ops again:
print('op:',sess.run(acc_op,{logits:[[0,1,0],[0,1,0]]}))
#op: 0.75
print('[total, count]:',sess.run(stream_vars))
#[total, count]: [3.0, 4.0]
The metrics are updated to new inputs
For more information on how to use the metrics during training and how to reset them during validation, can be found here.

On TF 2.0, if you are using the tf.keras API, you can define a custom class myAccuracy which inherits from tf.keras.metrics.Accuracy, and overrides the update method like this:
# imports
# ...
class myAccuracy(tf.keras.metrics.Accuracy):
def update_state(self, y_true, y_pred, sample_weight=None):
y_true = tf.argmax(y_true,1)
y_pred = tf.argmax(y_pred,1)
return super(myAccuracy,self).update_state(y_true,y_pred,sample_weight)
Then, when compiling the model you can add metrics in the usual way.
from my_awesome_models import discriminador
discriminador.compile(tf.keras.optimizers.Adam(),
loss=tf.nn.softmax_cross_entropy_with_logits,
metrics=[myAccuracy()])
from my_puzzling_datasets import train_dataset,test_dataset
discriminador.fit(train_dataset.shuffle(70000).repeat().batch(1000),
epochs=1,steps_per_epoch=1,
validation_data=test_dataset.shuffle(70000).batch(1000),
validation_steps=1)
# Train for 1 steps, validate for 1 steps
# 1/1 [==============================] - 3s 3s/step - loss: 0.1502 - accuracy: 0.9490 - val_loss: 0.1374 - val_accuracy: 0.9550
Or evaluate yout model over the whole dataset
discriminador.evaluate(test_dataset.batch(TST_DSET_LENGTH))
#> [0.131587415933609, 0.95354694]

Applied on a cnn you can write:
x_len=24*24
y_len=2
x = tf.placeholder(tf.float32, shape=[None, x_len], name='input')
fc1 = ... # cnn's fully connected layer
keep_prob = tf.placeholder(tf.float32, name='keep_prob')
layer_fc_dropout = tf.nn.dropout(fc1, keep_prob, name='dropout')
y_pred = tf.nn.softmax(fc1, name='output')
logits = tf.argmax(y_pred, axis=1)
y_true = tf.placeholder(tf.float32, shape=[None, y_len], name='y_true')
acc, acc_op = tf.metrics.accuracy(labels=tf.argmax(y_true, axis=1), predictions=tf.argmax(y_pred, 1))
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
def print_accuracy(x_data, y_data, dropout=1.0):
accuracy = sess.run(acc_op, feed_dict = {y_true: y_data, x: x_data, keep_prob: dropout})
print('Accuracy: ', accuracy)

Extending the answer to TF2.0, the tutorial here explains clearly how to use tf.metrics for accuracy and loss.
https://www.tensorflow.org/beta/tutorials/quickstart/advanced
Notice that it mentions that the metrics are reset after each epoch :
train_loss.reset_states()
train_accuracy.reset_states()
test_loss.reset_states()
test_accuracy.reset_states()
When label and predictions are one-hot-coded
def train_step(features, labels):
with tf.GradientTape() as tape:
prediction = model(features)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=predictions))
gradients = tape.gradient(loss, model.trainable_weights)
optimizer.apply_gradients(zip(gradients, model.trainable_weights))
train_loss(loss)
train_accuracy(tf.argmax(labels, 1), tf.argmax(predictions, 1))

Here how I use it:
test_accuracy = tf.keras.metrics.Accuracy()
# use dataset api or normal dataset from lists/np arrays
ds_test_batch = zip(x_test,y_test)
predicted_classes = np.array([])
for (x, y) in ds_test_batch:
# training=False is needed only if there are layers with different
# behaviour during training versus inference (e.g. Dropout).
#Ajust the input similar to your input during the training
logits = model(x.reshape(1,-1), training=False )
prediction = tf.argmax(logits, axis=1, output_type=tf.int64)
predicted_classes = np.concatenate([predicted_classes,prediction.numpy()])
test_accuracy(prediction, y)
print("Test set accuracy: {:.3%}".format(test_accuracy.result()))

Related

ValueError: No gradients provided for any variable in my custom loss - Why?

Here is my code (you can copy and paste to execute it)
import tensorflow as tf
import numpy as np
from sklearn.preprocessing import MinMaxScaler
x = np.array([[1, 2], [3, 4], [5, 6], [7, 8]]).astype(np.float32)
y = np.array([[-1], [3], [7], [-2]]).astype(np.float32)
# scale x and y
x_scaler = MinMaxScaler()
x_scaler.fit(x)
x_sc = x_scaler.transform(x)
y_scaler = MinMaxScaler()
y_scaler.fit(y)
y_sc = y_scaler.transform(y)
batch_size = 2
ds = tf.data.Dataset.from_tensor_slices((x_sc, y_sc)).batch(batch_size=batch_size)
# create the model
model = tf.keras.Sequential(
[
tf.keras.layers.Input(shape=(2,)),
tf.keras.layers.Dense(units=3, activation='relu'),
tf.keras.layers.Dense(units=1)
]
)
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-3)
def standard_loss(y_batch, y_pred, y_min_max):
batches = y_pred.shape[0]
loss = 0.0
y_true_unsc = tf.convert_to_tensor(y_min_max.inverse_transform(y_batch), tf.float32)
y_pred_unsc = tf.convert_to_tensor(y_min_max.inverse_transform(y_pred), tf.float32)
for batch in range(batches):
loss += tf.math.reduce_mean(tf.math.square(y_true_unsc[batch] - y_pred_unsc[batch]))
return loss / batches
# training loop
epochs = 1
for epoch in range(epochs):
print("\nStart of epoch %d" % (epoch, ))
for step, (x_batch, y_batch) in enumerate(ds):
with tf.GradientTape() as tape:
y_pred = model(x_batch, training=True)
loss_value = standard_loss(y_batch, y_pred, y_scaler)
grads = tape.gradient(loss_value, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
The problem is located in my cost function (standard_loss). When I don't unscale my data, all work better as below:
def standard_loss(y_batch, y_pred, y_min_max):
batches = y_pred.shape[0]
loss = 0.0
for batch in range(batches):
loss += tf.math.reduce_mean(tf.math.square(y_batch[batch] - y_pred[batch]))
return loss / batches
But when I let it as above, I got this error:
ValueError: No gradients provided for any variable: ['dense/kernel:0', 'dense/bias:0', 'dense_1/kernel:0', 'dense_1/bias:0'].
I need to unscale my data to use it for others computations.
Someone could help me understand why this happen?
EDIT 1:
The problem is due to the tape (in tf.GradientTape() as tape) which records all the operations, this series of operations by which it goes up in the opposite direction when calculating the gradient. My goal now is to figure out how to unscale my y_pred variable without the "tape" saving it and going astray when calculating the gradient. Ideas?
EDIT 2:
In my custom loss my unscale operation is a numpy operation and this operation is not recorded by "tape" since we go out of the tensorflow field. This is the reason why the error appears. So I'm going to look for a way to scale my data with a tensorflow operation in order to unscale them with a tensorflow operation.
SOLUTION :
EDIT 2 is the solution. Now, everything works perfectly.
In my custom loss my unscale operation is a numpy operation and this operation is not recorded by "tape" since we go out of the tensorflow field. This is the reason why the error appears. One solution is to use tensorflow operations to scale and unscale data in order to allow the tape to record the path. See code below,
import tensorflow as tf
import numpy as np
x = tf.convert_to_tensor([[1, 2], [3, 4], [5, 6], [7, 8]], dtype=tf.float32)
y = tf.convert_toètensor([[-1], [3], [7], [-2]], dtype=tf.float32)
# retrieve x and y min max
xmin, xmax = tf.reduce_min(x, axis=0), tf.reduce_max(x, axis=0)
ymin, ymax = tf.reduce_min(y, axis=0), tf.reduce_max(y, axis=0)
batch_size = 2
ds = tf.data.Dataset.from_tensor_slices((x, y)).batch(batch_size)
# create the model
model = tf.keras.Sequential(
[
tf.keras.layers.Input(shape=(2,)),
tf.keras.layers.Dense(units=3, activation='relu'),
tf.keras.layers.Dense(units=1)
]
)
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-3)
def standard_loss(y_batch, y_pred):
# unscale y_pred (note that y_batch has never been scaled)
y_pred_unsc = y_pred * (ymax - ymin) + ymin
return tf.reduce_mean(tf.square(y_batch - y_pred_unsc)
# training loop
epochs = 1
for epoch in range(epochs):
print("\nStart of epoch %d" % (epoch, ))
for step, (x_batch, y_batch) in enumerate(ds):
with tf.GradientTape() as tape:
# scale data (we see that I do not quit tensorflow operations)
x_scale = (x_batch - xmin)/(xmax - xmin)
y_pred = model(x_scale, training=True)
loss_value = standard_loss(y_batch, y_pred)
grads = tape.gradient(loss_value, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))

NN on tensorflow doesn't train

I try to train a simple tensorflow-network on a simple model, but from some reason, it doesn't learn anything. Do I make any mistake?
X, Y = read_data(file_name)
# CONSTRUCT GRAPH
x_t = tf.placeholder(shape=[None, X.shape[1]], dtype=tf.float32)
y_t = tf.placeholder(shape=[None,], dtype=tf.float32)
hidden_1 = tf.layers.dense(x_t, 50, activation=tf.nn.sigmoid)
hidden_2 = tf.layers.dense(hidden_1, 50, activation=tf.nn.sigmoid)
output = tf.layers.dense(hidden_2, 1, activation=tf.nn.sigmoid)
# DEFINE LOSS AND OPTIMIZER
loss = tf.reduce_mean(tf.square(output - y_t))
GD_optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train_step = GD_optimizer.minimize(loss)
# BATCH SIZE
BATCH_SIZE = 20
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(15000):
rand_indices = np.random.choice(X.shape[0], size=BATCH_SIZE)
x_batch = X[rand_indices,:]
y_batch = Y[rand_indices]
_, temp_loss = sess.run([train_step, loss], feed_dict={x_t: x_batch, y_t: y_batch})
print(temp_loss)
According to my understanding of your dataset description, the target value column Y is a float (real valued number) that can be in any range and not necessarily within the [0,1] interval.
On the otherhand, because you use a sigmoid activation for the last layer of your model. The prediction values will be always in [0, 1] range.
I would suggest not using sigmoid activation in the last layer. Unless if your Y value were also in the [0,1] range.
so, modify your code such that it becomes
output = tf.layers.dense(hidden_2, 1, activation=None)

Softmax logistic regression: Different performance by scikit-learn and TensorFlow

I'm trying to learn a simple linear softmax model on some data. The LogisticRegression in scikit-learn seems to work fine, and now I am trying to port the code to TensorFlow, but I'm not getting the same performance, but quite a bit worse. I understand that the results will not be exactly equal (scikit learn has regularization params etc), but it's too far off.
total = pd.read_feather('testfile.feather')
labels = total['labels']
features = total[['f1', 'f2']]
print(labels.shape)
print(features.shape)
classifier = linear_model.LogisticRegression(C=1e5, solver='newton-cg', multi_class='multinomial')
classifier.fit(features, labels)
pred_labels = classifier.predict(features)
print("SCI-KITLEARN RESULTS: ")
print('\tAccuracy:', classifier.score(features, labels))
print('\tPrecision:', precision_score(labels, pred_labels, average='macro'))
print('\tRecall:', recall_score(labels, pred_labels, average='macro'))
print('\tF1:', f1_score(labels, pred_labels, average='macro'))
# now try softmax regression with tensorflow
print("\n\nTENSORFLOW RESULTS: ")
## By default, the OneHotEncoder class will return a more efficient sparse encoding.
## This may not be suitable for some applications, such as use with the Keras deep learning library.
## In this case, we disabled the sparse return type by setting the sparse=False argument.
enc = OneHotEncoder(sparse=False)
enc.fit(labels.values.reshape(len(labels), 1)) # Reshape is required as Encoder expect 2D data as input
labels_one_hot = enc.transform(labels.values.reshape(len(labels), 1))
# tf Graph Input
x = tf.placeholder(tf.float32, [None, 2]) # 2 input features
y = tf.placeholder(tf.float32, [None, 5]) # 5 output classes
# Set model weights
W = tf.Variable(tf.zeros([2, 5]))
b = tf.Variable(tf.zeros([5]))
# Construct model
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax
clas = tf.argmax(pred, axis=1)
# Minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(cost)
# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()
# Start training
with tf.Session() as sess:
# Run the initializer
sess.run(init)
# Training cycle
for epoch in range(1000):
# Run optimization op (backprop) and cost op (to get loss value)
_, c = sess.run([optimizer, cost], feed_dict={x: features, y: labels_one_hot})
# Test model
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
class_out = clas.eval({x: features})
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("\tAccuracy:", accuracy.eval({x: features, y: labels_one_hot}))
print('\tPrecision:', precision_score(labels, class_out, average='macro'))
print('\tRecall:', recall_score(labels, class_out, average='macro'))
print('\tF1:', f1_score(labels, class_out, average='macro'))
The output of this code is
(1681,)
(1681, 2)
SCI-KITLEARN RESULTS:
Accuracy: 0.822129684711
Precision: 0.837883361162
Recall: 0.784522522208
F1: 0.806251963817
TENSORFLOW RESULTS:
Accuracy: 0.694825
Precision: 0.735883666192
Recall: 0.649145125846
F1: 0.678045562185
I inspected the result of the one-hot-encoding, and the data, but I have no idea why the result in TF is much worse.
Any suggestion would be really appreciated..
The problem turned out to be silly, I just needed more epochs, a smaller learning rate (and for efficiency I turned to AdamOptimizer, results are now equal, although the TF implementation is much slower.
(1681,)
(1681, 2)
SCI-KITLEARN RESULTS:
Accuracy: 0.822129684711
Precision: 0.837883361162
Recall: 0.784522522208
F1: 0.806251963817
TENSORFLOW RESULTS:
Accuracy: 0.82213
Precision: 0.837883361162
Recall: 0.784522522208
F1: 0.806251963817

Tensorflow: Using Batch Normalization gives poor (erratic) validation loss and accuracy

I am trying to use Batch Normalization using tf.layers.batch_normalization() and my code looks like this:
def create_conv_exp_model(fingerprint_input, model_settings, is_training):
# Dropout placeholder
if is_training:
dropout_prob = tf.placeholder(tf.float32, name='dropout_prob')
# Mode placeholder
mode_placeholder = tf.placeholder(tf.bool, name="mode_placeholder")
he_init = tf.contrib.layers.variance_scaling_initializer(mode="FAN_AVG")
# Input Layer
input_frequency_size = model_settings['bins']
input_time_size = model_settings['spectrogram_length']
net = tf.reshape(fingerprint_input,
[-1, input_time_size, input_frequency_size, 1],
name="reshape")
net = tf.layers.batch_normalization(net,
training=mode_placeholder,
name='bn_0')
for i in range(1, 6):
net = tf.layers.conv2d(inputs=net,
filters=8*(2**i),
kernel_size=[5, 5],
padding='same',
kernel_initializer=he_init,
name="conv_%d"%i)
net = tf.layers.batch_normalization(net,
training=mode_placeholder,
name='bn_%d'%i)
with tf.name_scope("relu_%d"%i):
net = tf.nn.relu(net)
net = tf.layers.max_pooling2d(net, [2, 2], [2, 2], 'SAME',
name="maxpool_%d"%i)
net_shape = net.get_shape().as_list()
net_height = net_shape[1]
net_width = net_shape[2]
net = tf.layers.conv2d( inputs=net,
filters=1024,
kernel_size=[net_height, net_width],
strides=(net_height, net_width),
padding='same',
kernel_initializer=he_init,
name="conv_f")
net = tf.layers.batch_normalization( net,
training=mode_placeholder,
name='bn_f')
with tf.name_scope("relu_f"):
net = tf.nn.relu(net)
net = tf.layers.conv2d( inputs=net,
filters=model_settings['label_count'],
kernel_size=[1, 1],
padding='same',
kernel_initializer=he_init,
name="conv_l")
### Squeeze
squeezed = tf.squeeze(net, axis=[1, 2], name="squeezed")
if is_training:
return squeezed, dropout_prob, mode_placeholder
else:
return squeezed, mode_placeholder
And my train step looks like this:
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate_input)
gvs = optimizer.compute_gradients(cross_entropy_mean)
capped_gvs = [(tf.clip_by_value(grad, -2., 2.), var) for grad, var in gvs]
train_step = optimizer.apply_gradients(gvs))
During training, I am feeding the graph with:
train_summary, train_accuracy, cross_entropy_value, _, _ = sess.run(
[
merged_summaries, evaluation_step, cross_entropy_mean, train_step,
increment_global_step
],
feed_dict={
fingerprint_input: train_fingerprints,
ground_truth_input: train_ground_truth,
learning_rate_input: learning_rate_value,
dropout_prob: 0.5,
mode_placeholder: True
})
During validation,
validation_summary, validation_accuracy, conf_matrix = sess.run(
[merged_summaries, evaluation_step, confusion_matrix],
feed_dict={
fingerprint_input: validation_fingerprints,
ground_truth_input: validation_ground_truth,
dropout_prob: 1.0,
mode_placeholder: False
})
My loss and accuracy curves (orange is training, blue is validation):
Plot of loss vs number of iterations,
Plot of accuracy vs number of iterations
The validation loss (and accuracy) seem very erratic. Is my implementation of Batch Normalization wrong? Or is this normal with Batch Normalization and I should wait for more iterations?
You need to pass is_training to tf.layers.batch_normalization(..., training=is_training) or it tries to normalize the inference minibatches using the minibatch statistics instead of the training statistics, which is wrong.
There are mainly two things to check.
1. Are you sure that you are using batch normalization (BN) correctly in the train op?
If you read the layer documentation:
Note: when training, the moving_mean and moving_variance need to be updated.
By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they
need to be added as a dependency to the train_op. Also, be sure to add
any batch_normalization ops before getting the update_ops collection.
Otherwise, update_ops will be empty, and training/inference will not work
properly.
For example:
x_norm = tf.layers.batch_normalization(x, training=training)
# ...
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
train_op = optimizer.minimize(loss)
2. Otherwise, try lowering the "momentum" in the BN.
During the training, in fact, the BN uses two moving averages of the mean and the variance that are supposed to approximate the population statistics. Mean and variance are initialized to 0 and 1 respectively and then, step by step, they are multiplied by the momentum value (default is 0.99) and added the new value*0.01. At inference (test) time, the normalization uses these statistics. For this reason, it takes these values a little while to arrive at the "real" mean and variance of the data.
Source:
https://www.tensorflow.org/api_docs/python/tf/layers/batch_normalization
https://github.com/keras-team/keras/issues/7265
https://github.com/keras-team/keras/issues/3366
The original BN paper can be found here:
https://arxiv.org/abs/1502.03167
I also observed oscillations in validation loss when adding batch norm before ReLU. We found that moving the batch norm after the ReLU resolved the issue.

Simple model gets 0.0 accuracy

I am training a simple model on a dataset containing labels always equal to 0, and am getting a 0.0 accuracy.
The model is the following:
import csv
import numpy as np
import pandas as pd
import tensorflow as tf
labelsReader = pd.read_csv('data.csv',usecols = [12],header=None)
dataReader = pd.read_csv('data.csv',usecols = [1,2,3,4,5,6,7,8,9,10,11],header=None)
labels_ = labelsReader.values
data_ = dataReader.values
labels = np.float32(labels_)
data = np.float32(data_)
x = tf.placeholder(tf.float32, [None, 11])
W = tf.Variable(tf.truncated_normal([11, 1], stddev=1./11.))
b = tf.Variable(tf.zeros([1]))
y = tf.matmul(x, W) + b
# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 1])
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
for i in range(0, 1000):
train_step.run(feed_dict={x: data, y_: labels})
correct_prediction = tf.equal(y, y_)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: data, y_: labels}))
And here is the dataset:
444444,0,0,0.9993089149965446,0,0,0.000691085003455425,0,0,0,0,0,0
As the model trains, y of the data shown above decreases, and reaches -1000 after 1000 iterations.
What could be the cause of the failure to train the model ?
Your accuracy checks if the predicted float is exactly equal to the value you expect. With the network you made this is a very difficult task (although you might have a chance as you are also overfitting your data).
To get better results:
- Define accuracy to be higher/lower than a value (closer to 1 or closer to 0).
- Normalise your input data, I don't know the range of your input, but 444444 is a rediculous value to use as input, and it is difficult to train weights that can handle these values.
Also: try to add some sanity checks. For example: what is the output your model is predicting? (y.eval) And what is the cross entropy you have during training your network? (sess.run([accuracy,cross_entropy], feed_dict={x: data, y_: labels})
Good luck!