Tensorflow Neural Network for Regression - tensorflow

I am using tensor flow library to build a pretty simple 2 layer artificial neural network to perform linear regression.
My problem is that the results seem to be far from expected. I've been trying to spot my mistake for hours but no hope. I am new to tensor flow and neural networks so it could be a trivial mistake. Could anyone have an idea what i am doing wrong?
from __future__ import print_function
import tensorflow as tf
import numpy as np
# Python optimisation variables
learning_rate = 0.02
data_size=100000
data_length=100
train_input=10* np.random.rand(data_size,data_length);
train_label=train_input.sum(axis=1);
train_label=np.reshape(train_label,(data_size,1));
test_input= np.random.rand(data_size,data_length);
test_label=test_input.sum(axis=1);
test_label=np.reshape(test_label,(data_size,1));
x = tf.placeholder(tf.float32, [data_size, data_length])
y = tf.placeholder(tf.float32, [data_size, 1])
W1 = tf.Variable(tf.random_normal([data_length, 1], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random_normal([data_size, 1]), name='b1')
y_ = tf.add(tf.matmul(x, W1), b1)
cost = tf.reduce_mean(tf.square(y-y_))
optimiser=tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
.minimize(cost)
init_op = tf.global_variables_initializer()
correct_prediction = tf.reduce_mean(tf.square(y-y_))
accuracy = tf.cast(correct_prediction, tf.float32)
with tf.Session() as sess:
sess.run(init_op)
_, c = sess.run([optimiser, cost],
feed_dict={x:train_input , y:train_label})
k=sess.run(b1)
print(k)
print(sess.run(accuracy, feed_dict={x: test_input, y: test_label}))
Thanks for your help!

There are a number of changes you have to make in your code.
First of all, you have to perform training for number of epochs and also feed the optimizer training data in batches. Your learning rate was very high. Bias is supposed to be only one input for every dense (fully connected) layer. You can plot the cost (loss) value to see how your network is converging.
In order to feed data in batches, I have made the changes in placeholders also. Check the full modified code:
from __future__ import print_function
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
# Python optimisation variables
learning_rate = 0.001
data_size=1000 # Had to change these value to fit in my memory
data_length=10
train_input=10* np.random.rand(data_size,data_length);
train_label=train_input.sum(axis=1);
train_label=np.reshape(train_label,(data_size,1));
test_input= np.random.rand(data_size,data_length);
test_label=test_input.sum(axis=1);
test_label=np.reshape(test_label,(data_size,1));
tf.reset_default_graph()
x = tf.placeholder(tf.float32, [None, data_length])
y = tf.placeholder(tf.float32, [None, 1])
W1 = tf.Variable(tf.random_normal([data_length, 1], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random_normal([1, 1]), name='b1')
y_ = tf.add(tf.matmul(x, W1), b1)
cost = tf.reduce_mean(tf.square(y-y_))
optimiser=tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)
init_op = tf.global_variables_initializer()
EPOCHS = 500
BATCH_SIZE = 32
with tf.Session() as sess:
sess.run(init_op)
loss_history = []
for epoch_no in range(EPOCHS):
for offset in range(0, data_size, BATCH_SIZE):
batch_x = train_input[offset: offset + BATCH_SIZE]
batch_y = train_label[offset: offset + BATCH_SIZE]
_, c = sess.run([optimiser, cost],
feed_dict={x:batch_x , y:batch_y})
loss_history.append(c)
plt.plot(range(len(loss_history)), loss_history)
plt.show()
# For running test dataset
results, test_cost = sess.run([y_, cost], feed_dict={x: test_input, y: test_label})
print('test cost: {:.3f}'.format(test_cost))
for t1, t2 in zip(results, test_label):
print('Prediction: {:.3f}, actual: {:.3f}'.format(t1[0], t2[0]))

Related

When restoring a network, an operation cannot be found in the restored graph

Using TensorFlow 1.9, I want to train a neural network in one Python file, and then restore the network using a different Python file. I have tried to do this using a simple example, but when I try to load my "prediction" operation, I receive an error. Specifically, the error is: KeyError: "The name 'prediction' refers to an Operation not in the graph.".
Below is my Python file to train and save the network. It generates some example data and trains a simple neural network, then saves the network every epoch.
import numpy as np
import tensorflow as tf
input_data = np.zeros([100, 10])
label_data = np.zeros([100, 1])
for i in range(100):
for j in range(10):
input_data[i, j] = i * j / 1000
label_data[i] = 2 * input_data[i, 0] + np.random.uniform(0.01)
input_placeholder = tf.placeholder(tf.float32, shape=[None, 10], name='input_placeholder')
label_placeholder = tf.placeholder(tf.float32, shape=[None, 1], name='label_placeholder')
x = tf.layers.dense(inputs=input_placeholder, units=10, activation=tf.nn.relu)
x = tf.layers.dense(inputs=x, units=10, activation=tf.nn.relu)
prediction = tf.layers.dense(inputs=x, units=1, name='prediction')
loss_op = tf.reduce_mean(tf.square(prediction - label_placeholder))
train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss_op)
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch_num in range(100):
_, loss = sess.run([train_op, loss_op], feed_dict={input_placeholder: input_data, label_placeholder: label_data})
print('epoch ' + str(epoch_num) + ', loss = ' + str(loss))
saver.save(sess, '../Models/model', global_step=epoch_num + 1)
And below is my Python file to restore the network. It loads the input and output placeholders, together with the operation required for making predictions. However, even though I have named an operation as prediction in the training code above, the code below cannot seem to find this operation in the loaded graph.
import tensorflow as tf
import numpy as np
input_data = np.zeros([100, 10])
for i in range(100):
for j in range(10):
input_data[i, j] = i * j / 1000
with tf.Session() as sess:
saver = tf.train.import_meta_graph('../Models/model-99.meta')
saver.restore(sess, '../Models/model-99')
graph = tf.get_default_graph()
input_placeholder = graph.get_tensor_by_name('input_placeholder:0')
label_placeholder = graph.get_tensor_by_name('label_placeholder:0')
prediction = graph.get_operation_by_name('prediction')
pred = sess.run([prediction], feed_dict={input_placeholder: input_data})
Why can this code not find this operation, and what should I do to correct my code?
You have to modify a single line in your loading script (tested with tf 1.8):
prediction = graph.get_tensor_by_name('prediction/BiasAdd:0')
You have to specify which tensor you want to access, as prediction is only the namespace for the dense layer. You can check the exact name during saving with prediction.name. And when restoring, use tf.get_tensor_by_name as you are interested in the value, not the operation producing it.

SVM on MNIST data with PCA using tensorflow

I intended to learn about PCA using SVD and therefore implemented it and tried to use it on MNIST data.
import numpy as np
class PCA(object):
def __init__ (self, X):
self.N, self.dim, *rest = X.shape
self.X = X
'''
U S V' = svd(X)
'''
X_std = (X - np.mean(X, axis=0))/(np.std(X, axis=0)+1e-13)
[self.U, self.s, self.Vt] = np.linalg.svd(X_std)
self.V = self.Vt.T
self.variance_ratio = self.s
def variance_explained_ratio (self):
'''
Returns the cumulative variance captured with each added principal component
'''
return np.cumsum(self.variance_ratio)/np.sum(self.variance_ratio)
def X_projected (self, r):
'''
Returns the data X projected along the first r principal components
'''
if r is None:
r = self.dim
X_proj = np.zeros((r, self.N))
P_reduce = self.V[:,0:r]
X_proj = self.X.dot(P_reduce)
return X_proj
Now with this implementation for PCA, I tried to apply it to MNIST data to see the performance with and without PCA for classification using softmax. The code for that is as follows:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
# Using first 10000 images
train_data = mnist.train.images[:10000,:]
train_labels = mnist.train.labels[:10000,:]
pca1 = PCA(train_data)
pca_test = PCA(mnist.test.images)
n_components = 14
X_proj1 = pca1.X_projected(r=n_components)
X_projTest = pca_test.X_projected(r=n_components)
t1 = time.time()
x = tf.placeholder(tf.float32, [None, n_components])
W = tf.Variable(tf.zeros([n_components, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.cast(tf.nn.softmax(tf.matmul(x, W) + b), tf.float32)
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_*tf.log(y),
reduction_indices=[1]))
train_step =
tf.train.GradientDescentOptimizer(0.7).minimize(cross_entropy)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
m = 10000
for _ in range(1000):
indices = random.sample(range(0, m), 100)
batch_xs = X_proj1[indices]
batch_ys = train_labels[indices]
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
accuracy = sess.run(accuracy, feed_dict={x: X_projTest, y_:
mnist.test.labels})
print("Accuracy: %f" % accuracy)
sess.close()
t2 = time.time()
print ("Total time taken: %f seconds" % (t2-t1))
The accuracy I obtain using this is only around 19% whereas with the train_data and train_labels, the accuracy is more than 90%. Could someone suggest where I'm going wrong?
When we use PCA or feature scaling, we set the underlying parameters on the training dataset and then just apply/transform it on the test dataset. The test dataset is not used to calculate the key parameters, or in this case, SVD should only be applied on the train dataset.
e.g. in sklearn's PCA, we use the following code :
from sklearn.decomposition import PCA
pca = PCA(n_components = 'whatever number you want')
X_train_pca = pca.fit_transform(X_train)
X_test_pca = pca.transform(X_test)
Note, that we fit on the training dataset, X_train and transform on X_test.
Similarly, for the above implementation, there's no need to create the pca_test object. Tweak the X_projTest variable to :
X_projTest = mnist.test.images.dot(pca1.V[:,0:n_components])
This should solve for the low test accuracy.

TensorFlow: Why do parameters not update when GradientDescentOptimizer train step is run?

When I run the following code, it prints a constant loss at every training step; I also tried printing the parameters, which also do not change.
I can't seem to figure out why train_step, which uses a GradientDescentOptimizer, doesnt change the weights in W_fc1, b_fc1, W_fc2, and b_fc2.
I'm a beginner to machine learning so I might be missing something obvious.
(An answer for a similar question was that weights should not be initialized at zero, but the weights here are initialized with truncated normal so that cant be the problem).
import tensorflow as tf
import numpy as np
import csv
import random
with open('wine_data.csv', 'rb') as csvfile:
input_arr = list(csv.reader(csvfile, delimiter=','))
for i in range(len(input_arr)):
input_arr[i][0] = int(input_arr[i][0]) - 1 # 0 index for one hot
for j in range(1, len(input_arr[i])):
input_arr[i][j] = float(input_arr[i][j])
random.shuffle(input_arr)
training_data = np.array(input_arr[:2*len(input_arr)/3]) # train on first two thirds of data
testing_data = np.array(input_arr[2*len(input_arr)/3:]) # test on last third of data
x_train = training_data[0:, 1:]
y_train = training_data[0:, 0]
x_test = testing_data[0:, 1:]
y_test = testing_data[0:, 0]
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
x = tf.placeholder(tf.float32, shape=[None, 13], name='x')
y_ = tf.placeholder(tf.float32, shape=[None], name='y_')
y_one_hot = tf.one_hot(tf.cast(y_, tf.int32), 3) # actual y values
W_fc1 = weight_variable([13, 128])
b_fc1 = bias_variable([128])
fc1 = tf.matmul(x, W_fc1)+b_fc1
W_fc2 = weight_variable([128, 3])
b_fc2 = bias_variable([3])
y = tf.nn.softmax(tf.matmul(fc1, W_fc2)+b_fc2)
cross_entropy = tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_one_hot, logits=y))
train_step = tf.train.GradientDescentOptimizer(1e-17).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_one_hot,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
for _ in range(1000):
train_step.run(feed_dict={x: x_train, y_: y_train})
if _%10 == 0:
loss = cross_entropy.eval(feed_dict={x: x_train, y_: y_train})
print('step', _, 'loss', loss)
Thanks in advance.
From the official tensorflow documentation:
WARNING: This op expects unscaled logits, since it performs a softmax on logits internally for efficiency. Do not call this op with the output of softmax, as it will produce incorrect results.
Remove the softmax on y before feeding it into tf.nn.softmax_cross_entropy_with_logits
Also set your learning rate to something higher (like 3e-4)

Tensorflow loss minimization is increasing loss

I implemented the linear regression model shown on Tensorflow's main page: https://www.tensorflow.org/get_started/get_started
import numpy as np
import tensorflow as tf
# Model parameters
W = tf.Variable([.3], tf.float32)
b = tf.Variable([-.3], tf.float32)
# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W * x + b
y = tf.placeholder(tf.float32)
# loss
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares
# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
# training data
x_train = [1,2,3,4]
y_train = [0,-1,-2,-3]
# training loop
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init) # reset values to wrong
for i in range(1000):
sess.run(train, {x:x_train, y:y_train})
# evaluate training accuracy
curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))
However, when I change the training data to x_train=[2,4,6,8] and y_train=[3,4,5,6],
the loss starts to increase over time until it reaches 'nan'
As suggested by Steven, you should probably use reduce_mean(), which seems to fix the problem of the increasing loss function. Note that I also increased the number of training steps since reduce_mean() appears to need a bit longer to converge. Be careful with increasing the learning rate, since this may reproduce the problem. Instead, if training time is not a critical factor, you might want to decrease the learning rate and increase the number of training iterations further.
With the reduce_sum() function it worked well for me after decreasing the learning rate from 0.01 to 0.001. Again, thanks to Steven for the suggestion.
import numpy as np
import tensorflow as tf
# Model parameters
W = tf.Variable([.3], tf.float32)
b = tf.Variable([-.3], tf.float32)
# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W * x + b
y = tf.placeholder(tf.float32)
# loss
loss = tf.reduce_mean(tf.square(linear_model - y)) # sum of the squares
# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
# training data
x_train = [2,4,6,8]
y_train = [0,3,4,5]
# training loop
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init) # reset values to wrong
for i in range(5000):
sess.run(train, {x:x_train, y:y_train})
# evaluate training accuracy
curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))

Simple model gets 0.0 accuracy

I am training a simple model on a dataset containing labels always equal to 0, and am getting a 0.0 accuracy.
The model is the following:
import csv
import numpy as np
import pandas as pd
import tensorflow as tf
labelsReader = pd.read_csv('data.csv',usecols = [12],header=None)
dataReader = pd.read_csv('data.csv',usecols = [1,2,3,4,5,6,7,8,9,10,11],header=None)
labels_ = labelsReader.values
data_ = dataReader.values
labels = np.float32(labels_)
data = np.float32(data_)
x = tf.placeholder(tf.float32, [None, 11])
W = tf.Variable(tf.truncated_normal([11, 1], stddev=1./11.))
b = tf.Variable(tf.zeros([1]))
y = tf.matmul(x, W) + b
# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 1])
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
for i in range(0, 1000):
train_step.run(feed_dict={x: data, y_: labels})
correct_prediction = tf.equal(y, y_)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: data, y_: labels}))
And here is the dataset:
444444,0,0,0.9993089149965446,0,0,0.000691085003455425,0,0,0,0,0,0
As the model trains, y of the data shown above decreases, and reaches -1000 after 1000 iterations.
What could be the cause of the failure to train the model ?
Your accuracy checks if the predicted float is exactly equal to the value you expect. With the network you made this is a very difficult task (although you might have a chance as you are also overfitting your data).
To get better results:
- Define accuracy to be higher/lower than a value (closer to 1 or closer to 0).
- Normalise your input data, I don't know the range of your input, but 444444 is a rediculous value to use as input, and it is difficult to train weights that can handle these values.
Also: try to add some sanity checks. For example: what is the output your model is predicting? (y.eval) And what is the cross entropy you have during training your network? (sess.run([accuracy,cross_entropy], feed_dict={x: data, y_: labels})
Good luck!