Tensorflow: simple linear regression - tensorflow

I just started to use Tensorflow in python for optimisation problems. And I just gave it a try with really simple regression model. But the results (both slope and constant) I obtain seemed to be quite far off from what I expect, can anyone point out what I have done wrong (the code runs, but I am not sure if I use Tensorflow properly).
What I did:
1 import module:
import matplotlib.pyplot as plt
import numpy as np
import random as ran
import tensorflow as tf
2 create data based on a linear structure (y = 3 X + 4 + error):
train_X = np.array(range(-20,20,1))
b = 3; c = 4; sd = 0.5;
error = np.random.normal(loc=0.0, scale=sd, size=40);
deterministic = b* train_X + c;
train_Y = np.add(deter,error)
3 setting up for optimisation:
X = tf.placeholder(tf.float32,[40])
Y = tf.placeholder(tf.float32,[40])
beta = tf.Variable(np.random.randn(), name="beta")
alpha = tf.Variable(np.random.randn(), name="alpha")
n_samples = 40
learning_rate = 0.01
pred_full = tf.add(tf.scalar_mul(beta, X),alpha)
cost = tf.reduce_mean(tf.pow(tf.subtract(Y, pred_full),2))
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
4 Running it:
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
sess.run(optimizer, feed_dict={X: train_X, Y: train_Y})
result = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
result_beta = sess.run(beta, feed_dict={X: train_X, Y: train_Y})
result_alpha = sess.run(alpha, feed_dict={X: train_X, Y: train_Y})
print('result:', result, ';', 'result_beta:', result_beta, ';', 'result_alpha:',result_alpha)
The result I obtained is:
result: 1912.99 ; result_beta: 6.75786 ; result_alpha: -0.209623
obviously beta is supposed to be close to 3 and alhpa should be close to 4. I am wondering what went wrong in my code?
Thanks

You have to call the optimizer multiple times for multiple iterations of gradient descent. As, #dv3 noted, try
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for i in range(50):
opt, result_alpha, result_beta = sess.run([optimizer, alpha, beta], feed_dict={X: train_X, Y: train_Y})
print('beta =', result_beta, 'alpha =', result_alpha)
NB: It isn't necessary to access multiple tensor values each with individual calls to run(). You can do that with a list of values to fetch.

Related

How to switch from GradientDescent Optimizer to Adam in Tensorflow

My code is running perfectly with Gradient Descent, but I want to compare the effectiveness of my algorithm using Adam Optimizer, so I tried to modify the following code:
# Import MNIST data
#import input_data
#mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
#fashion_mnist = input_data.read_data_sets('data/fashion')
import tensorflow as tf
# Set parameters
learning_rate = 0.01 #1e-4
training_iteration = 30
batch_size = 100
display_step = 2
# TF graph input
x = tf.placeholder("float", [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder("float", [None, 10]) # 0-9 digits recognition => 10 classes
#regularizer = tf.reduce_sum(tf.square(y))
# Create a model
# Set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
with tf.name_scope("Wx_b") as scope:
# Construct a linear model
model = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax
# Add summary ops to collect data
w_h = tf.summary.histogram("weights", W)
b_h = tf.summary.histogram("biases", b)
# More name scopes will clean up graph representation
with tf.name_scope("cost_function") as scope:
# Minimize error using cross entropy
# Cross entropy
cost_function = -tf.reduce_sum(y*tf.log(model))
# Create a summary to monitor the cost function
tf.summary.scalar("cost_function", cost_function)
with tf.name_scope("train") as scope:
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost_function)
# Initializing the variables
#init = tf.initialize_all_variables()
init = tf.global_variables_initializer()
# Merge all summaries into a single operator
merged_summary_op = tf.summary.merge_all()
# Launch the graph
with tf.Session() as sess:
sess.run(init)
summary_writer = tf.summary.FileWriter('/home/raed/Tensorflow/tensorflow_demo', graph_def =sess.graph_def)
#writer.add_graph(sess.graph_def)
# Training cycle
for iteration in range(training_iteration):
avg_cost = 0.
total_batch = int(mnist.train.num_examples/batch_size)
# Loop over all batches
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
# Fit training using batch data
sess.run(optimizer, feed_dict={x: batch_xs, y: batch_ys})
# Compute the average loss
avg_cost += sess.run(cost_function, feed_dict={x: batch_xs, y: batch_ys})/total_batch
# Write logs for each iteration
summary_str = sess.run(merged_summary_op, feed_dict={x: batch_xs, y: batch_ys})
summary_writer.add_summary(summary_str, iteration*total_batch + i)
# Display logs per iteration step
if iteration % display_step == 0:
print ("Iteration:" "%04d" % (iteration + 1), "cost=", "{:.9f}".format(avg_cost))
print ("Tuning completed!")
# Test the model
predictions = tf.equal(tf.argmax(model, 1), tf.argmax(y, 1))
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(predictions, "float"))
print ("Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))
to use Adam Optimizer I tried to change the following line :
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost_function)
and replace it with the AdamOptimizer :
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost_function)
when I ran the code , I got few iteration and then it stopped with the following error.
InvalidArgumentError (see above for traceback): Nan in summary histogram for: weights
[[Node: weights = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](weights/tag, Variable/read)]]
could you please help me understnad the problem , thanks in advance
the problem is weights are initialized to zero W = tf.Variable(tf.zeros([784, 10])) that`s why you re get Nan as weights.
you need to inialize them with some initializer i.e normal distribution as follow
W = tf.Variable(tf.random_normal([784, 10], stddev=0.35),
name="weights")

Tensorflow Neural Network for Regression

I am using tensor flow library to build a pretty simple 2 layer artificial neural network to perform linear regression.
My problem is that the results seem to be far from expected. I've been trying to spot my mistake for hours but no hope. I am new to tensor flow and neural networks so it could be a trivial mistake. Could anyone have an idea what i am doing wrong?
from __future__ import print_function
import tensorflow as tf
import numpy as np
# Python optimisation variables
learning_rate = 0.02
data_size=100000
data_length=100
train_input=10* np.random.rand(data_size,data_length);
train_label=train_input.sum(axis=1);
train_label=np.reshape(train_label,(data_size,1));
test_input= np.random.rand(data_size,data_length);
test_label=test_input.sum(axis=1);
test_label=np.reshape(test_label,(data_size,1));
x = tf.placeholder(tf.float32, [data_size, data_length])
y = tf.placeholder(tf.float32, [data_size, 1])
W1 = tf.Variable(tf.random_normal([data_length, 1], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random_normal([data_size, 1]), name='b1')
y_ = tf.add(tf.matmul(x, W1), b1)
cost = tf.reduce_mean(tf.square(y-y_))
optimiser=tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
.minimize(cost)
init_op = tf.global_variables_initializer()
correct_prediction = tf.reduce_mean(tf.square(y-y_))
accuracy = tf.cast(correct_prediction, tf.float32)
with tf.Session() as sess:
sess.run(init_op)
_, c = sess.run([optimiser, cost],
feed_dict={x:train_input , y:train_label})
k=sess.run(b1)
print(k)
print(sess.run(accuracy, feed_dict={x: test_input, y: test_label}))
Thanks for your help!
There are a number of changes you have to make in your code.
First of all, you have to perform training for number of epochs and also feed the optimizer training data in batches. Your learning rate was very high. Bias is supposed to be only one input for every dense (fully connected) layer. You can plot the cost (loss) value to see how your network is converging.
In order to feed data in batches, I have made the changes in placeholders also. Check the full modified code:
from __future__ import print_function
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
# Python optimisation variables
learning_rate = 0.001
data_size=1000 # Had to change these value to fit in my memory
data_length=10
train_input=10* np.random.rand(data_size,data_length);
train_label=train_input.sum(axis=1);
train_label=np.reshape(train_label,(data_size,1));
test_input= np.random.rand(data_size,data_length);
test_label=test_input.sum(axis=1);
test_label=np.reshape(test_label,(data_size,1));
tf.reset_default_graph()
x = tf.placeholder(tf.float32, [None, data_length])
y = tf.placeholder(tf.float32, [None, 1])
W1 = tf.Variable(tf.random_normal([data_length, 1], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random_normal([1, 1]), name='b1')
y_ = tf.add(tf.matmul(x, W1), b1)
cost = tf.reduce_mean(tf.square(y-y_))
optimiser=tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)
init_op = tf.global_variables_initializer()
EPOCHS = 500
BATCH_SIZE = 32
with tf.Session() as sess:
sess.run(init_op)
loss_history = []
for epoch_no in range(EPOCHS):
for offset in range(0, data_size, BATCH_SIZE):
batch_x = train_input[offset: offset + BATCH_SIZE]
batch_y = train_label[offset: offset + BATCH_SIZE]
_, c = sess.run([optimiser, cost],
feed_dict={x:batch_x , y:batch_y})
loss_history.append(c)
plt.plot(range(len(loss_history)), loss_history)
plt.show()
# For running test dataset
results, test_cost = sess.run([y_, cost], feed_dict={x: test_input, y: test_label})
print('test cost: {:.3f}'.format(test_cost))
for t1, t2 in zip(results, test_label):
print('Prediction: {:.3f}, actual: {:.3f}'.format(t1[0], t2[0]))

SVM on MNIST data with PCA using tensorflow

I intended to learn about PCA using SVD and therefore implemented it and tried to use it on MNIST data.
import numpy as np
class PCA(object):
def __init__ (self, X):
self.N, self.dim, *rest = X.shape
self.X = X
'''
U S V' = svd(X)
'''
X_std = (X - np.mean(X, axis=0))/(np.std(X, axis=0)+1e-13)
[self.U, self.s, self.Vt] = np.linalg.svd(X_std)
self.V = self.Vt.T
self.variance_ratio = self.s
def variance_explained_ratio (self):
'''
Returns the cumulative variance captured with each added principal component
'''
return np.cumsum(self.variance_ratio)/np.sum(self.variance_ratio)
def X_projected (self, r):
'''
Returns the data X projected along the first r principal components
'''
if r is None:
r = self.dim
X_proj = np.zeros((r, self.N))
P_reduce = self.V[:,0:r]
X_proj = self.X.dot(P_reduce)
return X_proj
Now with this implementation for PCA, I tried to apply it to MNIST data to see the performance with and without PCA for classification using softmax. The code for that is as follows:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
# Using first 10000 images
train_data = mnist.train.images[:10000,:]
train_labels = mnist.train.labels[:10000,:]
pca1 = PCA(train_data)
pca_test = PCA(mnist.test.images)
n_components = 14
X_proj1 = pca1.X_projected(r=n_components)
X_projTest = pca_test.X_projected(r=n_components)
t1 = time.time()
x = tf.placeholder(tf.float32, [None, n_components])
W = tf.Variable(tf.zeros([n_components, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.cast(tf.nn.softmax(tf.matmul(x, W) + b), tf.float32)
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_*tf.log(y),
reduction_indices=[1]))
train_step =
tf.train.GradientDescentOptimizer(0.7).minimize(cross_entropy)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
m = 10000
for _ in range(1000):
indices = random.sample(range(0, m), 100)
batch_xs = X_proj1[indices]
batch_ys = train_labels[indices]
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
accuracy = sess.run(accuracy, feed_dict={x: X_projTest, y_:
mnist.test.labels})
print("Accuracy: %f" % accuracy)
sess.close()
t2 = time.time()
print ("Total time taken: %f seconds" % (t2-t1))
The accuracy I obtain using this is only around 19% whereas with the train_data and train_labels, the accuracy is more than 90%. Could someone suggest where I'm going wrong?
When we use PCA or feature scaling, we set the underlying parameters on the training dataset and then just apply/transform it on the test dataset. The test dataset is not used to calculate the key parameters, or in this case, SVD should only be applied on the train dataset.
e.g. in sklearn's PCA, we use the following code :
from sklearn.decomposition import PCA
pca = PCA(n_components = 'whatever number you want')
X_train_pca = pca.fit_transform(X_train)
X_test_pca = pca.transform(X_test)
Note, that we fit on the training dataset, X_train and transform on X_test.
Similarly, for the above implementation, there's no need to create the pca_test object. Tweak the X_projTest variable to :
X_projTest = mnist.test.images.dot(pca1.V[:,0:n_components])
This should solve for the low test accuracy.

TensorFlow Linear Regression gives 'NaN' result

I am currently running the TensorFlow model with Linear Regression. However, I don't understand why, even when I decrease the learning_rate from 0.01 to 0.001 and increase the training iterations from 1000 to 50000, I still obtain the 'nan' result for the cost function, as well as the two coefficients. Could anyone please help me detect the problem in the following code?
from __future__ import print_function
import tensorflow as tf
import numpy
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
import random
rng = numpy.random
# Parameters
learning_rate = 0.001
training_epochs = 20000 #number of iterations
display_step = 400
#read csv file
datapath = [directory path]
Ha_Noi = pd.read_csv(datapath+"HaNoi_1month_LW_WeatherTest.csv")
#Add an additional column into the table
sLength = len(Ha_Noi['accept_rate'])
Ha_Noi['accept_rate_timeT'] = pd.Series(Ha_Noi['accept_rate'], index=Ha_Noi.index)
#Shift the entries in the accept_rate column upward
Ha_Noi.accept_rate = Ha_Noi.accept_rate.shift(-1)
Ha_Noi = Ha_Noi.dropna(subset = ["longwait_percent4"])
Ha_Noi = Ha_Noi.dropna(subset=["accept_rate"])
Ha_Noi = Ha_Noi.dropna(subset = ["longwait_percent2"])
df2 = pd.DataFrame(Ha_Noi)
#split the dataset into training and testing sets
train_set, test_set = train_test_split(Ha_Noi, test_size=0.2, random_state = random.randint(20, 200))
Xtrain = train_set['longwait_percent2'].reshape(-1,1)
Ytrain = train_set['accept_rate'].reshape(-1,1)
Xtrain2 = train_set['Weather Weight_Longwait_percent2'].reshape(-1,1)
Xtest2 = test_set['Weather Weight_Longwait_percent2'].reshape(-1,1)
# Xtest = test_set['longwait_percent2'].reshape(-1,1)
# Ytest = test_set['accept_rate'].reshape(-1,1)
# Training Data
train_X = Xtrain
train_Y = Ytrain
n_samples = train_X.shape[0]
#Testing Data
Xtest = np.asarray(test_set['longwait_percent2'])
Ytest = np.asarray(test_set['accept_rate'])
# tf Graph Input
X = tf.placeholder("float")
Y = tf.placeholder("float")
# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")
# Construct a linear model
pred = tf.add(tf.multiply(X, W), b)
# Mean squared error
cost = tf.sqrt(tf.reduce_sum(tf.pow(pred-Y, 2))/(n_samples))
# Gradient descent method
# Note, minimize() knows to modify W and b because Variable objects are "trained" (trainable=True by default)
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
# Initializing the variables
init = tf.global_variables_initializer()
saver = tf.train.Saver() #save all the initialized data
# Launch the graph
with tf.Session() as sess:
sess.run(init)
# Fit all training data
for epoch in range(training_epochs):
for (x, y) in zip(train_X, train_Y):
sess.run(optimizer, feed_dict={X: x, Y: y})
# Display logs per epoch step
if (epoch+1) % display_step == 0: # checkpoint every 50 epochs
c = sess.run(cost, feed_dict={X: train_X, Y:train_Y})
print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
"W=", sess.run(W), "b=", sess.run(b))
print("Optimization Finished!")
training_cost = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
print("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')
# Graphic display
plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
plt.legend()
plt.show()
testing_cost = sess.run(
tf.reduce_sum(tf.pow(pred - Y, 2)) / (Xtest.shape[0]),
feed_dict={X: Xtest, Y: Ytest}) # square root of function cost above
print("Root Mean Square Error =", tf.sqrt(testing_cost))
print("Absolute mean square loss difference:", abs(
training_cost - testing_cost))
plt.plot(Xtest, Ytest, 'bo', label='Testing data')
plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
plt.legend()
plt.show()
Don't have your data, so it's hard to tell whether the problem is caused by data or by training problem. You can make learning rate and training iteration much smaller such 0.00005 and 100 to see is there still NaN.

Simple model gets 0.0 accuracy

I am training a simple model on a dataset containing labels always equal to 0, and am getting a 0.0 accuracy.
The model is the following:
import csv
import numpy as np
import pandas as pd
import tensorflow as tf
labelsReader = pd.read_csv('data.csv',usecols = [12],header=None)
dataReader = pd.read_csv('data.csv',usecols = [1,2,3,4,5,6,7,8,9,10,11],header=None)
labels_ = labelsReader.values
data_ = dataReader.values
labels = np.float32(labels_)
data = np.float32(data_)
x = tf.placeholder(tf.float32, [None, 11])
W = tf.Variable(tf.truncated_normal([11, 1], stddev=1./11.))
b = tf.Variable(tf.zeros([1]))
y = tf.matmul(x, W) + b
# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 1])
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
for i in range(0, 1000):
train_step.run(feed_dict={x: data, y_: labels})
correct_prediction = tf.equal(y, y_)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: data, y_: labels}))
And here is the dataset:
444444,0,0,0.9993089149965446,0,0,0.000691085003455425,0,0,0,0,0,0
As the model trains, y of the data shown above decreases, and reaches -1000 after 1000 iterations.
What could be the cause of the failure to train the model ?
Your accuracy checks if the predicted float is exactly equal to the value you expect. With the network you made this is a very difficult task (although you might have a chance as you are also overfitting your data).
To get better results:
- Define accuracy to be higher/lower than a value (closer to 1 or closer to 0).
- Normalise your input data, I don't know the range of your input, but 444444 is a rediculous value to use as input, and it is difficult to train weights that can handle these values.
Also: try to add some sanity checks. For example: what is the output your model is predicting? (y.eval) And what is the cross entropy you have during training your network? (sess.run([accuracy,cross_entropy], feed_dict={x: data, y_: labels})
Good luck!