Error with tensorflow 1.0 mnist code - tensorflow

I am now learning tensorflow 1.0 with python 3.5.2. I tried the following code found on github but i am getting the error No module named 'tensorflowvisu'. If i remove the import tensorflowvisu i get the error I = tensorflowvisu.tf_format_mnist_images(X, Ypred, Y_) # assembles 10x10 images by default
NameError: name 'tensorflowvisu' is not defined
What should i do to get this code to work? Does anyone have a working code for mnist with tensorflow 1.0 and python 3.5 that i can follow to learn? Any response appreciated.
https://github.com/martin-gorner/tensorflow-mnist-tutorial/blob/master/mnist_1.0_softmax.py
import tensorflow as tf
import tensorflowvisu
from tensorflow.contrib.learn.python.learn.datasets.mnist import read_data_sets
tf.set_random_seed(0)
# neural network with 1 layer of 10 softmax neurons
#
# · · · · · · · · · · (input data, flattened pixels) X [batch, 784] # 784 = 28 * 28
# \x/x\x/x\x/x\x/x\x/ -- fully connected layer (softmax) W [784, 10] b[10]
# · · · · · · · · Y [batch, 10]
# The model is:
#
# Y = softmax( X * W + b)
# X: matrix for 100 grayscale images of 28x28 pixels, flattened (there are 100 images in a mini-batch)
# W: weight matrix with 784 lines and 10 columns
# b: bias vector with 10 dimensions
# +: add with broadcasting: adds the vector to each line of the matrix (numpy)
# softmax(matrix) applies softmax on each line
# softmax(line) applies an exp to each value then divides by the norm of the resulting line
# Y: output matrix with 100 lines and 10 columns
# Download images and labels into mnist.test (10K images+labels) and mnist.train (60K images+labels)
mnist = read_data_sets("data", one_hot=True, reshape=False, validation_size=0)
# input X: 28x28 grayscale images, the first dimension (None) will index the images in the mini-batch
X = tf.placeholder(tf.float32, [None, 28, 28, 1])
# correct answers will go here
Y_ = tf.placeholder(tf.float32, [None, 10])
# weights W[784, 10] 784=28*28
W = tf.Variable(tf.zeros([784, 10]))
# biases b[10]
b = tf.Variable(tf.zeros([10]))
# flatten the images into a single line of pixels
# -1 in the shape definition means "the only possible dimension that will preserve the number of elements"
XX = tf.reshape(X, [-1, 784])
# The model
Y = tf.nn.softmax(tf.matmul(XX, W) + b)
# loss function: cross-entropy = - sum( Y_i * log(Yi) )
# Y: the computed output vector
# Y_: the desired output vector
# cross-entropy
# log takes the log of each element, * multiplies the tensors element by element
# reduce_mean will add all the components in the tensor
# so here we end up with the total cross-entropy for all images in the batch
cross_entropy = -tf.reduce_mean(Y_ * tf.log(Y)) * 1000.0 # normalized for batches of 100 images,
# *10 because "mean" included an unwanted division by 10
# accuracy of the trained model, between 0 (worst) and 1 (best)
correct_prediction = tf.equal(tf.argmax(Y, 1), tf.argmax(Y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# training, learning rate = 0.005
train_step = tf.train.GradientDescentOptimizer(0.005).minimize(cross_entropy)
# matplotlib visualisation
allweights = tf.reshape(W, [-1])
allbiases = tf.reshape(b, [-1])
I = tensorflowvisu.tf_format_mnist_images(X, Y, Y_) # assembles 10x10 images by default
It = tensorflowvisu.tf_format_mnist_images(X, Y, Y_, 1000, lines=25) # 1000 images on 25 lines
datavis = tensorflowvisu.MnistDataVis()
# init
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
# You can call this function in a loop to train the model, 100 images at a time
def training_step(i, update_test_data, update_train_data):
# training on batches of 100 images with 100 labels
batch_X, batch_Y = mnist.train.next_batch(100)
# compute training values for visualisation
if update_train_data:
a, c, im, w, b = sess.run([accuracy, cross_entropy, I, allweights, allbiases], feed_dict={X: batch_X, Y_: batch_Y})
datavis.append_training_curves_data(i, a, c)
datavis.append_data_histograms(i, w, b)
datavis.update_image1(im)
print(str(i) + ": accuracy:" + str(a) + " loss: " + str(c))
# compute test values for visualisation
if update_test_data:
a, c, im = sess.run([accuracy, cross_entropy, It], feed_dict={X: mnist.test.images, Y_: mnist.test.labels})
datavis.append_test_curves_data(i, a, c)
datavis.update_image2(im)
print(str(i) + ": ********* epoch " + str(i*100//mnist.train.images.shape[0]+1) + " ********* test accuracy:" + str(a) + " test loss: " + str(c))
# the backpropagation training step
sess.run(train_step, feed_dict={X: batch_X, Y_: batch_Y})
datavis.animate(training_step, iterations=2000+1, train_data_update_freq=10, test_data_update_freq=50, more_tests_at_start=True)
# to save the animation as a movie, add save_movie=True as an argument to datavis.animate
# to disable the visualisation use the following line instead of the datavis.animate line
# for i in range(2000+1): training_step(i, i % 50 == 0, i % 10 == 0)
print("max test accuracy: " + str(datavis.get_max_test_accuracy()))
# final max test accuracy = 0.9268 (10K iterations). Accuracy should peak above 0.92 in the first 2000 iterations.

I had the same issue. The solution is to run the code from within the folder where all the code resides. Don't just copy the mnist_1.0_softmax.py code to your IDE and run it. Download or clone entire repo from the link below
https://github.com/martin-gorner/tensorflow-mnist-tutorial.git
Once cloned, you will see that in that folder there is file called tensorflowvisu.py. So this is not a module that you install from conda or pip. It's just a file that the author uses as module in this precise case. Go to that dir where all this code sits through command line and from there run
python mnist_1.0_softmax.py
Now it should work. You should see a pop up windows with 6 charts that get updated in real-time.
If you want to run it from your IDE, then open your IDE (in my case Atom) and there go to File > Open Folder > Click OK > choose your file mnist_1.0_softmax.py and press Ctrl+Shift+B.
the same pop up window should appear.
The most important thing is to open the file from within the dir provided by the author.

try to install matplotlib using pip install --upgrade matplotlib
with anaconda prompt

Related

Problem with tensorflow initialization when it gets encapsulated

I am encapsulating an autoencoder cost calculation, in order to allow to be used with an swarm algorithms. The goal is to get a cost summary of the autoencoder sending a few parameters, so the method creates a model, train it and returns its cost tensor
def getAECost(dfnormalized, adamParam, iterations):
N_VISIBLE = 31
N_HIDDEN = 20
DEVICE = '/gpu:0' #Or '/cpu:0'
ITERATIONS = 1 + iterations
with tf.device(DEVICE):
# create node for input data(entiendo none columns and N_VISIBLE rows)
X = tf.placeholder("float", [None, N_VISIBLE], name='X')
# create nodes for hidden variables
W_init_max = 4 * np.sqrt(6. / (N_VISIBLE + N_HIDDEN))
W_init = tf.random_uniform(shape=[N_VISIBLE, N_HIDDEN])#,
# minval=-W_init_max,
# maxval=W_init_max)
#Inicialite our weight and bias
#W [784,500]
W = tf.Variable(W_init, name='W')
#Inicializate only bias of hidden layer
b = tf.Variable(tf.zeros([N_HIDDEN]), name='b')
#W_prime[500,784]
W_prime = tf.transpose(W) # tied weights between encoder and decoder
b_prime = tf.Variable(tf.zeros([N_VISIBLE]), name='b_prime')
#model that take our variables parameters
#Comportamiento de la red neuronal
def model(X, W, b, W_prime, b_prime):
tilde_X = X
#To decode ?
Y = tf.nn.sigmoid(tf.matmul(tilde_X, W) + b) # hidden state
#to reconstructed the input
Z = tf.nn.sigmoid(tf.matmul(Y, W_prime) + b_prime) # reconstructed input
return Z
# build model graph
pred = model(X, W, b, W_prime, b_prime)
# create cost function
#Sum of squared error
cost = tf.reduce_sum(tf.pow(X - pred, 2)) # minimize squared error
#Tensor to parameter learning rate
learning = tf.placeholder("float", name='learning')
train_op = tf.train.AdamOptimizer(learning).minimize(cost) # construct an optimizer
with tf.Session() as sess:
# you need to initialize all variables
tf.global_variables_initializer()
RATIO = adamParam
for i in range(ITERATIONS):
#Prepare input(minibach) from feed autoencoder
input_ = dfnormalized
# train autoencoder
sess.run(train_op, feed_dict={X: input_, learning: RATIO})
#Save last epoch and test
if(i == ITERATIONS-1):
#Get output as dataframe after training(Z is a array, we cast to list to append with a dataframe)
costAE = sess.run(cost, feed_dict={X: input_})
return costAE
It worked a few days ago (maybe I had another session on background), returning the method a float number, but nowadays is not working, getting the inizialization error
FailedPreconditionError: Attempting to use uninitialized value W
[[{{node W/read}}]]
in the training step
sess.run(train_op, feed_dict={X: input_, learning: RATIO})
Any advice about how this initialization problem can be solved, or how can I encapsulate a tensorflow model and session?
Thanks
You have to actually run the variables initializer, tf.global_variables_initializer() returns an op to be executed, it does not run the initialization for you. So the solution to your problem should be replacing the line
tf.global_variables_initializer()
with
sess.run(tf.global_variables_initializer())
I have tried what #Addy said, and reestructured the code to see more legible, and now works perfectly
class Model:
N_VISIBLE = 31
N_HIDDEN = 20
DEVICE = '/gpu:0' #Or '/cpu:0'
with tf.device(DEVICE):
# create node for input data(entiendo none columns and N_VISIBLE rows)
X = tf.placeholder("float", [None, N_VISIBLE], name='X')
# create nodes for hidden variables
W_init_max = 4 * np.sqrt(6. / (N_VISIBLE + N_HIDDEN))
W_init = tf.random_uniform(shape=[N_VISIBLE, N_HIDDEN])#,
# minval=-W_init_max,
# maxval=W_init_max)
#Inicialite our weight and bias
#W [784,500]
W = tf.Variable(W_init, name='W')
#Inicializate only bias of hidden layer
b = tf.Variable(tf.zeros([N_HIDDEN]), name='b')
#W_prime[500,784]
W_prime = tf.transpose(W) # tied weights between encoder and decoder
b_prime = tf.Variable(tf.zeros([N_VISIBLE]), name='b_prime')
#model that take our variables parameters
#Comportamiento de la red neuronal
def model(X, W, b, W_prime, b_prime):
tilde_X = X
#To decode ?
Y = tf.nn.sigmoid(tf.matmul(tilde_X, W) + b) # hidden state
#to reconstructed the input
Z = tf.nn.sigmoid(tf.matmul(Y, W_prime) + b_prime) # reconstructed input
return Z
# build model graph
pred = model(X, W, b, W_prime, b_prime)
# create cost function
#Sum of squared error
cost = tf.reduce_sum(tf.pow(X - pred, 2)) # minimize squared error
#Tensor to parameter learning rate
learning = tf.placeholder("float", name='learning')
train_op = tf.train.AdamOptimizer(learning).minimize(cost) # construct an optimizer
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
def train (self, data, adamParam, iterations):
input_ = data
RATIO = adamParam
for i in range(iterations):
# train autoencoder
_= self.sess.run(self.train_op, feed_dict={self.X: input_, self.learning: RATIO})
#print ("Model trained")
def getAECost(self, data):
input_ = data
return self.sess.run(self.cost, {self.X: data})
def trainAndGetCost (self, dataTrain, dataCost, adamParam, iterations):
self.train(dataTrain, adamParam, iterations)
return self.getAECost(dataCost)

How does TensorFlow know which variables to change for optimization?

Code taken from:-http://adventuresinmachinelearning.com/python-tensorflow-tutorial/
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
# Python optimisation variables
learning_rate = 0.5
epochs = 10
batch_size = 100
# declare the training data placeholders
# input x - for 28 x 28 pixels = 784
x = tf.placeholder(tf.float32, [None, 784])
# now declare the output data placeholder - 10 digits
y = tf.placeholder(tf.float32, [None, 10])
# now declare the weights connecting the input to the hidden layer
W1 = tf.Variable(tf.random_normal([784, 300], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random_normal([300]), name='b1')
# and the weights connecting the hidden layer to the output layer
W2 = tf.Variable(tf.random_normal([300, 10], stddev=0.03), name='W2')
b2 = tf.Variable(tf.random_normal([10]), name='b2')
# calculate the output of the hidden layer
hidden_out = tf.add(tf.matmul(x, W1), b1)
hidden_out = tf.nn.relu(hidden_out)
# now calculate the hidden layer output - in this case, let's use a softmax activated
# output layer
y_ = tf.nn.softmax(tf.add(tf.matmul(hidden_out, W2), b2))
y_clipped = tf.clip_by_value(y_, 1e-10, 0.9999999)
cross_entropy = -tf.reduce_mean(tf.reduce_sum(y * tf.log(y_clipped)
+ (1 - y) * tf.log(1 - y_clipped), axis=1))
# add an optimiser
optimiser = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cross_entropy)
# finally setup the initialisation operator
init_op = tf.global_variables_initializer()
# define an accuracy assessment operation
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# start the session
with tf.Session() as sess:
# initialise the variables
sess.run(init_op)
total_batch = int(len(mnist.train.labels) / batch_size)
for epoch in range(epochs):
avg_cost = 0
for i in range(total_batch):
batch_x, batch_y = mnist.train.next_batch(batch_size=batch_size)
_, c = sess.run([optimiser, cross_entropy],
feed_dict={x: batch_x, y: batch_y})
avg_cost += c / total_batch
print("Epoch:", (epoch + 1), "cost =", "{:.3f}".format(avg_cost))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y: mnist.test.labels}))
I wanted to ask, how does tensorflow recognize the parameters it needs to optimize , like in the above code we need to optimize w1,w2,b1 & b2 but we never specified that anywhere. We did ask GradientDescentOptimizer to minimize cross_entropy but we never told it that it would have to change the values of w1,w2,b1&b2 in order to do so , So how did it know the parameters on which cross_entropy depended upon?
The answer by Cory Nezin is only partially correct, and could lead to wrong assumptions!
You actually do specify which parameters are optimized (=trainable), namely by doing this:
# now declare the weights connecting the input to the hidden layer
W1 = tf.Variable(tf.random_normal([784, 300], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random_normal([300]), name='b1')
# and the weights connecting the hidden layer to the output layer
W2 = tf.Variable(tf.random_normal([300, 10], stddev=0.03), name='W2')
b2 = tf.Variable(tf.random_normal([10]), name='b2')
In short, TensorFlow will only update tf.Variables. If you would use something like tf.Variable(...,trainable=False), you would not get any updates, regardless of what "network depends on". You would have still specified it, and the network would still propagate through that part, but you would never receive any updates for that specific variable.
Cory's answer is correct in the way that the network does automatically recognize what values to update it with, but you specify what has to be defined/updated first!
TensorFlow works on the premise of something called a computational graph. Essentially, whenever you say something like:
hidden_out = tf.add(tf.matmul(x, W1), b1)
TensorFlow says ok, so that output clearly depends on W1, I'll connect an edge from "hidden_out" to W1. This same process happens for y_, y_clipped, and cross_entropy. So in the end you have a graph which connects cross_entropy to W1. Pick your favorite graph traversal algorithm and TensorFlow finds the connection between cross entropy and W1.

How to switch from GradientDescent Optimizer to Adam in Tensorflow

My code is running perfectly with Gradient Descent, but I want to compare the effectiveness of my algorithm using Adam Optimizer, so I tried to modify the following code:
# Import MNIST data
#import input_data
#mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
#fashion_mnist = input_data.read_data_sets('data/fashion')
import tensorflow as tf
# Set parameters
learning_rate = 0.01 #1e-4
training_iteration = 30
batch_size = 100
display_step = 2
# TF graph input
x = tf.placeholder("float", [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder("float", [None, 10]) # 0-9 digits recognition => 10 classes
#regularizer = tf.reduce_sum(tf.square(y))
# Create a model
# Set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
with tf.name_scope("Wx_b") as scope:
# Construct a linear model
model = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax
# Add summary ops to collect data
w_h = tf.summary.histogram("weights", W)
b_h = tf.summary.histogram("biases", b)
# More name scopes will clean up graph representation
with tf.name_scope("cost_function") as scope:
# Minimize error using cross entropy
# Cross entropy
cost_function = -tf.reduce_sum(y*tf.log(model))
# Create a summary to monitor the cost function
tf.summary.scalar("cost_function", cost_function)
with tf.name_scope("train") as scope:
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost_function)
# Initializing the variables
#init = tf.initialize_all_variables()
init = tf.global_variables_initializer()
# Merge all summaries into a single operator
merged_summary_op = tf.summary.merge_all()
# Launch the graph
with tf.Session() as sess:
sess.run(init)
summary_writer = tf.summary.FileWriter('/home/raed/Tensorflow/tensorflow_demo', graph_def =sess.graph_def)
#writer.add_graph(sess.graph_def)
# Training cycle
for iteration in range(training_iteration):
avg_cost = 0.
total_batch = int(mnist.train.num_examples/batch_size)
# Loop over all batches
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
# Fit training using batch data
sess.run(optimizer, feed_dict={x: batch_xs, y: batch_ys})
# Compute the average loss
avg_cost += sess.run(cost_function, feed_dict={x: batch_xs, y: batch_ys})/total_batch
# Write logs for each iteration
summary_str = sess.run(merged_summary_op, feed_dict={x: batch_xs, y: batch_ys})
summary_writer.add_summary(summary_str, iteration*total_batch + i)
# Display logs per iteration step
if iteration % display_step == 0:
print ("Iteration:" "%04d" % (iteration + 1), "cost=", "{:.9f}".format(avg_cost))
print ("Tuning completed!")
# Test the model
predictions = tf.equal(tf.argmax(model, 1), tf.argmax(y, 1))
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(predictions, "float"))
print ("Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))
to use Adam Optimizer I tried to change the following line :
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost_function)
and replace it with the AdamOptimizer :
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost_function)
when I ran the code , I got few iteration and then it stopped with the following error.
InvalidArgumentError (see above for traceback): Nan in summary histogram for: weights
[[Node: weights = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](weights/tag, Variable/read)]]
could you please help me understnad the problem , thanks in advance
the problem is weights are initialized to zero W = tf.Variable(tf.zeros([784, 10])) that`s why you re get Nan as weights.
you need to inialize them with some initializer i.e normal distribution as follow
W = tf.Variable(tf.random_normal([784, 10], stddev=0.35),
name="weights")

Tf.Print() doesn't print the shape of the tensors?

I have written a simple classification program using Tensorflow and getting the output except I tried to print the shape of tensors for Model parameters, features & bias.
The function definations:
import tensorflow as tf, numpy as np
from tensorflow.examples.tutorials.mnist import input_data
def get_weights(n_features, n_labels):
# Return weights
return tf.Variable( tf.truncated_normal((n_features, n_labels)) )
def get_biases(n_labels):
# Return biases
return tf.Variable( tf.zeros(n_labels))
def linear(input, w, b):
# Linear Function (xW + b)
# return np.dot(input,w) + b
return tf.add(tf.matmul(input,w), b)
def mnist_features_labels(n_labels):
"""Gets the first <n> labels from the MNIST dataset
"""
mnist_features = []
mnist_labels = []
mnist = input_data.read_data_sets('dataset/mnist', one_hot=True)
# In order to make quizzes run faster, we're only looking at 10000 images
for mnist_feature, mnist_label in zip(*mnist.train.next_batch(10000)):
# Add features and labels if it's for the first <n>th labels
if mnist_label[:n_labels].any():
mnist_features.append(mnist_feature)
mnist_labels.append(mnist_label[:n_labels])
return mnist_features, mnist_labels
The graph creation :
# Number of features (28*28 image is 784 features)
n_features = 784
# Number of labels
n_labels = 3
# Features and Labels
features = tf.placeholder(tf.float32)
labels = tf.placeholder(tf.float32)
# Weights and Biases
w = get_weights(n_features, n_labels)
b = get_biases(n_labels)
# Linear Function xW + b
logits = linear(features, w, b)
# Training data
train_features, train_labels = mnist_features_labels(n_labels)
print("Total {0} data points of Training Data, each having {1} features \n \
Total {2} number of labels,each having 1-hot encoding {3}".format(len(train_features),len(train_features[0]),\
len(train_labels),train_labels[0]
)
)
# global variables initialiser
init= tf.global_variables_initializer()
with tf.Session() as session:
session.run(init)
The problem is here :
# shapes =tf.Print ( tf.shape(features), [tf.shape(features),
# tf.shape(labels),
# tf.shape(w),
# tf.shape(b),
# tf.shape(logits)
# ], message= "The shapes are:" )
# print("Verify shapes",shapes)
logits = tf.Print(logits, [tf.shape(features),
tf.shape(labels),
tf.shape(w),
tf.shape(b),
tf.shape(logits)],
message= "The shapes are:")
print(logits)
I looked at here, but didn't find much useful.
# Softmax
prediction = tf.nn.softmax(logits)
# Cross entropy
# This quantifies how far off the predictions were.
# You'll learn more about this in future lessons.
cross_entropy = -tf.reduce_sum(labels * tf.log(prediction), reduction_indices=1)
# Training loss
# You'll learn more about this in future lessons.
loss = tf.reduce_mean(cross_entropy)
# Rate at which the weights are changed
# You'll learn more about this in future lessons.
learning_rate = 0.08
# Gradient Descent
# This is the method used to train the model
# You'll learn more about this in future lessons.
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Run optimizer and get loss
_, l = session.run(
[optimizer, loss],
feed_dict={features: train_features, labels: train_labels})
# Print loss
print('Loss: {}'.format(l))
The output I am getting is :
Extracting dataset/mnist/train-images-idx3-ubyte.gz
Extracting dataset/mnist/train-labels-idx1-ubyte.gz
Extracting dataset/mnist/t10k-images-idx3-ubyte.gz
Extracting dataset/mnist/t10k-labels-idx1-ubyte.gz
Total 3118 data points of Training Data, each having 784 features
Total 3118 number of labels,each having 1-hot encoding [0. 1. 0.]
Tensor("Print_22:0", shape=(?, 3), dtype=float32)
Loss: 5.339271068572998
Could anyone help me understand, Why I am not able to see the shapes of the tensors?
That is not how you use tf.Print. It is an op that does nothing on its own (simply returns the input) but prints the requested tensors as a side effect. You should do something like
logits = tf.Print(logits, [tf.shape(features),
tf.shape(labels),
tf.shape(w),
tf.shape(b),
tf.shape(logits)],
message= "The shapes are:")
Now, whenever logits is evaluated (as it will be for computing the loss/gradients), the shape information will be printed.
What you are doing right now is simply printing the return value of the tf.Print op, which is just its input (tf.shape(features)).
After #xdurch0 suggestion, I tried this
shapes = tf.Print(logits, [tf.shape(features),
tf.shape(labels),
tf.shape(w),
tf.shape(b),
tf.shape(logits)],
message= "The shapes are:")
# Run optimizer and get loss
_, l, resultingShapes = session.run( [optimizer, loss, shapes],
feed_dict={features: train_features, labels: train_labels})
print('The shapes are: '. resultingShapes.shape)
and it worked partially,
Extracting dataset/mnist/train-images-idx3-ubyte.gz
Extracting dataset/mnist/train-labels-idx1-ubyte.gz
Extracting dataset/mnist/t10k-images-idx3-ubyte.gz
Extracting dataset/mnist/t10k-labels-idx1-ubyte.gz
Total 3118 data points of Training Data, each having 784 features
Total 3118 number of labels, each having 1-hot encoding [0. 1. 0.]
The shapes are: (3118, 3)
Loss: 10.223002433776855
could #xdurch0 suggest something to get the desired results?
My DESIRED RESULTS are:
tf.shape(features): (3118, 784) tf.shape(labels) :(3118, 3) ,
tf.shape(w) : (784,3), tf.shape(b) : (3,1), tf.shape(logits):(3118,3)

Which is correct shape of linear regression in my model?

I am designing regression network to predict the weight of a person from 10 to 100 kg. My dataset has 50 training data that is
Vector 1: 1024x1 corresponding to 40kg
Vector 2: 1024x1 corresponding to 20kg
Vector 3: 1024x1 corresponding to 40kg
...
Vector 50: 1024x1 corresponding to 30kg
Hence, my dataset size is 1024x50, and the label size is 1x50. If I design a simple linear regression, like y=xW+b, so the size of W and b will be
W is 1024x1
b is 1x50
Am I right?
This is my tensorflow code but it provide a wrong prediction
# Training Data
train_X = ...# shape of 1024 x 50
train_Y = ...# shape of 1x50
n_samples = 50
learning_rate = 0.0001
training_epochs = 1000
display_step = 50
# tf Graph Input
X = tf.placeholder("float")
Y = tf.placeholder("float")
# Set model weights
W = tf.Variable(tf.truncated_normal([1024, 1], mean=0.0, stddev=1.0, dtype=tf.float32))
b = tf.Variable(tf.zeros(1, dtype = tf.float32))
# Construct a linear model
pred = tf.add(tf.multiply(X, W), b)
# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
init = tf.global_variables_initializer()
# Start training
with tf.Session() as sess:
# Run the initializer
sess.run(init)
# Fit all training data
for epoch in range(training_epochs):
for (x, y) in zip(train_X, train_Y):
sess.run(optimizer, feed_dict={X: x, Y: y})
# Display logs per epoch step
if (epoch + 1) % display_step == 0:
c = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
print("Epoch:", '%04d' % (epoch + 1), "cost=", "{:.9f}".format(c), \
"W=", sess.run(W), "b=", sess.run(b))
print("Optimization Finished!")
W is 1024x1
b is 1x50
Am I right?
No, shape of W is correct, but b should be a scalar (1x1 matrix). In your approach you have one trainable bias per data point which makes no sense. However, in your code it is correctly set to size 1.
What is wrong is handling matrix multiplication, your model should be:
pred = tf.matmul(X, W) + b # you will have to transpose your train_X
tf.multiply is pointwise multiplication, not matrix multiplication.