Tensorflow: custom data load + asynchronous computation [closed] - tensorflow

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
This is how-to which I believe is missed from TF examples.
Task:
samples for each class are given in separate dir and thus labels are indirect (i.e. by dir)
decoupled load and computations in TF
Each separate bit could be found, however I think have them all together in one place will help to save a lot of time for TF beginners (like myself).
Lets tackle 1. in my case it is two sets of images:
# all filenames for .jpg in dir
# - list of fnames
# - list of labels
def path_fnames(f_path, label, ext = ['.jpg', '.jpeg']):
f_n = [f_path+'/'+f for f in sorted(os.listdir(f_path)) if os.path.splitext(f)[1].lower() in ext]
f_l = [label] * len(f_n)
return f_n, f_l
#
def dense_to_one_hot(labels_dense, num_classes=10, dtype=np.float32):
"""Convert class labels from scalars to one-hot vectors."""
num_labels = labels_dense.shape[0]
index_offset = np.arange(num_labels) * num_classes
labels_one_hot = np.zeros((num_labels, num_classes),dtype=dtype)
labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1
return labels_one_hot
data_dir = '/mnt/dataset/'
dir_1 = '/class_1'
dir_2 = '/class_2'
# --- get filenames for data ---
dpath = [data_dir+dir_1, data_dir+dir_2]
f_n1, f_l1 = path_fnames(dpath[0], 0)
f_n2, f_l2 = path_fnames(dpath[1], 1)
# --- create one-hot labels ---
ohl = dense_to_one_hot(np.asarray(f_l1+f_l2), num_classes=2, dtype = np.float32)
fnames = f_n1+f_n2; # one-hot labels created in this sequence
Now we have all file-names and one-hot labels preloaded.
Lets move to the 2.
It is based on How to prefetch data using a custom python function in tensorflow. In short it has:
custom image-reader (replace with yours)
queue fnl_q with [filename label] which is used by reader to feed
queue proc_q with [sample label] which is used to feed processing some_op
thread which perform read_op to get [sample label] and enqueue_op to put pair into proc_q. Thread is controlled by tf.Coordinator
some_op which first get data from proc_q by dequeue_many() and rest of computation (also could be put in separate thread).
Notes:
feature_read_op and label_read_op are two separate ops.
I use sleep() to slow down and control op - only for test purposes
i have separated "feeding" and "calculation" parts - in real case just run them in parallel
print 'TF version:', tf.__version__
# --- params ----
im_s = [30, 30, 1] # target image size
BATCH_SIZE = 16
# image reader
# - fnl_queue: queue with [fn l] pairs
# Notes
# - to resize: image_tensor = tf.image.resize_image_with_crop_or_pad(image_tensor, HEIGHT, WIDTH)
# - how about image preprocessing?
def img_reader_jpg(fnl_queue, ch = 3, keep = False):
fn, label = fnl_queue.dequeue()
if keep:
fnl_queue.enqueue([fn, label])
img_bytes = tf.read_file(fn)
img_u8 = tf.image.decode_jpeg(img_bytes, channels=ch)
img_f32 = tf.cast(img_u8, tf.float32)/256.0
#img_4 = tf.expand_dims(img_f32,0)
return img_f32, label
# load [feature, label] and enqueue to processing queue
# - sess: tf session
# - sess: tf Coordinator
# - [fr_op, lr_op ]: feature_read_op label_read_op
# - enqueue_op: [f l] pairs enqueue op
def load_and_enqueue(sess, coord, feature_read_op, label_read_op , enqueue_op):
i = 0
while not coord.should_stop():
# for testing purpose
time.sleep(0.1)
#print 'load_and_enqueue i=',i
#i = i +1
feature, label = sess.run([feature_read_op, label_read_op ])
feed_dict = {feature_input: feature,
label_input : label}
sess.run(enqueue_op, feed_dict=feed_dict)
# --- TF part ---
# filenames and labels are pre-loaded
fv = tf.constant(fnames)
lv = tf.constant(ohl)
#fnl_q = tf.FIFOQueue(len(fnames), [tf.string, tf.float32])
fnl_q = tf.RandomShuffleQueue(len(fnames), 0, [tf.string, tf.float32])
do_enq = fnl_q.enqueue_many([fv, lv])
# reading_op: feature_read_op label_read_op
feature_read_op, label_read_op = img_reader_jpg(fnl_q, ch = im_s[2])
# samples queue
f_s = im_s
l_s = 2
feature_input = tf.placeholder(tf.float32, shape=f_s, name='feature_input')
label_input = tf.placeholder(tf.float32, shape=l_s, name='label_input')
#proc_q = tf.RandomShuffleQueue(len(fnames), 0, [tf.float32, tf.float32], shapes=[f_s, l_s])
proc_q = tf.FIFOQueue(len(fnames), [tf.float32, tf.float32], shapes=[f_s, l_s])
enqueue_op = proc_q.enqueue([feature_input, label_input])
# test:
# - some op
img_batch, lab_batch = proc_q.dequeue_many(BATCH_SIZE)
some_op = [img_batch, lab_batch]
# service ops
init_op = tf.initialize_all_variables()
# let run stuff
with tf.Session() as sess:
sess.run(init_op)
sess.run(do_enq)
print "fnl_q.size:", fnl_q.size().eval()
print "proc_q.size:", proc_q.size().eval()
# --- test thread stuff ---
# - fill proc_q
coord = tf.train.Coordinator()
t = threading.Thread(target=load_and_enqueue, args = (sess, coord, feature_read_op, label_read_op , enqueue_op))
t.start()
time.sleep(2.1)
coord.request_stop()
coord.join([t])
print "fnl_q.size:", fnl_q.size().eval()
print "proc_q.size:", proc_q.size().eval()
# - process a bit
ss = sess.run(some_op)
print 'ss[0].shape', ss[0].shape
print ' ss[1]:\n', ss[1]
print "fnl_q.size:", fnl_q.size().eval()
print "proc_q.size:", proc_q.size().eval()
print 'ok'
Typical output:
TF version: 0.6.0
fnl_q.size: 1225
proc_q.size: 0
fnl_q.size: 1204
proc_q.size: 21
ss[0].shape (16, 30, 30, 1)
ss[1]:
[[ 0. 1.]
[ 1. 0.]
[ 1. 0.]
[ 0. 1.]
[ 0. 1.]
[ 1. 0.]
[ 1. 0.]
[ 0. 1.]
[ 1. 0.]
[ 0. 1.]
[ 0. 1.]
[ 1. 0.]
[ 1. 0.]
[ 0. 1.]
[ 1. 0.]
[ 0. 1.]]
fnl_q.size: 1204
proc_q.size: 5
ok
All as expected
batch of pairs [sample label] are created
pairs are shuffled
Only thing left is to apply TF as it is intended to be used by replacing some_op :)
And a question:
one observed problem problem - in case I use tf.FIFOQueue for file-names and tf.RandomShuffleQueue for samples - shuffling doesn't happen. However other way around (as it code above) it does shuffle perfectly.
Any problem with shuffling for
tf.RandomShuffleQueue(len(fnames), 0, [tf.float32, tf.float32], shapes=[f_s, l_s]) ?
ADD:
The version with two threads:
one for re-fill/update/change file name queue
second for fill samples to processing queue.
Also added correct way to stop threads.
def load_and_enqueue(sess, coord, feature_read_op, label_read_op , enqueue_op):
try:
while not coord.should_stop():
feature, label = sess.run([feature_read_op, label_read_op ])
feed_dict = {feature_input: feature,
label_input : label}
sess.run(enqueue_op, feed_dict=feed_dict)
except Exception as e:
return
# periodically check the state of fnl queue and if needed refill it
# - enqueue_op: 'refill' file-name_label queue
def enqueue_fnl(sess, coord, fnl_q, enqueue_op):
try:
while not coord.should_stop():
time.sleep(0.5)
s = sess.run(fnl_q.size())
if s < (9*BATCH_SIZE) :
sess.run(enqueue_op)
except Exception as e:
return
# -- ops for feed part --
# filenames and labels are pre-loaded
fv = tf.constant(fnames)
lv = tf.constant(ohl)
# read op
fnl_q = tf.RandomShuffleQueue(len(fnames)*2, 0, [tf.string, tf.float32], name = 'fnl_q') # add some margin for re-fill to fit
do_fnl_enq = fnl_q.enqueue_many([fv, lv])
feature_read_op, label_read_op = img_reader_jpg(fnl_q, ch = IMG_SIZE[2])
# samples queue
feature_input = tf.placeholder(tf.float32, shape=IMG_SIZE, name='feature_input')
label_input = tf.placeholder(tf.float32, shape=LAB_SIZE, name='label_input')
proc_q = tf.FIFOQueue(len(fnames)*3, [tf.float32, tf.float32], shapes=[IMG_SIZE, LAB_SIZE], name = 'fe_la_q')
enqueue_op = proc_q.enqueue([feature_input, label_input])
# -- ops for trainind end eval
img_batch, lab_batch = proc_q.dequeue_many(BATCH_SIZE)
... here is your model
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, lab_ph))
optimizer = tf.train.AdamOptimizer(1e-4).minimize(loss)
with tf.Session() as sess:
coord = tf.train.Coordinator()
t_le = threading.Thread(target=load_and_enqueue, args = (sess, coord, feature_read_op, label_read_op , enqueue_op) , name = 'load_and_enqueue')
t_re = threading.Thread(target=enqueue_fnl, args = (sess, coord, fnl_q, do_fnl_enq), name = 'enqueue_fnl') # re-enq thread i.e. refiling filename queue
t_le.start()
t_re.start()
try:
# training
for step in xrange(823):
# some proc
img_v, lab_v = sess.run([img_batch, lab_batch])
feed_dict = { img_ph : img_v,
lab_ph : lab_v,
keep_prob: 0.7}
_, loss_v = sess.run([optimizer, loss], feed_dict = feed_dict)
except Exception as e:
print 'Training: Exception:', e
# stop threads
coord.request_stop() # ask to stop
sess.run(fnl_q.close(cancel_pending_enqueues=True)) # tell proc_q don't wait for enque anymore
sess.run(proc_q.close(cancel_pending_enqueues=True)) # tell proc_q don't wait for enque anymore
coord.join([t_le, t_re], stop_grace_period_secs=8)

Related

Tensorflow : Predict in Recurrent Neural Networks for Drawing Classification tutorial

I used the tutorial code from https://www.tensorflow.org/tutorials/recurrent_quickdraw and all works fine until I tried to make a prediction instead of just evaluate it.
I wrote a new input function for prediction, based on the code in create_dataset.py
def predict_input_fn():
def parse_line(stroke_points):
"""Parse an ndjson line and return ink (as np array) and classname."""
inkarray = json.loads(stroke_points)
stroke_lengths = [len(stroke[0]) for stroke in inkarray]
total_points = sum(stroke_lengths)
np_ink = np.zeros((total_points, 3), dtype=np.float32)
current_t = 0
for stroke in inkarray:
for i in [0, 1]:
np_ink[current_t:(current_t + len(stroke[0])), i] = stroke[i]
current_t += len(stroke[0])
np_ink[current_t - 1, 2] = 1 # stroke_end
# Preprocessing.
# 1. Size normalization.
lower = np.min(np_ink[:, 0:2], axis=0)
upper = np.max(np_ink[:, 0:2], axis=0)
scale = upper - lower
scale[scale == 0] = 1
np_ink[:, 0:2] = (np_ink[:, 0:2] - lower) / scale
# 2. Compute deltas.
np_ink = np_ink[1:, 0:2] - np_ink[0:-1, 0:2]
np_ink = np_ink[1:, :]
features = {}
features["ink"] = tf.train.Feature(float_list=tf.train.FloatList(value=np_ink.flatten()))
features["shape"] = tf.train.Feature(int64_list=tf.train.Int64List(value=np_ink.shape))
f = tf.train.Features(feature=features)
example = tf.train.Example(features=f)
#t = tf.constant(np_ink)
return example
def parse_example(example):
"""Parse a single record which is expected to be a tensorflow.Example."""
# feature_to_type = {
# "ink": tf.VarLenFeature(dtype=tf.float32),
# "shape": tf.FixedLenFeature((0,2), dtype=tf.int64)
# }
feature_to_type = {
"ink": tf.VarLenFeature(dtype=tf.float32),
"shape": tf.FixedLenFeature([2], dtype=tf.int64)
}
example_proto = example.SerializeToString()
parsed_features = tf.parse_single_example(example_proto, feature_to_type)
parsed_features["ink"] = tf.sparse_tensor_to_dense(parsed_features["ink"])
#parsed_features["shape"].set_shape((2))
return parsed_features
example = parse_line(FLAGS.predict_input_stroke_data)
features = parse_example(example)
dataset = tf.data.Dataset.from_tensor_slices(features)
# Our inputs are variable length, so pad them.
dataset = dataset.padded_batch(FLAGS.batch_size, padded_shapes=dataset.output_shapes)
iterator = dataset.make_one_shot_iterator()
next_feature_batch = iterator.get_next()
return next_feature_batch, None # In prediction, we have no labels
I modified the existing model_fn() function and added below at appropirate place
predictions = tf.argmax(logits, axis=1)
if mode == tf.estimator.ModeKeys.PREDICT:
preds = {
"class_index": predictions,
"probabilities": tf.nn.softmax(logits),
'logits': logits
}
return tf.estimator.EstimatorSpec(mode, predictions=preds)
However when i call the following the code
if (FLAGS.predict_input_stroke_data != None):
# prepare_input_tfrecord_for_prediction()
# predict_results = estimator.predict(input_fn=get_input_fn(
# mode=tf.estimator.ModeKeys.PREDICT,
# tfrecord_pattern=FLAGS.predict_input_temp_file,
# batch_size=FLAGS.batch_size))
predict_results = estimator.predict(input_fn=predict_input_fn)
for idx, prediction in enumerate(predict_results):
type = prediction["class_ids"][0] # Get the predicted class (index)
print("Prediction Type: {}\n".format(type))
I get the following error, what is wrong in my code could anyone please help me. I have tried quite a few things to get the shape right but i am unable to. I also tried to first write my strokes data as a tfrecord and then use the existing input_fn to read from the tfrecord that gives me similar errors but slighly different
File "/Users/farooq/.virtualenvs/tensor1.0/lib/python3.6/site-packages/tensorflow/python/framework/common_shapes.py", line 627, in call_cpp_shape_fn
require_shape_fn)
File "/Users/farooq/.virtualenvs/tensor1.0/lib/python3.6/site-packages/tensorflow/python/framework/common_shapes.py", line 691, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Shape must be rank 2 but is rank 1 for 'Slice' (op: 'Slice') with input shapes: [?], [2], [2].
I finally solved the problem by taking my input keystrokes, writing them to disk as a TFRecord. I also had to write the same inputstrokes batch_size times to same TFRecord, else i got the shape mismatch errors. And then invoking predict worked.
The main addition for prediction was the following function
def create_tfrecord_for_prediction(batch_size, stoke_data, tfrecord_file):
def parse_line(stoke_data):
"""Parse provided stroke data and ink (as np array) and classname."""
inkarray = json.loads(stoke_data)
stroke_lengths = [len(stroke[0]) for stroke in inkarray]
total_points = sum(stroke_lengths)
np_ink = np.zeros((total_points, 3), dtype=np.float32)
current_t = 0
for stroke in inkarray:
if len(stroke[0]) != len(stroke[1]):
print("Inconsistent number of x and y coordinates.")
return None
for i in [0, 1]:
np_ink[current_t:(current_t + len(stroke[0])), i] = stroke[i]
current_t += len(stroke[0])
np_ink[current_t - 1, 2] = 1 # stroke_end
# Preprocessing.
# 1. Size normalization.
lower = np.min(np_ink[:, 0:2], axis=0)
upper = np.max(np_ink[:, 0:2], axis=0)
scale = upper - lower
scale[scale == 0] = 1
np_ink[:, 0:2] = (np_ink[:, 0:2] - lower) / scale
# 2. Compute deltas.
#np_ink = np_ink[1:, 0:2] - np_ink[0:-1, 0:2]
#np_ink = np_ink[1:, :]
np_ink[1:, 0:2] -= np_ink[0:-1, 0:2]
np_ink = np_ink[1:, :]
features = {}
features["ink"] = tf.train.Feature(float_list=tf.train.FloatList(value=np_ink.flatten()))
features["shape"] = tf.train.Feature(int64_list=tf.train.Int64List(value=np_ink.shape))
f = tf.train.Features(feature=features)
ex = tf.train.Example(features=f)
return ex
if stoke_data is None:
print("Error: Stroke data cannot be none")
return
example = parse_line(stoke_data)
#Remove the file if it already exists
if tf.gfile.Exists(tfrecord_file):
tf.gfile.Remove(tfrecord_file)
writer = tf.python_io.TFRecordWriter(tfrecord_file)
for i in range(batch_size):
writer.write(example.SerializeToString())
writer.flush()
writer.close()
Then in the main function you just have to invoke estimator.predict() reusing the same input_fn=get_input_fn(...)argument except point it to the temporary created tfrecord_file
Hope this helps

How to properly update variables in a while loop in TensorFlow?

Can someone please explain (or point me to the relevant place in the documentation that I've missed) how to properly update a tf.Variable() in a tf.while_loop? I am trying to update variables in the loop that will store some information until the next iteration of the loop using the assign() method. However, this isn't doing anything.
As the values of mu_tf and sigma_tf are being updated by the minimizer, while step_mu isn't, I am obviously doing something wrong, but I don't understand what it is. Specifically, I guess I should say that I know assign() does not do anything until it is executed when the graph is run, so I know that I can do
sess.run(step_mu.assign(mu_tf))
and that will update step_mu, but I want to do this in the loop correctly. I don't understand how to add an assign operation to the body of the loop.
A simplified working example of what I'm doing follows here:
import numpy as np
import tensorflow as tf
mu_true = 0.5
sigma_true = 1.5
n_events = 100000
# Placeholders
X = tf.placeholder(dtype=tf.float32)
# Variables
mu_tf = tf.Variable(initial_value=tf.random_normal(shape=[], mean=0., stddev=0.1,
dtype=tf.float32),
dtype=tf.float32)
sigma_tf = tf.Variable(initial_value=tf.abs(tf.random_normal(shape=[], mean=1., stddev=0.1,
dtype=tf.float32)),
dtype=tf.float32,
constraint=lambda x: tf.abs(x))
step_mu = tf.Variable(initial_value=-99999., dtype=tf.float32)
step_loss = tf.Variable(initial_value=-99999., dtype=tf.float32)
# loss function
gaussian_dist = tf.distributions.Normal(loc=mu_tf, scale=sigma_tf)
log_prob = gaussian_dist.log_prob(value=X)
negative_log_likelihood = -1.0 * tf.reduce_sum(log_prob)
# optimizer
optimizer = tf.train.AdamOptimizer(learning_rate=0.1)
# sample data
x_sample = np.random.normal(loc=mu_true, scale=sigma_true, size=n_events)
# Construct the while loop.
def cond(step):
return tf.less(step, 10)
def body(step):
# gradient step
train_op = optimizer.minimize(loss=negative_log_likelihood)
# update step parameters
with tf.control_dependencies([train_op]):
step_mu.assign(mu_tf)
return tf.add(step,1)
loop = tf.while_loop(cond, body, [tf.constant(0)])
# Execute the graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
step_loss = sess.run(fetches=negative_log_likelihood, feed_dict={X: x_sample})
print('Before loop:\n')
print('mu_tf: {}'.format(sess.run(mu_tf)))
print('sigma_tf: {}'.format(sess.run(sigma_tf)))
print('step_mu: {}'.format(sess.run(step_mu)))
print('step_loss: {}\n'.format(step_loss))
sess.run(fetches=loop, feed_dict={X: x_sample})
print('After loop:\n')
print('mu_tf: {}'.format(sess.run(mu_tf)))
print('sigma_tf: {}'.format(sess.run(sigma_tf)))
print('step_mu: {}'.format(sess.run(step_mu)))
print('step_loss: {}'.format(step_loss))

Tensorflow No gradients provided for any variable LSTM encoder-decoder

I'm new to Tensorflow and i'm trying to impletement LSTM encoder-decoder from scratch using tensorflow, according to this blog post:
https://medium.com/#shiyan/understanding-lstm-and-its-diagrams-37e2f46f1714
This is the code for the encoder (using tf.while_loop)
def decode_timestep(self, context, max_dec, result, C_t_d, H_t_d, current_index):
with tf.variable_scope('decode_scope', reuse=True):
W_f_d = tf.get_variable('W_f_d', (self.n_dim, self.n_dim), tf.float32)
U_f_d = tf.get_variable('U_f_d', (self.n_dim, self.n_dim), tf.float32)
# ...
# Decoder
# Forget Gate
f_t_d = tf.nn.sigmoid(
tf.add(tf.reduce_sum(tf.add(tf.matmul(context, W_f_d), tf.matmul(H_t_d, U_f_d))), b_f_d))
C_t_d_f = tf.multiply(C_t_d, f_t_d)
# Input Gate
# Part 1. New Memory Impaction
i_t_d = tf.nn.sigmoid(
tf.add(tf.reduce_sum(tf.add(tf.matmul(context, W_i_d), tf.matmul(H_t_d, U_i_d))), b_i_d))
# Part 2. Calculate New Memory
c_d_new = tf.nn.tanh(tf.add(tf.reduce_sum(tf.add(tf.matmul(context, W_c_d), tf.matmul(H_t_d, U_c_d))), b_c_d))
# Part 3. Update old Memory
C_t_d_i = tf.add(C_t_d_f, tf.multiply(i_t_d, c_d_new))
# Output Gate
o_t_d = tf.nn.sigmoid(
tf.add(tf.reduce_sum(tf.add(tf.matmul(context, W_o_d), tf.matmul(H_t_d, U_o_d))), b_o_d))
# Calculate new H_t
H_t_d_new = tf.multiply(o_t_d, tf.nn.tanh(C_t_d))
# Write the result of this timestep to the tensor array at pos current_index
result.write(tf.subtract(tf.subtract(max_dec, 1), current_index), H_t_d_new)
# Decrement the current_index by 1
index_next = tf.subtract(current_index, 1)
return context, max_dec, result, C_t_d_i, H_t_d_new, index_next
This is the code for the decoder:
def decode_timestep(self, context, max_dec, result, C_t_d, H_t_d, current_index):
# Same as above
# Calculate new H_t
H_t_d_new = tf.multiply(o_t_d, tf.nn.tanh(C_t_d))
# Write the result of this timestep to the tensor array at pos current_index
result.write(tf.subtract(tf.subtract(max_dec, 1), current_index), H_t_d_new)
# Decrement the current_index by 1
index_next = tf.subtract(current_index, 1)
return context, max_dec, result, C_t_d_i, H_t_d_new, index_next
And this is the code for the session:
with tf.variable_scope('encode_scope', reuse=True):
H_t_e = tf.get_variable('H_t_e')
C_t_e = tf.get_variable('C_t_e')
with tf.variable_scope('global_scope', reuse=True):
current_index = tf.get_variable('current_index', dtype=tf.int32)
context, _, C_t_e, H_t_e, current_index = tf.while_loop(self.encode_timestep_cond, self.encode_timestep,
[X_Sent, max_enc, C_t_e, H_t_e, current_index])
result = tf.TensorArray(tf.float32, size=max_dec)
_,_,result, C_t_e, H_t_e, current_index = tf.while_loop(self.decode_timestep_cond, self.decode_timestep,
[context, max_dec, result, C_t_e, H_t_e, current_index])
loss = tf.reduce_sum(tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(result.concat(), Y_Sent)), reduction_indices=1)))
train_step = tf.train.AdamOptimizer().minimize(loss)
The error is :
ValueError: No gradients provided for any variable, check your graph
for ops that do not support gradients, between variables
["", ...
Please help! This is the implementation that i think it should right after what i read and understand from that post. Thank you and sorry about my english!

Comparison of GradientDescent algorithm in tensorflow with the implementation of Michael Nielsen

First I will give an overview of my problem. I have two setups:
1) A net which is based on tensorflow
2) A net which is based on code from Michael Nielsen's Book http://neuralnetworksanddeeplearning.com/index.html
Both nets are completely equal. They both have
3 hidden layers a 30 neurons
2 inputs neurons, one output neuron
All activations are sigmoid
Stochastic Gradient descent algorithm as learning algorithm with eta=3.0
quadratic cost function : cost_function = tf.scalar_mul(1.0/(N_training_set*2.0),tf.reduce_sum(tf.squared_difference(y,y_)))
batch_size of 10
weight initialization: The weights which connect the lth and l+1th layer are initialized with sigma=1/sqrt(N_l), where N_l is the number of neurons in the lth layer.
My problem is, that the tensorflow results are very bad ( a factor 10 worse than the results one obtains if I use the Nielsen code).
So before I post my complete code: Does anybody know that there is a bug in the tensorflow StochasticGradientDescent algorithm? (Or does anybody has a reference how the learning rate of the Stocharstic Gradient Descent in tensorflow is defined? I cannot find something in the api)
Here is my code for the tensorflow net:
regression.py
import readData
import matplotlib.pyplot as plt
import numpy as np
from random import randint
import random
from root_numpy import fill_hist
from ROOT import TCanvas, TH2F, TText, TF1 ,TH1D
import ROOT
import tensorflow as tf
import math
# # # # # # ##
#Read in data#
# #
function_outputs=True# apply an invertable function to the y's and train with the modified outputs y_mod! Up to know this function is just a normalization.
function_inputs=True #
full_set = readData.read_data_set("./TH2D_A00_TB10.root","LHCChi2_CMSSM_nObs1061_A00_TB10","full_set",function_inputs,function_outputs)
N_full_set=full_set.get_N()
N_validation_set=10000
N_training_set=N_full_set-(N_validation_set)
full=range(0,N_full_set)
random.shuffle(full)
training_subset=full[:N_training_set]#indices for training set
validation_subset=full[N_training_set:N_training_set+N_validation_set]#indices for validation set
training_set = readData.read_data_set("./TH2D_A00_TB10.root","LHCChi2_CMSSM_nObs1061_A00_TB10","training_set",
function_inputs,function_outputs,full_set=full_set,subset=training_subset)
validation_set = readData.read_data_set("./TH2D_A00_TB10.root","LHCChi2_CMSSM_nObs1061_A00_TB10","validation_set",
function_inputs,function_outputs,full_set=full_set,subset=validation_subset )
#overwiew of full data set, training_data set and validation_data set. The modified members( normalized in this case) can be accessed with the x_mod() and y_mod() member functions
#the normalized data (input and output) will be used to train the net
print "full_data_set:"
print "x (inputs)"
print full_set.get_x()
print "y (outputs)"
print full_set.get_y()
print "x_mod"
print full_set.get_x_mod()
print "y_mod"
print full_set.get_y_mod()
print "------------------"
print "training_data_set:"
print "x (inputs)"
print training_set.get_x()
print "y (outputs)"
print training_set.get_y()
print "x_mod"
print training_set.get_x_mod()
print "y_mod"
print training_set.get_y_mod()
print "------------------"
print "evaluation_data_set:"
print "x (inputs)"
print validation_set.get_x()
print "y (outputs)"
print validation_set.get_y()
print "x_mod"
print validation_set.get_x_mod()
print "y_mod"
print validation_set.get_y_mod()
print "------------------"
# # # # # # # # # # # ##
#setting up the network#
# #
N_epochs = 20
learning_rate = 3.0
batch_size = 10
N1 = 2 #equals N_inputs
N2 = 30
N3 = 30
N4 = 30
N5 = 1
N_in=N1
N_out=N5
#one calculates everything directly for all elements in one batch
"""example: N_in=2,N_out=3, mini_batch_size=5, activation function=linear. In der output matrix gibt es 5Zeilen,jede fuer ein mini batch. Jede Zeile hat 3 Spalten fuer ein output neuron jeweils
W2
[[-0.31917086 -0.03908769 0.5792625 ]
[ 1.34563279 0.03904691 0.39674851]]
b2
[ 0.40960133 -0.5495823 -0.97048181]
x_in
[[ 23.2 12.2 ]
[ 0. 1.1 ]
[ 2.3 3.3 ]
[ 23.22222 24.44444]
[ 333. 444. ]]
y=x_in*W2+b2
[[ 9.42155647 -0.98004436 17.30874062]
[ 1.88979745 -0.50663072 -0.53405845]
[ 4.1160965 -0.51062918 1.67109203]
[ 25.8909874 -0.50280523 22.17957497]
[ 491.5866394 3.77104688 368.08026123]]
hier wird klar, dass b2 auf jede Zeile der Matrix x_in*w2 draufaddiert wird.
W2 ist die transponierte der atrix, die im Buch definiert ist.
"""
x = tf.placeholder(tf.float32,[None,N1])#don't take the shape=(batch_size,N1) argument, because we need this for different batch sizes
W2 = tf.Variable(tf.random_normal([N1, N2],mean=0.0,stddev=1.0/math.sqrt(N1*1.0)))# Initialize the weights for one neuron with 1/sqrt(Number of weights which enter the neuron/ Number of neurons in layer before)
b2 = tf.Variable(tf.random_normal([N2]))
a2 = tf.sigmoid(tf.matmul(x, W2) + b2) #x=a1
W3 = tf.Variable(tf.random_normal([N2, N3],mean=0.0,stddev=1.0/math.sqrt(N2*1.0)))
b3 = tf.Variable(tf.random_normal([N3]))
a3 = tf.sigmoid(tf.matmul(a2, W3) + b3)
W4 = tf.Variable(tf.random_normal([N3, N4],mean=0.0,stddev=1.0/math.sqrt(N3*1.0)))
b4 = tf.Variable(tf.random_normal([N4]))
a4 = tf.sigmoid(tf.matmul(a3, W4) + b4)
W5 = tf.Variable(tf.random_normal([N4, N5],mean=0.0,stddev=1.0/math.sqrt(N4*1.0)))
b5 = tf.Variable(tf.random_normal([N5]))
y = tf.sigmoid(tf.matmul(a4, W5) + b5)
y_ = tf.placeholder(tf.float32,[None,N_out]) # ,shape=(None,N_out)
# # # # # # # # # # # # # #
#initializing and training#
# #
cost_function = tf.scalar_mul(1.0/(N_training_set*2.0),tf.reduce_sum(tf.squared_difference(y,y_)))
error_to_desired_output= y-y_
abs_error_to_desired_output= tf.abs(y-y_)
sum_abs_error_to_desired_output= tf.reduce_sum(tf.abs(y-y_))
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost_function)
init = tf.initialize_all_variables()
#launch the graph
sess = tf.Session()
sess.run(init)
N_training_batch=training_set.get_N()/batch_size #rounds to samllest integer
out_mod_validation=[0]*N_epochs # output of net, when inputting x_mod of validation data. Will be saved after each epoch.
error_mod_validation_data= [0]*N_epochs #absolute error on mod validation data after each epoch
diff_mod_validation=[0]*N_epochs # error vector of validation data after each epoch. i.e. y-y_
cost_training_data=[0]*N_epochs
for i in range(0,N_epochs):
for j in range(0,N_training_batch):
batch_xs, batch_ys, epochs_completed = training_set.next_batch(batch_size)#always gives the modified x's and y's. If one does not want to modifie them the function has to be set to identity
sess.run(train_step, feed_dict={x: batch_xs,
y_: batch_ys})
cost_training_data[i]=sess.run(cost_function, feed_dict={
x: training_set.get_x_mod(), y_: training_set.get_y_mod()})
out_mod_validation[i]= sess.run(y, feed_dict={
x: validation_set.get_x_mod()})# output of net, when imputting x_mod of validation data after each training epoch
diff_mod_validation[i]=sess.run(error_to_desired_output, feed_dict={
x: validation_set.get_x_mod(),y_: validation_set.get_y_mod()})
error_mod_validation_data[i]=sess.run(sum_abs_error_to_desired_output, feed_dict={
x: validation_set.get_x_mod(),y_: validation_set.get_y_mod()})
print "epochs completed: "+str(i)
#now calculate everything for the unmodified/unnormalized outputs
out_validation=[0]*N_epochs # output of net, when inputting x_mod of validation data and making the normalization of the output backwards, saved after each epoch
error_validation_data=[0.0]*N_epochs
diff_validation=[0.0]*N_epochs
#make the transformation on the outputs backwards
for i in range(0,N_epochs):
out_validation[i]=np.ndarray(shape=(validation_set.get_N(),1))
for j in range(0,len(out_mod_validation[i])):
out_validation[i][j]=out_mod_validation[i][j]#do this, because otherwise we will produce only a reference
readData.apply_inverse_function_to_outputs(out_mod_validation[i],out_validation[i],full_set.get_y_max())# second argument will be changed!
diff_validation[i]=np.subtract(out_validation[i],validation_set.get_y())
error_validation_data[i]=np.sum(np.absolute(np.subtract(out_validation[i],validation_set.get_y())))
#print at 10 examples how good the output matches the desired output
for i in range(0,10):
print "desired output"
print validation_set.get_y()[i][0]
print "actual output after last training epoch"
print out_validation[-1][i][0]
print "-------"
print "total error on validation_data set after last training"
print error_validation_data[-1]
# # # # ##
#printing#
# #
plt.figure(1)
plt.title("Costfunction of (modified) Training-data")
plt.xlabel("epochs")
plt.ylabel("cost function")
x_range=[x+1 for x in range(0,N_epochs)]
plt.plot(x_range,cost_training_data)
plt.savefig("cost_on_training_data.png")
plt.figure(2)
plt.title("f data")
plt.xlabel("epochs")
plt.ylabel("total error on validation data")
x_range=[x+1 for x in range(0,N_epochs)]
plt.plot(x_range,error_validation_data)
plt.savefig("error_on_val_data.png")
error_on_validation_data_after_training = diff_validation[-1].reshape((1,validation_set.get_N()))
hist=TH1D('hist',"Errors on val data after last training epoch",200,-10000,10000)
fill_hist(hist,error_on_validation_data_after_training[0])
canvas=TCanvas();
hist.GetXaxis().SetTitle("desired Chi^2- outputted Chi^2");
hist.Draw()
canvas.SaveAs('error_on_val_data_hist.png')
readData.py
import numpy as np
import root_numpy
from ROOT import TFile, TH2D, TCanvas
import itertools
def apply_function_to_inputs(x,x_mod,x_max):# python uebergibt alles als reference
#normalize the inputs
for i in range(0,len(x)):
for j in range(0,len(x[i])):
#print "x["+str(i)+"]["+str(j)+"]="+str(x[i][j])
x_mod[i][j]=x[i][j]/x_max[j]
#print "x_mod["+str(i)+"]["+str(j)+"]="+str(x_mod[i][j])
def apply_inverse_function_to_inputs(x,x_mod,x_max):# python uebergibt alles als reference
#re normalize the inputs
for i in range(0,len(x)):
for j in range(0,len(x[i])):
x_mod[i][j]=x[i][j]*x_max[j]
def apply_function_to_outputs(y,y_mod,y_max):# python uebergibt alles als reference
#normalize the outputs
for i in range(0,len(y)):
for j in range(0,len(y[i])):
y_mod[i][j]=y[i][j]/y_max[j]
def apply_inverse_function_to_outputs(y,y_mod,y_max):# python uebergibt alles als reference
#re-normalize the outputs
for i in range(0,len(y)):
for j in range(0,len(y[i])):
y_mod[i][j]=y[i][j]*y_max[j]
class Dataset(object):
def __init__(self,path,hist_name,kind_of_set,function_inputs,function_outputs,full_set,subset):
self._kind_of_set=kind_of_set
"""example
self._x np.ndarray(shape=(N_points,2))
[[ 10. 95.]
[ 10. 100.]
[ 10. 105.]
...,
[ 2490. 1185.]
[ 2490. 1190.]
[ 2490. 1195.]]
self._y np.ndarray(shape=(N_points,1))
[[ 0.00000000e+00]
[ 0.00000000e+00]
[ 0.00000000e+00]
...,
[ 6.34848448e-06]
[ 6.34845946e-06]
[ 6.34848448e-06]]
"""
rfile = TFile(path)
histogram = rfile.Get(hist_name)
#now prepare data for training:
if kind_of_set=="full_set":
N_points=histogram.GetXaxis().GetNbins() * histogram.GetYaxis().GetNbins() #number of points in full_set
self._N=N_points
self._y=np.ndarray(shape=(N_points,1))
self._x=np.ndarray(shape=(N_points,2))
self._y_mod=np.ndarray(shape=(N_points,1)) #function applied to outputs, for example normalized, or a function is applied
self._x_mod=np.ndarray(shape=(N_points,2)) #function applied to inputs
self._y_max=np.ndarray(shape=(1))
self._y_max[0]=0.0
self._x_max=np.ndarray(shape=(2))
self._x_max=np.ndarray(shape=(2))
self._x_max[0]=0.0
self._x_max[1]=0.0
i=0
for x_bin in range(0, histogram.GetXaxis().GetNbins()):
for y_bin in range(0, histogram.GetYaxis().GetNbins()):
self._x[i][0]=histogram.GetXaxis().GetBinCenter(x_bin)
self._x[i][1]=histogram.GetYaxis().GetBinCenter(y_bin)
self._y[i][0]=histogram.GetBinContent(x_bin,y_bin)
for j in range(0,len(self._x[i])):# only in the full_set case the maximum values are calculated
if self._x[i][j]>self._x_max[j]:
self._x_max[j]=self._x[i][j]
for j in range(0,len(self._y[i])):
if self._y[i][j]>self._y_max[j]:
self._y_max[j]=self._y[i][j]
i=i+1
#apply function to inputs and outputs, the function can also be the identity
apply_function_to_inputs(self._x,self._x_mod,self._x_max)
apply_function_to_outputs(self._y,self._y_mod,self._y_max)
elif kind_of_set=="training_set" or kind_of_set=="validation_set" or kind_of_set=="test_set":
self._N = len(subset)#Number of elements of the data set
self._y=np.ndarray(shape=(self._N,1))
self._x=np.ndarray(shape=(self._N,2))
self._y_mod=np.ndarray(shape=(self._N,1))
self._x_mod=np.ndarray(shape=(self._N,2))
self._y_max=full_set.get_y_max()
self._x_max=full_set.get_x_max()
for i in range(0,self._N):
self._x[i][0]=full_set.get_x()[subset[i]][0]
self._x[i][1]=full_set.get_x()[subset[i]][1]
self._y[i][0]=full_set.get_y()[subset[i]][0]
self._x_mod[i][0]=full_set.get_x_mod()[subset[i]][0]
self._x_mod[i][1]=full_set.get_x_mod()[subset[i]][1]
self._y_mod[i][0]=full_set.get_y_mod()[subset[i]][0]
if len(self._x)==0:# If the set has 0 entries the list is empty
self._N_input=-1
else:
self._N_input = len(self._x[0])
if len(self._y)==0:# If the set has 0 entries the list is empty
self._N_output=-1
else:
self._N_output = len(self._y[0])
self._index_in_epoch = 0 #if one has trained 2 mini batches in the epoch already then this is 2*batch_size
self._epochs_completed = 0
def get_N_input_nodes(self):
return self._N_input
def get_N_output_nodes(self):
return self._N_output
def get_N(self):
return self._N
def get_x(self):
return self._x
def get_y(self):
return self._y
def get_x_max(self):
return self._x_max
def get_y_max(self):
return self._y_max
def get_x_mod(self):
return self._x_mod
def get_y_mod(self):
return self._y_mod
def next_batch(self, batch_size, fake_x=False):
start = self._index_in_epoch
self._index_in_epoch += batch_size
if self._index_in_epoch >= self._N:
# Finished epoch
self._epochs_completed += 1
# Shuffle the data
perm = np.arange(self._N)
np.random.shuffle(perm)
self._x = self._x[perm]#shuffle both, actually one would only need to shuffle x_mod and y_mod, but for consistency we shuffle both!
self._y = self._y[perm]
self._x_mod = self._x_mod[perm]
self._y_mod = self._y_mod[perm]
# Start next epoch
start = 0
self._index_in_epoch = batch_size
assert batch_size <= self._N #if batch size<= self._N then an exception is thrown!
end = self._index_in_epoch
return self._x_mod[start:end], self._y_mod[start:end], self._epochs_completed
def read_data_set(path,hist_name,kind_of_set,function_inputs,function_outputs,full_set=None,subset=None):
return Dataset(path,hist_name,kind_of_set,function_inputs,function_outputs,full_set,subset)
I have uploaded the corresponding data input file to
https://github.com/kanban1992/GradientDescent_Comparison

How to see multiple images through tf.image_summary

Problem - only one image is shown at TensorBoard
Inspired by this
How can I visualize the weights(variables) in cnn in Tensorflow?
Here is code:
# --- image reader ---
# - rsq: random shuffle queue with [fn l] pairs
def img_reader_jpg(rsq):
fn, label = rsq.dequeue()
img_b = tf.read_file(fn)
img_u = tf.image.decode_jpeg(img_b, channels=3)
img_f = tf.cast(img_u, tf.float32)
img_4 = tf.expand_dims(img_f,0)
return img_4, label
# filenames and labels are pre-loaded
fv = tf.constant(fnames)
lv = tf.constant(ohl)
rsq = tf.RandomShuffleQueue(len(fnames), 0, [tf.string, tf.float32])
do_enq = rsq.enqueue_many([fv, lv])
# reading_op
image, label = img_reader_jpg(rsq)
# test: some op
im_t = tf.placeholder(tf.float32, shape=[None,30,30,3], name='img_tensor')
lab_t = tf.placeholder(tf.float32, shape=[None,2], name='lab_tensor')
some_op = tf.add(im_t,im_t)
ims_op = tf.image_summary("img", im_t)
# service ops
init_op = tf.initialize_all_variables()
# run it
with tf.Session() as sess:
summary_writer = tf.train.SummaryWriter(summ_dir, graph_def=sess.graph_def)
print 'log at:', summ_dir
sess.run(init_op)
sess.run(do_enq)
print "rsq.size:", rsq.size().eval()
for i in xrange(5):
print "\ni:",i
img_i, lab_i = sess.run([image, label]) # read image - right?
print "I:", img_i.shape , " L:", lab_i
feed_dict = {
im_t: img_i
}
img2 = sess.run([some_op], feed_dict = feed_dict)
# now summary part
imss = sess.run(ims_op, feed_dict = feed_dict)
#print "imss",imss
summary_writer.add_summary(imss,i)
print "rsq.size:", rsq.size().eval()
summary_writer.close()
print 'ok'
Here is output:
log at: /mnt/code/test_00/log/2016-01-09 17:10:37
rsq.size: 1225
i: 0
I: (1, 30, 30, 3) L: [ 1. 0.]
i: 1
I: (1, 30, 30, 3) L: [ 1. 0.]
i: 2
I: (1, 30, 30, 3) L: [ 0. 1.]
i: 3
I: (1, 30, 30, 3) L: [ 0. 1.]
i: 4
I: (1, 30, 30, 3) L: [ 0. 1.]
rsq.size: 1220
ok
Looks ok
5 [image label] pairs were delivered
in case I uncomment print "imss",imss I can see 5 different buffers each with own png image
op graph looks ok in TB
However only one image in TB. I suspect I have missed something important about how TF is working -.i.e. what caused what at graph execution time.
Second question: what I need to do to see result i.e. img2 = img+img in TB?
You are right that you will only see one image. You are calling the image summary op once in each for loop, and each time you call it, you are passing it a single image.
What you could do to see all images that you want to see, would be to compile these images into a single tensor. If we refer to TensorFlow API (link always changes so find the latest one)
tf.image_summary(tag, tensor, max_images=3, collections=None,
name=None)
As of TF 1.0.0, it's this:
tf.summary.image(name, tensor, max_outputs=3, collections=None)
Put your "multiple image tensor" in, set max_images to the number of images you have, and you should be able to see all the images in TensorBoard.
Let me know if there are still problems.
As of r0.12, tf.image_summary has been replaced with tf.summary.image
tf.summary.image(name, tensor, max_outputs=3, collections=None)