In Tensorflow, is it possible to append some summaries to already-merged summary_op? - tensorflow

Let's say, some built-in function returns train_op and summary_op where summary_op is defined by tf.summary.merge(summaries, name='summary_op'), and I cannot touch the function.
Also, let's say, I am going to use the built-in slim.learning.train which takes train_op and summary_op as input arguments.
# -- typical
train_op, summary_op = model_fn(image)
slim.learning.train(train_op, summary_op=summary_op)
# -- my question
train_op, summary_op = model_fn(image)
some_other_summary_list = some_another_function()
summary_op_ = ... # is it possible to append some_other_summary_list to summary_op?
slim.learning.train(train_op, summary_op=summary_op_)
How I can combine summaries in already-merged summary_op and newly-collected summaries some_other_summary_list?
-- If I do tf.merge_all(tf.GraphKeys.SUMMARIES) actually there will be too many summaries since, in model_fn() collect only useful and necessary summaries.
-- I can think of defining separate summary_op2 and define train_step_fn as in:
from tensorflow.contrib.slim.python.slim.learning import train_step
def train_step_fn(...):
... = train_step(...)
if iteration % 100 == 0:
summaries = session.run(summary_op2)
summary_writer.add_summary(summaries, iteration)
slim.learning.train(train_op, summary_op=summary_op, train_step_fn=train_step_fn)
However, this seems too much if I can simply somehow append new summaries to summary_op. Is it possible?

If both "summary_op and newly-collected summaries some_other_summary_list" are created by tf.summary.merge, you can simply merge them again by tf.summary.merge([summary_op, summaries some_other_summary_list]), as demonstrated by this code:
import tensorflow as tf
a = tf.summary.scalar('a', tf.constant(0))
b = tf.summary.scalar('b', tf.constant(1))
c = tf.summary.scalar('c', tf.constant(2))
d = tf.summary.scalar('d', tf.constant(3))
ab = tf.summary.merge([a, b])
cd = tf.summary.merge([c, d])
abcd = tf.summary.merge([ab, cd])
with tf.Session() as sess:
writer = tf.summary.FileWriter('.', sess.graph)
summary = sess.run(abcd)
writer.add_summary(summary)

Related

Why am I getting shape errors when trying to pass a batch from the Tensorflow Dataset API to my session operations?

I am dealing with an issue in my conversion over to the Dataset API and I guess I just don't have enough experience yet with the API to know how to handle the below situation. We currently have image augmentation that we perform currently using queueing and batching. I was tasked with checking out the new Dataset API and converting over our existing implementation using it rather than queues.
What we would like to do is get a reference to all the paths and handle all operations from just that reference. As you see in the dataset initialization, I have mapped the parse_fn to the dataset itself which then goes about reading the file and extracting the initial values from the filenames. However when I then go about calling the iterators next_batch method and then pass those values to get_summary, I'm now getting an error around shape. I have been trying a number of things which just keeps changing the error and so I felt I should see if anyone on SO saw possibly that I was going about this all wrong and should be taking a different route. Does anything jump out as absolutely wrong in my use of the Dataset API?
Should I not be calling the ops this way any longer? I noticed the majority of the examples I saw they would get the batch, pass the variables to the op and then capture that in a variable and pass that to sess.run, however I haven't found an easy way of doing that as of yet with our setup that wasn't erroring so this was the approach I took instead (but its still erroring). I'll be continuing to try to trace down the problem and post here should I find anything, but if anyone sees something please advise. Thanks!
Current Error:
... in get_summary summary, acc = sess.run([self._summary_op,
self._accuracy], feed_dict=feed_dict) ValueError: Cannot feed value of
shape (32,) for Tensor 'ph_input_labels:0', which has shape '(?, 1)
Below is the block where the get_summary method is called and error is fired:
def perform_train():
if __name__ == '__main__':
#Get all our image paths
filenames = data_layer_train.get_image_paths()
next_batch, iterator = preproc_image_fn(filenames=filenames)
with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
with sess.graph.as_default():
# Set the random seed for tensorflow
tf.set_random_seed(cfg.RNG_SEED)
classifier_network = c_common.create_model(len(products_to_class_dict), is_training=True)
optimizer, global_step_var = c_common.create_optimizer(classifier_network)
sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
# Init tables and dataset iterator
sess.run(tf.tables_initializer())
sess.run(iterator.initializer)
cur_epoch = 0
blobs = None
try:
epoch_size = data_layer_train.get_steps_per_epoch()
num_steps = num_epochs * epoch_size
for step in range(num_steps):
timer_summary.tic()
if blobs is None:
#Now populate from our training dataset
blobs = sess.run(next_batch)
# *************** Below is where it is erroring *****************
summary_train, acc = classifier_network.get_summary(sess, blobs["images"], blobs["labels"], blobs["weights"])
...
Believe the error is in preproc_image_fn:
def preproc_image_fn(filenames, images=None, labels=None, image_paths=None, cells=None, weights=None):
def _parse_fn(filename, label, weight):
augment_instance = False
paths=[]
selected_cells=[]
if vals.FIRST_ITER:
#Perform our check of the path to see if _data_augmentation is within it
#If so set augment_instance to true and replace the substring with an empty string
new_filename = tf.regex_replace(filename, "_data_augmentation", "")
contains = tf.equal(tf.size(tf.string_split([filename], "")), tf.size(tf.string_split([new_filename])))
filename = new_filename
if contains is True:
augment_instance = True
core_file = tf.string_split([filename], '\\').values[-1]
product_id = tf.string_split([core_file], ".").values[0]
label = search_tf_table_for_entry(product_id)
weight = data_layer_train.get_weights(product_id)
image_string = tf.read_file(filename)
img = tf.image.decode_image(image_string, channels=data_layer_train._channels)
img.set_shape([None, None, None])
img = tf.image.resize_images(img, [data_layer_train._target_height, data_layer_train._target_width])
#Previously I was returning the below, but I was getting an error from the op when assigning feed_dict stating that it didnt like the dictionary
#retval = dict(zip([filename], [img])), label, weight
retval = img, label, weight
return retval
num_files = len(filenames)
filenames = tf.constant(filenames)
#*********** Setup dataset below ************
dataset = tf.data.Dataset.from_tensor_slices((filenames, labels, weights))
dataset=dataset.map(_parse_fn)
dataset = dataset.repeat()
dataset = dataset.batch(32)
iterator = dataset.make_initializable_iterator()
batch_features, batch_labels, batch_weights = iterator.get_next()
return {'images': batch_features, 'labels': batch_labels, 'weights': batch_weights}, iterator
def search_tf_table_for_entry(self, product_id):
'''Looks up keys in the table and outputs the values. Will return -1 if not found '''
if product_id is not None:
return self._products_to_class_table.lookup(product_id)
else:
if not self._real_eval:
logger().info("class not found in training {} ".format(product_id))
return -1
Where I create the model and have the placeholders used previously:
...
def create_model(self):
weights_regularizer = tf.contrib.layers.l2_regularizer(cfg.TRAIN.WEIGHT_DECAY)
biases_regularizer = weights_regularizer
# Input data.
self._input_images = tf.placeholder(
tf.float32, shape=(None, self._image_height, self._image_width, self._num_channels), name="ph_input_images")
self._input_labels = tf.placeholder(tf.int64, shape=(None, 1), name="ph_input_labels")
self._input_weights = tf.placeholder(tf.float32, shape=(None, 1), name="ph_input_weights")
self._is_training = tf.placeholder(tf.bool, name='ph_is_training')
self._keep_prob = tf.placeholder(tf.float32, name="ph_keep_prob")
self._accuracy = tf.reduce_mean(tf.cast(self._correct_prediction, tf.float32))
...
self.create_summaries()
def create_summaries(self):
val_summaries = []
with tf.device("/cpu:0"):
for var in self._act_summaries:
self._add_act_summary(var)
for var in self._train_summaries:
self._add_train_summary(var)
self._summary_op = tf.summary.merge_all()
self._summary_op_val = tf.summary.merge(val_summaries)
def get_summary(self, sess, images, labels, weights):
feed_dict = {self._input_images: images, self._input_labels: labels,
self._input_weights: weights, self._is_training: False}
summary, acc = sess.run([self._summary_op, self._accuracy], feed_dict=feed_dict)
return summary, acc
Since the error says:
Cannot feed value of shape (32,) for Tensor 'ph_input_labels:0', which has shape '(?, 1)
My guess is your labels in get_summary has the shape [32]. Can you just reshape it to (32, 1)? Or maybe reshape the label earlier in _parse_fn?

How to properly update variables in a while loop in TensorFlow?

Can someone please explain (or point me to the relevant place in the documentation that I've missed) how to properly update a tf.Variable() in a tf.while_loop? I am trying to update variables in the loop that will store some information until the next iteration of the loop using the assign() method. However, this isn't doing anything.
As the values of mu_tf and sigma_tf are being updated by the minimizer, while step_mu isn't, I am obviously doing something wrong, but I don't understand what it is. Specifically, I guess I should say that I know assign() does not do anything until it is executed when the graph is run, so I know that I can do
sess.run(step_mu.assign(mu_tf))
and that will update step_mu, but I want to do this in the loop correctly. I don't understand how to add an assign operation to the body of the loop.
A simplified working example of what I'm doing follows here:
import numpy as np
import tensorflow as tf
mu_true = 0.5
sigma_true = 1.5
n_events = 100000
# Placeholders
X = tf.placeholder(dtype=tf.float32)
# Variables
mu_tf = tf.Variable(initial_value=tf.random_normal(shape=[], mean=0., stddev=0.1,
dtype=tf.float32),
dtype=tf.float32)
sigma_tf = tf.Variable(initial_value=tf.abs(tf.random_normal(shape=[], mean=1., stddev=0.1,
dtype=tf.float32)),
dtype=tf.float32,
constraint=lambda x: tf.abs(x))
step_mu = tf.Variable(initial_value=-99999., dtype=tf.float32)
step_loss = tf.Variable(initial_value=-99999., dtype=tf.float32)
# loss function
gaussian_dist = tf.distributions.Normal(loc=mu_tf, scale=sigma_tf)
log_prob = gaussian_dist.log_prob(value=X)
negative_log_likelihood = -1.0 * tf.reduce_sum(log_prob)
# optimizer
optimizer = tf.train.AdamOptimizer(learning_rate=0.1)
# sample data
x_sample = np.random.normal(loc=mu_true, scale=sigma_true, size=n_events)
# Construct the while loop.
def cond(step):
return tf.less(step, 10)
def body(step):
# gradient step
train_op = optimizer.minimize(loss=negative_log_likelihood)
# update step parameters
with tf.control_dependencies([train_op]):
step_mu.assign(mu_tf)
return tf.add(step,1)
loop = tf.while_loop(cond, body, [tf.constant(0)])
# Execute the graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
step_loss = sess.run(fetches=negative_log_likelihood, feed_dict={X: x_sample})
print('Before loop:\n')
print('mu_tf: {}'.format(sess.run(mu_tf)))
print('sigma_tf: {}'.format(sess.run(sigma_tf)))
print('step_mu: {}'.format(sess.run(step_mu)))
print('step_loss: {}\n'.format(step_loss))
sess.run(fetches=loop, feed_dict={X: x_sample})
print('After loop:\n')
print('mu_tf: {}'.format(sess.run(mu_tf)))
print('sigma_tf: {}'.format(sess.run(sigma_tf)))
print('step_mu: {}'.format(sess.run(step_mu)))
print('step_loss: {}'.format(step_loss))

How to only restore variables in the checkpoint in Tensorflow?

In Tensorflow, my model is based on a pre-trained model, and I added a few more variables and remove some in the pre-trained model. When I restore the variables from the checkpoint file, I have to explicitly specify all variables I added to the graph that need to be excluded. For example, I did
exclude = # explicitly list all variables to exclude
variables_to_restore = slim.get_variables_to_restore(exclude=exclude)
saver = tf.train.Saver(variables_to_restore)
Is there a simpler way to do this? Namely, as long as a variable is not in checkpoint, then don't try to restore.
You should first find out all those variable that are useful(meaning also in your graph) and then add the joint set of the intersection of the two from the checkpoint rather than all from it.
variables_can_be_restored = list(set(tf.get_collection_ref(tf.GraphKeys.GLOBAL_VARIABLES)).intersection(tf.train.list_variables(checkpoint_dir)))
then restore it after defining a saver like this:
temp_saver = tf.train.Saver(variables_can_be_restored)
ckpt_state = tf.train.get_checkpoint_state(checkpoint_dir, lastest_filename)
print('Loading checkpoint %s' % ckpt_state.model_checkpoint_path)
temp_saver.restore(sess, ckpt_state.model_checkpoint_path)
The only thing that you can do is firstly having the same model as in the checkpoint, secondly restoring the checkpoint values to the same model. After restoring the variables for the same model, you can add new layers, delete existing layers or change the weights of the layers.
But there is an important point that you need to be careful. After added new layers you need to initialize them. If you use tf.global_variables_initializer(), you will lose the values of reloaded layers. So you should only initialize the uninitialized weights, you can use following function for this.
def initialize_uninitialized(sess):
global_vars = tf.global_variables()
is_not_initialized = sess.run([tf.is_variable_initialized(var) for var in global_vars])
not_initialized_vars = [v for (v, f) in zip(global_vars, is_not_initialized) if not f]
# for i in not_initialized_vars: # only for testing
# print(i.name)
if len(not_initialized_vars):
sess.run(tf.variables_initializer(not_initialized_vars))
This is more full answer, that works for not-distributed setting:
from tensorflow.contrib.framework.python.framework import checkpoint_utils
slim = tf.contrib.slim
def scan_checkpoint_for_vars(checkpoint_path, vars_to_check):
check_var_list = checkpoint_utils.list_variables(checkpoint_path)
check_var_list = [x[0] for x in check_var_list]
check_var_set = set(check_var_list)
vars_in_checkpoint = [x for x in vars_to_check if x.name[:x.name.index(":")] in check_var_set]
vars_not_in_checkpoint = [x for x in vars_to_check if x.name[:x.name.index(":")] not in check_var_set]
return vars_in_checkpoint, vars_not_in_checkpoint
def create_easy_going_scaffold(vars_in_checkpoint, vars_not_in_checkpoint):
model_ready_for_local_init_op = tf.report_uninitialized_variables(var_list = vars_in_checkpoint)
model_init_vars_not_in_checkpoint = tf.variables_initializer(vars_not_in_checkpoint)
restoration_saver = tf.train.Saver(vars_in_checkpoint)
eg_scaffold = tf.train.Scaffold(saver=restoration_saver,
ready_for_local_init_op = model_ready_for_local_init_op,
local_init_op = model_init_vars_not_in_checkpoint)
return eg_scaffold
all_vars = slim.get_variables()
ckpoint_file = tf.train.latest_checkpoint(output_chkpt_dir)
vars_in_checkpoint, vars_not_in_checkpoint = scan_checkpoint_for_vars(ckpoint_file, all_vars)
is_checkpoint_complete = len(vars_not_in_checkpoint) == 0
# Create session that can handle current checkpoint
if (is_checkpoint_complete):
# Checkpoint is full - all variables can be found there
print('Using normal session')
sess = tf.train.MonitoredTrainingSession(checkpoint_dir = output_chkpt_dir,
save_checkpoint_secs = save_checkpoint_secs,
save_summaries_secs = save_summaries_secs)
else:
# Checkpoint is partial - some variables need to be initialized
print('Using easy going session')
eg_scaffold = create_easy_going_scaffold(vars_in_checkpoint, vars_not_in_checkpoint)
# Save all variables to next checkpoint
saver = tf.train.Saver()
hooks = [tf.train.CheckpointSaverHook(checkpoint_dir = output_chkpt_dir,
save_secs = save_checkpoint_secs,
saver = saver)]
# Such session is a little slower during the first iteration
sess = tf.train.MonitoredTrainingSession(checkpoint_dir = output_chkpt_dir,
scaffold = eg_scaffold,
hooks = hooks,
save_summaries_secs = save_summaries_secs,
save_checkpoint_secs = None)
with sess:
.....

tensorflow serving uninitialized

Hello I want to initialize variable named result in the code below.
I tried to initialize with this code* when I tried to serving.
sess.run(tf.global_variables_initializer(),feed_dict=
{userLat:0,userLon:0})
I just want to initialize the variable.
The reason for using the variable is to write validate_shape = false.
The reason for using this option is to resolve error 'Outer dimension for outputs must be unknown, outer dimension of 'Variable:0' is 1' when deploying the model version to the Google Cloud ml engine.
Initialization with the following code will output a value when feed_dict is 0 when attempting a prediction.
sess.run(tf.global_variables_initializer(),feed_dict=
{userLat:0,userLon:0})
Is there a way to simply initialize the value of result?
Or is it possible to store the list of stored tensor values as a String with a comma without shape?
It's a very basic question.
I'm sorry.
I am a beginner of the tensor flow.
I need help. Thank you for reading.
import tensorflow as tf
import sys,os
#define filename queue
filenameQueue =tf.train.string_input_producer(['./data.csv'],
shuffle=False,name='filename_queue')
# define reader
reader = tf.TextLineReader()
key,value = reader.read(filenameQueue)
#define decoder
recordDefaults = [ ["null"],[0.0],[0.0]]
sId,lat, lng = tf.decode_csv(
value, record_defaults=recordDefaults,field_delim=',')
taxiData=[]
with tf.Session() as sess:
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
for i in range(18):
data=sess.run([sId, lat, lng])
tmpTaxiData=[]
tmpTaxiData.append(data[0])
tmpTaxiData.append(data[1])
tmpTaxiData.append(data[2])
taxiData.append(tmpTaxiData)
coord.request_stop()
coord.join(threads)
from math import sin, cos,acos, sqrt, atan2, radians
#server input data
userLat = tf.placeholder(tf.float32, shape=[])
userLon = tf.placeholder(tf.float32, shape=[])
R = 6373.0
radian=0.017453292519943295
distanceList=[]
for i in taxiData:
taxiId=tf.constant(i[0],dtype=tf.string,shape=[])
taxiLat=tf.constant(i[1],dtype=tf.float32,shape=[])
taxiLon=tf.constant(i[2],dtype=tf.float32,shape=[])
distanceValue=6371*tf.acos(tf.cos(radian*userLat)*
tf.cos(radian*taxiLat)*tf.cos(radian*taxiLon-
radian*126.8943311)+tf.sin(radian*37.4685225)*tf.sin(radian*taxiLat))
tmpDistance=[]
tmpDistance.append(taxiId)
tmpDistance.append(distanceValue)
distanceList.append(tmpDistance)
# result sort
sId,distances=zip(*distanceList)
indices = tf.nn.top_k(distances, k=len(distances)).indices
gather=tf.gather(sId, indices[::-1])[0:5]
result=tf.Variable(gather,validate_shape=False)
print "Done training!"
# serving
import os
from tensorflow.python.util import compat
model_version = 1
path = os.path.join("Taximodel", str(model_version))
builder = tf.saved_model.builder.SavedModelBuilder(path)
with tf.Session() as sess:
builder.add_meta_graph_and_variables(
sess,
[tf.saved_model.tag_constants.SERVING],
signature_def_map= {
"serving_default":
tf.saved_model.signature_def_utils.predict_signature_def(
inputs= {"userLat": userLat, "userLon":userLon},
outputs= {"result": result})
})
builder.save()
print 'Done exporting'
You can try to define the graph so that the output tensor preserves the shape (outer dimension) of the input tensor.
For example, something like:
#server input data
userLoc = tf.placeholder(tf.float32, shape=[None, 2])
def calculate_dist(user_loc):
distanceList = []
for i in taxiData:
taxiId=tf.constant(i[0],dtype=tf.string,shape=[])
taxiLat=tf.constant(i[1],dtype=tf.float32,shape=[])
taxiLon=tf.constant(i[2],dtype=tf.float32,shape=[])
distanceValue=6371*tf.acos(tf.cos(radian*user_loc[0])*
tf.cos(radian*taxiLat)*tf.cos(radian*taxiLon-
radian*126.8943311)+tf.sin(radian*37.4685225)*tf.sin(radian*taxiLat))
tmpDistance=[]
tmpDistance.append(taxiId)
tmpDistance.append(distanceValue)
distanceList.append(tmpDistance)
# result sort
sId,distances=zip(*distanceList)
indices = tf.nn.top_k(distances, k=len(distances)).indices
return tf.gather(sId, indices[::-1])[0:5]
result = tf.map_fn(calculate_dist, userLoc)

consistent forward / backward pass with tensorflow dropout

For the reinforcement learning one usually applies forward pass of the neural network for each step of the episode in order to calculate policy. Afterwards one could calculate parameter gradients using backpropagation. Simplified implementation of my network looks like this:
class AC_Network(object):
def __init__(self, s_size, a_size, scope, trainer, parameters_net):
with tf.variable_scope(scope):
self.is_training = tf.placeholder(shape=[], dtype=tf.bool)
self.inputs = tf.placeholder(shape=[None, s_size], dtype=tf.float32)
# (...)
layer = slim.fully_connected(self.inputs,
layer_size,
activation_fn=tf.nn.relu,
biases_initializer=None)
layer = tf.contrib.layers.dropout(inputs=layer, keep_prob=parameters_net["dropout_keep_prob"],
is_training=self.is_training)
self.policy = slim.fully_connected(layer, a_size,
activation_fn=tf.nn.softmax,
biases_initializer=None)
self.actions = tf.placeholder(shape=[None], dtype=tf.int32)
self.advantages = tf.placeholder(shape=[None], dtype=tf.float32)
actions_onehot = tf.one_hot(self.actions, a_size, dtype=tf.float32)
responsible_outputs = tf.reduce_sum(self.policy * actions_onehot, [1])
self.policy_loss = - policy_loss_multiplier * tf.reduce_mean(tf.log(responsible_outputs) * self.advantages)
local_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope)
self.gradients = tf.gradients(self.policy_loss, local_vars)
Now during training I will fist rollout the episode by consecutive forward passes (again, simplified version):
s = self.local_env.reset() # list of input variables for the first step
while done == False:
a_dist = sess.run([self.policy],
feed_dict = {self.local_AC.inputs: [s],
self.is_training: True})
a = np.argmax(a_dist)
s, r, done, extra_stat = self.local_env.step(a)
# (...)
and in the end I will calculate gradients by backward pass:
p_l, grad = sess.run([self.policy_loss,
self.gradients],
feed_dict={self.inputs: np.vstack(comb_observations),
self.is_training: True,
self.actions: np.hstack(comb_actions),})
(please note that I could have made a mistake somewhere above trying to remove as much as possible of the original code irrelevant to the issue in question)
So finally the question: Is there a way of ensuring that all the consecutive calls to the sess.run() will generate the same dropout structure? Ideally I would like to have exactly the same dropout structure within each episode and only change it between episodes. Things seem to work well as they are but I continue to wonder.