Tensorflow Dataset issue at inference phase - tensorflow

I created a char-level language generation with Tensorflow here. I used tf.placeholder API, which according to the google docs:
Feeding is least efficient way to feed data into a TensorFlow program.
I decided to change my code and replace it with new TensroFlow Dataset API.
I used from_generator to create Dataset:
dataset = tf.data.Dataset.from_generator(gen, (tf.int32, tf.int32),
(tf.TensorShape([None, None]),
tf.TensorShape([None, None])))
self.iterator = dataset.make_initializable_iterator()
self.inp, self.target = self.iterator.get_next()
As can be seen in above code, I used [None, None] for Tensorshape to give the model more generality. During the training everything is perfectly fine. But at inference some problem arise. In tf.placeholder API I used following code to generate characters:
def inference(self):
converter = utils.TextReader(filename=FLAGS.CONVERTER_PATH)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
samples = []
new_state = sess.run(self.init_state)
c = 12 # random starting token
samples.append(c)
for i in range(1000):
x = np.zeros((1, 1))
x[0, 0] = c
feed_dict = {
self.inp: x,
self.init_state: new_state
}
preds, new_state = sess.run([self.prediction, self.final_state], feed_dict=feed_dict)
c = utils.pick_top_n(preds, converter.vocab_size)
samples.append(c)
samples = np.array(samples)
print(converter.arr_to_text(samples))
In Dataset API, I dont have tf.placeholder to feed my previous character. And when I use the above code, as expected, following error happened:
InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,50] vs. shape[1] = [32,50]
At inference, the model use the same input shape ([32,50]) that I used for training. Which is not what I want (Actually, I define TensorShape([None,None]) to handle this but not works).
How can I fix the issue with new Dataset API?
Complete code.

Related

How do you fit a tf.Dataset to a Keras Autoencoder Model when the Dataset has been generated using TFX?

Problem
As the title suggests I have been trying to create a pipeline for training an Autoencoder model using TFX. The problem I'm having is fitting the tf.Dataset returned by the DataAccessor.tf_dataset_factory object to the Autoencoder.
Below I summarise the steps I've taken through this project, and have some Questions at the bottom if you wish to skip the background information.
Intro
TFX Pipeline
The TFX components I have used so far have been:
CsvExampleGenerator (the dataset has 82 columns, all numeric, and the sample csv has 739 rows)
StatisticsGenerator / SchemaGenerator, the schema has been edited as is now loaded in using an Importer
Transform
Trainer (this is the component I am currently having problems with)
Model
The model that I am attempting to train is based off of the example laid out here https://www.tensorflow.org/tutorials/generative/autoencoder. However, my model is being trained on tabular data, searching for anomalous results, as opposed to image data.
As I have tried a couple of solutions I have tried using both the Keras.layers and Keras.model format for defining the model and I outline both below:
Subclassing Keras.Model
class Autoencoder(keras.models.Model):
def __init__(self, features):
super(Autoencoder, self).__init__()
self.encoder = tf.keras.Sequential([
keras.layers.Dense(82, activation = 'relu'),
keras.layers.Dense(32, activation = 'relu'),
keras.layers.Dense(16, activation = 'relu'),
keras.layers.Dense(8, activation = 'relu')
])
self.decoder = tf.keras.Sequential([
keras.layers.Dense(16, activation = 'relu'),
keras.layers.Dense(32, activation = 'relu'),
keras.layers.Dense(len(features), activation = 'sigmoid')
])
def call(self, x):
inputs = [keras.layers.Input(shape = (1,), name = f) for f in features]
dense = keras.layers.concatenate(inputs)
encoded = self.encoder(dense)
decoded = self.decoder(encoded)
return decoded
Subclassing Keras.Layers
def _build_keras_model(features: List[str]) -> tf.keras.Model:
inputs = [keras.layers.Input(shape = (1,), name = f) for f in features]
dense = keras.layers.concatenate(inputs)
dense = keras.layers.Dense(32, activation = 'relu')(dense)
dense = keras.layers.Dense(16, activation = 'relu')(dense)
dense = keras.layers.Dense(8, activation = 'relu')(dense)
dense = keras.layers.Dense(16, activation = 'relu')(dense)
dense = keras.layers.Dense(32, activation = 'relu')(dense)
outputs = keras.layers.Dense(len(features), activation = 'sigmoid')(dense)
model = keras.Model(inputs = inputs, outputs = outputs)
model.compile(
optimizer = 'adam',
loss = 'mae'
)
return model
TFX Trainer Component
For creating the Trainer Component I have been mainly following the implementation details laid out here: https://www.tensorflow.org/tfx/guide/trainer
As well as following the default penguins example: https://www.tensorflow.org/tfx/tutorials/tfx/penguin_simple#write_model_training_code
run_fn defintion
def run_fn(fn_args: tfx.components.FnArgs) -> None:
tft_output = tft.TFTransformOutput(fn_args.transform_output)
train_dataset = _input_fn(
file_pattern = fn_args.train_files,
data_accessor = fn_args.data_accessor,
tf_transform_output = tft_output,
batch_size = fn_args.train_steps
)
eval_dataset = _input_fn(
file_pattern = fn_args.eval_files,
data_accessor = fn_args.data_accessor,
tf_transform_output = tft_output,
batch_size = fn_args.custom_config['eval_batch_size']
)
# model = Autoencoder(
# features = fn_args.custom_config['features']
# )
model = _build_keras_model(features = fn_args.custom_config['features'])
model.compile(optimizer = 'adam', loss = 'mse')
model.fit(
train_dataset,
steps_per_epoch = fn_args.train_steps,
validation_data = eval_dataset,
validation_steps = fn_args.eval_steps
)
...
_input_fn definition
def _apply_preprocessing(raw_features, tft_layer):
transformed_features = tft_layer(raw_features)
return transformed_features
def _input_fn(
file_pattern,
data_accessor: tfx.components.DataAccessor,
tf_transform_output: tft.TFTransformOutput,
batch_size: int) -> tf.data.Dataset:
"""
Generates features and label for tuning/training.
Args:
file_pattern: List of paths or patterns of input tfrecord files.
data_accessor: DataAccessor for converting input to RecordBatch.
tf_transform_output: A TFTransformOutput.
batch_size: representing the number of consecutive elements of returned
dataset to combine in a single batch
Returns:
A dataset that contains features where features is a
dictionary of Tensors.
"""
dataset = data_accessor.tf_dataset_factory(
file_pattern,
tfxio.TensorFlowDatasetOptions(batch_size = batch_size),
tf_transform_output.transformed_metadata.schema
)
transform_layer = tf_transform_output.transform_features_layer()
def apply_transform(raw_features):
return _apply_preprocessing(raw_features, transform_layer)
return dataset.map(apply_transform).repeat()
This differs from the _input_fn example given above as I was following the example in the next tfx tutorial found here: https://www.tensorflow.org/tfx/tutorials/tfx/penguin_tft#run_fn
Also for reference, there is no Target within the example data so there is no label_key to be passed to the tfxio.TensorFlowDatasetOptions object.
Error
When trying to run the Trainer component using a TFX InteractiveContext object I receive the following error.
ValueError: No gradients provided for any variable: ['dense_460/kernel:0', 'dense_460/bias:0', 'dense_461/kernel:0', 'dense_461/bias:0', 'dense_462/kernel:0', 'dense_462/bias:0', 'dense_463/kernel:0', 'dense_463/bias:0', 'dense_464/kernel:0', 'dense_464/bias:0', 'dense_465/kernel:0', 'dense_465/bias:0'].
From my own attempts to solve this I believe the problem lies in the way that an Autoencoder is trained. From the Autoencoder example linked here https://www.tensorflow.org/tutorials/generative/autoencoder the data is fitted like so:
autoencoder.fit(x_train, x_train,
epochs=10,
shuffle=True,
validation_data=(x_test, x_test))
therefore it stands to reason that the tf.Dataset should also mimic this behaviour and when testing with plain Tensor objects I have been able to recreate the error above and then solve it when adding the target to be the same as the training data in the .fit() function.
Things I've Tried So Far
Duplicating Train Dataset
model.fit(
train_dataset,
train_dataset,
steps_per_epoch = fn_args.train_steps,
validation_data = eval_dataset,
validation_steps = fn_args.eval_steps
)
Raises error due to Keras not accepting a 'y' value when a dataset is passed.
ValueError: `y` argument is not supported when using dataset as input.
Returning a dataset that is a tuple with itself
def _input_fn(...
dataset = data_accessor.tf_dataset_factory(
file_pattern,
tfxio.TensorFlowDatasetOptions(batch_size = batch_size),
tf_transform_output.transformed_metadata.schema
)
transform_layer = tf_transform_output.transform_features_layer()
def apply_transform(raw_features):
return _apply_preprocessing(raw_features, transform_layer)
dataset = dataset.map(apply_transform)
return dataset.map(lambda x: (x, x))
This raises an error where the keys from the features dictionary don't match the output of the model.
ValueError: Found unexpected keys that do not correspond to any Model output: dict_keys(['feature_string', ...]). Expected: ['dense_477']
At this point I switched to using the keras.model Autoencoder subclass and tried to add output keys to the Model using an output which I tried to create dynamically in the same way as the inputs.
def call(self, x):
inputs = [keras.layers.Input(shape = (1,), name = f) for f in x]
dense = keras.layers.concatenate(inputs)
encoded = self.encoder(dense)
decoded = self.decoder(encoded)
outputs = {}
for feature_name in x:
outputs[feature_name] = keras.layers.Dense(1, activation = 'sigmoid')(decoded)
return outputs
This raises the following error:
TypeError: Cannot convert a symbolic Keras input/output to a numpy array. This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.
I've been looking into solving this issue but am no longer sure if the data is being passed correctly and am beginning to think I'm getting side-tracked from the actual problem.
Questions
Has anyone managed to get an Autoencoder working when connected via TFX examples?
Did you alter the tf.Dataset or handled the examples in a different way to the _input_fn demonstrated?
So I managed to find an answer to this and wanted to leave what I found here in case anyone else stumbles onto a similar problem.
It turns out my feelings around the error were correct and the solution did indeed lie in how the tf.Dataset object was presented.
This can be demonstrated when I ran some code which simulated the incoming data using randomly generated tensors.
tensors = [tf.random.uniform(shape = (1, 82)) for i in range(739)]
# This gives us a list of 739 tensors which hold 1 value for 82 'features' simulating the dataset I had
dataset = tf.data.Dataset.from_tensor_slices(tensors)
dataset = dataset.map(lambda x : (x, x))
# This returns a dataset which marks the training set and target as the same
# which is what the Autoecnoder model is looking for
model.fit(dataset ...)
Following this I proceeded to do the same thing with the dataset returned by the _input_fn. Given that the tfx DataAccessor object returns a features_dict however I needed to combine the tensors in that dict together to create a single tensor.
This is how my _input_fn looks now:
def create_target_values(features_dict: Dict[str, tf.Tensor]) -> tuple:
value_tensor = tf.concat(list(features_dict.values()), axis = 1)
return (features_dict, value_tensor)
def _input_fn(
file_pattern,
data_accessor: tfx.components.DataAccessor,
tf_transform_output: tft.TFTransformOutput,
batch_size: int) -> tf.data.Dataset:
"""
Generates features and label for tuning/training.
Args:
file_pattern: List of paths or patterns of input tfrecord files.
data_accessor: DataAccessor for converting input to RecordBatch.
tf_transform_output: A TFTransformOutput.
batch_size: representing the number of consecutive elements of returned
dataset to combine in a single batch
Returns:
A dataset that contains (features, target_tensor) tuple where features is a
dictionary of Tensors, and target_tensor is a single Tensor that is a concatenated tensor of all the
feature values.
"""
dataset = data_accessor.tf_dataset_factory(
file_pattern,
tfxio.TensorFlowDatasetOptions(batch_size = batch_size),
tf_transform_output.transformed_metadata.schema
)
dataset = dataset.map(lambda x: create_target_values(features_dict = x))
return dataset.repeat()

tf.keras.backend.function for transforming embeddings inside tf.data.dataset

I am trying to use the output of a neural network to transform data inside tf.data.dataset. Specifically, I am using a Delta-Encoder to manipulate embeddings inside the tf.data pipeline. In so doing, however, I get the following error:
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed in Graph execution. Use Eager execution or decorate this function with #tf.function.
I have searched the dataset pipeline page and stack overflow, but I could not find something that addresses my question. In the code below I am using an Autoencoder, as it yields an identical error with more concise code.
The offending part seems to be
[[x,]] = tf.py_function(Auto_Func, [x], [tf.float32])
inside
tf_auto_transform.
num_embeddings = 100
input_dims = 1000
embeddings = np.random.normal(size = (num_embeddings, input_dims)).astype(np.float32)
target = np.zeros(num_embeddings)
#creating Autoencoder
inp = Input(shape = (input_dims,), name ='input')
hidden = Dense(10, activation = 'relu', name = 'hidden')(inp)
out = Dense(input_dims, activation = 'relu', name='output')(hidden)
auto_encoder = tf.keras.models.Model(inputs =inp, outputs=out)
Auto_Func = tf.keras.backend.function(inputs = Autoencoder.get_layer(name='input').input,
outputs = Autoencoder.get_layer(name='output').input )
#Autoencoder transform for dataset.map
def tf_auto_transform(x, target):
x_shape = x.shape
##tf.function
#def func(x):
# return tf.py_function(Auto_Func, [x], [tf.float32])
#[[x,]] = func(x)
[[x,]] = tf.py_function(Auto_Func, [x], [tf.float32])
x.set_shape(x_shape)
return x, target
def get_dataset(X,y, batch_size = 32):
train_ds = tf.data.Dataset.from_tensor_slices((X, y))
train_ds = train_ds.map(tf_auto_transform)
train_ds = train_ds.batch(batch_size)
return train_ds
dataset = get_dataset(embeddings, target, 2)
The above code yields the following error:
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed in Graph execution. Use Eager execution or decorate this function with #tf.function.
I tried to eliminate the error by running the commented out section of the tf_auto_transform function, but the error persisted.
SideNote: While it is true that the Delta encoder paper has code, it is written in tf 1.x. I am trying to use tf 2.x with the tf functional API instead. Thank you for your help!
At the risk of outing myself as a n00b, the answer is to switch the order of the map and batch functions. I am trying to apply a neural network to make some changes on data. tf.keras models take batches as input, not individual samples. By batching the data first, I can run batches through my nn.
def get_dataset(X,y, batch_size = 32):
train_ds = tf.data.Dataset.from_tensor_slices((X, y))
#The changed order
train_ds = train_ds.batch(batch_size)
train_ds = train_ds.map(tf_auto_transform)**strong text**
return train_ds
It really is that simple.

adding value to tensorboard for adversarial learning

I'm new to tensorboard. I have faced some problem while using it.
Problem 1 :
I'm writing an adversarial learning model. For visualizing the loss of the model I have the following loss,
actor loss
critic loss
for the learning algorithm provided in this paper,
in one(or K) batch I have to feed actor and critic both. Then I need to only feed value to the critic. This time there is no actor. I think, to show value in tensorboard I need to do following,
def model():
...
actor_loss = ...
tf.summary.scalar('actor', actor_loss)
...
critic_loss = ...
tf.summary.scalar('critic', critic_loss)
my_graph = tf.Graph()
with my_graph.as_default():
tf.reset_default_graph()
sess = tf.Session()
with sess.as_default():
model()
merged = tf.summary.merge_all()
writer = tf.summary.FileWriter(address+ '/train',
sess.graph)
init = tf.global_variables_initializer()
sess.run(init)
Now while giving input to inner_loop (where actor and critic both participated) there's no problem, we get the result by following,
a,b,c,d,summary = sess.run( [actor_train_step, critic_train_step, actor_loss, critic_loss, merged], feed_dict = feed_dict )
writer.add_summary(summary, batch)
but when we want to give input only to the critic, then the code becomes following,
a,b,summary = sess.run( [critic_train_step, critic_loss, merged], feed_dict = feed_dict )
writer.add_summary(summary, batch)
But as merged have dependency over actor_loss it cannot run. On the other side, I can't just feed value to of actor to the model. How how to solve this issue?
Problem 2
I am not evaluating (calculating the score value) the model by tensor operation. Actually, I generate the output and fed the output to another script and got the score value from there. So after each of the batch/epoch I am evaluating my model and got a score value from the script. How can I save this value to tensorboard?
I can not formalize a tf.summary.merge_all() operation before the session initialization as I am calculating the evaluation score value at the training time from outside script.
Where should I put the tf.summary.merge_all() operation?
Now if I want to combine the Problem 1 and Problem 2 to in a single project is there anything new I have to do.
Note: I'm new to tensorboard. So it will be better if you can give a detailed explanation.
Problem #1
If you only want to summary only the critic op, you should only run the summary op for the critic part instead of using tf.summary.merge_all()
For example:
def model():
...
actor_loss = ...
tf.summary.scalar('actor', actor_loss)
...
critic_loss = ...
summary_critic = tf.summary.scalar('critic', critic_loss)
a,b,summary = sess.run( [critic_train_step, critic_loss, summary_critic], feed_dict = feed_dict )
writer.add_summary(summary, batch)
Problem #2
To visualize the values you got after running the outside script. You can convert those values to tensor using tf.convert_to_tensor(), which is documented here. Then serializing that tensor to visualize it on tensorboard.
For example:
vals = output_from_outside_script()
vals_tensor = sess.run(tf.convert_to_tensor(vals))
tf.summary.scalar('evaluation', vals_tensor)
Every tf.summary operations will create a Summary protobuf which serializing your tensor to an events file. And instead of running all the summary ops, Tensorflow provide tf.summary.merger_all() to run all the summary ops in your graph.
I tried to do it in your case.
Outside script:
import numpy as np
def output_from_outside_script(var):
return np.sum(var)
Code in adversarial training:
import tensorflow as tf
import numpy as np
from outside_evaluation import *
sess = tf.Session()
x = sess.run(tf.constant([[1,2,3,4]], dtype=tf.float32))
X = tf.placeholder(dtype=tf.float32, shape=[1, 4])
W = tf.Variable(tf.truncated_normal([4, 10], stddev=0.1))
sess.run(tf.global_variables_initializer())
val = tf.matmul(a=X, b=W, name='matmul')
tf.summary.scalar('matmul_mean', tf.reduce_mean(val))
y = sess.run(val, feed_dict={X: x})
print('y = ', y)
vals = output_from_outside_script(y)
print('vals = ', vals)
vals_tensor = tf.convert_to_tensor(vals, name='vals_tensor')
tf.summary.scalar('evaluation', vals_tensor)
writer = tf.summary.FileWriter(os.path.join('test_log'), sess.graph)
merged = tf.summary.merge_all()
summary = sess.run(merged, feed_dict={X: x})
writer.add_summary(summary)
writer.close()
Output:
('y = ', array([[-0.51137048, -0.16054343, -0.03827953, 0.1124011 , 0.09200752,
-0.22235785, 0.41357356, 1.04061067, -0.08877556, -0.86647421]],
('vals = ', -0.22920817)
Tensorboard log:
Scalar:
Should there be any problem, please let me know.

"None" dimension causes error when using DataSet API Tensorflow

I am trying to use Dataset API to feed the resnet found in the latest Tensorflow official models release.
The basic code is as follows:
with tf.Session() as sess:
print("initialized")
features_placeholder = tf.placeholder(prepared_x.dtype, prepared_x.shape)
labels_placeholder = tf.placeholder(dtype=tf.float32, shape=prepared_t.shape)
dataset = tf.contrib.data.Dataset.from_tensor_slices((features_placeholder, labels_placeholder))
dataset = dataset.shuffle(buffer_size=10000)
dataset = dataset.batch(batch_size)
dataset = dataset.repeat(num_epoch)
iterator = dataset.make_initializable_iterator()
(next_x_test, next_t_test) = iterator.get_next()
next_x_test = tf.to_float(next_x_test, name='ToFloat')
sess.run(iterator.initializer, feed_dict={features_placeholder: prepared_x,
labels_placeholder: prepared_t})
print(next_x_test)
print(next_t_test)
model = resnet_v2(resnet_size=50, num_classes=num_bins)
output = model(next_x_test,is_training=True)
This last lines throws an error when compiling
ValueError: The last dimension of the inputs to Dense should be
defined. Found None.
which makes reference back to the resent_v2 definition where the final layer is a dense layer.
How can I assert the shape of my features tensor?
Use tensor.set_shape to set the shape of a tensor if it happens to be undefined.

How to use feedable iterator from Tensorflow Dataset API along with MonitoredTrainingSession?

Tensorflow programmer's guide recommends using feedable iterator to switch between training and validation dataset without reinitializing the iterator. It mainly requires to feed the handle to choose between them.
How to use it along with tf.train.MonitoredTrainingSession?
The following method fails with "RuntimeError: Graph is finalized and cannot be modified." error.
with tf.train.MonitoredTrainingSession() as sess:
training_handle = sess.run(training_iterator.string_handle())
validation_handle = sess.run(validation_iterator.string_handle())
How to achieve both the convenience of MonitoredTrainingSession and iterating training and validation datasets simultaneously?
I got the answer from the Tensorflow GitHub issue - https://github.com/tensorflow/tensorflow/issues/12859
The solution is to invoke the iterator.string_handle() before creating the MonitoredSession.
import tensorflow as tf
from tensorflow.contrib.data import Dataset, Iterator
dataset_train = Dataset.range(10)
dataset_val = Dataset.range(90, 100)
iter_train_handle = dataset_train.make_one_shot_iterator().string_handle()
iter_val_handle = dataset_val.make_one_shot_iterator().string_handle()
handle = tf.placeholder(tf.string, shape=[])
iterator = Iterator.from_string_handle(
handle, dataset_train.output_types, dataset_train.output_shapes)
next_batch = iterator.get_next()
with tf.train.MonitoredTrainingSession() as sess:
handle_train, handle_val = sess.run([iter_train_handle, iter_val_handle])
for step in range(10):
print('train', sess.run(next_batch, feed_dict={handle: handle_train}))
if step % 3 == 0:
print('val', sess.run(next_batch, feed_dict={handle: handle_val}))
Output:
('train', 0)
('val', 90)
('train', 1)
('train', 2)
('val', 91)
('train', 3)
#Michael Jaison G answer is correct. However, it does not work when you also want to use certain session_run_hooks that need to evaluate parts of the graph, like e.g. LoggingTensorHook or SummarySaverHook.
The example below will cause an error:
import tensorflow as tf
dataset_train = tf.data.Dataset.range(10)
dataset_val = tf.data.Dataset.range(90, 100)
iter_train_handle = dataset_train.make_one_shot_iterator().string_handle()
iter_val_handle = dataset_val.make_one_shot_iterator().string_handle()
handle = tf.placeholder(tf.string, shape=[])
iterator = tf.data.Iterator.from_string_handle(
handle, dataset_train.output_types, dataset_train.output_shapes)
feature = iterator.get_next()
pred = feature * feature
tf.summary.scalar('pred', pred)
global_step = tf.train.create_global_step()
summary_hook = tf.train.SummarySaverHook(save_steps=5,
output_dir="summaries", summary_op=tf.summary.merge_all())
with tf.train.MonitoredTrainingSession(hooks=[summary_hook]) as sess:
handle_train, handle_val = sess.run([iter_train_handle, iter_val_handle])
for step in range(10):
feat = sess.run(feature, feed_dict={handle: handle_train})
pred_ = sess.run(pred, feed_dict={handle: handle_train})
print('train: ', feat)
print('pred: ', pred_)
if step % 3 == 0:
print('val', sess.run(feature, feed_dict={handle: handle_val}))
This will fail with error:
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder' with dtype string
[[Node: Placeholder = Placeholder[dtype=DT_STRING, shape=[], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
[[Node: cond/Switch_1/_15 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_18_cond/Switch_1", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
The reason being that the hook will try to evaluate the graph already upon the first session.run([iter_train_handle, iter_val_handle]) which obviously does not contain a handle in the feed_dict yet.
The workaround solution being to overwrite the hooks that cause the problem and changing the code in before_run and after_run to only evaluate on session.run calls containing the handle in the feed_dict (you can access the feed_dict of the current session.run call via the run_context argument of before_run and after_run)
Or you can use the latest master of Tensorflow (post-1.4) which adds a run_step_fn function to MonitoredSession which allows you to specify the following step_fn which will avoid the error (on the expense of evaluating the if statement TrainingIteration number of times ...)
def step_fn(step_context):
if handle_train is None:
handle_train, handle_val = sess.run([iter_train_handle, iter_val_handle])
return step_context.run_with_hooks(fetches=..., feed_dict=...)
There is a demo for using placeholder in mot_session with SessionRunHook.
This demo is about switching datasets by feeding diff handle_string.
BTW, I have tried all solutions, but only this works.
dataset_switching