How to use Tensorflow's tf.cond() with two different Dataset iterators without iterating both? - tensorflow

I want to feed a CNN with the tensor "images". I want this tensor to contain images from the training set ( which have FIXED size ) when the placeholder is_training is True, otherwise I want it to contain images from the test set ( which are of NOT FIXED size ).
This is needed because in training I take a random fixed crop from the training images, while in test I want to perform a dense evaluation and feed the entire images inside the network ( it is fully convolutional so it will accept them)
The current NOT WORKING way is to create two different iterators, and try to select the training/test input with tf.cond at the,{is_training:True/False}).
The problem is that BOTH the iterators are evaluated. The training and test dataset are also of different size so I cannot iterate both of them until the end. Is there a way to make this work? Or to rewrite this in a smarter way?
I've seen some questions/answers about this but they always used tf.assign which takes a numpy array and assigns it to a tensor. In this case I cannot use tf.assign because I already have a tensor coming from the iterators.
The current code that I have is this one. It simply checks the shape of the tensor "images":
train_filenames, train_labels = list_images(args.train_dir)
val_filenames, val_labels = list_images(args.val_dir)
graph = tf.Graph()
with graph.as_default():
# Preprocessing (for both training and validation):
def _parse_function(filename, label):
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_jpeg(image_string, channels=3)
image = tf.cast(image_decoded, tf.float32)
return image, label
# Preprocessing (for training)
def training_preprocess(image, label):
# Random flip and crop
image = tf.image.random_flip_left_right(image)
image = tf.random_crop(image, [args.crop,args.crop, 3])
return image, label
# Preprocessing (for validation)
def val_preprocess(image, label):
flipped_image = tf.image.flip_left_right(image)
batch = tf.stack([image,flipped_image],axis=0)
return batch, label
# Training dataset
train_filenames = tf.constant(train_filenames)
train_labels = tf.constant(train_labels)
train_dataset =, train_labels))
train_dataset =,num_threads=args.num_workers, output_buffer_size=args.batch_size)
train_dataset =,num_threads=args.num_workers, output_buffer_size=args.batch_size)
train_dataset = train_dataset.shuffle(buffer_size=10000)
batched_train_dataset = train_dataset.batch(args.batch_size)
# Validation dataset
val_filenames = tf.constant(val_filenames)
val_labels = tf.constant(val_labels)
val_dataset =, val_labels))
val_dataset =,num_threads=1, output_buffer_size=1)
val_dataset =,num_threads=1, output_buffer_size=1)
train_iterator =,batched_train_dataset.output_shapes)
val_iterator =,val_dataset.output_shapes)
train_images, train_labels = train_iterator.get_next()
val_images, val_labels = val_iterator.get_next()
train_init_op = train_iterator.make_initializer(batched_train_dataset)
val_init_op = val_iterator.make_initializer(val_dataset)
# Indicates whether we are in training or in test mode
is_training = tf.placeholder(tf.bool)
def f_true():
with tf.control_dependencies([tf.identity(train_images)]):
return tf.identity(train_images)
def f_false():
return val_images
images = tf.cond(is_training,f_true,f_false)
num_images = images.shape
with tf.Session(graph=graph) as sess:
img =,{is_training:True})
The problem is that when I want to use only the training iterator, I comment the line to initialize the val_init_op but there is the following error:
FailedPreconditionError (see above for traceback): GetNext() failed because the iterator has not been initialized. Ensure that you have run the initializer operation for this iterator before getting the next element.
[[Node: IteratorGetNext_1 = IteratorGetNext[output_shapes=[[2,?,?,3], []], output_types=[DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/cpu:0"](Iterator_1)]]
If I do not comment that line everything works as expected, when is_training is true I get training images and when is_training is False I get validation images. The issue is that both the iterators need to be initialized and when I evaluate one of them, the other is incremented too. Since as I said they are of different size this causes an issue.
I hope there is a way to solve it! Thanks in advance

The trick is to call iterator.get_next() inside the f_true() and f_false() functions:
def f_true():
train_images, _ = train_iterator.get_next()
return train_images
def f_false():
val_images, _ = val_iterator.get_next()
return val_images
images = tf.cond(is_training, f_true, f_false)
The same advice applies to any TensorFlow op that has a side effect, like assigning to a variable: if you want that side effect to happen conditionally, the op must be created inside the appropriate branch function passed to tf.cond().


How do you fit a tf.Dataset to a Keras Autoencoder Model when the Dataset has been generated using TFX?

As the title suggests I have been trying to create a pipeline for training an Autoencoder model using TFX. The problem I'm having is fitting the tf.Dataset returned by the DataAccessor.tf_dataset_factory object to the Autoencoder.
Below I summarise the steps I've taken through this project, and have some Questions at the bottom if you wish to skip the background information.
TFX Pipeline
The TFX components I have used so far have been:
CsvExampleGenerator (the dataset has 82 columns, all numeric, and the sample csv has 739 rows)
StatisticsGenerator / SchemaGenerator, the schema has been edited as is now loaded in using an Importer
Trainer (this is the component I am currently having problems with)
The model that I am attempting to train is based off of the example laid out here However, my model is being trained on tabular data, searching for anomalous results, as opposed to image data.
As I have tried a couple of solutions I have tried using both the Keras.layers and Keras.model format for defining the model and I outline both below:
Subclassing Keras.Model
class Autoencoder(keras.models.Model):
def __init__(self, features):
super(Autoencoder, self).__init__()
self.encoder = tf.keras.Sequential([
keras.layers.Dense(82, activation = 'relu'),
keras.layers.Dense(32, activation = 'relu'),
keras.layers.Dense(16, activation = 'relu'),
keras.layers.Dense(8, activation = 'relu')
self.decoder = tf.keras.Sequential([
keras.layers.Dense(16, activation = 'relu'),
keras.layers.Dense(32, activation = 'relu'),
keras.layers.Dense(len(features), activation = 'sigmoid')
def call(self, x):
inputs = [keras.layers.Input(shape = (1,), name = f) for f in features]
dense = keras.layers.concatenate(inputs)
encoded = self.encoder(dense)
decoded = self.decoder(encoded)
return decoded
Subclassing Keras.Layers
def _build_keras_model(features: List[str]) -> tf.keras.Model:
inputs = [keras.layers.Input(shape = (1,), name = f) for f in features]
dense = keras.layers.concatenate(inputs)
dense = keras.layers.Dense(32, activation = 'relu')(dense)
dense = keras.layers.Dense(16, activation = 'relu')(dense)
dense = keras.layers.Dense(8, activation = 'relu')(dense)
dense = keras.layers.Dense(16, activation = 'relu')(dense)
dense = keras.layers.Dense(32, activation = 'relu')(dense)
outputs = keras.layers.Dense(len(features), activation = 'sigmoid')(dense)
model = keras.Model(inputs = inputs, outputs = outputs)
optimizer = 'adam',
loss = 'mae'
return model
TFX Trainer Component
For creating the Trainer Component I have been mainly following the implementation details laid out here:
As well as following the default penguins example:
run_fn defintion
def run_fn(fn_args: tfx.components.FnArgs) -> None:
tft_output = tft.TFTransformOutput(fn_args.transform_output)
train_dataset = _input_fn(
file_pattern = fn_args.train_files,
data_accessor = fn_args.data_accessor,
tf_transform_output = tft_output,
batch_size = fn_args.train_steps
eval_dataset = _input_fn(
file_pattern = fn_args.eval_files,
data_accessor = fn_args.data_accessor,
tf_transform_output = tft_output,
batch_size = fn_args.custom_config['eval_batch_size']
# model = Autoencoder(
# features = fn_args.custom_config['features']
# )
model = _build_keras_model(features = fn_args.custom_config['features'])
model.compile(optimizer = 'adam', loss = 'mse')
steps_per_epoch = fn_args.train_steps,
validation_data = eval_dataset,
validation_steps = fn_args.eval_steps
_input_fn definition
def _apply_preprocessing(raw_features, tft_layer):
transformed_features = tft_layer(raw_features)
return transformed_features
def _input_fn(
data_accessor: tfx.components.DataAccessor,
tf_transform_output: tft.TFTransformOutput,
batch_size: int) ->
Generates features and label for tuning/training.
file_pattern: List of paths or patterns of input tfrecord files.
data_accessor: DataAccessor for converting input to RecordBatch.
tf_transform_output: A TFTransformOutput.
batch_size: representing the number of consecutive elements of returned
dataset to combine in a single batch
A dataset that contains features where features is a
dictionary of Tensors.
dataset = data_accessor.tf_dataset_factory(
tfxio.TensorFlowDatasetOptions(batch_size = batch_size),
transform_layer = tf_transform_output.transform_features_layer()
def apply_transform(raw_features):
return _apply_preprocessing(raw_features, transform_layer)
This differs from the _input_fn example given above as I was following the example in the next tfx tutorial found here:
Also for reference, there is no Target within the example data so there is no label_key to be passed to the tfxio.TensorFlowDatasetOptions object.
When trying to run the Trainer component using a TFX InteractiveContext object I receive the following error.
ValueError: No gradients provided for any variable: ['dense_460/kernel:0', 'dense_460/bias:0', 'dense_461/kernel:0', 'dense_461/bias:0', 'dense_462/kernel:0', 'dense_462/bias:0', 'dense_463/kernel:0', 'dense_463/bias:0', 'dense_464/kernel:0', 'dense_464/bias:0', 'dense_465/kernel:0', 'dense_465/bias:0'].
From my own attempts to solve this I believe the problem lies in the way that an Autoencoder is trained. From the Autoencoder example linked here the data is fitted like so:, x_train,
validation_data=(x_test, x_test))
therefore it stands to reason that the tf.Dataset should also mimic this behaviour and when testing with plain Tensor objects I have been able to recreate the error above and then solve it when adding the target to be the same as the training data in the .fit() function.
Things I've Tried So Far
Duplicating Train Dataset
steps_per_epoch = fn_args.train_steps,
validation_data = eval_dataset,
validation_steps = fn_args.eval_steps
Raises error due to Keras not accepting a 'y' value when a dataset is passed.
ValueError: `y` argument is not supported when using dataset as input.
Returning a dataset that is a tuple with itself
def _input_fn(...
dataset = data_accessor.tf_dataset_factory(
tfxio.TensorFlowDatasetOptions(batch_size = batch_size),
transform_layer = tf_transform_output.transform_features_layer()
def apply_transform(raw_features):
return _apply_preprocessing(raw_features, transform_layer)
dataset =
return x: (x, x))
This raises an error where the keys from the features dictionary don't match the output of the model.
ValueError: Found unexpected keys that do not correspond to any Model output: dict_keys(['feature_string', ...]). Expected: ['dense_477']
At this point I switched to using the keras.model Autoencoder subclass and tried to add output keys to the Model using an output which I tried to create dynamically in the same way as the inputs.
def call(self, x):
inputs = [keras.layers.Input(shape = (1,), name = f) for f in x]
dense = keras.layers.concatenate(inputs)
encoded = self.encoder(dense)
decoded = self.decoder(encoded)
outputs = {}
for feature_name in x:
outputs[feature_name] = keras.layers.Dense(1, activation = 'sigmoid')(decoded)
return outputs
This raises the following error:
TypeError: Cannot convert a symbolic Keras input/output to a numpy array. This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.
I've been looking into solving this issue but am no longer sure if the data is being passed correctly and am beginning to think I'm getting side-tracked from the actual problem.
Has anyone managed to get an Autoencoder working when connected via TFX examples?
Did you alter the tf.Dataset or handled the examples in a different way to the _input_fn demonstrated?
So I managed to find an answer to this and wanted to leave what I found here in case anyone else stumbles onto a similar problem.
It turns out my feelings around the error were correct and the solution did indeed lie in how the tf.Dataset object was presented.
This can be demonstrated when I ran some code which simulated the incoming data using randomly generated tensors.
tensors = [tf.random.uniform(shape = (1, 82)) for i in range(739)]
# This gives us a list of 739 tensors which hold 1 value for 82 'features' simulating the dataset I had
dataset =
dataset = x : (x, x))
# This returns a dataset which marks the training set and target as the same
# which is what the Autoecnoder model is looking for ...)
Following this I proceeded to do the same thing with the dataset returned by the _input_fn. Given that the tfx DataAccessor object returns a features_dict however I needed to combine the tensors in that dict together to create a single tensor.
This is how my _input_fn looks now:
def create_target_values(features_dict: Dict[str, tf.Tensor]) -> tuple:
value_tensor = tf.concat(list(features_dict.values()), axis = 1)
return (features_dict, value_tensor)
def _input_fn(
data_accessor: tfx.components.DataAccessor,
tf_transform_output: tft.TFTransformOutput,
batch_size: int) ->
Generates features and label for tuning/training.
file_pattern: List of paths or patterns of input tfrecord files.
data_accessor: DataAccessor for converting input to RecordBatch.
tf_transform_output: A TFTransformOutput.
batch_size: representing the number of consecutive elements of returned
dataset to combine in a single batch
A dataset that contains (features, target_tensor) tuple where features is a
dictionary of Tensors, and target_tensor is a single Tensor that is a concatenated tensor of all the
feature values.
dataset = data_accessor.tf_dataset_factory(
tfxio.TensorFlowDatasetOptions(batch_size = batch_size),
dataset = x: create_target_values(features_dict = x))
return dataset.repeat()

tf.keras.backend.function for transforming embeddings inside

I am trying to use the output of a neural network to transform data inside Specifically, I am using a Delta-Encoder to manipulate embeddings inside the pipeline. In so doing, however, I get the following error:
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed in Graph execution. Use Eager execution or decorate this function with #tf.function.
I have searched the dataset pipeline page and stack overflow, but I could not find something that addresses my question. In the code below I am using an Autoencoder, as it yields an identical error with more concise code.
The offending part seems to be
[[x,]] = tf.py_function(Auto_Func, [x], [tf.float32])
num_embeddings = 100
input_dims = 1000
embeddings = np.random.normal(size = (num_embeddings, input_dims)).astype(np.float32)
target = np.zeros(num_embeddings)
#creating Autoencoder
inp = Input(shape = (input_dims,), name ='input')
hidden = Dense(10, activation = 'relu', name = 'hidden')(inp)
out = Dense(input_dims, activation = 'relu', name='output')(hidden)
auto_encoder = tf.keras.models.Model(inputs =inp, outputs=out)
Auto_Func = tf.keras.backend.function(inputs = Autoencoder.get_layer(name='input').input,
outputs = Autoencoder.get_layer(name='output').input )
#Autoencoder transform for
def tf_auto_transform(x, target):
x_shape = x.shape
#def func(x):
# return tf.py_function(Auto_Func, [x], [tf.float32])
#[[x,]] = func(x)
[[x,]] = tf.py_function(Auto_Func, [x], [tf.float32])
return x, target
def get_dataset(X,y, batch_size = 32):
train_ds =, y))
train_ds =
train_ds = train_ds.batch(batch_size)
return train_ds
dataset = get_dataset(embeddings, target, 2)
The above code yields the following error:
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed in Graph execution. Use Eager execution or decorate this function with #tf.function.
I tried to eliminate the error by running the commented out section of the tf_auto_transform function, but the error persisted.
SideNote: While it is true that the Delta encoder paper has code, it is written in tf 1.x. I am trying to use tf 2.x with the tf functional API instead. Thank you for your help!
At the risk of outing myself as a n00b, the answer is to switch the order of the map and batch functions. I am trying to apply a neural network to make some changes on data. tf.keras models take batches as input, not individual samples. By batching the data first, I can run batches through my nn.
def get_dataset(X,y, batch_size = 32):
train_ds =, y))
#The changed order
train_ds = train_ds.batch(batch_size)
train_ds =**strong text**
return train_ds
It really is that simple.

Simple softmax classifier in tensorflow

So I am trying to write a simple softmax classifier in TensorFlow.
Here is the code:
# Neural network parameters
n_hidden_units = 500
n_classes = 10
# training set placeholders
input_X = tf.placeholder(dtype='float32',shape=(None,X_train.shape[1], X_train.shape[2]),name="input_X")
input_y = tf.placeholder(dtype='int32', shape=(None,), name="input_y")
# hidden layer
dim = X_train.shape[1]*X_train.shape[2] # dimension of each traning data point
flatten_X = tf.reshape(input_X, shape=(-1, dim))
weights_hidden_layer = tf.Variable(initial_value=np.zeros((dim,n_hidden_units)), dtype ='float32')
bias_hidden_layer = tf.Variable(initial_value=np.zeros((1,n_hidden_units)), dtype ='float32')
hidden_layer_output = tf.nn.relu(tf.matmul(flatten_X, weights_hidden_layer) + bias_hidden_layer)
# output layer
weights_output_layer = tf.Variable(initial_value=np.zeros((n_hidden_units,n_classes)), dtype ='float32')
bias_output_layer = tf.Variable(initial_value=np.zeros((1,n_classes)), dtype ='float32')
output_logits = tf.matmul(hidden_layer_output, weights_output_layer) + bias_output_layer
predicted_y = tf.nn.softmax(output_logits)
# loss
one_hot_labels = tf.one_hot(input_y, depth=n_classes, axis = -1)
loss = tf.losses.softmax_cross_entropy(one_hot_labels, output_logits)
# optimizer
optimizer = tf.train.MomentumOptimizer(0.01, 0.5).minimize(
loss, var_list=[weights_hidden_layer, bias_hidden_layer, weights_output_layer, bias_output_layer])
This compiles, and I have checked the shape of all the tensor and it coincides with what I expect.
However, I tried to run the optimizer using the following code:
# running the optimizer
s = tf.InteractiveSession()
for i in range(5):, {input_X: X_train, input_y: y_train})
loss_i =, {input_X: X_train, input_y: y_train})
print("loss at iter %i:%.4f" % (i, loss_i))
And the loss kept being the same in all iterations!
I must have messed up something, but I fail to see what.
Any ideas? I also appreciate if somebody leaves comments regarding code style and/or tensorflow tips.
You have made a mistake. You are initializing your weights using np.zeros. Use np.random.normal. You can choose mean for this Gaussian Distribution by using number of inputs going to a particular neuron. You can read more about it here.
The reason that you want to initialize with Gaussian Distribution is because you want to break symmetry. If all the weights are initialized by zero, then you can use backpropogation to see that all the weights will evolved same.
One could visualize the weight histogram using TensorBoard to make it easier. I executed your code for this. A few more lines are needed to set up Tensorboard logging but the histogram summary of weights can be easily added.
Initialized to zeros
weights_hidden_layer = tf.Variable(initial_value=np.zeros((784,n_hidden_units)), dtype ='float32')
Xavier initialization
initializer = tf.contrib.layers.xavier_initializer()
weights_hidden_layer = tf.Variable(initializer(shape=(784,n_hidden_units)), dtype ='float32')

TensorFlow: read a frozen model, add operations, then save to a new frozen model

I am sorry in advance if the title does not reflect exactly my problem (I think it does, but I'm not sure), which I describe below.
I am working on converting a Yolo object detection model to a TensorFlow frozen model .pb and then to use that model for prediction on mobile phones.
I have successfully obtained a working .pb model (i.e. a frozen graph from Yolo's graph). But since the outputs of the network (there are two of them) are not the bounding boxes, I have to write a function for conversion (this part is not my question, I already have a working function for this task):
def get_boxes_from_output(outputs_of_the_graph, anchors,
num_classes, input_image_shape,
score_threshold=score, iou_threshold=iou)
Apply some operations on the outputs_of_the_graph to obtain bounding boxes information
return boxes, scores, classes
So the pipeline is simple: I have to load the pb model, then throw the image data to it to get two outputs, then from these two outputs, I apply the above function (that contains tensor operations) to obtain bounding boxes information. The code look like this:
model_path = 'model_data/yolo.pb'
class_names = _get_class('model_data/classes.txt')
anchors = _get_anchors('model_data/yolo_anchors.txt')
score = 0.25
iou = 0.5
# Load the Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
graph_def = tf.GraphDef()
with tf.gfile.GFile(model_path, 'rb') as fid:
tf.import_graph_def(graph_def, name='')
# Get the input and output nodes (there are two outputs)
l_input = detection_graph.get_tensor_by_name('input_1:0')
l_output = [detection_graph.get_tensor_by_name('conv2d_10/BiasAdd:0'),
# Generate output tensor targets for filtered bounding boxes.
input_image_shape = tf.placeholder(dtype=tf.float32,shape=(2, ))
training = tf.placeholder(tf.bool, name='training')
boxes, scores, classes = get_boxes_from_output(l_output, anchors,
len(class_names), input_image_shape,
score_threshold=score, iou_threshold=iou)
image ='./data/image1.jpg')
image = preprocess_image(image)
image_data = np.array(image, dtype='float32')
image_data = np.expand_dims(image_data, 0) # Add batch dimension.
sess = tf.Session(graph=detection_graph)
# Run the session to get the output bounding boxes
out_boxes, out_scores, out_classes =
[boxes, scores, classes],
l_input: image_data,
input_image_shape: [image.size[1], image.size[0]],
training: False
# Now how do I save a new model that outputs directly [boxes, scores, classes]
Now my question is how do I save a new .pb model from the session, so that I can load it again elsewhere and it can directly outputs boxes, scores, classes?
I hope the question is clear enough.
Thank you very much in advance for your help!
Add nodes and save to a frozen model
Once you have added the new ops, you need to write the new graph using tf.train.write_graph:
boxes, scores, classes = get_boxes_from_output()
Then you need to freeze the above graph using the freeze_graph utility. Make sure the output_node_names are set to boxes, scores, classes as shown below:
# Freeze graph
from import freeze_graph
import os
input_graph_path = os.path.join(save_dir, 'new_cnn_weights.pb')
input_saver_def_path = ''
input_binary = False
output_node_names = 'boxes, scores, classes'
restore_op_name = ''
filename_tensor_name = ''
output_graph_path = os.path.join(save_dir, 'new_frozen_cnn_weights.pb')
clear_devices = False
checkpoint_path = os.path.join(save_dir, 'test_model')
freeze_graph.freeze_graph(input_graph_path, input_saver_def_path,
input_binary, checkpoint_path, output_node_names,
restore_op_name, filename_tensor_name,
output_graph_path, clear_devices, '')
Check the optimized graph
#Load the new optimized graph and check whether the output is consistent,
with tf.gfile.GFile(save_dir+'new_frozen_cnn_weights.pb', 'rb') as f:
graph_def_optimized = tf.GraphDef()
G = tf.Graph()
with tf.Session(graph=G) as sess:
boxes,scores,classes = tf.import_graph_def(graph_def_optimized, return_elements=['boxes:0', 'scores:0', 'classes:0'])
print('Operations in Optimized Graph:')
print([ for op in G.get_operations()])
x = G.get_tensor_by_name('import/import/input:0')
print([boxes, scores, classes], feed_dict={x: np.expand_dims(img, 0)}))

Tensorflow, how to pass MultiRNN state in feed_dict

I am trying to make a generative RNN model in tensorflow. What is annoying me is that with the new switch to state_is_tupe being true by default in the RNN library, I am having a hard time finding the best way to save state between batches. I know I can change it back to being False but I don't want to do it since it is deprecated. When I am done with the training I need to be able to perserve the hidden states between calls to since I will be generating the sequences one sample at a time. I figured out that I can return the state of the rnn as follows.
rnn = tf.nn.rnn_cell.MultiRNNCell(cells)
zero_state = rnn.zero_state(batch_size, tf.float32)
output, final_state = tf.nn.dynamic_rnn(rnn, self.input_sound, initial_state = zero_state)
sess = tf.Session()
state_output =, feed_dict = {self.input_sound: np.zeros((64, 32, 512))})
This would be great but the issue emerges when I want to pass state_output back into the model. Since a placeholder can only be a tensor object I can't pass it back the state_output tupel.
I am looking for a very generic solution. The rnn could be a MultiRNNCell or a single LSTMCell or any other combination imaginable.
I think I figured it out. I used the following code to flatten the state tuples into a single 1D tensor. I can than chop it up when I pass it back into the model according to the size specification of the rnn cell.
def flatten_state_tupel(x):
result = []
for x_ in x:
if isinstance(x_, tf.Tensor) or not hasattr(x_, '__iter__'):
return result
def pack_state_tupel(state):
return tf.concat(0, [tf.reshape(s, (-1,)) for s in flatten_state_tupel(state)])
def unpack_state_tupel(state, size):
state = tf.reshape(state, (-1, tf.reduce_sum(flatten_state_tupel(size))))
def _make_state_tupel(sz, i):
if hasattr(sz, '__iter__'):
result = []
for s in sz:
base_index, y = _make_state_tupel(s, i)
return base_index, tf.nn.rnn_cell.LSTMStateTuple(*result) if isinstance(sz, tf.nn.rnn_cell.LSTMStateTuple) else tuple(result)
return i + sz, state[..., i : i + sz]
return _make_state_tupel(size, 0)[-1]
I use the functions as follows.
rnn = tf.nn.rnn_cell.MultiRNNCell(cells)
zero_state = pack_state_tupel(rnn.zero_state(batch_size, tf.float32))
self.initial_state = tf.placeholder_with_default(zero_state, None)
output, final_state = tf.nn.dynamic_rnn(rnn, self.input_sound, initial_state = unpack_state_tupel(self.initial_state, rnn.state_size))
packed_state = pack_state_tupel(final_state)
sess = tf.Session()
state_output =, feed_dict = {self.input_sound: np.zeros((64, 32, 512))})
state_output =, feed_dict = {self.input_sound: np.zeros((64, 32, 512)), self.initial_state: np.zeros(state_output.shape[0])})
This way it will zero the state if I do not pass anything (which will be the case during training) however I can save and pass the state between batches during generation.