keras tape.gradient error: Input to reshape is a tensor with 1012 values, but the requested shape has 20240 [Op:Reshape] - tensorflow

i use the tape.gradient(g_loss, aa_mutator.trainable_variables) to calculate the gradient of a model called aa_mutator and got the error
File "/home/tialan/Data/gan/code/g.py", line 297, in <module>
grads_g = tape.gradient(g_loss, aa_mutator.trainable_variables)
File "/home/tialan/tf/lib/python3.7/site-packages/tensorflow/python/eager/backprop.py", line 1086, in gradient
unconnected_gradients=unconnected_gradients)
File "/home/tialan/tf/lib/python3.7/site-packages/tensorflow/python/eager/imperative_grad.py", line 77, in imperative_grad
compat.as_str(unconnected_gradients.value))
File "/home/tialan/tf/lib/python3.7/site-packages/tensorflow/python/eager/backprop.py", line 162, in _gradient_function
return grad_fn(mock_op, *out_grads)
File "/home/tialan/tf/lib/python3.7/site-packages/tensorflow/python/ops/array_grad.py", line 782, in _ReshapeGrad
_IndexedSlicesToTensorNoWarning(grad), array_ops.shape(op.inputs[0])),
File "/home/tialan/tf/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "/home/tialan/tf/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py", line 195, in reshape
result = gen_array_ops.reshape(tensor, shape, name)
File "/home/tialan/tf/lib/python3.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 8368, in reshape
_ops.raise_from_not_ok_status(e, name)
File "/home/tialan/tf/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 6862, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 1012 values, but the requested shape has 20240 [Op:Reshape]
within the aa_mutator model i build a customized keras layer
class mutate_func_layer(layers.Layer):
def __init__(self):
super(mutate_func_layer, self).__init__()
def call(self, inputs):
Mut_pos_layer_out, input_pre, Mutation_3 = inputs
where = where_func(Mut_pos_layer_out)
return mutate_func(Mutation_3, where, input_pre)
with mutate_func defined as
#tf.custom_gradient
def mutate_func(x, where, input_pre):#(Mutation_3, where, input_pre): ## x = mutation 3
print('m3_in_mutate_func')
print(x.shape)
aa_aft = gather_nd_func(x, where)
print('m3_in_mutate_func')
print(aa_aft.shape)
aa_aft = K.argmax(aa_aft, axis=-1)
print('m3_in_mutate_func')
print(aa_aft.shape)
aa_aft = tf.reshape(aa_aft, [-1])
print('m3_in_mutate_func')
print(aa_aft.shape)
aa_aft = tf.cast(aa_aft, dtype=tf.float32)
print('m3_in_mutate_func')
print(aa_aft.shape)
aa_seq_out = tf.tensor_scatter_nd_update(input_pre, [where], [aa_aft])
print('m3_in_mutate_func')
print(aa_seq_out.shape)
def grad(upstream):
return upstream*1, upstream*1, upstream*1
the shape for layers in the mutate_func are printed as
m3_in_mutate_func
(1, 1012, 20)
m3_in_mutate_func
(675, 20)
m3_in_mutate_func
(675,)
m3_in_mutate_func
(675,)
m3_in_mutate_func
(675,)
m3_in_mutate_func
(1, 1012)
the model is able to predict given the input. just for fitting the error shows at the stage of tape.gradient. is the error raised due to the customized layer? Thanks for any help or suggestion

Related

Why TensorFlow throws this exception when loading a model that was normalized like this?

All latest versions from the very moment of this post.
tensorflow-gpu: 2.6.0
Python: 3.9.7
CUDA: 11.4.2
cuDNN: 8.2.4
As in the code below, when loading a model that was normalized by not passing arguments to Normalization() it throws an exception when that model is loaded by load_model(), however before loading the model I can use it without any apparent issues which makes you think it's all good since Normalization() did NOT complain and took care of the input shape. When loading a model that was normalized by Normalization(input_dim=5) it does NOT thrown any exception since a known shape is specified. That is weird I mean it should warn you that when normalizing it without passing arguments to Normalization() you should expect an exception when loading it.
I'm not sure if it's a bug so I'm posting it here before reporting a bug in the github section, maybe I'm missing to setup something.
Here's my code:
import numpy as np
import tensorflow as tf
def main():
train_data = np.array([[1, 2, 3, 4, 5]])
train_label = np.array([123])
# Uncomment this to load the model and comment the next model and normalizer related lines.
#model = tf.keras.models.load_model('AI/test.h5')
normalizer = tf.keras.layers.experimental.preprocessing.Normalization()
normalizer.adapt(train_data)
model = tf.keras.Sequential([normalizer, tf.keras.layers.Dense(units=1)])
model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.1), loss='mean_absolute_error')
model.fit(train_data, train_label, epochs=3000)
model.save('AI/test.h5')
unseen_data = np.array([[1, 2, 3, 4, 6]])
prediction = model.predict(unseen_data)
print(prediction)
if __name__ == "__main__":
main()
It throws the following exception:
Traceback (most recent call last):
File "E:\Backup\Desktop\tensorflow_test.py", line 30, in <module>
main()
File "E:\Backup\Desktop\tensorflow_test.py", line 11, in main
model = tf.keras.models.load_model('AI/test.h5')
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\saving\save.py", line 200, in load_model
return hdf5_format.load_model_from_hdf5(filepath, custom_objects,
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\saving\hdf5_format.py", line 180, in load_model_from_hdf5
model = model_config_lib.model_from_config(model_config,
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\saving\model_config.py", line 52, in model_from_config
return deserialize(config, custom_objects=custom_objects)
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\layers\serialization.py", line 208, in deserialize
return generic_utils.deserialize_keras_object(
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\utils\generic_utils.py", line 674, in deserialize_keras_object
deserialized_obj = cls.from_config(
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\sequential.py", line 434, in from_config
model.add(layer)
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\training\tracking\base.py", line 530, in _method_wrapper
result = method(self, *args, **kwargs)
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\sequential.py", line 217, in add
output_tensor = layer(self.outputs[0])
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\base_layer.py", line 976, in __call__
return self._functional_construction_call(inputs, args, kwargs,
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\base_layer.py", line 1114, in _functional_construction_call
outputs = self._keras_tensor_symbolic_call(
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\base_layer.py", line 848, in _keras_tensor_symbolic_call
return self._infer_output_signature(inputs, args, kwargs, input_masks)
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\base_layer.py", line 886, in _infer_output_signature
self._maybe_build(inputs)
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\base_layer.py", line 2659, in _maybe_build
self.build(input_shapes) # pylint:disable=not-callable
File "C:\Users\censored\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\layers\preprocessing\normalization.py", line 145, in build
raise ValueError(
ValueError: All `axis` values to be kept must have known shape. Got axis: (-1,), input shape: [None, None], with unknown axis at index: 1
Process finished with exit code 1
It looks like a bug.
Follow this link
if 'input_dim' in kwargs and 'input_shape' not in kwargs:
# Backwards compatibility: alias 'input_dim' to 'input_shape'.
kwargs['input_shape'] = (kwargs['input_dim'],)
if 'input_shape' in kwargs or 'batch_input_shape' in kwargs:
# In this case we will later create an input layer
# to insert before the current layer
if 'batch_input_shape' in kwargs:
batch_input_shape = tuple(kwargs['batch_input_shape'])
elif 'input_shape' in kwargs:
if 'batch_size' in kwargs:
batch_size = kwargs['batch_size']
else:
batch_size = None
batch_input_shape = (batch_size,) + tuple(kwargs['input_shape'])
self._batch_input_shape = batch_input_shape
The error occurs because the normalization could not get any shape information which would lead to self._input_batch_shape =(None, None).
But when loading model(deserialization), It would call build function which should have known shape in all axes.
# Sorted to avoid transposing axes.
self._keep_axis = sorted([d if d >= 0 else d + ndim for d in self.axis])
# All axes to be kept should have known shape.
for d in self._keep_axis:
if input_shape[d] is None:
raise ValueError(
'All `axis` values to be kept must have known shape. Got axis: {}, '
'input shape: {}, with unknown axis at index: {}'.format(
self.axis, input_shape, d))

Incompatible shapes while using triplet loss and pre-trained resnet

I am trying to use pre-trained resnet and fine-tune it using triplet loss. The following code I came up with is a combination of tutorials I found on the topic:
import pathlib
import tensorflow as tf
import tensorflow_addons as tfa
with tf.device('/cpu:0'):
INPUT_SHAPE = (32, 32, 3)
BATCH_SIZE = 16
data_dir = pathlib.Path('/home/user/dataset/')
base_model = tf.keras.applications.ResNet50V2(
weights='imagenet',
pooling='avg',
include_top=False,
input_shape=INPUT_SHAPE,
)
# following two lines are added after edit, originally it was model = base_model
head_model = tf.keras.layers.Lambda(lambda x: tf.math.l2_normalize(x, axis=1))(base_model.output)
model = tf.keras.Model(inputs=base_model.input, outputs=head_model)
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rotation_range=10,
zoom_range=0.1,
)
generator = datagen.flow_from_directory(
data_dir,
target_size=INPUT_SHAPE[:2],
batch_size=BATCH_SIZE,
seed=42,
)
model.compile(
optimizer=tf.keras.optimizers.Adam(0.001),
loss=tfa.losses.TripletSemiHardLoss(),
)
model.fit(
generator,
epochs=5,
)
Unfortunately after running the code I get the following error:
Found 4857 images belonging to 83 classes.
Epoch 1/5
Traceback (most recent call last):
File "ReID/external_process.py", line 35, in <module>
model.fit(
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
return method(self, *args, **kwargs)
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 1098, in fit
tmp_logs = train_function(iterator)
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
result = self._call(*args, **kwds)
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 840, in _call
return self._stateless_fn(*args, **kwds)
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2829, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1843, in _filtered_call
return self._call_flat(
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1923, in _call_flat
return self._build_call_outputs(self._inference_function.call(
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 545, in call
outputs = execute.execute(
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 1328 values, but the requested shape has 16
[[{{node TripletSemiHardLoss/PartitionedCall/Reshape}}]] [Op:__inference_train_function_13749]
Function call stack:
train_function
2020-10-23 22:07:09.094736: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated.
[[{{node PyFunc}}]]
The dataset directory has 83 subdirectories, one per class and each of this subdirectories contains images of given class. The dimension 1328 in the error output is the batch size (16) times number of classes (83), and the dimension 16 is the batch size (both dimensions change accordingly if I change the BATCH_SIZE.
To be honest I do not really understand the error, so any solution or even any kind of indight where is the problem is deeply appreciated.
The problem is that the TripletSemiHardLoss expects
labels y_true to be provided as 1-D integer Tensor with shape [batch_size] of multi-class integer labels
but the flow_from_directory by default generate categorical labels; using class_mode="sparse" should fix the problem.

tflite converter from a custom keras layer

I'm getting a TypeError when trying to convert a keras .h5 file to tflite.
The new layer is a gaussian kernel (Radial Basis Layer).
To be able to save and load the keras model I defined also the get_config() method in the custom layer. So I'm able to save and load the model correctly.
class RBFLayer(Layer):
def __init__(self, output_dim, centers=None, tol = 1e-6, gamma=0, **kwargs):
super(RBFLayer, self).__init__(**kwargs)
self.centers_ = centers
self.output_dim= output_dim
self.gamma_ = gamma
self.tol_ = tol
def build(self, input_shape):
self.mu = K.variable(self.centers_, name='centers')
self.gamma = K.variable(self.gamma_, name='gamma')
self.tol = K.constant(self.tol_,name='tol')
super(RBFLayer, self).build(input_shape)
def call(self, inputs): #Kernel radial
a,b = self.mu.shape
diff = K.reshape( K.tile(inputs,(1,a))-K.reshape(self.mu,(1,-1)), (-1,a,b))
l2 = K.sum(K.pow(diff, 2), axis=-1)
res = K.exp(-1 * self.gamma * l2)
mask = K.greater( res, self.tol)
return K.switch(mask, res, K.zeros_like(res))
def compute_output_shape(self, input_shape):
return (input_shape[0], self.output_dim)
def get_config(self):
config = {
'output_dim': self.output_dim,
'centers': self.centers_,
'gamma': self.gamma_,
'tol': self.tol_
}
base_config = super(RBFLayer, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
Now I want to save the model to tflite. I use TFLiteConverter from keras file including also 'custom_objects'.
def save_tflite(self, base_name):
file =base_name +'.h5'
converter = tf.lite.TFLiteConverter.from_keras_model_file(file, custom_objects={'RBFLayer':RBFLayer})
tflite_model = converter.convert()
open(base_name+".tflite", "wb").write(tflite_model)
I expect to get the tflite model file including the K.variables used while training the complete model (centers, tol, gamma).
When converting I get these error messages:
airgorbn.save_tflite( base_name)
Traceback (most recent call last):
File "<ipython-input-7-cdaa1ec46233>", line 1, in <module>
airgorbn.save_tflite( base_name)
File "C:/Users/AIRFI/Hospital/keras_RadialBasis.py", line 158, in save_tflite
converter = tf.lite.TFLiteConverter.from_keras_model_file(file, custom_objects={'RBFLayer':RBFLayer})
File "C:\Users\AIRFI\Anaconda3\envs\tf\lib\site-packages\tensorflow\lite\python\lite.py", line 747, in from_keras_model_file
keras_model = _keras.models.load_model(model_file, custom_objects)
File "C:\Users\AIRFI\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\keras\saving\save.py", line 146, in load_model
return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile)
File "C:\Users\AIRFI\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\keras\saving\hdf5_format.py", line 212, in load_model_from_hdf5
custom_objects=custom_objects)
File "C:\Users\AIRFI\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\keras\saving\model_config.py", line 55, in model_from_config
return deserialize(config, custom_objects=custom_objects)
File "C:\Users\AIRFI\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\keras\layers\serialization.py", line 89, in deserialize
printable_module_name='layer')
File "C:\Users\AIRFI\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\keras\utils\generic_utils.py", line 192, in deserialize_keras_object
list(custom_objects.items())))
File "C:\Users\AIRFI\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\keras\engine\sequential.py", line 353, in from_config
model.add(layer)
File "C:\Users\AIRFI\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\training\tracking\base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "C:\Users\AIRFI\Anaconda3\envs\tf\lib\site-packages\tensorflow\python\keras\engine\sequential.py", line 154, in add
'Found: ' + str(layer))
TypeError: The added layer must be an instance of class Layer. Found: <__main__.RBFLayer object at 0x0000017D3A75AC50>
you need to define that layer as a custom op.
refer to this https://www.tensorflow.org/lite/guide/ops_custom

ValueError: An operation has `None` for gradient. while implementing custom loss function in Keras

I'm trying to implement the following custom loss function from this SO post; however, I've had to make some minor changes to suit my model. For some context, I'm using multi labels with 5 classes (below is an example of how they're encoded).
0 => [1, 0, 0, 0, 0]
1 => [1, 1, 0, 0, 0]
2 => [1, 1, 1, 0, 0]
3 => [1, 1, 1, 1, 0]
4 => [1, 1, 1, 1, 1]
My custom loss function
def _cohen_kappa(y_true, y_pred, num_classes=5, weights=None, metrics_collections=None, updates_collections=None, name=None):
kappa, update_op = tf.contrib.metrics.cohen_kappa(y_true, y_pred, num_classes, weights, metrics_collections, updates_collections, name)
kappa = K.cast(kappa, 'float32')
K.get_session().run(tf.local_variables_initializer())
with tf.control_dependencies([update_op]):
kappa = tf.identity(kappa)
return kappa
def cohen_kappa_loss(num_classes=5, weights=None, metrics_collections=None, updates_collections=None, name=None):
def cohen_kappa(y_true, y_pred):
y_true = K.cast(y_true, 'int32')
y_pred = K.cast(y_pred + 0.5, 'int32')
y_true = tf.subtract(K.sum(y_true, axis=1), tf.constant(1))
y_pred = tf.subtract(K.sum(y_pred, axis=1), tf.constant(1))
return -_cohen_kappa(y_true, y_pred, num_classes, weights, metrics_collections, updates_collections, name)
return cohen_kappa
This is how I'm attempting to use my loss function:
model_cohen_kappa = cohen_kappa_loss(num_classes=5)
model.compile(loss=model_cohen_kappa,
optimizer=optimizers.SGD(lr=0.0001, momentum=0.9),
metrics=['accuracy'])
Unfortunately, I get the following error, which is confusing since my loss function doesn't contain K.argmax, K.round, K.eval., which are mentioned in the error message as operations that are non-differentiable. Is there another non-differentiable operation in my custom loss function that I'm not noticing that is giving me this error?
Traceback (most recent call last):
File "small_test.py", line 106, in <module>
main()
File "small_test.py", line 101, in main
max_queue_size=2
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\engine\training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\engine\training_generator.py", line 40, in fit_generator
model._make_train_function()
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\engine\training.py", line 509, in _make_train_function
loss=self.total_loss)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\optimizers.py", line 184, in get_updates
grads = self.get_gradients(loss, params)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\optimizers.py", line 91, in get_gradients
raise ValueError('An operation has `None` for gradient. '
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
While I suspect K.cast is non-differentiable, removing the below snippet from my loss function results in the following error:
kappa = K.cast(kappa, 'float32')
Error
Traceback (most recent call last):
File "small_test.py", line 106, in <module>
main()
File "small_test.py", line 91, in main
metrics=['accuracy'])
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\engine\training.py", line 342, in compile
sample_weight, mask)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\engine\training_utils.py", line 421, in weighted
score_array *= weights
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\tensorflow\python\ops\math_ops.py", line 884, in binary_op_wrapper
return func(x, y, name=name)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1180, in _mul_dispatch
return gen_math_ops.mul(x, y, name=name)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 6879, in mul
"Mul", x=x, y=y, name=name)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 563, in _apply_op_helper
inferred_from[input_arg.type_attr]))
TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type float64 of argument 'x'.

Tensor Object is not Iterable with BasicLSTMCell

I have the following code:
def dense_layers(pool3):
with tf.variable_scope('local1') as scope:
# Move everything into depth so we can perform a single matrix multiply.
shape_d = pool3.get_shape()
shape = shape_d[1] * shape_d[2] * shape_d[3]
# tf_shape = tf.stack(shape)
tf_shape = 1024
print("shape:", shape, shape_d[1], shape_d[2], shape_d[3])
# So note that tf_shape = 1024, this means that we have 1024 features are fed into the network. And
# the batch size = 1024. Therefore, the aim is to divide the batch_size into num_steps so that
reshape = tf.reshape(pool3, [-1, tf_shape])
# Now we need to reshape/divide the batch_size into num_steps so that we would be feeding a sequence
# And note that most importantly is to have batch_partition_length followed by step_size in the parameter list.
lstm_inputs = tf.reshape(reshape, [batch_partition_length, step_size, tf_shape])
# print('RNN inputs shape: ', lstm_inputs.get_shape()) # -> (128, 8, 1024).
# Note that the state_size is the number of neurons.
lstm = tf.contrib.rnn.BasicLSTMCell(state_size)
lstm_outputs, final_state = tf.nn.dynamic_rnn(cell=lstm, inputs=lstm_inputs, initial_state=init_state)
tf.assign(init_state, final_state)
So, I am taking the output of the pool layer and try to feed it into the LSTM in the network.
Initially I have declared the following:
state_size = 16
step_size = 8
batch_partition_length = int(batch_size / step_size)
init_state = tf.Variable(tf.zeros([batch_partition_length, state_size])) # -> [128, 16].
Therefore, I am getting an error on:
lstm_outputs, final_state = tf.nn.dynamic_rnn(cell=lstm, inputs=lstm_inputs, initial_state=init_state)
As follows:
Traceback (most recent call last):
File "C:/Users/user/PycharmProjects/AffectiveComputing/Brady_with_LSTM.py", line 197, in <module>
predictions = dense_layers(conv_nets_output)
File "C:/Users/user/PycharmProjects/AffectiveComputing/Brady_with_LSTM.py", line 162, in dense_layers
lstm_outputs, final_state = tf.nn.dynamic_rnn(cell=lstm, inputs=lstm_inputs, initial_state=init_state)
File "C:\Users\user\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\ops\rnn.py", line 553, in dynamic_rnn
dtype=dtype)
File "C:\Users\user\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\ops\rnn.py", line 720, in _dynamic_rnn_loop
swap_memory=swap_memory)
File "C:\Users\user\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2623, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "C:\Users\user\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2456, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "C:\Users\user\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2406, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "C:\Users\user\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\ops\rnn.py", line 705, in _time_step
(output, new_state) = call_cell()
File "C:\Users\user\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\ops\rnn.py", line 691, in <lambda>
call_cell = lambda: cell(input_t, state)
File "C:\Users\user\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn_cell_impl.py", line 238, in __call__
c, h = state
File "C:\Users\user\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 504, in __iter__
raise TypeError("'Tensor' object is not iterable.")
TypeError: 'Tensor' object is not iterable.
Any help is much appreciated!!
The state for LSTMs really consists of two parts
State for the cell(s)
Previous outputs
This is alluded to in the docs for BasicLSTMCell. This paper has a good explanation of how LSTMs work which will help you understand why you need to keep two sets of states in an LSTM implementation. The reason an error is being thrown is because you need to supply a tuple of tensors for the initial state.
That said you have two options:
Supply an initial state that consists of two tensors.
Let the RNN cell generate its own initial state.
You would usually only do 1. if you wanted to override default behavior. In this case you are using the default (zero) initial state so you can do 2.
lstm_outputs, final_state = tf.nn.dynamic_rnn(cell=lstm, inputs=lstm_inputs, dtype=tf.float32)