Loading subclassed model from seperate file fails - tensorflow

I train and save a subclassed keras Model in one file, say file A and want to load it in another file B. The problem that I have is that in file A I can save the model and even load the model back in with no problems, but the moment that I try to load the model in a different file (e.g. file B) I get the following error:
ValueError: Could not find matching concrete function to call loaded from the SavedModel.
Got:
Positional arguments (1 total):
* [<tf.Tensor 'inputs:0' shape=(?, 4498) dtype=float32>, <tf.Tensor 'inputs_1:0'
shape=(?, 4) dtype=float32>]
Keyword arguments: {}
Expected these arguments to match one of the following 2 option(s):
Option 1:
Positional arguments (1 total):
* (TensorSpec(shape=(?, 4498), dtype=tf.float32, name='inputs/0'),
TensorSpec(shape=(?, 4), dtype=tf.float32, name='inputs/1'))
Keyword arguments: {}
Option 2:
Positional arguments (1 total):
* (TensorSpec(shape=(?, 4498), dtype=tf.float32, name='input_1'),
TensorSpec(shape=(?, 4), dtype=tf.float32, name='input_2'))
Keyword arguments: {}
Does anybody have an idea how to fix this? Or what might cause the problem?

Related

Layer "model" expects 2 input(s), but it received 1 input tensors

I built a vqa model, and set two inputs(images, questions).
It was well trained with train/val datasets, but with test_dataset, it keep printing errors like below;
ValueError: Layer "model" expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(224, 224, 3) dtype=float32>]
The variables I used are;
test_qt
test_it
test_qt and test_it are both lists of tensors..
I built a dataset with this code;
test_ds = tf.data.Dataset.from_tensor_slices((test_it, test_qt))
I also tried to directly give each input separately but got this error.
ValueError: Data cardinality is ambiguous:
x sizes: 224, 224, 302, 302
Make sure all arrays contain the same number of samples.

TF Agents: How to feed faked observations in to a trained deep Q network model to examine which actions it chooses?

All descriptions of links referenced in the question below are from 2021/05/31.
I have trained a deep Q network following the version of the TF Agents tutorial on a custom problem. Now I would like to feed it some hand-crafted observations to see what actions it recommends. I have some utility functions for creating these feature vectors that I use in my PyEnvironment. However, I am not sure how to convert these bits to feed into the network.
What I would like to have is something like the following:
Feed in an initial state, and see the recommended action from the network.
Manally alter the state, and see what the network recomends, next.
And so on...
My environment has a stochastic component, so I want to manually modify the environment state rather than have the agent explicitly take a path through the environment.
To make progress on this question, I have been examining this tutorial on policies. It looks like, my use case might be similar to the section "Random TF Policy" or the one below on "Actor policies". However, in my use case I have a loaded agent and have Python (non TF) observation, time specs, and action specs. What is the ideal approach to drive my network to produce actions from these components?
Here is something I have tried:
saved_policy = tf.compat.v2.saved_model.load(policy_dir)
# get_feat_vector returns an numpy.ndarray
observation = tf.convert_to_tensor(state.get_feat_vector(), dtype=tf.float32)
time_step = ts.restart(observation)
action_step = saved_policy.action(time_step)
and the associated error message:
File "/home/---/.local/lib/python3.8/site-packages/tensorflow/python/saved_model/function_deserialization.py", line 267, in restored_function_body
raise ValueError(
ValueError: Could not find matching function to call loaded from the SavedModel. Got:
Positional arguments (2 total):
* TimeStep(step_type=<tf.Tensor 'time_step:0' shape=() dtype=int32>, reward=<tf.Tensor 'time_step_1:0' shape=() dtype=float32>, discount=<tf.Tensor 'time_step_2:0' shape=() dtype=float32>, observation=<tf.Tensor 'time_step_3:0' shape=(170,) dtype=float32>)
* ()
Keyword arguments: {}
Expected these arguments to match one of the following 2 option(s):
Option 1:
Positional arguments (2 total):
* TimeStep(step_type=TensorSpec(shape=(None,), dtype=tf.int32, name='step_type'), reward=TensorSpec(shape=(None,), dtype=tf.float32, name='reward'), discount=TensorSpec(shape=(None,), dtype=tf.float32, name='discount'), observation=TensorSpec(shape=(None, 170), dtype=tf.float32, name='observation'))
* ()
Keyword arguments: {}
Option 2:
Positional arguments (2 total):
* TimeStep(step_type=TensorSpec(shape=(None,), dtype=tf.int32, name='time_step/step_type'), reward=TensorSpec(shape=(None,), dtype=tf.float32, name='time_step/reward'), discount=TensorSpec(shape=(None,), dtype=tf.float32, name='time_step/discount'), observation=TensorSpec(shape=(None, 170), dtype=tf.float32, name='time_step/observation'))
* ()
Keyword arguments: {}
I believe your problem might be with how you are loading and saving the model. TF-Agents recommends using the PolicySaver (see here). So maybe try running code like
tf_agent = ...
tf_policy_saver = policy_saver.PolicySaver(policy=tf_agent.policy)
... # train agent
tf_policy_saver.save(export_dir=policy_dir_path)
and then load and run the model with:
eager_py_policy = py_tf_eager_policy.SavedModelPyTFEagerPolicy(
policy_dir, env.time_step_spec(), env.action_spec())
policy_state = eager_py_policy.get_initial_state(1)
time_step = env.reset()
action_step = eager_py_policy.action(time_step, policy_state)
time_step = env.step(action_step.action)
policy_state = action_step.state
Or whatever manual thing you want to do with the environment and observations.

ValueError: Could not find matching function to call loaded from the SavedModel and 'CheckpointLoadStatus' object has no attribute 'predict'

I am working on categorizing reviews into multiple labels and built a multi-label text classifier by referring to this code. The classification model is based on the Bert text model. I got an issue regarding how to predict unseen data by the trained model and posted a question here. According to the solutions provided there, tried to save my model and load it in the following ways.
text_model.save('/tmp/model')
loaded_model=tf.keras.models.load_model('/tmp/model')
result = loaded_model.predict(np.asarray(item))
When I try to predict unseen data using the loaded model I get the following error.
ValueError: Could not find matching function to call loaded from the SavedModel. Got:
Positional arguments (2 total):
* Tensor("inputs:0", shape=(None, 1), dtype=int64)
* False
Keyword arguments: {}
Expected these arguments to match one of the following 4 option(s):
Option 1:
Positional arguments (2 total):
* TensorSpec(shape=(None, None), dtype=tf.int32, name='input_1')
* False
Keyword arguments: {}
Option 2:
Positional arguments (2 total):
* TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')
* True
Keyword arguments: {}
Option 3:
Positional arguments (2 total):
* TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')
* False
Keyword arguments: {}
Option 4:
Positional arguments (2 total):
* TensorSpec(shape=(None, None), dtype=tf.int32, name='input_1')
* True
Keyword arguments: {}
After studying on same cases then I tried using save_weights and load_weights.code is given below
text_model.save_weights("model.hd5")
loaded_model=TEXT_MODEL(vocabulary_size=VOCAB_LENGTH,
embedding_dimensions=EMB_DIM,
cnn_filters=CNN_FILTERS,
dnn_units=DNN_UNITS,
model_output_classes=OUTPUT_CLASSES,
dropout_rate=DROPOUT_RATE
)
loaded_model=text_model.load_weights('model.hd5')
result = loaded_model.predict(np.asarray(item1))
It gives me an error as 'CheckpointLoadStatus' object has no attribute 'predict'
If this code is not enough I have provided the code for the implementation and training part of the model in this question.
I was able to solve my issue. I haven't built the model before loading the weights. Because of that, it didn't initialize layers in the subclassed model and gave the error as 'CheckpointLoadStatus' object has no attribute 'predict'. The following code shows how I fixed the issue by applying build() method.
text_model.save_weights("model.h5")
new_model=TEXT_MODEL(vocabulary_size=VOCAB_LENGTH,
embedding_dimensions=EMB_DIM,
cnn_filters=CNN_FILTERS,
dnn_units=DNN_UNITS,
model_output_classes=OUTPUT_CLASSES,
dropout_rate=DROPOUT_RATE
)
new_model.build((2487,260))
new_model.load_weights('model.h5')
result = new_model.predict([item1])

How to get all layers' activations for a specific input for Tensorflow Hub modules?

I am new to Tensorflow Hub. I want to use I3D module and finetune this network to another dataset and I need to get the last hidden layer as well as some other layers' outputs. I was wondering if there is a way to get the other layers' activations. The only signature provided for I3D is just "default". I think there should be a way to get the output of all layers easily with Tensorflow Hub modules.
import tensorflow_hub as hub
module = hub.Module("https://tfhub.dev/deepmind/i3d-kinetics-600/1", trainable=False)
logits = module(inp)
This will give me the final layer output. How can I get other layer's outputs, for example, the second convolution layer's output?
https://tfhub.dev/deepmind/i3d-kinetics-400/1 (and also the *-600 version) happen to export only the final layer, so there is no properly supported way to get the other layers. (That said, you can always experiment by inspecting the graph and selecting tensors by name, but this has a real risk to stop working with newer module or library versions.)
You can get the other layers by name. Using Inception-v3 as an example:
import tensorflow_hub as hub
module = hub.Module("https://tfhub.dev/google/imagenet/inception_v3/feature_vector/1")
logits = module(inp)
logits contains all the models layers. You can view them by calling items():
print(logits.items())
This outputs a dictionary containing all the layers in the graph, a few of which are shown below:
dict_items([
('InceptionV3/Mixed_6c', <tf.Tensor 'module_2_apply_image_feature_vector/InceptionV3/InceptionV3/Mixed_6c/concat:0' shape=(1, 17, 17, 768) dtype=float32>),
('InceptionV3/Mixed_6d', <tf.Tensor 'module_2_apply_image_feature_vector/InceptionV3/InceptionV3/Mixed_6d/concat:0' shape=(1, 17, 17, 768) dtype=float32>),
('InceptionV3/Mixed_6e', <tf.Tensor 'module_2_apply_image_feature_vector/InceptionV3/InceptionV3/Mixed_6e/concat:0' shape=(1, 17, 17, 768) dtype=float32>),
('default', <tf.Tensor 'module_2_apply_image_feature_vector/hub_output/feature_vector/SpatialSqueeze:0' shape=(1, 2048) dtype=float32>),
('InceptionV3/MaxPool_5a_3x3', <tf.Tensor 'module_2_apply_image_feature_vector/InceptionV3/InceptionV3/MaxPool_5a_3x3/MaxPool:0' shape=(1, 35, 35, 192) dtype=float32>)])
Usually to get the last layer, you would use default:
sess.run(logits['default'])
But you can just as easily get other layers using their name:
sess.run(logits['InceptionV3/MaxPool_5a_3x3'])

Why can't tensorflow determine the shape of this expression?

I have the following expression which is giving me problems. I have defined the batch_size as batch_size = tf.shape(input_tensor)[0] which dynamically determines the size of the batch based on the size of the input tensor to the model. I have used it elsewhere in the code without issue. What I am confused about is that when I run the following line of code it says the shape is (?, ?) I would expect it to be (?, 128) because it knows the second dimension.
print(tf.zeros((batch_size, 128)).get_shape())
I want to know the shape since I am trying to do the following and I am getting an error.
rnn_input = tf.reduce_sum(w * decoder_input, 1)
last_out = decoder_outputs[t - 1] if t else tf.zeros((batch_size, 128))
rnn_input = tf.concat(1, (rnn_input, last_out))
This code needs to set last_out to zero on the first time step.
Here is the error ValueError: Linear expects shape[1] of arguments: [[None, None], [None, 1024]]
I am doing something similar when I determine my initial state vector for the RNNs.
state = tf.zeros((batch_size, decoder_multi_rnn.state_size), tf.float32)
I also get (?, ?) when I try to print the size of state but it does not really throw any exceptions when I try to use it.
You are mixing static shapes and dynamic shapes. Static shape is what you get during tensor.get_shape(tensor) which is best-effort attempt to obtain shape, while dynamic shape comes from sess.run(tf.shape(tensor)) and it is always defined.
To be more precise, tf.shape(tensor) creates an op in the graph that will produce shape tensor on run call. If you do aop=tf.shape(tensor)[0], there's some magic through _SliceHelper that adds extra ops that will extract first element of the shape tensor on run call.
This means that myval=tf.zeros((aop, 128)) has to run aop to obtain the dimensions and this means that first dimension of myval is undefined until you issue the run call. IE, your run call could look like sess.run(myval, feed_dict={aop:2}, where feed_dict overrides aop with 2. Hence static shape inference reports ? for that dimension.
(EDIT: I rewrite an answer as what I wrote before was not up to the point)
The quick fix to your issue is to use set_shape() to update the static (inferred) shape of the Tensor:
input_tensor = tf.placeholder(tf.float32, [None, 32])
batch_size = tf.shape(input_tensor)[0]
res = tf.zeros((batch_size, 128))
print res.get_shape() # prints (?, ?) WHEREAS one could expect (?, 128)
res.set_shape([None, 128])
print res.get_shape() # prints (?, 128)
As for why TensorFlow looses the information about the second dimension being 128, I don't really know.
Maybe #Yaroslav will be able to answer.
EDIT:
The incorrect behavior was corrected following this issue.