keras.layers.TimeDistributed with Huggingface Transformer gives NotImplementedError - tensorflow

I wanted to apply Bert on a sequence of sentences in the following manner, but I am getting a NotImplementedError
How to reproduce :
import tensorflow as tf
from transformers import BertTokenizer, TFBertModel
inputs = tf.keras.Input(shape=(50, 64), dtype='int32')
model = TFBertModel.from_pretrained('bert-base-uncased')
outputs = tf.keras.layers.TimeDistributed(model)(inputs)
NotImplementedError Traceback (most recent call last)
<ipython-input-5-631f3cd2e8b2> in <module>
----> 1 outputs = tf.keras.layers.TimeDistributed(model)(inputs)
Whereas the code would work fine for
inputs = tf.keras.Input(shape=(10, 128, 128, 3))
conv_2d_layer = tf.keras.layers.Conv2D(64, (3, 3))
outputs = tf.keras.layers.TimeDistributed(conv_2d_layer)(inputs)
Is there anything I am missing here?

Related

Tensorflow cannot quantize reshape function

I am going to train my model quantization aware. However, when i use it , the tensorflow_model_optimization cannot quantize tf.reshape function , and throws an error.
tensorflow version : '2.4.0-dev20200903'
python version : 3.6.9
the code:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '3'
from tensorflow.keras.applications import VGG16
import tensorflow_model_optimization as tfmot
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
quantize_model = tfmot.quantization.keras.quantize_model
inputs = keras.Input(shape=(784,))
# img_inputs = keras.Input(shape=(32, 32, 3))
dense = layers.Dense(64, activation="relu")
x = dense(inputs)
x = layers.Dense(64, activation="relu")(x)
outputs = layers.Dense(10)(x)
outputs = tf.reshape(outputs, [-1, 2, 5])
model = keras.Model(inputs=inputs, outputs=outputs, name="mnist_model")
# keras.utils.plot_model(model, "my_first_model.png")
q_aware_model = quantize_model(model)
and the output:
Traceback (most recent call last):
File "<ipython-input-39-af601b78c010>", line 14, in <module>
q_aware_model = quantize_model(model)
File "/home/essys/.local/lib/python3.6/site-packages/tensorflow_model_optimization/python/core/quantization/keras/quantize.py", line 137, in quantize_model
annotated_model = quantize_annotate_model(to_quantize)
File "/home/essys/.local/lib/python3.6/site-packages/tensorflow_model_optimization/python/core/quantization/keras/quantize.py", line 210, in quantize_annotate_model
to_annotate, input_tensors=None, clone_function=_add_quant_wrapper)
...
File "/home/essys/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 667, in wrapper
raise e.ag_error_metadata.to_exception(e)
TypeError: in user code:
TypeError: tf__call() got an unexpected keyword argument 'shape'
If somebody know, please help ?
The reason behind is because your layer is not yet support for QAT at the moment. If you want to quantize it, you have to self writing your quantization by quantize_annotate_layer and pass it through quantize_scope and apply to your model by quantize_apply as describe in here: https://www.tensorflow.org/model_optimization/guide/quantization/training_comprehensive_guide?hl=en#quantize_custom_keras_layer
I have create a batch_norm_layer in here as an example
Tensorflow 2.x is not complete for QAT layer, pls consider using tf1.x by adding FakeQuant after operators.

Error in removing the first layer of keras model

import numpy as np
from keras.applications.vgg19 import decode_predictions
from prettytable import PrettyTable
import time
from keras import backend as K
from tensorflow import keras
from tensorflow.python import keras
from keras import models, layers, Model, Input
import tensorflow as tf
model_2=keras.models.load_model('model_2.h5',compile=False)
model_2.summary()
predictions1= np.load('D:/predictions_result.npy')
def profiler(model, test_input):
data_input=test_input
for layer in model.layers:
start = time.time()
im_imput=keras.layers.Input(batch_shape=model.get_layer(layer.name).get_input_shape_at(0))
im_out = layer(im_imput)
new_model = keras.models.Model(inputs=im_imput,outputs=im_out)
data_input = new_model.predict(data_input)
end = time.time() - start
print(end)
result=1
profiler(model_2,predictions1)
tmp=np.zeros((1,224,224,64))
for i in range(0,1):
tmp[i,:,:,:]=predictions1[i,:]
predictions2 = model_2.predict(tmp)
label_vgg19 = decode_predictions(predictions2)
print ('label_vgg19 =', label_vgg19)
When I try to run the above code I get below error. My question is how to remove the first layer of the model after loading. I initially split VGG model into sub-models and then load the submodel. I tried different approaches but none is working. Help is highly appreciated.
Traceback (most recent call last):
File "C:/Users/40227422/PycharmProjects/model_partititon/model_2_sock.py", line 42, in <module>
profiler(model_2,predictions1)
File "C:/Users/40227422/PycharmProjects/model_partititon/model_2_sock.py", line 28, in the
profiler
data_input = new_model.predict(data_input)
File "C:\Users\40227422\AppData\Local\Continuum\miniconda3\envs\tensorflow\lib\site-
packages\tensorflow\python\keras\engine\training_utils.py", line 332, in standardize_input_data
' but got array with shape ' + str(data_shape))
ValueError: Error when checking input: expected input_1 to have shape (224, 224, 3) but got array
with shape (224, 224, 64)
when I try to use kerassurgeon to delete a layer using below code I get error
ValueError: not enough values to unpack (expected 2, got 0)
from kerassurgeon import Surgeon
surgeon = Surgeon(model_2)
layer_1 = model_2.layers[0] # selecting 2nd layer
surgeon.add_job('delete_layer', layer_1)
new_model = surgeon.operate()
I was not able to recreate the error you were facing. May be you can share the reproducible code. Below options I have tried to delete a layer and it worked.
from kerassurgeon.operations import delete_layer
# delete layer_1 from a model
model = delete_layer(model_2, layer_1)
OR
# delete layer_1 from a model
from kerassurgeon import Surgeon
surgeon = Surgeon(model_2)
surgeon.add_job('delete_layer', layer_1)
new_model = surgeon.operate()

Unable to import a pretrained model after calling Keras.backend.clear_session()

I am trying to train a model with new data samples in each iteration in a loop in keras (using tensorflow backend). Due to GPU memory error after some iterations, I appended K.clear_session(). However, after one iteration, the code throws the error:
'Cannot interpret feed_dict key as Tensor: ' + e.args[0])
TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder:0", shape=(7, 7, 3, 64), dtype=float32) is not an element of this graph.
If I remove K.clear_session() at end, there is no error. Is there anyone who can explain why this error comes in second iteration?
I tried other methods (for gpu release) but none of them worked and this is my last option. But it throws error. I have pasted an example code which can produce the error. Please NOTE that this is not the actual code, I just made an example to reproduce the error which I am facing in actual code.
from __future__ import absolute_import, division, print_function, unicode_literals
import numpy as np
import tensorflow as tf
import random
seed_value= 0
import os
import keras
os.environ['PYTHONHASHSEED']=str(seed_value)
random.seed(0)
np.random.seed(0)
from keras import backend as K
from keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
for i in range(3):
base_model = tf.keras.applications.resnet50.ResNet50(weights='imagenet', input_shape=(32, 32, 3),
include_top=False)
x = base_model.output
x = tf.keras.layers.GlobalAveragePooling2D()(x)
output = tf.keras.layers.Dense(10, activation='softmax',
kernel_initializer=tf.keras.initializers.RandomNormal(seed=4))(x)
model = tf.keras.Model(inputs=base_model.input, outputs=output)
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
for layer in base_model.layers:
layer.trainable = False
optimizer = tf.train.AdamOptimizer(learning_rate=0.0001)
model.compile(optimizer=optimizer, loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=1024,epochs=1,verbose=1)
K.clear_session()
Traceback (most recent call last):
File "C:\Users\sirshad\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1092, in _run
subfeed, allow_tensor=True, allow_operation=False)
File "C:\Users\sirshad\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 3490, in as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
File "C:\Users\sirshad\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 3569, in _as_graph_element_locked
raise ValueError("Tensor %s is not an element of this graph." % obj)
ValueError: Tensor Tensor("Placeholder:0", shape=(7, 7, 3, 64), dtype=float32) is not an element of this graph.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "E:/codes/experiments-AL/breakhis/40X-M-B/codes-AL/error_debug.py", line 22, in <module>
include_top=False)
File "C:\Users\sirshad\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\applications\__init__.py", line 70, in wrapper
return base_fun(*args, **kwargs)
File "C:\Users\sirshad\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\applications\resnet50.py", line 32, in ResNet50
return resnet50.ResNet50(*args, **kwargs)
File "C:\Users\sirshad\AppData\Local\Programs\Python\Python36\lib\site-packages\keras_applications\resnet50.py", line 291, in ResNet50
model.load_weights(weights_path)
File "C:\Users\sirshad\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1544, in load_weights
saving.load_weights_from_hdf5_group(f, self.layers)
File "C:\Users\sirshad\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\saving.py", line 806, in load_weights_from_hdf5_group
K.batch_set_value(weight_value_tuples)
File "C:\Users\sirshad\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\backend.py", line 2784, in batch_set_value
get_session().run(assign_ops, feed_dict=feed_dict)
File "C:\Users\sirshad\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 929, in run
run_metadata_ptr)
File "C:\Users\sirshad\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1095, in _run
'Cannot interpret feed_dict key as Tensor: ' + e.args[0])
TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder:0", shape=(7, 7, 3, 64), dtype=float32) is not an element of this graph.
Process finished with exit code 1
I was able to overcome this issue by saving the imagenet pre-trained model to disk and then loading everytime in loop after I call tf.keras.backend.clear_session(). So saving the base model to file and then loading works. But I am still confused why it did not work before with
base_model = tf.keras.applications.resnet50.ResNet50

convLSTM2d w/ functional API

I have an autoencoder for image compression, where the encoded tensor has the shape: (batch_size, 12, 64, 48).
batch_size is the number of images being fed in a batch,
12 is the number of channels of this last encoder layer, which has a
64x48 width/height.
I want to input this to a ConvLSTM2D layer, and i would like the output of the ConvLSTM2D to have the same dimension as the input of the ConvLSTM2D.
The intention is to see image reconstruction on a video sequence, rather than unordered images from a dataset.
Placing a ConvLSTM2d between an encoder/decoder in a autoencoder architecture has been difficult, especially because most examples use the Sequential API, and i want to use the functional API in Keras.
I tried reshaping the input but the error persists
import tensorflow as tf
import tensorflow.keras.backend as K
def LSTM_layer(input):
input = tf.keras.backend.expand_dims(input, axis=-1)
lstm1 = tf.keras.layers.ConvLSTM2D(filters=12, kernel_size=(3, 3), strides=(1, 1), data_format="channels_first",
input_shape=(None, 12, 64, 48),
padding='same', return_sequences=True)(input)
return lstm1
def build_model(input_shape):
#create an input with input_shape as the size
input_ = tf.keras.Input(shape=input_shape, name="input_node")
lstm_features = LSTM_layer(input_)
model = tf.keras.Model(inputs=input_, outputs=[lstm_features])
return model
def main():
input_shape = (12, 64, 48) #this is the size of the tensor which is outputted by my encoder, with channels_first assumed
model = build_model(input_shape)
if __name__ == '__main__':
main()
Unfortunately, this is throwing this error:
Traceback (most recent call last):
File "lstm.py", line 29, in <module>
main()
File "lstm.py", line 26, in main
model = build_model(input_shape)
File "lstm.py", line 20, in build_model
model = tf.keras.Model(inputs=input_, outputs=[lstm_features])
File "/home/hallab/.local/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 121, in __init__
super(Model, self).__init__(*args, **kwargs)
File "/home/hallab/.local/lib/python3.5/site-packages/tensorflow/python/keras/engine/network.py", line 80, in __init__
self._init_graph_network(*args, **kwargs)
File "/home/hallab/.local/lib/python3.5/site-packages/tensorflow/python/training/checkpointable/base.py", line 474, in _method_wrapper
method(self, *args, **kwargs)
File "/home/hallab/.local/lib/python3.5/site-packages/tensorflow/python/keras/engine/network.py", line 224, in _init_graph_network
'(thus holding past layer metadata). Found: ' + str(x))
ValueError: Output tensors to a Model must be the output of a TensorFlow `Layer` (thus holding past layer metadata). Found: Tensor("conv_lst_m2d/transpose_1:0", shape=(?, 12, 12, 48, 1), dtype=float32)
Most posts about this error instruct to wrap the operation in a lambda.. but i am not implementing a custom operation here, this should be a keras tf layer... right?
Also, in my implementation, i want the output tensor from the LSTM unit to be the same as the input, can i get some feedback about that as well?
Thank you.
You could use Lambda to wrap the output form K.expand_dims before input it to next layer like this:
import tensorflow as tf
import tensorflow.keras.backend as K
from tensorflow.keras.layers import Lambda
def expand_dims(x):
return K.expand_dims(x, 1)
def expand_dims_output_shape(input_shape):
return (input_shape[0], 1, input_shape[1])
def LSTM_layer(input_):
lstm1 = Lambda(expand_dims, expand_dims_output_shape)(input_)
lstm1 = tf.keras.layers.ConvLSTM2D(filters=12, kernel_size=(3, 3), strides=(1, 1), data_format="channels_first", padding='same', return_sequences=False)(lstm1)
return lstm1

Trying to implement recurrent network with tf.scan()

I am trying to implement a recurrent state tensor using tf.scan. The code I have at the moment is this:
import tensorflow as tf
import math
import numpy as np
INPUTS = 10
HIDDEN_1 = 20
BATCH_SIZE = 3
def iterate_state(prev_state_tuple, input):
with tf.name_scope('h1'):
weights = tf.get_variable('W', shape=[INPUTS, HIDDEN_1], initializer=tf.truncated_normal_initializer(stddev=1.0 / math.sqrt(float(INPUTS))))
biases = tf.get_variable('bias', shape=[HIDDEN_1], initializer=tf.constant_initializer(0.0))
matmuladd = tf.matmul(inputs, weights) + biases
unpacked_state, unpacked_out = tf.split(0,2,prev_state_tuple)
prev_state = unpacked_state
state = 0.9* prev_state + 0.1*matmuladd
output = tf.nn.relu(state)
return tf.concat(0,[state, output])
def data_iter():
while True:
idxs = np.random.rand(BATCH_SIZE, INPUTS)
yield idxs
with tf.Graph().as_default():
inputs = tf.placeholder(tf.float32, shape=(BATCH_SIZE, INPUTS))
with tf.variable_scope('states'):
initial_state = tf.zeros([HIDDEN_1],
name='initial_state')
initial_out = tf.zeros([HIDDEN_1],
name='initial_out')
concat_tensor = tf.concat(0,[initial_state, initial_out])
states, output = tf.scan(iterate_state, inputs,
initializer=concat_tensor, name='states')
sess = tf.Session()
# Run the Op to initialize the variables.
sess.run(tf.initialize_all_variables())
iter_ = data_iter()
for i in xrange(0, 2):
print ("iteration: ",i)
input_data = iter_.next()
out,st = sess.run([output,states], feed_dict={ inputs: input_data})
However, I get this error when running this:
Traceback (most recent call last):
File "cycles_in_graphs_with_scan.py", line 37, in <module>
initializer=concat_tensor, name='states')
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 442, in __iter__
raise TypeError("'Tensor' object is not iterable.")
TypeError: 'Tensor' object is not iterable.
(tensorflow)charlesq#Leviathan ~/projects/stuff $ python cycles_in_graphs_with_scan.py
Traceback (most recent call last):
File "cycles_in_graphs_with_scan.py", line 37, in <module>
initializer=concat_tensor, name='states')
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 442, in __iter__
raise TypeError("'Tensor' object is not iterable.")
TypeError: 'Tensor' object is not iterable.
I've already tried with pack/unpack and concat/split but I get this same error.
Any ideas how to solve this problem?
You're getting an error because tf.scan() returns a single tf.Tensor, so the line:
states, output = tf.scan(...)
...cannot destructure (unpack) the tensor returned from tf.scan() into two values (states and outputs). Effectively, the code is trying to treat the result of tf.scan() as a list of length 2, and assign the first element to states and the second element to output, but—unlike a Python list or tuple—tf.Tensor does not support this.
Instead you need to extract the values from the result of tf.scan() manually. For example, using tf.split():
scan_result = tf.scan(...)
# Assumes values are packed together along `split_dim`.
states, output = tf.split(split_dim, 2, scan_result)
Alternatively, you could use tf.slice() or tf.unpack() to extract the relevant states and output values.