The embedding_rnn_seq2seq function in tensorflows seq2seq module provides a feed_previous argument, which means that during decoding it only uses the first decoder input, and then for subsequent decoder inputs it uses the previous decoder output. Is there a simple way to get this behavior from the basic_rnn_seq2seq function?
This API is now deprecated, but if anyone's still looking for a solution I suggest looking at this GitHub repo: raindeer/seq2seq_experiments
In short, to create their decoder, the author uses the following (the important part being loop_function):
def _init_seq2seq(self, encoder_inputs, decoder_inputs, cell, feed_previous):
def inference_loop_function(prev, _):
prev = tf.nn.xw_plus_b(prev, self.w_softmax, self.b_softmax)
return tf.to_float(tf.equal(prev, tf.reduce_max(prev, reduction_indices=[1], keep_dims=True)))
loop_function = inference_loop_function if feed_previous else None
with variable_scope.variable_scope('seq2seq'):
_, final_enc_state = rnn.rnn(cell, encoder_inputs, dtype=dtypes.float32)
return seq2seq.rnn_decoder(decoder_inputs, final_enc_state, cell, loop_function=loop_function)
Related
I am pretty new at this, so there might be something I am missing completely, but here is my problem: I am trying to create a Tokenizer class that uses the pretrained tokenizer models from Huggingface. I would then like to use this class in a larger transformer model to tokenize my input data. Here is the class code
class Roberta(MyTokenizer):
from transformers import AutoTokenizer
from transformers import RobertaTokenizer
class Roberta(MyTokenizer):
def build(self, *args, **kwargs):
self.max_length = self.phd.max_length
self.untokenized_data = self.questions + self.answers
def tokenize_and_filter(self):
# Initialize the tokenizer with a pretrained model
Tokenizer = AutoTokenizer.from_pretrained('roberta')
tokenized_inputs, tokenized_outputs = [], []
inputs = Tokenizer(self.questions, padding=True)
outputs = Tokenizer(self.answers, padding=True)
tokenized_inputs = inputs['input_ids']
tokenized_outputs = outputs['input_ids']
return tokenized_inputs, tokenized_outputs
When I call the function tokenize_and_filter in my Transformer model as below
questions = self.get_tokenizer().tokenize_and_filter
answers = self.get_tokenizer().tokenize_and_filter
print(questions)
and I try to print the tokenized data, I get this message:
<bound method Roberta.tokenize_and_filter of <MyTokenizer.Roberta.Roberta object at
0x000002779A9E4D30>>
It appears that the function returns a method instead of a list or a tensor - I've tried passing the parameter 'return_tensors='tf'', I have tried using the tokenizer.encode() method, I have tried both with AutoTokenizer and with RobertaTokenizer, I have tried the batch_encode_plus() method, nothing seems to work.
Please help!
it seems this was a really stupid error on my part, I forgot to put parentheses when calling the function
questions = self.get_tokenizer().tokenize_and_filter
answers = self.get_tokenizer().tokenize_and_filter
should actually be
questions = self.get_tokenizer().tokenize_and_filter()
answers = self.get_tokenizer().tokenize_and_filter()
and it works this way :)
I've been trying to convert a generator I built to a tf.data.dataset.
I've come far and now I have something simple like this
def parse_image(filename):
file = tf.io.read_file(filename) # this will work only with filename as tensor
image = tf.image.decode_image(file)
return image
def transform_img(img):
img = parse_image(img).numpy()
img = transforms_train(image = img)["image"]
return img
transform img works as expected when I call it on a filename itself. like:
plt.imshow(transform_img(array_of_filenames[0]))
but when I map it on a dataset
dataset = tf.data.Dataset.from_tensor_slices(array_of_filenames)
dataset = dataset.map(transform_img)
I get the error in the title.
I am doing something silly again aren't I?
Thanks for helping!
It is not possible to use numpy inside the map function of tensorflow dataset. Otherwise, you need to wrap the function in tf.py_function or tf.numpy_function. So it should look like the following:
dataset = dataset.map(lambda: item: tf.py_function(transform_img, [item], [tf.float32]))
The first argument of py_function is the preprocessing function you want, the second argument is the parameter to pass to the function. The final argument is the dtype of the return of preprocess function. (same applies to tf.numpy_function)
I don't remember reading this in documentation but in a tutorial, you can find it here.
I had this function to get its differentiation value.
def dp1_f1(x):
return 64*x*(1-x)*(math.pow((1-2*x),2) )*math.pow((1-8*x+8*x*x), 2)
I want to get dy/dx value.
I can get this value by numeric method just as below:
def dp_numeric_diff(x):
delta_x = 0.0001
return (dp1_f1(x+delta_x)-dp1_f1(x))/delta_x
I use TensorFlow eager execution API to calculate this value:
def dp_ad_tfe(x):
tf.enable_eager_execution()
tfe = tf.contrib.eager
grad_lx = tfe.gradients_function(dp1_f1)
x = 3.0
y = dp1_f1(x)
rst = grad_lx(x)
return y, rst[0]
I call this function with code below:
numeric_diff = dp_numeric_diff(x)
print('Numeric method:{0}'.format(numeric_diff))
v, d = dp_ad_tfe(x)
print('TFE:{0}'.format(d))
It will display something like this:
Numeric method:-75290405.66440672
TFE:-19208000.0
I am sure that the numeric method is right. What's wrong with my TensorFlow eager execution code? By the way the same TensorFlow eager execution code can get correct answer for simple function like x^2.
I had found that TensorFlow eager execution API can't deal with functions like math.pow. I must provide a function to tell TensorFlow eager execution API how to get the derivative of the function. To solve this question I have to change math.pow to my own function as below:
#tf.custom_gradient
def f3(x, n):
v = tf.pow(x, n)
def grad(dy):
return (dy* (n*tf.pow(x, n-1)) ).numpy()
return v.numpy(), grad
And have to modify the original function as below:
def dp1_f1(x):
return 64*x*(1-x)*f3(1-2*x,2)*f3(1-8*x+8*x*x, 2)
Now TensorFlow eager execution API will give the right answer just as the numeric method.
TensorFlow's automatic differentiation APIs can only differentiate through compositions of TensorFlow operations, not through functions like math.pow() or other libraries. If you replace math.pow() with tf.pow() it should work out just fine.
Something like:
import tensorflow as tf
tf.enable_eager_execution()
def dp1_f1(x):
return 64*x*(1-x)*(tf.pow((1-2*x),2) )*tf.pow((1-8*x+8*x*x), 2)
def dp_numeric_diff(x):
delta_x = 0.0001
return (dp1_f1(x+delta_x)-dp1_f1(x))/delta_x
grad = tf.contrib.eager.gradients_function(dp1_f1)
print(dp_numeric_diff(3.0).numpy()) # Prints -75300000.0
print(grad(3.0)[0].numpy()) # Prints -75279680.0
Hope that helps.
(Seems this was also asked on GitHub)
Relu function as defined in keras/activation.py is:
def relu(x, alpha=0., max_value=None):
return K.relu(x, alpha=alpha, max_value=max_value)
It has a max_value which can be used to clip the value. Now how can this be used/called in the code?
I have tried the following:
(a)
model.add(Dense(512,input_dim=1))
model.add(Activation('relu',max_value=250))
assert kwarg in allowed_kwargs, 'Keyword argument not understood:
' + kwarg
AssertionError: Keyword argument not understood: max_value
(b)
Rel = Activation('relu',max_value=250)
same error
(c)
from keras.layers import activations
uu = activations.relu(??,max_value=250)
The problem with this is that it expects the input to be present in the first value. The error is 'relu() takes at least 1 argument (1 given)'
So how do I make this a layer?
model.add(activations.relu(max_value=250))
has the same issue 'relu() takes at least 1 argument (1 given)'
If this file cannot be used as layer, then there seems to be no way of specifying a clip value to Relu. This implies that the comment here https://github.com/fchollet/keras/issues/2119 closing a proposed change is wrong...
Any thoughts? Thanks!
You can use the ReLU function of the Keras backend. Therefore, first import the backend:
from keras import backend as K
Then, you can pass your own function as activation using backend functionality.
This would look like
def relu_advanced(x):
return K.relu(x, max_value=250)
Then you can use it like
model.add(Dense(512, input_dim=1, activation=relu_advanced))
or
model.add(Activation(relu_advanced))
Unfortunately, you must hard code additional arguments.
Therefore, it is better to use a function, that returns your function and passes your custom values:
def create_relu_advanced(max_value=1.):
def relu_advanced(x):
return K.relu(x, max_value=K.cast_to_floatx(max_value))
return relu_advanced
Then you can pass your arguments by either
model.add(Dense(512, input_dim=1, activation=create_relu_advanced(max_value=250)))
or
model.add(Activation(create_relu_advanced(max_value=250)))
That is as easy as one lambda :
from keras.activations import relu
clipped_relu = lambda x: relu(x, max_value=3.14)
Then use it like this:
model.add(Conv2D(64, (3, 3)))
model.add(Activation(clipped_relu))
When reading a model saved in hdf5 use custom_objects dictionary:
model = load_model(model_file, custom_objects={'<lambda>': clipped_relu})
Tested below, it'd work:
import keras
def clip_relu (x):
return keras.activations.relu(x, max_value=1.)
predictions=Dense(num_classes,activation=clip_relu,name='output')
This is what I did using Lambda layer to implement clip relu:
Step 1: define a function to do reluclip:
def reluclip(x, max_value = 20):
return K.relu(x, max_value = max_value)
Step 2: add Lambda layer into model:
y = Lambda(function = reluclip)(y)
def rnn_seq2seq(encoder_inputs, decoder_inputs, cell, output_projection=None,feed_previous=False, dtype=tf.float32, scope=None):
with tf.variable_scope(scope or "rnn_seq2seq"):
_, enc_states = rnn.rnn(cell, encoder_inputs,dtype=dtype)
def extract_argmax(prev, i):
if output_projection is not None:
prev = tf.nn.xw_plus_b(prev, output_projection[0], output_projection[1])
return tf.to_float(tf.equal(prev,tf.reduce_max(prev,reduction_indices=[1],keep_dims=True)))
loop_function = None
if feed_previous:
loop_function = extract_argmax
#seq2seq.rnn_decoder is provided in tensorflow/models/rnn/seq2seq.py
return seq2seq.rnn_decoder(decoder_inputs, enc_states[-1], cell, loop_function=loop_function)
I want to create two RNN models, one for training and another for testing. For that, I can call the function twice passing the feed_previous to True or False.
train_op,train_states = rnn_seq2seq(enc_inp,dec_inp,cell,output_projection=op,feed_previous=False)
test_op,_ = rnn_seq2seq(enc_inp,dec_inp,cell,output_projection=op,feed_previous=True)
But if I call the above function twice, wouldn't it create two different RNNs ? I am wondering if they would be able to share the weights.
Both functions operate on the same default graph and so can reuse the variables, check out variable scopes tutorial and see if your variables are created with reuse=True parameter
As a sanity check, try following snippet to list all variables in the default graph:
[v.name for v in tf.get_default_graph().as_graph_def().node if v.op=='Variable']