Let's take the following code as an example:
inputs = keras.layers.InputLayer(1).output
output = tf.random.uniform((1, )) * inputs
I want to feed inputs with value and have it propagate through the layers without using a Keras model.
How can it be done?
Related
(I'm following this pytorch tutorial about BERT word embeddings, and in the tutorial the author is access the intermediate layers of the BERT model.)
What I want is to access the last, lets say, 4 last layers of a single input token of the BERT model in TensorFlow2 using HuggingFace's Transformers library. Because each layer outputs a vector of length 768, so the last 4 layers will have a shape of 4*768=3072 (for each token).
How can I implement this in TF/keras/TF2, to get the intermediate layers of pretrained model for an input token? (later I will try to get the tokens for each token in a sentence, but for now one token is enough).
I'm using the HuggingFace's BERT model:
!pip install transformers
from transformers import (TFBertModel, BertTokenizer)
bert_model = TFBertModel.from_pretrained("bert-base-uncased") # Automatically loads the config
bert_tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
sentence_marked = "hello"
tokenized_text = bert_tokenizer.tokenize(sentence_marked)
indexed_tokens = bert_tokenizer.convert_tokens_to_ids(tokenized_text)
print (indexed_tokens)
>> prints [7592]
The output is a token ([7592]), which should be the input of the for the BERT model.
The third element of the BERT model's output is a tuple which consists of output of embedding layer as well as the intermediate layers hidden states. From documentation:
hidden_states (tuple(tf.Tensor), optional, returned when config.output_hidden_states=True):
tuple of tf.Tensor (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
For the bert-base-uncased model, the config.output_hidden_states is by default True. Therefore, to access hidden states of the 12 intermediate layers, you can do the following:
outputs = bert_model(input_ids, attention_mask)
hidden_states = outputs[2][1:]
There are 12 elements in hidden_states tuple corresponding to all the layers from beginning to the last, and each of them is an array of shape (batch_size, sequence_length, hidden_size). So, for example, to access the hidden state of third layer for the fifth token of all the samples in the batch, you can do: hidden_states[2][:,4].
Note that if the model you are loading does not return the hidden states by default, then you can load the config using BertConfig class and pass output_hidden_state=True argument, like this:
config = BertConfig.from_pretrained("name_or_path_of_model",
output_hidden_states=True)
bert_model = TFBertModel.from_pretrained("name_or_path_of_model",
config=config)
I'm dealing with Keras functional API.
Specifically for my experiments, I'm using Keras resnet50 model obtained with:
model = resnet50.ResNet50(weights='imagenet')
Obviously, to get the final output of the network we need to feed a value to the placeholder input_1.
My question is, can I somehow start inferencing this graph from the relu layer which is depicted at the bottom of the picture below, provided that I feed a value of the appropriate dimensions into it?
I tried to achieve this with Keras functions. Something like:
self.inp = model.input
self.outputs = [layer.output for layer in model.layers]
self.functor = K.function([self.inp, K.learning_phase()], [self.outputs[6], self.outputs[17]])
But this approach will not work, because again to inference any output I need to feed value into tensor.
Is recreating graph from scratch my best option here?
Thanks
If I got you right, you can just specify input and output nodes
base_model = tf.keras.applications.ResNet50(weights='imagenet')
inference_model = tf.keras.Model(inputs=base_model.input, outputs=base_model.get_layer('any_layer_name').output)
You can set the output to any layer name
I want to remove the last layer of 'faster_rcnn_nas_lowproposals_coco' model which downloaded from https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md.
I know I in Keras we can use model.layers.pop() to remove the last layer.
But I searched in the Internet and there are no equivalent function in tensorflow.
If there are no equivalent function in tensorflow, are there anyone can tell me how to load trained Model zoo by Keras?
You don't need to "pop" a layer, you just have to not load it:
For the example of Mobilenet (but put your downloaded model here) :
model = mobilenet.MobileNet()
x = model.layers[-2].output
The first line load the entire model, the second load the outputs of the before the last layer.
You can change layer[-x] with x being the outputs of the layer you want. So, for loading the model without the last layer, x should be equal to -2.
Then it's possible to use it like this :
x = Dense(256)(x)
predictions = Dense(15, activation = "softmax")(x)
model = Model(inputs = model.input, outputs = predictions)
With a model like this, how can one access the trained parameters like weight and bias of each layer?
model = Sequential ([
Dense(xx, activation=cntk.sigmoid),
Dense(outputs)])
z = model(features)
Thanks.
The specific mechanisms are shown in this tutorial. Here is the sample that shows how to access the parameters:
model = create_model()
print(len(model.layers))
print(model.layers[0].E.shape)
print(model.layers[2].b.value)
I am using TensorFlow to make predictions on time-series data. So it is like I have 50 tags and I want to find out the next possible 5 tags.
As shown in the following picture, I want to make it like the 4th structure.
I went through the tutorial demo: Recurrent Neural Networks
But I found it can provide like the 5th one in the above picture, which is different.
I am wondering which model could I use? I am thinking of the seq2seq models, but not sure if it is the right way.
You are right that you can use a seq2seq model. For brevity I've written up an example of how you can do it in Keras which also has a Tensorflow backend. I've not run the example so it might need tweaking. If your tags are one-hot you need to use cross-entropy loss instead.
from keras.models import Model
from keras.layers import Input, LSTM, RepeatVector
# The input shape is your sequence length and your token embedding size
inputs = Input(shape=(seq_len, embedding_size))
# Build a RNN encoder
encoder = LSTM(128, return_sequences=False)(inputs)
# Repeat the encoding for every input to the decoder
encoding_repeat = RepeatVector(5)(encoder)
# Pass your (5, 128) encoding to the decoder
decoder = LSTM(128, return_sequences=True)(encoding_repeat)
# Output each timestep into a fully connected layer
sequence_prediction = TimeDistributed(Dense(1, activation='linear'))(decoder)
model = Model(inputs, sequence_prediction)
model.compile('adam', 'mse') # Or categorical_crossentropy
model.fit(X_train, y_train)