I am reading seq2seq in https://www.tensorflow.org/text/tutorials/nmt_with_attention#text_vectorization.
It said "The TextVectorization layer and many other preprocessing layers have an adapt method. This method reads one epoch of the training data, and works a lot like Model.fix".
Why do we need TextVectorization instead of simple mapping from string to number? How to train that TextVectorization?
Related
I want to work with Keras models pre-trained on ImageNet. The models and information about their performance are here.
I downloaded ILSVRC 2012 (ImageNet) dataset and evaluated ResNet50 on the validation dataset. The top-1 accuracy should be 0.749 but I get 0.68. The top-5 accuracy should be 0.921, mine is 0.884. I also tried VGG16 and MobileNet with similar discrepancies.
I preprocess the images using built-in preprocess_input function (e.g. tensorflow.keras.applications.resnet50.preprocess_input()).
My guess is that the dataset is different. How can I make sure that the validation dataset that I use for evaluation is the same as the one that was used by the authors? Could there be any other reason why I get different results?
I'm following the tutorial on the tensorflow site (https://www.tensorflow.org/tutorials/text/word_embeddings#create_a_simple_model) to learn word embeddings, and a confusion that I have is about the purpose of having a Globalaveragepooling layer right after the embedding layer as follows:
model = keras.Sequential([
layers.Embedding(encoder.vocab_size, embedding_dim),
layers.GlobalAveragePooling1D(),
layers.Dense(16, activation='relu'),
layers.Dense(1)
])
I understand what pooling means and how it's done. If someone can explain why we need a pooling layer, and what would change if we didn't use it, I'd appreciate it.
The purpose of this tutorial is to get you to understand word-embeddings through a simple toy task: binary sentiment analysis.
To start with, they make you code a simple model: take the average of all embeddings in a sentence and add a feed-forward neural net to classify this aggregated input. GlobalAveragePooling1D does this averaging.
Obviously in the real world you'd want to use more complex models as RNNs, LSTMs, bidirectional models, atrous-convolution-based models or Transformers but that's not the point in this tutorial.
The "simple model" they mention being a feed-forward neural net, it expects a fixed input dimension so when you have sequential data of variable length you need to address this somehow: averaging, padding, cropping etc. Here they average with this GlobalAveragePooling1D layer
I am trying to convert my CNN written with tensorflow layers to use the keras api in tensorflow (I am using the keras api provided by TF 1.x), and am having issue writing a custom loss function, to train the model.
According to this guide, when defining a loss function it expects the arguments (y_true, y_pred)
https://www.tensorflow.org/guide/keras/train_and_evaluate#custom_losses
def basic_loss_function(y_true, y_pred):
return ...
However, in every example I have seen, y_true is somehow directly related to the model (in the simple case it is the output of the network). In my problem, this is not the case. How do implement this if my loss function depends on some training data that is unrelated to the tensors of the model?
To be concrete, here is my problem:
I am trying to learn an image embedding trained on pairs of images. My training data includes image pairs and annotations of matching points between the image pairs (image coordinates). The input feature is only the image pairs, and the network is trained in a siamese configuration.
I am able to implement this successfully with tensorflow layers and train it sucesfully with tensorflow estimators.
My current implementations builds a tf Dataset from a large database of tf Records, where the features is a dictionary containing the images and arrays of matching points. Before I could easily feed these arrays of image coordinates to the loss function, but here it is unclear how to do so.
There is a hack I often use that is to calculate the loss within the model, by means of Lambda layers. (When the loss is independent from the true data, for instance, and the model doesn't really have an output to be compared)
In a functional API model:
def loss_calc(x):
loss_input_1, loss_input_2 = x #arbirtray inputs, you choose
#according to what you gave to the Lambda layer
#here you use some external data that doesn't relate to the samples
externalData = K.constant(external_numpy_data)
#calculate the loss
return the loss
Using the outputs of the model itself (the tensor(s) that are used in your loss)
loss = Lambda(loss_calc)([model_output_1, model_output_2])
Create the model outputting the loss instead of the outputs:
model = Model(inputs, loss)
Create a dummy keras loss function for compilation:
def dummy_loss(y_true, y_pred):
return y_pred #where y_pred is the loss itself, the output of the model above
model.compile(loss = dummy_loss, ....)
Use any dummy array correctly sized regarding number of samples for training, it will be ignored:
model.fit(your_inputs, np.zeros((number_of_samples,)), ...)
Another way of doing it, is using a custom training loop.
This is much more work, though.
Although you're using TF1, you can still turn eager execution on at the very beginning of your code and do stuff like it's done in TF2. (tf.enable_eager_execution())
Follow the tutorial for custom training loops: https://www.tensorflow.org/tutorials/customization/custom_training_walkthrough
Here, you calculate the gradients yourself, of any result regarding whatever you want. This means you don't need to follow Keras standards of training.
Finally, you can use the approach you suggested of model.add_loss.
In this case, you calculate the loss exaclty the same way I did in the first answer. And pass this loss tensor to add_loss.
You can probably compile a model with loss=None then (not sure), because you're going to use other losses, not the standard one.
In this case, your model's output will probably be None too, and you should fit with y=None.
I am using tensorflow 1.3.0 to train a CNN classification model. However I need to get access to the prelogits layer to evaluate my method (i.e. while this is casted as a classification problem, the method is not a classification problem but is used to extract CNN features, i.e. to produce a point in an N-dimensional vector space for an input image test)
I am using both the dataset API (with TFRecord files) and the estimator API to train the model. However, I don't see how I can get access/return the prelogits value using the Estimator API, i.e. estimator.train(), .evaluate() or .predict() since model_fn() needs to return a specific tf.estimator.EstimatorSpec object.
Previously (i.e. using the standard sess=tf.Session() method) I could train the model and get access to the prelogits layer while training (or by loading the model after training) and feed the network with a specific input to get the specific layer output with a sess.run(specific_layer) as long as the layer was named specific_layer.
I have tried to use the prediction output of EstimatorSpec but it did not work. Any ideas/suggestions?
I know what embeddings are and how they are trained. Precisely, while referring to the tensorflow's documentation, I came across two different articles. I wish to know what exactly is the difference between them.
link 1: Tensorflow | Vector Representations of words
In the first tutorial, they have explicitly trained embeddings on a specific dataset. There is a distinct session run to train those embeddings. I can then later on save the learnt embeddings as a numpy object and use the
tf.nn.embedding_lookup() function while training an LSTM network.
link 2: Tensorflow | Embeddings
In this second article however, I couldn't understand what is happening.
word_embeddings = tf.get_variable(“word_embeddings”,
[vocabulary_size, embedding_size])
embedded_word_ids = tf.gather(word_embeddings, word_ids)
This is given under the training embeddings sections. My doubt is: does the gather function train the embeddings automatically? I am not sure since this op ran very fast on my pc.
Generally: What is the right way to convert words into vectors (link1 or link2) in tensorflow for training a seq2seq model? Also, how to train the embeddings for a seq2seq dataset, since the data is in the form of separate sequences for my task unlike (a continuous sequence of words refer: link 1 dataset)
Alright! anyway, I have found the answer to this question and I am posting it so that others might benefit from it.
The first link is more of a tutorial that steps you through the process of exactly how the embeddings are learnt.
In practical cases, such as training seq2seq models or Any other encoder-decoder models, we use the second approach where the embedding matrix gets tuned appropriately while the model gets trained.