How to feeding hidden state vectors from one transformer directly into a layer of different transformer - tensorflow

The transformer models take as input token Ids, which are converted into embeddings. I am wondering how to input embeddings directly.
I am asking for both the Pytorch and Keras versions of the models.

Related

Incorporate fasttext vectors in tf.keras embedding layer?

Fasttext could handle OOV easily, i.e., it could be assumed that emb = fasttext_model(raw_input) always holds. However, I am not sure how I could build this layer into tf.keras embedding. I couldn't simply load the matrix into Embedding because in that way the OOV couldn't be handled. Is there a walkaround that I could use fasttext_model in a tf.keras model?

Can I generate heat map using method such as Grad-CAM in concatenated CNN?

I am trying to apply GradCAM to my pre-trained CNN model to generate heat maps of layers. My custom CNN design is shown as follows:
- It adopted all the convolution layers and the pre-trained weights from the VGG16 model.
- Extract lower level features (early convolution layers) from VGG16.
- Train the fully connected layers of both normal/high and lower level features from VGG16.
- Concatenate outputs of both normal/high- and lower-level f.c. layers and then train more f.c. layers before the final prediction.
model design
I want to use GradCAM to visualize the feature maps of the low-level route and the normal/high-level route and I have done such heatmaps on non-concatenate fine-tuned VGG using the last convolutional layers. My question is, on a concatenated CNN model, can the Grad-CAM method still work using the gradient of the prediction with respect to the low- and high-level feature map feature maps respectfully? If not, are there other methods that can do the heatmaps visualization for such a model? Is using the shared fully connected layer an option?
Any idea and suggestions are much appreciated!

How can i extract Trained model information from its weight files?

I had got pre-trained model weight files of a model for audio classification. How can i extract model information from that weights given (For examples no of layers used to build architecture)? This question asked in the interview. is if possible to extract model information from its weights.
Check these two threads
How to read keras model weights without a model
https://github.com/keras-team/keras/issues/91

VGG16 Transfer learning with an additional input source

I am trying to use Tensorflow for transfer learning using a pre-trained VGG16 model.
However, the input to the model in my problem is an RGB image with an extra channel functioning as a binary mask. This is different than the original input on which the model was trained (224x224 RGB images).
I think that using the pretrained model is still possible in this case. How do I assign weights for connections between the first convolutional layer and the extra channel? Is transfer learning still applicable in such a scenario?
Thanks!

What's are the differences between tensorflow_serving classification, predict and regression SignatureDefs

I am trying to serve the tensorflow object detection api model in tensorflow serving, And I am confused by the 3 different SignatureDefs. What are the differences, When to choose one over another?
Tensorflow Serving uses a different way of updating models weights and different signature mechanism is used in serving. In order to save model in serving se uses SavedModel. SavedModel provides a language-neutral format to save machine-learned models that is recoverable and hermetic. It enables higher-level systems and tools to produce, consume and transform TensorFlow models.
This support SignatureDefs
Graphs that are used for inference tasks typically have a set of inputs and outputs. This is called a Signature.
SavedModel uses SignatureDefs to allow generic support for signatures that may need to be saved with the graphs.
For those who previously used TF-Exporter/SessionBundle, Signatures in TF-Exporter will be replaced by SignatureDefs in SavedModel.
A SignatureDef requires specification of:
inputs as a map of string to TensorInfo.
outputs as a map of string to TensorInfo.
method_name (which corresponds to a supported method name in the loading tool/system).
Classification SignatureDefs support structured calls to TensorFlow Serving's Classification API. These prescribe that there must be an inputs Tensor, and that there are two optional output Tensors: classes and scores, at least one of which must be present.
Predict SignatureDefs support calls to TensorFlow Serving's Predict API. These signatures allow you to flexibly support arbitrarily many input and output Tensors. For the example below, the signature my_prediction_signature has a single logical input Tensor images that are mapped to the actual Tensor in your graph x:0.
Regression SignatureDefs support structured calls to TensorFlow Serving's Regression API. These prescribe that there must be exactly one inputs Tensor, and one outputs Tensor.
Please refer:
https://www.tensorflow.org/serving/signature_defs
https://github.com/tensorflow/serving/issues/599
The Classify API is higher-level and more specific than the Predict API. Classify accepts tensorflow.serving.Input (which wraps a list of tf.Examples) as input and produces classes and scores as output. It is used for classification problems. Predict, on the other than, accepts tensors as input and outputs tensors. It can be used for regression, classification and other types of inference problems.