LBP Feature Grafting with CNN Layer using Custom Layer - tensorflow

Can we add manual features extracted by LBP in CNN Layer with the help of Custom Layer or any other terminology.
In the hidden layer.

Related

Implement autoencoder with attention layer

I am trying to modify the autoencoder model so I can add an attention layer on the "bottleneck/compressed" layer.
Suppose i have 2 encode Density layers, the attention layer i want to add, and 2 decode Density layers.
How can I develop this architecture in tensorflow?

Multi-Head attention layers - what is a warpper multi-head layer in Keras?

I am new to attention mechanisms and I want to learn more about it by doing some practical examples. I came across a Keras implementation for multi-head attention found it in this website Pypi keras multi-head. I found two different ways to implement it in Keras.
One way is to use a multi-head attention as a keras wrapper layer with either LSTM or CNN.
This is a snippet of implementating multi-head as a wrapper layer with LSTM in Keras. This example is taken from this website keras multi-head"
import keras
from keras_multi_head import MultiHead
model = keras.models.Sequential()
model.add(keras.layers.Embedding(input_dim=100, output_dim=20, name='Embedding'))
model.add(MultiHead(keras.layers.LSTM(units=64), layer_num=3, name='Multi-LSTMs'))
model.add(keras.layers.Flatten(name='Flatten'))
model.add(keras.layers.Dense(units=4, activation='softmax', name='Dense'))
model.build()
model.summary()
The other way is to use it separately as a stand-alone layer.
This is a snippet of the second implementation for multi-head as stand-alone laye, also taken from keras multi-head"
import keras
from keras_multi_head import MultiHeadAttention
input_layer = keras.layers.Input( shape=(2, 3), name='Input',)
att_layer = MultiHeadAttention( head_num=3, name='Multi-Head',)(input_layer)
model = keras.models.Model(inputs=input_layer, outputs=att_layer)
model.compile( optimizer='adam', loss='mse', metrics={},)
I have been trying to find some documents that explain this but I have not found yet.
Update:
What I have found was that the second implementation (MultiHeadAttention) is more like the Transformer paper "Attention All You Need". However, I am still struggling to understand the first implementation which is the wrapper layer.
Does the first one (as a wrapper layer) would combine the output of multi-head with LSTM?.
I was wondering if someone could explain the idea behind them, especially, the wrapper layer.
I understand your confusion. From my experience, what the Multihead (this wrapper) does is that it duplicates (or parallelize) layers to form a kind of multichannel architecture, and each channel can be used to extract different features from the input.
For instance, each channel can have a different configuration, which is later concatenated to make an inference. So, the MultiHead can be used to wrap conventional architectures to form multihead-CNN, multihead-LSTM etc.
Note that the attention layer is different. You may stack attention layers to form a new architecture. You may also parallelize the attention layer (MultiHeadAttention) and configure each layer as explained above. See here for different implementation of the attention layer.

Can I generate heat map using method such as Grad-CAM in concatenated CNN?

I am trying to apply GradCAM to my pre-trained CNN model to generate heat maps of layers. My custom CNN design is shown as follows:
- It adopted all the convolution layers and the pre-trained weights from the VGG16 model.
- Extract lower level features (early convolution layers) from VGG16.
- Train the fully connected layers of both normal/high and lower level features from VGG16.
- Concatenate outputs of both normal/high- and lower-level f.c. layers and then train more f.c. layers before the final prediction.
model design
I want to use GradCAM to visualize the feature maps of the low-level route and the normal/high-level route and I have done such heatmaps on non-concatenate fine-tuned VGG using the last convolutional layers. My question is, on a concatenated CNN model, can the Grad-CAM method still work using the gradient of the prediction with respect to the low- and high-level feature map feature maps respectfully? If not, are there other methods that can do the heatmaps visualization for such a model? Is using the shared fully connected layer an option?
Any idea and suggestions are much appreciated!

How can I create a RoI pooling layer in tensorlfow/keras?

I've programmed a VGG16 based CNN and now I want to create a faster R-CNN from it. In all the architecture photos I've seen it is needed to have a RoI pooling layer but I don't know how to implement one. Is there an function to do this?
Keras/Tensorflow does not provide an implementation of ROI Pooling Layer, so you need to code it yourself.
You can have code reference from this repository

How to use a bidirectional RNN layer in tensorflow ?

When we add a bidirectional RNN layer I can understand that we have to concatenate hidden states. If we use bidirectional RNN layer in encoder decoder model do we have to train the bidirectional RNN layer separately ?
No. To quote from the abstract of Bidirectional Recurrent Neural Networks by Schuster and Paliwal:
The BRNN can be trained without the limitation of using input information just
up to a preset future frame. This is accomplished by training it
simultaneously in positive and negative time direction.
I guess you are talking about tf.nn.static_bidirectional_rnn.