I am trying to replicate YOLO in tensoflow.js. But instead of porting an existing model, because I want to learn how to build models from scratch, I am building it using the layers API.
The problem is the YOLO model uses leaky relu and tensorflow.js does not provide leaky relu as an activation option for conv2d layers. My understanding is that I should use no activation in the conv2d layer and simply add a tf.layers.leakyReLU directly in the model.
I found Pyrhon / Keras awnser How do you use Keras LeakyReLU in Python? But that does not apply to the JS API; especially if I want to run it on my GPU via node.js.
This might sound like the most obvious question, but do I add the leakyReLU before or after the conv2d layer?
Am I missing some bit of API where I can specify an arbitrary activation?
Related
I would like to deploy a trained Keras model on a microcontroller. However, there is no support for Spatial Dropout layer. I thought about removing the layer from the graph similarly to the Dropout layer. However I didn't find any indication on how the Spatial Dropout works in inference.
I have tried to look into the documentations or similar problem but couldn't find any indication about it.
I am new to attention mechanisms and I want to learn more about it by doing some practical examples. I came across a Keras implementation for multi-head attention found it in this website Pypi keras multi-head. I found two different ways to implement it in Keras.
One way is to use a multi-head attention as a keras wrapper layer with either LSTM or CNN.
This is a snippet of implementating multi-head as a wrapper layer with LSTM in Keras. This example is taken from this website keras multi-head"
import keras
from keras_multi_head import MultiHead
model = keras.models.Sequential()
model.add(keras.layers.Embedding(input_dim=100, output_dim=20, name='Embedding'))
model.add(MultiHead(keras.layers.LSTM(units=64), layer_num=3, name='Multi-LSTMs'))
model.add(keras.layers.Flatten(name='Flatten'))
model.add(keras.layers.Dense(units=4, activation='softmax', name='Dense'))
model.build()
model.summary()
The other way is to use it separately as a stand-alone layer.
This is a snippet of the second implementation for multi-head as stand-alone laye, also taken from keras multi-head"
import keras
from keras_multi_head import MultiHeadAttention
input_layer = keras.layers.Input( shape=(2, 3), name='Input',)
att_layer = MultiHeadAttention( head_num=3, name='Multi-Head',)(input_layer)
model = keras.models.Model(inputs=input_layer, outputs=att_layer)
model.compile( optimizer='adam', loss='mse', metrics={},)
I have been trying to find some documents that explain this but I have not found yet.
Update:
What I have found was that the second implementation (MultiHeadAttention) is more like the Transformer paper "Attention All You Need". However, I am still struggling to understand the first implementation which is the wrapper layer.
Does the first one (as a wrapper layer) would combine the output of multi-head with LSTM?.
I was wondering if someone could explain the idea behind them, especially, the wrapper layer.
I understand your confusion. From my experience, what the Multihead (this wrapper) does is that it duplicates (or parallelize) layers to form a kind of multichannel architecture, and each channel can be used to extract different features from the input.
For instance, each channel can have a different configuration, which is later concatenated to make an inference. So, the MultiHead can be used to wrap conventional architectures to form multihead-CNN, multihead-LSTM etc.
Note that the attention layer is different. You may stack attention layers to form a new architecture. You may also parallelize the attention layer (MultiHeadAttention) and configure each layer as explained above. See here for different implementation of the attention layer.
I’d like to add some layer into a Pytorch based neural model. Basically I am trying to combine to codes together.
But I notice that the layer I want to add is the implemented by Tensorflow. I’d like to know if there is an easy way to integrate a TensorFlow layer into a Pytorch neural model… ?
The error is shown as:
module ‘torch.nn’ has no attribute ‘tensorflow_layer’
I recently completed a task to study how to use morphological operation as an activation function for neural networks. But I had no idea and didn't know how to use keras for custom functionality. Can anyone provide Suggestions or related papers?
I don't know if morphological operations will work as activation function. I don't think so.
I think you should use or combine morphological operations like dilation and erosion, and, after that, use a ReLU or softmax layer as activation function.
Morphological operations are available in tensorflow, and you can call them in your Keras application.
Reference links:
https://www.tensorflow.org/api_docs/python/tf/nn/dilation2d
https://www.tensorflow.org/api_docs/python/tf/nn/erosion2d
I see that there are many similar functions between tensorflow and keras like argmax, boolean_mask...I wonder why people have to use keras as backend along with tensorflow instead of using tensorflow alone.
Keras is not a backend, but it is a high-level API for building and training Neural Networks. Keras is capable of running on top of Tensorflow, Theano and CNTK. Most of the people prefer Keras due to its simplicity compared to other libraries like Tensorflow. I recommend Keras for beginners in Deep Learning.
A Keras tensor is a tensor object from the underlying backend (Theano,
TensorFlow or CNTK), which we augment with certain attributes that
allow us to build a Keras model just by knowing the inputs and outputs
of the model.
Theano vs Tensorflow
Tensorflow is necessary if you wish to use coremltools. Apple has promised support for architectures created using Theano but I haven't seen it yet.
Keras will require unique syntax sugar depending on the backend in use. I like the flexibility of Tensorflow input layers and easy-access to strong Google neural networks.