I have trained a recurrent neural network (LSTM) in keras but now I am struggling to put all the pieces together. Specifically, I cannot understand how to recompose the matrices of weights.
I have one input, one hidden and one output layer, as follows:
# create the model
model = Sequential()
model.add(LSTM(100, dropout=0.5, recurrent_dropout=0.5, input_shape=(timesteps, data_dim), activation='tanh'))
model.add(Dense(5, activation='softmax'))
when I call model.get_weights() I only get the bias units for the hidden and output layers but the ones for the input layers appear missing. Namely, I have a 15x400 matrix, then a 400x101 and a 101X5.
Is there something that I am missing out here?
Thanks for your help
Sequential is a model in Keras, not an input layer.
An input layer in a neural network is simply passing the inputs to the hidden layer and it does not need a bias neuron .
In your case, the model.get_weights() returns these arrays
(15, 400)
(100, 400)
(400,)
(100, 5)
(5,)
Out of which (400,) is the bias array for the LSTM layer
(5,) is the bias array for the Dense layer
Related
Keras pre-trained models (VGG, ResNet, DenseNet, etc.) have weights established after training on ImageNet with input shape (224, 224, 3). However, Keras allows us to specify any other input shape (width and height should be no smaller than 32). How does Keras determine the initial weights of the first hidden layer when the input shape is other than (224, 224, 3)?
It depends on parameter include_top.
Example:
import tensorflow as tf
model = tf.keras.applications.VGG16(include_top = True, input_shape=(299, 299, 3))
model.summary()
This will throw an error because when you pass include_top = True whole VGG16 architecture will be loaded including Dense layers.
As Dense layers care about the shape, it will throw an error. Because of the operation that Dense layers employ, shapes must be defined and matched with the input shape.
-- Source Code --
Second Example:
import tensorflow as tf
model = tf.keras.applications.VGG16(include_top = False, input_shape=(299, 299, 3))
model.summary()
This time, model only has convolutional layers because include_top = False. Convolutional layers are just sliding filters on the image. So input shape is not a problem for normal convolutions.
When you pass an input_shape, Keras creates an Input layer for that shape. Then creates the model, after that loads the weights.
-- Source Code --
The only constraint here is that, since these models are trained on RGB images, the new images should also have 3 channels.
I am training an autoencoder using keras,with the encoder part as :
self.encoder = tf.keras.Sequential()
self.encoder.add(tf.keras.layers.Dropout(rate=0.2))
self.encoder.add(layers.Dense(14, activation='relu'))
self.encoder.add(layers.Dense(10, activation='relu'))
I am using Dropout at the start to create noise.My input is a 14-dimensional dataset.What dropout does now is dropping randomly each time 20% of the nodes meaning dropping 20% of the features at each time.What i would like to do is drop a specific feature,let's say feature_3(i suppose this means dropping a specific node),with a probability of 20% in each training step.
Could this be done using Keras?
If yes then how?
I do think you misunderstand how Dropout works.
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout
Your expectations is what dropout actually is. Also keras.layers.Dropout does not "create noise"
If you'd like to set the dropout mask:
noise_shape: 1D integer tensor representing the shape of the binary dropout mask that will be multiplied with the input. For instance, if your inputs have shape (batch_size, timesteps, features) and you want the dropout mask to be the same for all timesteps, you can use noise_shape=(batch_size, 1, features).
Note that noise_shape describes the behavior of the feature's dropout and is not related to adding/substracting noise to your features.
Are there any hidden layers in the following model, or just Input and output layers?
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(1024, activation='relu', input_dim=1))
model.add(tf.keras.layers.Dense(1,))
The first Dense layer is the hidden layer. Keras internally assigns an input layer for the model, with 1 input unit (the parameter of input_dim)
there is one hidden dense layer with 1024 neuron and relu activation function.
can plot model with:
tf.keras.utils.plot_model(model, show_shapes=True)
I want to build a fully-connected (dense) layer for a regression task. I usually do it with TF2, using Keras API like:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=2, activation='sigmoid', input_shape=(1, )))
model.add(tf.keras.layers.Dense(units=2, activation='linear'))
model.compile(optimizer='adam', loss='mae')
model.fit(inp_data, out_data, epochs=1000)
Now I want to build a custom layer. The layer is composed of, say 10 units, in which 8 units have predefined, fixed, untrainable weights and biases and 2 units have randomly-chosen weights and biases, to be trained by the network. Has anyone any idea how can I define it in Tensorflow?
Keras layers may receive a trainable parameter, True by default, to indicate whether you want them to be trained. Non-trainable layers will just keep the value they are given by the initializer. If I understand correctly, you want to have one layer which is only partially trainable. That is not possible as such with existing layers. Maybe you could do it with a custom layer class, but you can have an equivalent behavior by using two simple layers and then concatenating them (as long as your activation works element-wise, and even it it doesn't, like in a softmax layer, you could apply that activation after the concatenation). This is how it could work:
inputs = tf.keras.Input(shape=(1,))
# This is the trainable part of the layer
layer_train = tf.keras.layers.Dense(units=8, activation='sigmoid')(inputs)
# This is the non-trainable part
layer_const = tf.keras.layers.Dense(units=2, activation='sigmoid', trainable=False)(inputs)
# Merge both parts
layer = tf.keras.layers.Concatenate()([layer_train, layer_const])
# Make model
model = tf.keras.Model(inputs=inputs, outputs=layer)
# ...
I have a convolutional neural network of 3 layers followed with a dense layer and finally, a softmax layer. The purpose is to train the convolutional layers, then replace the dense layer with an RNN layer. My problem is, I need a batch_size of 128, number of time_steps of 100, and hidden_state size of 100. Therefore, any input has to be of shape: [batch_size, n_steps, number_features], where the number_features is the reshaped pool layer after the 3rd conv layer.
So, feeding 128 (for batch_size) * 100(for num_steps) images will not fit into my memory.
What I need to achieve is the follows: I need to process 128 images at a time, as a mini-batch, extract their features, then hold on. After 100 mini-batches, I can feed the 128, 100, num_features into the RNN. In this case, I will not run out of memory.
So, how can I achieve this in tensorflow?
Any help is much appreciated!!