Consider the following model:
model = Sequential()
model.add(Dense(60, input_shape=(60,), activation='relu', kernel_constraint=MaxNorm(3)))
model.add(Dropout(0.2))
model.add(Dense(30, activation='relu', kernel_constraint=MaxNorm(3)))
model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid'))
I understand the idea behind Dropout for regularization. According to my understanding, Dropout is applied per layer with a rate p which determines the probability of a neuron being dropped. In the above example, I cannot understand whether the first dropout layer is applied to the first hidden layer or the second hidden layer. Because as I have mentioned before, the dropout is applied per layer, and what confuses me here is that Keras deals with dropout as a layer on its own. Moreover, if the first dropout layer is applied to the second hidden layer, what about the second dropout layer? Is it applied to the output layer (which is not valid at all to apply dropout to output neurons)? So please can someone clarify these points?
As per documentation in keras:
Applies Dropout to the input.
Therefore the input layer to the drop out is dropped at a probability of p. In your case it means the first layer. In your example, 20% of 60 Neurons from first layer will be dropped.
Also it doesn't make sense if drop out works on the layer succeeding it because, in that way you will drop out from the last layer - which in classification can be the result.
Related
I am training an autoencoder using keras,with the encoder part as :
self.encoder = tf.keras.Sequential()
self.encoder.add(tf.keras.layers.Dropout(rate=0.2))
self.encoder.add(layers.Dense(14, activation='relu'))
self.encoder.add(layers.Dense(10, activation='relu'))
I am using Dropout at the start to create noise.My input is a 14-dimensional dataset.What dropout does now is dropping randomly each time 20% of the nodes meaning dropping 20% of the features at each time.What i would like to do is drop a specific feature,let's say feature_3(i suppose this means dropping a specific node),with a probability of 20% in each training step.
Could this be done using Keras?
If yes then how?
I do think you misunderstand how Dropout works.
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout
Your expectations is what dropout actually is. Also keras.layers.Dropout does not "create noise"
If you'd like to set the dropout mask:
noise_shape: 1D integer tensor representing the shape of the binary dropout mask that will be multiplied with the input. For instance, if your inputs have shape (batch_size, timesteps, features) and you want the dropout mask to be the same for all timesteps, you can use noise_shape=(batch_size, 1, features).
Note that noise_shape describes the behavior of the feature's dropout and is not related to adding/substracting noise to your features.
I have spent some time trying to improve my F1-Score for my multiclass text classification task. I am extraction aspects and sentiments from laptop reviews. Therefore there are 3 labels, B_A / I_A / O etc. I would really appreciate any suggestions to improve my network, for example additional layers or another embedding. (Maybe I should also try something else than multiclass classification for my task)
Now I have got a F1-Score of about 60% for the following code:
#vocab_size=4840, embedding is glove6B, max_seq_length=100
model = Sequential()
model.add(Embedding(vocab_size, 300, weights=[embedding_vectors], input_length=max_seq_length,
trainable= False))
model.add(Dropout(0.1))
model.add(Conv1D(3000, 1, activation='relu'))
model.add(Bidirectional(LSTM(units=150, recurrent_dropout=0, return_sequences=True)))
model.add(Dense(32, activation='relu'))
model.add(Dense(n_tags, activation='softmax'))
model.compile(loss="categorical_crossentropy", optimizer="rmsprop", metrics=["categorical_accuracy"])
model.summary()
# fit model on train data
model.fit(x_train, y_train,
batch_size=64,
epochs=10)
I don't know about the data, but I do have a lot of suggestions in general for mult-text classification with keras:
Instead of adding 1 3000 Conv1D layer, try adding multiple Conv1D layers of a smaller filtering amount
For the 32 neuron Dense layer, try increasing the amount of neurons. Often, when you don't have enough neurons in the layer before the output layer, the model loses accuracy
Instead of adding activation='relu' into the layers, instead try adding a LeakyReLU, so it would fix the dying ReLU problem if it is there
Instead of adding the Dropout after the Embedding layer, add the Dropout after the Conv1D layer. I wouldn't see the need for a Dropout after an untrainable layer made just for vectorizing inputs
If you haven't tried any of my suggestions already, I would recommend trying it. I especially would try the 4th one, as a Dropout after an Embedding layer doesn't seem neccessary.
I am unsure if I need to add a Dense input layer before adding LSTM layers in my model. Forexample, with the following model:
# Model
model = Sequential()
model.add(LSTM(128, input_shape=(train_x.shape[1], train_x.shape[2])))
model.add(Dense(5, activation="linear"))
Will the LSTM layer be the input layer, and the Dense layer the output layer (meaning no hidden layers)? Or does Keras create an input layer meaning the LSTM layer will be a hidden layer?
You don't need too. It depends on what you want to accomplish.
Check here some cases.
In your case, yes the LSTm will be the first layer and the Dense layer will be the output layer.
The current configuration is okay for simple examples. Everything is based on what you want to get as results. The model and layers subject to change based on the target goal. So, if the model is complex, you can make mix model with different layers and shapes. See reference.
Mix model layering
Are there any hidden layers in the following model, or just Input and output layers?
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(1024, activation='relu', input_dim=1))
model.add(tf.keras.layers.Dense(1,))
The first Dense layer is the hidden layer. Keras internally assigns an input layer for the model, with 1 input unit (the parameter of input_dim)
there is one hidden dense layer with 1024 neuron and relu activation function.
can plot model with:
tf.keras.utils.plot_model(model, show_shapes=True)
I want to build a fully-connected (dense) layer for a regression task. I usually do it with TF2, using Keras API like:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=2, activation='sigmoid', input_shape=(1, )))
model.add(tf.keras.layers.Dense(units=2, activation='linear'))
model.compile(optimizer='adam', loss='mae')
model.fit(inp_data, out_data, epochs=1000)
Now I want to build a custom layer. The layer is composed of, say 10 units, in which 8 units have predefined, fixed, untrainable weights and biases and 2 units have randomly-chosen weights and biases, to be trained by the network. Has anyone any idea how can I define it in Tensorflow?
Keras layers may receive a trainable parameter, True by default, to indicate whether you want them to be trained. Non-trainable layers will just keep the value they are given by the initializer. If I understand correctly, you want to have one layer which is only partially trainable. That is not possible as such with existing layers. Maybe you could do it with a custom layer class, but you can have an equivalent behavior by using two simple layers and then concatenating them (as long as your activation works element-wise, and even it it doesn't, like in a softmax layer, you could apply that activation after the concatenation). This is how it could work:
inputs = tf.keras.Input(shape=(1,))
# This is the trainable part of the layer
layer_train = tf.keras.layers.Dense(units=8, activation='sigmoid')(inputs)
# This is the non-trainable part
layer_const = tf.keras.layers.Dense(units=2, activation='sigmoid', trainable=False)(inputs)
# Merge both parts
layer = tf.keras.layers.Concatenate()([layer_train, layer_const])
# Make model
model = tf.keras.Model(inputs=inputs, outputs=layer)
# ...