Are hidden layers of sklearn's MLPClassifier() the same as Dense layer of keras/tensorflow? - tensorflow

Theoretically and practically, are the hidden layers of MLPclassifier (refer to hidden_layer_sizes)
mlp = MLPClassifier(hidden_layer_sizes=(4, 3, 2, 1),
max_iter = 100, activation = 'relu',
solver = 'adam', verbose = type_spec_from_value,
random_state = 100, learning_rate = 'invscaling',
early_stopping=False
)
the same as the Dense layers of tensorflow/keras
mlp = Sequential()
mlp.add(Dense(4))
mlp.add(Dense(3))
mlp.add(Dense(2))
mlp.add(Dense(3))
?

Yes, they are the same. In both cases, the parameters specify the number of neurons.

Related

Input layer 0 of sequence is incompatible with the layer - CNNs

I am trying to create a CNN model using hyperparameterization for image classification. When I run the code I receive the following error:
ValueError: Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 32, 32, 32, 3), found shape=(32, 32, 32, 3)
How to fix the error? Here is the whole code pasted below:
# first we create our actual code which requires the arguments, units, activation, dropout, lr:
def build_model(hp):
model = ks.Sequential([
# adding first conv2d layer
ks.layers.Conv2D(
#Let's tune the filters, kernel_size, activation function.
filters = hp.Int("conv_1_filter", min_value=1,max_value=100, step = 16),
kernel_size = hp.Choice("conv_1_kernel", values = [3,5]),
activation = hp.Choice("conv_1_activation", ["relu", "tanh", "softmax"]),
input_shape = (32,32,32,3)
),
# adding second conv2d layer
ks.layers.Conv2D(
#Let's tune the filters, kernel_size, activation function.
filters = hp.Int("conv_2_filter", min_value=1,max_value=50, step = 16),
kernel_size = hp.Choice("conv_2_kernel", values = [3,5]),
activation = hp.Choice("conv_2_activation", ["relu", "tanh", "softmax"]),
input_shape = (32,32,32,3)
)])
model.add(layers.Flatten())
# Let's tune the number of Dense layers.
for i in range(hp.Int("num_dense_layers", 1, 3)):
model.add(
layers.Dense(
# Let's tune the number of units separately
units = hp.Int(f"units_{i}", min_value=1, max_value = 100, step = 16),
activation = hp.Choice("activation", ["relu", "tanh", "softmax"])
))
if hp.Boolean("dropout"):
model.add(layers.Dropout(rate = 0.25))
model.add(layers.Dense(10, activation = "softmax"))
learning_rate = hp.Float("lr", min_value = 1e-4, max_value = 1e-2, sampling="log")
model.compile(
optimizer = ks.optimizers.Adam(learning_rate = learning_rate),
loss = "categorical_crossentropy",
metrics = ["accuracy"]
)
return model
build_model(keras_tuner.HyperParameters())
You are getting this error due the input shape mismatch.
Here i have implemented the hypermodel on the mnist fashion dataset which contains images of shape (28,282,1).
def build_model(hp):
model = tf.keras.Sequential([
tf.keras.Input(shape=(28,28,1)),
# adding first conv2d layer
tf.keras.layers.Conv2D(
#Let's tune the filters, kernel_size, activation function.
filters = hp.Int("conv_1_filter", min_value=1,max_value=100, step = 16),
kernel_size = hp.Choice("conv_1_kernel", values = [3,5]),
activation = hp.Choice("conv_1_activation", ["relu", "tanh", "softmax"]),
input_shape = (28,28,1)
),
tf.keras.layers.MaxPooling2D(
pool_size=hp.Choice('pooling_1',values=[2,3])),
# adding second conv2d layer
tf.keras.layers.Conv2D(
#Let's tune the filters, kernel_size, activation function.
filters = hp.Int("conv_2_filter", min_value=1,max_value=50, step = 16),
kernel_size = hp.Choice("conv_2_kernel", values = [3,5]),
activation = hp.Choice("conv_2_activation", ["relu", "tanh", "softmax"]),
input_shape = (28,28,1)
)])
tf.keras.layers.MaxPooling2D(
pool_size=hp.Choice('pooling_2',values=[2,3])),
model.add(tf.keras.layers.Flatten())
if hp.Boolean("dropout"):
model.add(tf.keras.layers.Dropout(rate = 0.25))
model.add(tf.keras.layers.Dense(10, activation = "softmax"))
learning_rate = hp.Float("lr", min_value = 1e-4, max_value = 1e-2, sampling="log")
model.compile(
optimizer = tf.keras.optimizers.Adam(learning_rate = learning_rate),
loss = "categorical_crossentropy",
metrics = ["accuracy"]
)
return model
By providing the correct shape you will not get any error.
For more details, Please refer to this gist and this documentation. Thank You!

Keras accuracy not increasing

I am trying to perform sentiment classification using Keras. I am trying to do this using a basic neural network (no RNN or other more complex type). However when I run the script I see no increase in accuracy during training/evaluation. I am guessing I am setting up the output layer incorrectly but I am not sure of that. y_train is a list [1,2,3,1,2,4,5] (5 different labels) containing the targets belonging to the features in X_train_seq_padded. The setup is as follows:
padding_len = 24 # len of each tokenized sentence
neurons = 16 # 2/3 the length of the text that is padded
model = Sequential()
model.add(Dense(neurons, input_dim = padding_len, activation = 'relu', name = 'hidden-1'))
model.add(Dense(neurons, activation = 'relu', name = 'hidden-2'))
model.add(Dense(neurons, activation = 'relu', name = 'hidden-3'))
model.add(Dense(1, activation = 'sigmoid', name = 'output_layer'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics=['accuracy'])
callbacks = [EarlyStopping(monitor = 'accuracy', patience = 5, mode = 'max')]
history = model.fit(X_train_seq_padded, y_train, epochs = 100, batch_size = 64, callbacks = callbacks)
First of all, in your above set up if you choose sigmoid in your last layer activation function which generally uses for binary classification or multi-label classification then, the loss function should be binary_crossentropy.
But if your labels are represented multi-class and transformed into one-hot encoded then your last layer should be Dense(num_classes, activations='softmax') and the loss function would be categorical_crossentropy.
But if you don't transform your multi-class label but integer then your last layer and loss function should be
Dense(num_classes) # with logits
SparseCategoricalCrossentropy(from_logits= True)
Or, (#Frightera)
Dense(num_classes, activation='softmax') # with probabilities
SparseCategoricalCrossentropy(from_logits=False)

What does recurrent_initializer do?

I am experimenting with recurrent neural network layers in tensorflow & keras and I am having a look at the recurrent_initializer. I wanted to know more about its influence on the layer, so I created a SimpleRnn layer as the follows:
rnn_layer = keras.layers.SimpleRNN(1, return_sequences=True, kernel_initializer = keras.initializers.ones, recurrent_initializer=keras.initializers.zeros, activation="linear")
Running this code, makes the addition in the recurrent net visible:
inp = np.zeros(shape=(1,1,20), dtype=np.float32)
for i in range(20):
inp[0][0][:i] = 5
#inp[0][0][i:] = 0
print(f"i:{i} {rnn_layer(inp)}"'')
output:
i:0 [[[0.]]]
i:1 [[[5.]]]
i:2 [[[10.]]]
i:3 [[[15.]]]
i:4 [[[20.]]]
i:5 [[[25.]]]
i:6 [[[30.]]]
i:7 [[[35.]]]
i:8 [[[40.]]]
i:9 [[[45.]]]
i:10 [[[50.]]]
i:11 [[[55.]]]
i:12 [[[60.]]]
i:13 [[[65.]]]
i:14 [[[70.]]]
i:15 [[[75.]]]
i:16 [[[80.]]]
i:17 [[[85.]]]
i:18 [[[90.]]]
i:19 [[[95.]]]
Now I change the recurrent_initializer to something different, like a glorot_normal distribution:
rnn_layer = keras.layers.SimpleRNN(1, return_sequences=True, kernel_initializer = keras.initializers.ones, recurrent_initializer=keras.initializers.glorot_normal(seed=0), activation="linear")
But I still get the same results. I thought it might depend on some logic, which a Rnn is missing but a LSTM has, so I tried it with an lstm, but still same results. I guess there is something about the recurrent_logic, I still miss. Can someone explain me, what the reccurent_initializers purpose is and how it affects the recurrent layer?
Thanks alot!
Your input to the RNN layer is of shape (1, 1, 20), which mean one Timestep for each batch , the default behavior of RNN is to RESET state between each batch , so you cant see the effect of the recurrent ops(the recurrent_initializers).
You have to change the length of the sequence of your input:
inp = np.ones(shape=(5 ,4,1), dtype=np.float32) # sequence length == 4
rnn_layer1 = tf.keras.layers.LSTM(1,return_state=True, return_sequences=False,
kernel_initializer = tf.keras.initializers.ones,
recurrent_initializer=tf.keras.initializers.zeros, activation="linear")
rnn_layer2 = tf.keras.layers.LSTM(1,return_state=True , return_sequences=False,
kernel_initializer = tf.keras.initializers.ones,
recurrent_initializer=tf.keras.initializers.glorot_normal(seed=0),
activation="linear")
first_sample = inp[0 : 1 , : ,: ] #shape(1,4,1)
print(rnn_layer1(first_sample )
print(rnn_layer2(first_sample )

How to initialize CNN Layer with Gammatone Filters (or any filter) for sound regression (Or Classification)?

For my project I need to initialize the CNN 1st Layer kernel with Gammatone filters according to papers ( https://www.mdpi.com/1099-4300/20/12/990/htm ) ,( https://www.groundai.com/project/end-to-end-environmental-sound-classification-using-a-1d-convolutional-neural-network/1 ) and a few others. What does it exactly mean to initialize the cnn kernel with Gammatone filter (Or any filter). How does one implement it? Is it a custom layer? Any tips and guidance would be much appreciated!
for instance
conv_1 = Conv1D(filters = 64, kernel_size = 3, kernel_initializer = *insert Gammatone Filter*, padding = 'same', activation='relu', input_shape = (timesteps, features))(decoder_outputs3)
TIA
You could use TensorFlows constant initializer:
gammatone_filter_kernel = np.array([...])
init_kernel = tf.constant_initializer(gammatone_filter_kernel)
# ...
conv_1 = Conv1D(filters = 64, kernel_size = 3, kernel_initializer = init_kernel, padding = 'same', activation='relu', input_shape = (timesteps, features))(decoder_outputs3)
# ...
If your filter is some kind of preprocessing step to your signal you could set the trainable attribute of the conv laver to False and the weights will be fixed.

How can I improve the accuracy of this keras Neural Network?

I'm working with the Heart Disease dataset from Machine Learning Repository and I want to improve the accuracy 0.8533 of my NN.
I've tried many things and I got the best results with this settings
classifier = Sequential()
classifier.add(Dense(units = 16, activation = 'relu',
kernel_initializer = 'normal', input_dim = 13))
classifier.add(Dropout(0.2))
classifier.add(Dense(units = 8, activation = 'relu',
kernel_initializer = 'normal'))
classifier.add(Dropout(0.2))
classifier.add(Dense(units = 1, activation = 'sigmoid'))
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy',
metrics = ['accuracy'])
classifier.fit(x = attributes, y = classes, batch_size = 1, epochs = 1000,
validation_split = 0.25)
I've changes the number of nodes to 10 and 5, respectively, changed the optimizer to rmsprop and sgd, changed the kernel_initializer to 'normal' and 'random_uniform'. Even though, the accuracy hasn't improved.
What tips could you guys give me to make the accuracy higher?
If you do not have overfitting try reducing the Dropout rate. Have you studied the dataset? Maybe there are unnecesary features and you can reduce dimensionality with PCA for example. Try using K-Fold with your dataset to get a more realisting accuracy. Have you tried other classifiers? Neural Networks are very cool but sometimes SVM, LR, RandomForest, XGBoost work pretty decently and are easier to tune if you use a Grid with sklearn. Good luck!