So I have this assignment to train a very simple neural network. Our dataset has 6 features that are fed into the network and we are required to train it and then predict one output number. The professor gave us the code and basically told us to learn by ourselves lol. So my doubt is, in the following code, in which the layers for the neural network are defined, does the first dense layer defined (the one with 50 nodes) corresponds to the input layer, or is it the first hidden layer?
If it's the first hidden layer, how are input layers defined?
Thanks in advance!
def get_compiled_model():
model = tf.keras.Sequential([
tf.keras.layers.Dense(50, activation='relu', input_shape=(6,)),
tf.keras.layers.Dense(30, activation='relu'),
tf.keras.layers.Dense(30, activation='relu'),
tf.keras.layers.Dense(1, activation='linear'),
])
The first dense layer is the first hidden layer. Keras automatically provides an input layer in Sequential objects, and the number of units is defined by input_shape or input_dim.
You can also explicitly state the input layer as follows:
def get_compiled_model():
model = tf.keras.Sequential([
tf.keras.layers.InputLayer((6,)),
tf.keras.layers.Dense(50, activation='relu'),
tf.keras.layers.Dense(30, activation='relu'),
tf.keras.layers.Dense(30, activation='relu'),
tf.keras.layers.Dense(1, activation='linear'),
])
It is the first hidden layer. The input layer isn't defined as a separate layer; it simply consists of the input data, and its size is defined by input_shape=(6,).
Related
I have implemented a CNN with an LSTM layer. My input consists of four images. The images were transformed into a tensor by feature extraction. The input shape is (4,256,256,3).
The following is the structure of my model:
model = keras.models.Sequential()
model.add(TimeDistributed(Conv2D(32,(3,3),padding = 'same', activation = 'relu'),input_shape = (4,256,256,3)))
model.add(TimeDistributed(MaxPooling2D((2,2))))
model.add(TimeDistributed(Dropout(0.25)))
model.add(TimeDistributed(Conv2D(64,(3,3),padding = 'same', activation = 'relu')))
model.add(TimeDistributed(MaxPooling2D((4,4))))
model.add(TimeDistributed(Dropout(0.25)))
model.add(TimeDistributed(Conv2D(128,(3,3),padding = 'same', activation = 'relu')))
model.add(TimeDistributed(MaxPooling2D((2,2))))
model.add(TimeDistributed(Dropout(0.25)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(128, activation='tanh'))# finalize with standard Dense, Dropout...
model.add(Dense(64, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(1, activation='relu'))
optim = keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=optim, loss=['MSE'])
history = model.fit(x=X, y=Y, batch_size=4, epochs=5, validation_split=0.2, validation_data=(X,Y))
My problem is that my model predicts the same values for all inputs.
What could be the problem?
you use the same data for training and validation. this kills the whole point of validation. Perhaps the mistake lies in this. Try to split the data, or apply cross validation.
Also, the application of the relu activation function to the last layer in combination with the mse error looks strange. At least the real can give an unlimited result, and the data should be normalized.
I hope this will help you
if you are working with a classification problem specifically binary classification, then using sigmoid activation instead softmax And MSE loss is not a good choice for binary classification.
Given a pre-trained well-performing auto-encoder. When I train a classifier on encodings (produced by the auto-encoder) the classifier does very poorly. In particular, it does much worse than training a classifier on normal inputs (i.e. unencoded inputs).
However, when I fine-tune the encoder based on classification loss, the classifier does quite well.
Why are encoded representations bad for classification?
Details: I’m working on CIFAR-100 and trying to classify coarse image labels, i.e. 20 classes (but I think I had the same problem when doing classification on CIFAR-10). The classifier has 5 layers and I’m using dropout:
classifier = tf.keras.Sequential([
tf.keras.layers.Dense(512,
activation='relu',
name='classifier_hidden_1'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(256,
activation='relu',
name='classifier_hidden_2'),
tf.keras.layers.Dense(128,
activation='relu',
name='classifier_hidden_3'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(64,
activation='relu',
name='classifier_hidden_4'),
tf.keras.layers.Dense(num_classes,
activation=None,
name='classifier_out'),
], name='classifier')
I am just getting into Keras and Tensor flow.
Im having a lot of problems adding an input normalization layer in a sequential model.
Now my model is ;
model = tf.keras.models.Sequential()
model.add(keras.layers.Dense(256, input_shape=(13, ), activation='relu'))
model.add(tf.keras.layers.LayerNormalization(axis=-1 , center=True , scale=True))
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(64, activation='relu'))
model.add(keras.layers.Dense(64, activation='relu'))
model.add(keras.layers.Dense(1))
model.summary()
My doubts are whether I should first perform an adapt function and how to use it in the sequential model.
Thanks to all!!
I'm trying to figure this out as well. According to this example, adapt is not necessary.
model = tf.keras.models.Sequential([
# Reshape into "channels last" setup.
tf.keras.layers.Reshape((28,28,1), input_shape=(28,28)),
tf.keras.layers.Conv2D(filters=10, kernel_size=(3,3),data_format="channels_last"),
# LayerNorm Layer
tf.keras.layers.LayerNormalization(axis=3 , center=True , scale=True),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_test, y_test)
Also, make sure you want a LayerNormalization. If I understand correctly, that normalizes every input on its own. Batch normalization may be more appropriate. See this for more info.
I am having trouble understanding building a tensorflow model and preprocessing a pandas Dataframe.
Ive been following this documentation:
https://www.tensorflow.org/tutorials/load_data/pandas_dataframe
First question:
dataset = tf.data.Dataset.from_tensor_slices((df.values, target.values))
train_dataset = dataset.shuffle(len(df)).batch(1)
def get_compiled_model():
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, activation='relu'),
tf.keras.layers.Dense(10, activation='relu'),
tf.keras.layers.Dense(1)
])
Since the heart disease dataframe has 13 features, why did the documentation only initate the first Dense layer with only 10 units? Why didn't it look something like this?
def get_compiled_model():
model = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(13,),
tf.keras.layers.Dense(10, activation='relu'),
tf.keras.layers.Dense(10, activation='relu'),
tf.keras.layers.Dense(1)
])
In the docs, it starts with a Dense layer of 10 units, but there are 13 features, I dont understand how this works.
Last question:
After Ive trained the model
model = get_compiled_model()
model.fit(train_dataset, epochs=15)
How do I make predictions with a single or multiple instance/instances of training data using the model.predict()?
Do I have to convert the instance to a tensor first before passing it into the model.predict method?
i.e
model.predict(tensor(instance))
Thanks!
There are different ways to create a model.
model.build()
model.fit() with some data
Specify an input_shape argument in the first layer(s) for automatic build.
Even if you don't specify the input layer, when you fit the data, it will automatically adjust the input size.
The model.predict() should be given input similar to the training data. It can be single or multiple (batches) inputs.
I have trained a recurrent neural network (LSTM) in keras but now I am struggling to put all the pieces together. Specifically, I cannot understand how to recompose the matrices of weights.
I have one input, one hidden and one output layer, as follows:
# create the model
model = Sequential()
model.add(LSTM(100, dropout=0.5, recurrent_dropout=0.5, input_shape=(timesteps, data_dim), activation='tanh'))
model.add(Dense(5, activation='softmax'))
when I call model.get_weights() I only get the bias units for the hidden and output layers but the ones for the input layers appear missing. Namely, I have a 15x400 matrix, then a 400x101 and a 101X5.
Is there something that I am missing out here?
Thanks for your help
Sequential is a model in Keras, not an input layer.
An input layer in a neural network is simply passing the inputs to the hidden layer and it does not need a bias neuron .
In your case, the model.get_weights() returns these arrays
(15, 400)
(100, 400)
(400,)
(100, 5)
(5,)
Out of which (400,) is the bias array for the LSTM layer
(5,) is the bias array for the Dense layer