machine learning/ deep learning model python - tensorflow

I have made a deep learning model for predicting the real estate prices based on different parameters such as: area, number of rooms, floor, number of floors, age of the building, type of building, etc.
My dataset is scrapped, data is prepared to analysis (discrete columns as dummies itp). My model is shown in the following code. The problem is that the model's accuracy is always 0.
Here is my code:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=6, activation='relu'))
model.add(tf.keras.layers.Dense(units=6, activation='relu'))
model.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))
model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
model.fit(X_train, y_train, batch_size = 32, epochs = 100)
accuracy_score(y_test, y_pred)
I couldn't find tutorial for similar case.

Predicting real estate prices needs a regression deep learninng model.
so the problem in that code is
1- in the output layer you're using 'sigmoid' function, change it to 'linear'
2- change the loss to mean absolute error or mean squarred error
3- change the metrics too to mae or mse
4-probably add more epochs, a neural network regression probably needs way more epochs to learn
Try those changes they should fix the problems

Related

LSTM training difficulties

I wanted to train LSTM model for tabular time series data. My data shape is
((7342689, 50, 5), (7342689,))
I was having a hard time to handle the training loss. Initially I tried with default learning rate , but it didn't help. My class label is severely skewed. I have added focal loss and class weights to handle class imbalance issues. I have tried with adding one more layer with 50 neurons, but that loss started to increase instead of decrease. I appreciate your suggestions. Thanks!
Here is my current model architecture:
adam = Adam(learning_rate=0.0001)
model = keras.Sequential()
model.add(LSTM(100, input_shape = (50, 5)))
model.add(Dropout(0.5))
model.add(Dense(1, activation="sigmoid"))
model.compile(loss=tfa.losses.SigmoidFocalCrossEntropy()
, metrics=[keras.metrics.binary_accuracy]
, optimizer=adam)
model.summary()
class_weights = dict(zip(np.unique(y_train), class_weight.compute_class_weight('balanced', classes=np.unique(y_train),y=y_train)))
history=model.fit(X_train, y_train, batch_size=64, epochs=50,class_weight=class_weights)
The loss of the model first decreased and then increased, which may be because the optimization process got stuck in a local optimal solution. Maybe you can try reducing the learning rate and increasing the epoch.

Can this be considered overfitting?

I have 4 classes, each with 1350 images. The validation set has 20% of the total images (it is generated automatically). The training model uses MobilenetV2 network:
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE, include_top=False, weights='imagenet')
The model is created:
model = tf.keras.Sequential([
base_model,
tf.keras.layers.Conv2D(32, 3, activation='relu', kernel_regularizer=regularizers.l2(0.001)),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.MaxPool2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(4, activation='softmax', kernel_regularizer=regularizers.l2(0.001))
])
The model is trained through 20 epochs and then fine tunning is done in 15 epochs. The result is as follows:
Image of the model trained before fine tunning
Image of the model trained after 15 epochs and fine tunning
A bit difficult to tell without the numeric values of validation loss but I would say the results before fine tuning are slightly over fitting and for after fine tuning less over fitting. A couple of additional things you could do. One is try using an adjustable learning rate using the callback tf.keras.callbacks.ReduceLROnPlateau. Set it up to monitor validation loss. Documentation is here. I set factor=.5 and patience=1. Second replace the Flatten layer with tf.keras.layers.GlobalMaxPool2D and see if it improves the validation loss.

Keras model not learning and predicting only one class out of three classes

New to the field of deep learning and currently working on this competition for predicting the earthquake damage to buildings.
The model I created starts at an accuracy of .56 but remains at this for any number of epochs i let it run. When finished, the model only predicts one of the three classes (which I one hot encoded into a dataframe with three columns). Changing the number of layers, optimizers, data preparation, dropout wont change anything. Even trying to overfit my model with the over-parameterization of the neural network will still have the same accuracy and a non-learning model.
What am I doing wrong?
This is my code:
model = keras.models.Sequential()
model.add(keras.layers.Dense(64, input_dim = 85, activation = "relu"))
keras.layers.Dropout(0.3)
model.add(keras.layers.Dense(128, activation = "relu"))
keras.layers.Dropout(0.3)
model.add(keras.layers.Dense(256, activation = "relu"))
keras.layers.Dropout(0.3)
model.add(keras.layers.Dense(512, activation = "relu"))
model.add(keras.layers.Dense(3, activation = "softmax"))
adam = keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer = adam,
loss='categorical_crossentropy',
metrics = ['accuracy'])
history = model.fit(traindata, trainlabels,
epochs = 5,
validation_split = 0.2,
verbose = 1,)
There's nothing visually wrong with your model, but it may be too haevy to learn any useful features.
Try normalizing your input with https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html
Start with only 2 layers, and a few numbers of neurons.
Increase batch_size and try learning_rate scheduling.
Observe the validation_accuracy, stop when it starts to overfit.
Finally, for a 3-class classification, 56% accuracy is better than baseline, remmeber it's a competition so the data is not dummy playground data which you can expect to get a 90% accuracy with an MLP in the first try.
Finally, try hyperparameter optimization with tuner.

How to calculate confidence score of a Neural Network prediction

I am using a deep neural network model (implemented in keras)to make predictions. Something like this:
def make_model():
model = Sequential()
model.add(Conv2D(20,(5,5), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(20, activation = "relu"))
model.add(Lambda(lambda x: tf.expand_dims(x, axis=1)))
model.add(SimpleRNN(50, activation="relu"))
model.add(Dense(1, activation="sigmoid"))
model.compile(loss = "binary_crossentropy", optimizer = adagrad, metrics = ["accuracy"])
return model
model = make_model()
model.fit(x_train, y_train, validation_data = (x_validation,y_validation), epochs = 25, batch_size = 25, verbose = 1)
##Prediciton:
prediction = model.predict_classes(x)
probabilities = model.predict_proba(x) #I assume these are the probabilities of class being predictied
My problem is a classification(binary) problem. I wish to calculate the confidence score of each of these prediction i.e. I wish to know - Is my model 99% certain it is "0" or is it 58% it is "0".
I have found some views on how to do it, but can't implement them. The approach I wish to follow says: "With classifiers, when you output you can interpret values as the probability of belonging to each specific class. You can use their distribution as a rough measure of how confident you are that an observation belongs to that class."
How should I predict with something like above model so that I get its confidence about each predictions? I would appreciate some practical examples (preferably in Keras).
The softmax is a problematic way to estimate a confidence of the model`s prediction.
There are a few recent papers about this topic.
You can look for "calibration" of neural networks in order to find relevant papers.
This is one example you can start with - https://arxiv.org/pdf/1706.04599.pdf
In Keras, there is a method called predict() that is available for both Sequential and Functional models. It will work fine in your case if you are using binary_crossentropy as your loss function and a final Dense layer with a sigmoid activation function.
Here is how to call it with one test data instance. Below, mymodel.predict() will return an array of two probabilities adding up to 1.0. These values are the confidence scores that you mentioned. You can further use np.where() as shown below to determine which of the two probabilities (the one over 50%) will be the final class.
yhat_probabilities = mymodel.predict(mytestdata, batch_size=1)
yhat_classes = np.where(yhat_probabilities > 0.5, 1, 0).squeeze().item()
I've come to understand that the probabilities that are output by logistic regression can be interpreted as confidence.
Here are some links to help you come to your own conclusion.
https://machinelearningmastery.com/how-to-score-probability-predictions-in-python/
how to assess the confidence score of a prediction with scikit-learn
https://stats.stackexchange.com/questions/34823/can-logistic-regressions-predicted-probability-be-interpreted-as-the-confidence
https://kiwidamien.github.io/are-you-sure-thats-a-probability.html
Feel free to upvote my answer if you find it useful.
How about to use a softmax as the activation in the last layer? Let's say something like this:
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer = adagrad, metrics = ["accuracy"])
In this way, for each data point, you will be given a probabilistic-ish result by the model, which tells what is the likelihood that your data point belongs to each of two classes.
For example for a given X, if the model returns (0.3,0.7), you will know it is more likely that X belongs to class 1 than class 0. and you know that the likelihood has been estimated to be 0.7 over 0.3.

RNN Not Generalizing on Text Classification

I am using keras and RNN to classify slack text data on whether the text is reaction worthy or not (1 - emoji, 0 - no emoji). I have removed usernames and urls from the text as well as dropped duplicates with different target variables.
I am not able to get the model to generalize to unseen data. The loss of the train/val sets look good and continually decrease but the accuracy of the val set only decreases.
I am using a pretrained GLOVE word embedding since my training size is only about 25,000 sentences.
I have added additional layers, changed my regularization value and increased dropout but get similar results. Is my model not complex enough to generalize the data? The times i added additional layers they were much smaller but deeper because the training time was about 2 min per epoch.
Any insight would be appreciated.
embedding_layer = Embedding(len(word_index) + 1,
100,
weights=[embeddings_matrix],
input_length=max_message_length,
embeddings_regularizer=l2(0.001),
trainable=True)
# Creating the Model
model = Sequential()
model.add(embedding_layer)
model.add(Convolution1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Dropout(0.7))
model.add(layers.GRU(128))
model.add(Dropout(0.7))
model.add(Dense(1, activation='sigmoid'))
# Compiling the model with our given Optimizer
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.000025)
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
print(model.summary())