Constrained Optimization Tensorflow - tensorflow

I have a trained classifier neural network in Keras. Let the neural network be f(x). I want to find the vectors x such that when ||x||^2 = 1, f(x) is maximized. I currently have trained my neural network with Keras
model = Sequential()
model.add(Dense(500, activation='sigmoid'))
model.add(Dense(500, activation='sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy', auc])
model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=2, verbose = 1, callbacks=[earlyStopping])
I want to know if there is a way to solve this constrained optimization problem once my Neural network has already been trained. There is a scipy optimize which can do this for general functions. Is there a way to do this for a neural network. Please include a code sample.

If I understand you correctly you have finished training your neural network and would like to find the input x which maximizes the probability of it being in a certain class (where the output is close to 1.0)
You could write a small function to assess the performance of your network using the predict_proba() method to get classification probabilities on test data, and then optimise this function using scipy:
model = Sequential()
model.add(Dense(500, activation='sigmoid'))
model.add(Dense(500, activation='sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy', auc])
model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=2, verbose = 1, callbacks=[earlyStopping])
def f(x):
prediction = model.predict_proba(x)
return -prediction
a = scipy.optimize.minimize(f, x0=np.random.randn(500))
optimal_x = a.x
optimal_x will be the input x which maximises the certainty with which your classifier puts it in one specific class.

Related

How to use LayerNormalization layer in a Keras sequential Model?

I am just getting into Keras and Tensor flow.
Im having a lot of problems adding an input normalization layer in a sequential model.
Now my model is ;
model = tf.keras.models.Sequential()
model.add(keras.layers.Dense(256, input_shape=(13, ), activation='relu'))
model.add(tf.keras.layers.LayerNormalization(axis=-1 , center=True , scale=True))
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(64, activation='relu'))
model.add(keras.layers.Dense(64, activation='relu'))
model.add(keras.layers.Dense(1))
model.summary()
My doubts are whether I should first perform an adapt function and how to use it in the sequential model.
Thanks to all!!
I'm trying to figure this out as well. According to this example, adapt is not necessary.
model = tf.keras.models.Sequential([
# Reshape into "channels last" setup.
tf.keras.layers.Reshape((28,28,1), input_shape=(28,28)),
tf.keras.layers.Conv2D(filters=10, kernel_size=(3,3),data_format="channels_last"),
# LayerNorm Layer
tf.keras.layers.LayerNormalization(axis=3 , center=True , scale=True),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_test, y_test)
Also, make sure you want a LayerNormalization. If I understand correctly, that normalizes every input on its own. Batch normalization may be more appropriate. See this for more info.

Keras binary classification model's AUC score doesn't increase

I have a imbalanced dataset which has 57000 zeros and 2500 ones. I gave class weights as an input to my model, tried to change optimizers, tried to resize number of layers and neurons. Finally I stick to ;
because it was the only one that seems systematic, tried to change layer weight regularization rules but nothing helped me yet. I am not talking about just for my validation AUC score, even train success doesn't rise satisfyingly.
Here is how I declared my model, don't mind if you think the problem is layer and node sizes. I think I tried everything that sounds sensible.
class_weight = {0: 23.59,
1: 1.}
model=Sequential()
model.add(Dense(40, input_dim=x_train.shape[1], activation='relu'))
model.add(Dense(33, activation='relu',kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4),bias_regularizer=regularizers.l2(1e-4),activity_regularizer=regularizers.l2(1e-5)))
model.add(Dense(28, activation='relu'))
model.add(Dense(20, activation='relu'))
model.add(Dense(15, activation='relu'))
model.add(Dense(9, activation='relu'))
model.add(Dense(5, activation='relu'))
model.add(Dense(1, activation='sigmoid',kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4),bias_regularizer=regularizers.l2(1e-4),activity_regularizer=regularizers.l2(1e-5)))
opt = keras.optimizers.SGD(learning_rate=0.1)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['AUC'])
model.fit(x_train,y_train,epochs=600,verbose=1,validation_data=(x_test,y_test),class_weight=class_weight)
After approximate 100 epoch, it was stuck at 0.73-0.75 auc, doesn't rise anymore. I couldn't even overfit my model

How to specify number of layers in keras?

I'm trying to define a fully connected neural network in keras using tensorflow backend, I have a sample code but I dont know what it means.
model = Sequential()
model.add(Dense(10, input_dim=x.shape[1], kernel_initializer='normal', activation='relu'))
model.add(Dense(50, input_dim=x.shape[1], kernel_initializer='normal', activation='relu'))
model.add(Dense(20, input_dim=x.shape[1], kernel_initializer='normal', activation='relu'))
model.add(Dense(10, input_dim=x.shape[1], kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
model.add(Dense(y.shape[1],activation='softmax'))
From the above code I want to know what is the number of inputs to my network, number of outputs, number of hidden layers and number of neurons in each layer. And what is the number coming after model.add(Dense ? assuming x.shape[1]=60.
What is the name of this network exacly? Should I call it a fully connected network or convolutional network?
That should be quite easy.
For knowing about the model's inputs and outputs use,
input_tensor = model.input
output_tensor = model.output
You can print these tf.Tensor objects to get the shape and dtype.
For fetching the Layers of a model use,
layers = model.layers
print( layers[0].units )
With these tricks you can easily get the input and output tensors for a model or its layer.

A neural network that can't overfit?

I am fitting a model to some noisy satellite data. The labels are measurements of rock on the bars of a river. There is a noisy but significant relationship. I only have 250 points but the method would expand and eventually run off much bigger datasets. I'm looking at a mix of models (RANSAC, Huber, SVM Regression) and DNNs. My DNN results seem too good to be true. The network looks like:
model = Sequential()
model.add(Dense(128, kernel_regularizer= regularizers.l2(0.001), input_dim=NetworkDims, kernel_initializer='he_normal', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(128, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(32, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
And when I save the history and plot training loss (green dots) and validation loss (cyan line) vs epoch I get this:
Training and validation loss just creep down. With a small dataset, I was expecting the validation loss to go its own way. In fact, if I run a 10-fold cross val score with this network, the error reported by cross val score does creep down. This just looks too good to be true. It implies that I could train this thing for 1000 epochs and still improve results. If it looks too good to be true, it usually is, but why?
EDIT: More results.
So I tried to cut dropout to 0.1 at each and remove the L2. Inteesting. With the toned-down drop-out, I get even better results:
10% dropout rate
Without the L2, there is overfitting:
No L2 reg
My guess would be that you have such a high dropout on every layer, which is why it's having trouble just overfitting on the training data. My prediction is that if you lower that dropout and regularization, it'll learn the training data much faster.
I'm not too sure if the results are too good to be true because it's hard to base how good a model is based on loss function. But it should be the dropout and regularization that is preventing it from overfitting in a few epochs.

Why is my Neural Net not learning

I have a CNN that I'm trying to train and I cant figure out why its not learning. It has 32 classes which are different types of clothes of about 1000 images in each folder.
Issue is this is the result at the end of training which takes about 9 hours on my GPU
loss: 3.3403 - acc: 0.0542 - val_loss: 3.3387 - val_acc: 0.0534
If anyone could give me directions on how to get this network to train better I would be grateful.
# dimensions of our images.
img_width, img_height = 228, 228
train_data_dir = 'Clothes/train'
validation_data_dir = 'Clothes/test'
nb_train_samples = 25061
nb_validation_samples = 8360
epochs = 20
batch_size = 64
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
model = Sequential()
model.add(Conv2D(filters=64, kernel_size=2, padding='same', activation='tanh', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.3))
model.add(Conv2D(filters=32, kernel_size=2, padding='same', activation='tanh'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.3))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(32, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
[test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle = True)
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle = True)
history = model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size)
Here is the plot of the training & validation loss
A network may not converge/learn for several reasons, but here is a list of tips that I think is relevant in your case (based on my own experience):
Transfer Learning: The first thing you should know is that it's very hard to train an image classifier from scratch for most problems, you need much more computing power and time for that. I strongly recommend using transfer learning. There are multiple trained architectures available in Keras that you can use as initial weights fr your network (or other methods).
Training step: For the optimizer, I recommend to use Adam first and to vary the learning rate to see how the loss responds. Also, since you are using Convolutional Layers, you should consider adding Batch Normalization Layers, that can speed significantly the training time, and change the Convolutional activations to 'relu', which make them much faster to train.
You could also try decreasing the Dropout values but I don't think that's the main issue here. Also, If you are considering training your network from scratch,
you should start with fewer layers and add more gradually to get a better idea of ​​what's going on.
train/test split: I see that you are using 8360 observations in your test set. Given the size of your training set, I think it's too much. 1000 for example is enough. The more training samples you have, the more satisfying your results will be.
Also, before judging the accuracy of your model, you should start by establishing a baseline model to benchmark your model. The baseline model depends on your problem, but in general I choose a model that predicts the most common class in the dataset. You should also look at another metric 'top_k_accuracy' available in Keras that is interesting when you have a relatively high number of classes to predict. It helps you to see how close your model is to the right prediction.
First, in order to keep your sanity, check carefully for any bugs, and that your data is being sent in as intended
You might want to add a Top K accuracy metric to get a better idea of whether it's close to getting it, or totally wrong.
Here are some tuning things to try:
Change the kernel size to 3 and activation to relu
model.add(Conv2D(filters=64, kernel_size=3, padding='same', activation='relu'))
If you think your model is underfitting then try increasing the number of Conv layers per pooling to start with. But you could also increase the number of filters or the number of conv + pool repetitions.
Adam optimizer might learn a bit faster than RMS prop
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
Probably the biggest improvement would be to get more data. I think your data set is probably too small for the scope of the problem.
You might want to try transfer learning from a pre-trained image recognition network.