How to extract the hidden vector (the output of the ReLU after the third encoder layer) as the image representation - tensorflow

I am implementing an autoencoder using the Fashion Mnsit dataset. The code for the encoder-
class MNISTClassifier(Model):
def __init__(self):
super(MNISTClassifier, self).__init__()
self.encoder = Sequential([
layers.Dense(128, activation = "relu"),
layers.Dense(64, activation = "relu"),
layers.Dense(32, activation = "relu")
])
self.decoder = Sequential([
layers.Dense(64, activation = "relu"),
layers.Dense(128, activation= "relu"),
layers.Dense(784, activation= "relu")
])
def call(self, x):
encoded = self.encoder(x)
decoded = self.decoder(encoded)
return decoded
autoencoder = MNISTClassifier()
now I want to train an SVM classifier on the image representations extracted from the above autoencoder mean
Once the above fully-connected autoencoder is trained, for each image, I want to extract the 32-
dimensional hidden vector (the output of the ReLU after the third encoder layer) as the
image representation and then train a linear SVM classifier on the training images of fashion mnist based on the 32-
dimensional features.
How to extract the output 32-
dimensional hidden vector??
Thanks in Advance!!!!!!!!!!!!

I recommend to use Functional API in order to define multiple outputs of your model because of a more clear code. However, you can do this with Sequential model by getting the output of any layer you want and add to your model's output.
Print your model.summary() and check your layers to find which layer you want to branch. You can access each layer's output by it's index with model.layers[index].output .
Then you can create a multi-output model of the layers you want, like this:
third_layer = model.layers[2]
last_layer = model.layers[-1]
my_model = Model(inputs=model.input, outputs=(third_layer.output, last_layer.output))
Then, you can access the outputs of both of layers you have defined:
third_layer_predict, last_layer_predict = my_model.predict(X_test)

Related

Categorical_crossentropy loss function has value of 0.0000e +00 for a BiLSTM sentiment analysis model

This is the graph of my model
Model
Code format:
def model_creation(vocab_size, embedding_dim, embedding_matrix,
rnn_units, batch_size,
train_embed=False):
model = Sequential(
[
Embedding(vocab_size, embedding_dim,
weights=[embedding_matrix], trainable=train_embed, mask_zero=True),
Bidirectional(LSTM(rnn_units, return_sequences=True, dropout=0.5)),
Bidirectional(LSTM(rnn_units, dropout=0.25)),
Dense(1, activation="softmax")
])
return model
The embedding layer receive an embedding matrix with value from Word2Vec
This is the code for the embedding matrix:
Embedding Matrix
def create_embedding_matrix(encoder,dict_w2v):
embedding_dim = 50
embedding_matrix = np.zeros((encoder.vocab_size, embedding_dim))
for word in encoder.tokens:
embedding_vector = dict_w2v.get(word)
if embedding_vector is not None: # dictionary contains word
test = encoder.encode(word)
token_id = encoder.encode(word)[0]
embedding_matrix[token_id] = embedding_vector
return embedding_matrix
Dataset
I'm using the amazon product dataset https://jmcauley.ucsd.edu/data/amazon/
This is what the dataframe look like
I'm only interested in overall and reviewText
overall is my Label and reviewText is my Feature
overall has a range of [1,5]
Problem
During training with categorical_crossentropy loss the is at 0.0000e +00, I don't think loss can be minimized well so accuracy is always at 0.1172
Did I configure my model wrong or is there any problem? How do I fix my loss function issue ? Please tell me if it's not clear enough I'll provide more information. I'm not sure what the problem is

Keras accuracy not increasing

I am trying to perform sentiment classification using Keras. I am trying to do this using a basic neural network (no RNN or other more complex type). However when I run the script I see no increase in accuracy during training/evaluation. I am guessing I am setting up the output layer incorrectly but I am not sure of that. y_train is a list [1,2,3,1,2,4,5] (5 different labels) containing the targets belonging to the features in X_train_seq_padded. The setup is as follows:
padding_len = 24 # len of each tokenized sentence
neurons = 16 # 2/3 the length of the text that is padded
model = Sequential()
model.add(Dense(neurons, input_dim = padding_len, activation = 'relu', name = 'hidden-1'))
model.add(Dense(neurons, activation = 'relu', name = 'hidden-2'))
model.add(Dense(neurons, activation = 'relu', name = 'hidden-3'))
model.add(Dense(1, activation = 'sigmoid', name = 'output_layer'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics=['accuracy'])
callbacks = [EarlyStopping(monitor = 'accuracy', patience = 5, mode = 'max')]
history = model.fit(X_train_seq_padded, y_train, epochs = 100, batch_size = 64, callbacks = callbacks)
First of all, in your above set up if you choose sigmoid in your last layer activation function which generally uses for binary classification or multi-label classification then, the loss function should be binary_crossentropy.
But if your labels are represented multi-class and transformed into one-hot encoded then your last layer should be Dense(num_classes, activations='softmax') and the loss function would be categorical_crossentropy.
But if you don't transform your multi-class label but integer then your last layer and loss function should be
Dense(num_classes) # with logits
SparseCategoricalCrossentropy(from_logits= True)
Or, (#Frightera)
Dense(num_classes, activation='softmax') # with probabilities
SparseCategoricalCrossentropy(from_logits=False)

<NameError: name 'categorical_crossentropy' is not defined> when trying to load a model

I have a custom keras model built:
def create_model(input_dim,
filters,
kernel_size,
strides,
padding,
rnn_units=256,
output_dim=30,
dropout_rate=0.5,
cell=GRU,
activation='tanh'):
"""
Creates simple Conv-Bi-RNN model used for word classification approach.
:params:
input_dim - Integer, size of inputs (Example: 161 if using spectrogram, 13 for mfcc)
filters - Integer, number of filters for the Conv1D layer
kernel_size - Integer, size of kernel for Conv layer
strides - Integer, stride size for the Conv layer
padding - String, padding version for the Conv layer ('valid' or 'same')
rnn_units - Integer, number of units/neurons for the RNN layer(s)
output_dim - Integer, number of output neurons/units at the output layer
NOTE: For speech_to_text approach, this number will be number of characters that may occur
dropout_rate - Float, percentage of dropout regularization at each RNN layer, between 0 and 1
cell - Keras function, for a type of RNN layer * Valid solutions: LSTM, GRU, BasicRNN
activation - String, activation type at the RNN layer
:returns:
model - Keras Model object
"""
keras.losses.custom_loss = 'categorical_crossentropy'
#Defines Input layer for the model
input_data = Input(name='inputs', shape=input_dim)
#Defines 1D Conv block (Conv layer + batch norm)
conv_1d = Conv1D(filters,
kernel_size,
strides=strides,
padding=padding,
activation='relu',
name='layer_1_conv',
dilation_rate=1)(input_data)
conv_bn = BatchNormalization(name='conv_batch_norm')(conv_1d)
#Defines Bi-Directional RNN block (Bi-RNN layer + batch norm)
layer = cell(rnn_units, activation=activation,
return_sequences=True, implementation=2, name='rnn_1', dropout=dropout_rate)(conv_bn)
layer = BatchNormalization(name='bt_rnn_1')(layer)
#Defines Bi-Directional RNN block (Bi-RNN layer + batch norm)
layer = cell(rnn_units, activation=activation,
return_sequences=True, implementation=2, name='final_layer_of_rnn')(layer)
layer = BatchNormalization(name='bt_rnn_final')(layer)
layer = Flatten()(layer)
#squish RNN features to match number of classes
time_dense = Dense(output_dim)(layer)
#Define model predictions with softmax activation
y_pred = Activation('softmax', name='softmax')(time_dense)
#Defines Model itself, and use lambda function to define output length based on inputs
model = Model(inputs=input_data, outputs=y_pred)
model.output_length = lambda x: cnn_output_length(x, kernel_size, padding, strides)
#Adds categorical crossentropy loss for the classification model
model = add_categorical_loss(model , output_dim)
#compile the model with choosen loss and optimizer
model.compile(loss={'categorical_crossentropy': lambda y_true, y_pred: y_pred},
optimizer=keras.optimizers.RMSprop(), metrics=['accuracy'])
print("\r\ncompile the model with choosen loss and optimizer\r\n")
print(model.summary())
return model
and after training model:
checkpointer = ModelCheckpoint(filepath=save_path+'tst_model.hdf5')
#Train the choosen model with the data generator
hist = model.fit_generator(generator=generator.next_train(), #Calls generators next_train function which generates new batch of training data
steps_per_epoch=steps_per_epoch, #Defines how many training steps are there
epochs=epochs, #Defines how many epochs does a training process takes
validation_data=generator.next_valid(), #Calls generators next_valid function which generates new batch of validation data
validation_steps=validation_steps, #Defines how many validation steps are theere
callbacks=[checkpointer], #Defines all callbacks (In this case we only have molde checkpointer that saves the model)
verbose=verbose)
Adter thet I am trying to load the latest checkpoint model as follows:
from keras.models import load_model
model = load_model(filepath=save_path+'tst_model.hdf5')
and get:
NameError: name 'categorical_crossentropy' is not defined
What i doing wrong?
Using:
Ubuntu 18.04
Python 3.6.8
TensorFlow 2.0
TensorFlow backend 2.3.1
You must import the library.
from tensorflow.keras.losses import categorical_crossentropy
When you load your model, tensorflow will automatically try to compile it (see the compile arguments of tf.keras.load_model). There's 2 ways to give away this warning:
If you provided a custom loss for the model you must include it in the tf.keras.load_model() function (see custom_objects argument; it is a dict object).
Set the compile argument to False.

How to get output from a specific layer in keras.tf, the bottleneck layer in autoencoder?

I am developing an autoencoder for clustering certain groups of images.
input_images->...->bottleneck->...->output_images
I have calibrated the autoencoder to my satisfaction and saved the model; everything has been developed using keras.tensorflow on python3.
The next step is to apply the autoencoder to a ton of images and cluster them according to cosine distance in the bottleneck layer. Oops, I just realized that I don't know the syntax in keras.tf for running the model on a batch up to a specific layer rather than to the output layer. Thus the question:
How do I run something like Model.predict_on_batch or Model.predict_generator up to the certain "bottleneck" layer and retrieve the values on that layer rather than the values on the output layer?
You need to define a new model (if you didn't define the encoder and decoder as separate models initially, which is usually the easiest option).
If your model was defined without reusing layers, it's just:
inputs = model.input
outputs= model.get_layer('bottleneck').output
encoder = Model(inputs, outputs)
Use the encoder model as any other model.
The full code would be like this,
# ENCODER
encoding_dim = 37310
input_layer = Input(shape=(encoding_dim,))
encoder = Dense(500, activation='tanh')(input_layer)
encoder = Dense(100, activation='tanh')(encoder)
encoder = Dense(50, activation='tanh', name='bottleneck_layer')(encoder)
decoder = Dense(100, activation='tanh')(encoder)
decoder = Dense(500, activation='tanh')(decoder)
decoder = Dense(37310, activation='sigmoid')(decoder)
# full model
model_full = models.Model(input_layer, decoder)
model_full.compile(optimizer='adam', loss='mse')
model_full.fit(x, y, epochs=20, batch_size=16)
# bottleneck model
bottleneck_output = model_full.get_layer('bottleneck_layer').output
model_bottleneck = models.Model(inputs = model_full.input, outputs = bottleneck_output)
bottleneck_predictions = model_bottleneck.predict(X_test)

Tensorflow dense layers worse than keras sequential

I try to train an agent on the inverse-pendulum (similar to cart-pole) problem, which is a benchmark of reinforcement learning. I use neural-fitted-Q-iteration algorithm which uses a multi-layer neural network to evaluate the Q function.
I use Keras.Sequential and tf.layers.dense to build the neural network repectively, and leave all other things to be the same. However, Keras gives me a good results and tensorflow does not. In fact, tensorflow doesn't work at all with its loss being increasing and the agent learns nothing from the training.
Here I present the code for Keras as follows
def build_model():
model = Sequential()
model.add(Dense(5, input_dim=3))
model.add(Activation('sigmoid'))
model.add(Dense(5))
model.add(Activation('sigmoid'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
adam = Adam(lr=1E-3)
model.compile(loss='mean_squared_error', optimizer=adam)
return model
and the tensorflow version is
class NFQ_fit(object):
"""
neural network approximator for NFQ iteration
"""
def __init__(self, sess, N_feature, learning_rate=1E-3, batch_size=100):
self.sess = sess
self.N_feature = N_feature
self.learning_rate = learning_rate
self.batch_size = batch_size
# DNN structure
self.inputs = tf.placeholder(tf.float32, [None, N_feature], 'inputs')
self.labels = tf.placeholder(tf.float32, [None, 1], 'labels')
self.l1 = tf.layers.dense(inputs=self.inputs,
units=5,
activation=tf.sigmoid,
use_bias=True,
kernel_initializer=tf.truncated_normal_initializer(0.0, 1E-2),
bias_initializer=tf.constant_initializer(0.0),
kernel_regularizer=tf.contrib.layers.l2_regularizer(1E-4),
name='hidden-layer-1')
self.l2 = tf.layers.dense(inputs=self.l1,
units=5,
activation=tf.sigmoid,
use_bias=True,
kernel_initializer=tf.truncated_normal_initializer(0.0, 1E-2),
bias_initializer=tf.constant_initializer(0.0),
kernel_regularizer=tf.contrib.layers.l2_regularizer(1E-4),
name='hidden-layer-2')
self.outputs = tf.layers.dense(inputs=self.l2,
units=1,
activation=tf.sigmoid,
use_bias=True,
kernel_initializer=tf.truncated_normal_initializer(0.0, 1E-2),
bias_initializer=tf.constant_initializer(0.0),
kernel_regularizer=tf.contrib.layers.l2_regularizer(1E-4),
name='outputs')
# optimization
# self.mean_loss = tf.losses.mean_squared_error(self.labels, self.outputs)
self.mean_loss = tf.reduce_mean(tf.square(self.labels-self.outputs))
self.regularization_loss = tf.losses.get_regularization_loss()
self.loss = self.mean_loss # + self.regularization_loss
self.train_op = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.loss)
The two models are the same. Both of them has two hidden layers with the same dimension. I expect that the problems may come from the kernel initialization but I don't know how to fix it.
Using Keras is great. If you want better TensorFlow integration check out tf.keras. There's no particular reason to use tf.layers if the Keras (or tf.keras) defaults work better.
In this case glorot_uniform looks like the default initializer. This is also the global TensorFlow default, so consider removing the kernel_initializer argument instead of the explicit truncated normal initialization in your question (or passing Glorot explicitly).