Deep Convolutional Autoencoder for movie similarity - tensorflow

i am new to python and i have a dataset that contains movie descriptions and i am trying to create a model that can calculate movie similarity based on these descriptions.
so i started by turning each movie description into a Word2Vec vector where each word has a size 100,since the longest movie description in my dataset has 213 words, each movie description is turned into a vector of size 21300.
now my next step is to reduce the dimensionality of these vectors using a convolutional autoencoder.
it was recommended to me that i turn each 21300-sized vector into a 150 by 142 matrix so i did that, my goal is to compress these matrices from 150 by 142 to 5 by 5 matrix which i will then flatten and use to calculate cosine similarity between different compressed movie vectors.
now here is my faulty code so far:
encoder_input = keras.Input(shape=(21300,), name='sum')
encoded= tf.keras.layers.Reshape((150,142),input_shape=(21300,))(encoder_input)
x = tf.keras.layers.Conv1D(32, 3, activation="relu", padding="same",input_shape=(16,150,142))(encoded)
x = tf.keras.layers.MaxPooling1D(2, padding="same")(x)
x = tf.keras.layers.Conv1D(32, 3, activation="relu", padding="same")(x)
x = tf.keras.layers.MaxPooling1D(2, padding="same")(x)
x = tf.keras.layers.Conv1D(16, 3, activation="relu", padding="same")(x)
x = tf.keras.layers.MaxPooling1D(2, padding="same")(x)
x = tf.keras.layers.Conv1D(16, 3, activation="relu", padding="same")(x)
x = tf.keras.layers.MaxPooling1D(2, padding="same")(x)
x = tf.keras.layers.Conv1D(8, 3, activation="relu", padding="same")(x)
x = tf.keras.layers.MaxPooling1D(2, padding="same")(x)
x=tf.keras.layers.Flatten()(x)
encoder_output=keras.layers.Dense(units=25, activation='relu',name='encoder')(x)
x= tf.keras.layers.Reshape((5,5),input_shape=(25,))(encoder_output)
# Decoder
decoder_input=tf.keras.layers.Conv1D(8, 3, activation='relu', padding='same')(x)
x = tf.keras.layers.UpSampling1D(2)(decoder_input)
x = tf.keras.layers.Conv1D(16, 3, activation='relu')(x)
x = tf.keras.layers.UpSampling1D(2)(x)
x = tf.keras.layers.Conv1D(16, 3, activation='relu')(x)
x = tf.keras.layers.UpSampling1D(2)(x)
x = tf.keras.layers.Conv1D(32, 3, activation='relu')(x)
x = tf.keras.layers.UpSampling1D(2)(x)
x = tf.keras.layers.Conv1D(32, 3, activation='relu')(x)
x = tf.keras.layers.UpSampling1D(2)(x)
#x=tf.keras.layers.Flatten()(x)
decoder_output = keras.layers.Conv1D(1, 3, activation='relu', padding='same')(x)
opt = tf.keras.optimizers.Adam(learning_rate=0.001, decay=1e-6)
autoencoder = keras.Model(encoder_input, decoder_output, name='autoencoder')
autoencoder.compile(opt, loss='mse')
autoencoder.summary()
history = autoencoder.fit(
movies_vector,
movies_vector,
epochs=25
)
print("ENCODER READY")
#USING THE MIDDLE LAYER
encoder = keras.Model(inputs=autoencoder.input,
outputs=autoencoder.get_layer('encoder').output)
running this code produces the following error:
ValueError: Dimensions must be equal, but are 100 and 21300 for '{{node mean_squared_error/SquaredDifference}} = SquaredDifference[T=DT_FLOAT](mean_squared_error/remove_squeezable_dimensions/Squeeze, IteratorGetNext:1)' with input shapes: [?,100], [?,21300].
how can i fix this autoencoder?

I was able to reproduce the error with dummy data. Changing the decoder model as follows will help.
decoder_input=tf.keras.layers.Conv1D(8, 3, activation='relu', padding='same')(x)
x = tf.keras.layers.UpSampling1D(2)(decoder_input)
x = tf.keras.layers.Conv1D(16, 3, activation='relu')(x)
x = tf.keras.layers.UpSampling1D(2)(x)
x = tf.keras.layers.Conv1D(16, 3, activation='relu')(x)
x = tf.keras.layers.UpSampling1D(2)(x)
x = tf.keras.layers.Conv1D(32, 3, activation='relu')(x)
x = tf.keras.layers.UpSampling1D(2)(x)
x = tf.keras.layers.Conv1D(32, 3, activation='relu')(x)
x = tf.keras.layers.UpSampling1D(2)(x)
x=tf.keras.layers.Conv1D(213, 3, activation='relu', padding='same')(x)
decoder_output = tf.keras.layers.Flatten()(x)
Please find the gist here. Thank you.

Related

Is there a way to setup ConvLSTM2D model to input and output dataset with different timesteps?

I am trying to set up a ConvLSTM2D architecture that takes as input and output the data with the following shapes:
Input shape: (1000, 10, 100, 100, 1)
Output shape: (1000, 5, 100, 100, 1)
As you can notice, the difference between these input and output shapes are timesteps. Is there any way I can set up such an architecture?
The following is the one that I am using right now, which doesn't allow me to specify input / output timesteps separately.
inputs = layers.Input(shape=(None, x_train.shape[2], x_train.shape[2], 1 ))
x = layers.ConvLSTM2D(filters=64, kernel_size=(5,5), padding="same", return_sequences=True, activation="relu")(inputs)
x = layers.BatchNormalization()(x)
x = layers.ConvLSTM2D(filters=64, kernel_size=(3,3), padding="same", return_sequences=True, activation="relu")(x)
x = layers.BatchNormalization()(x)
x = layers.ConvLSTM2D(filters=64, kernel_size=(1,1), padding="same", return_sequences=True, activation="relu")(x)
outputs = layers.Conv3D(filters=1, kernel_size=(3,3,3), activation="sigmoid", padding="same")(x)
model = keras.models.Model(inputs, outputs)
model.compile(loss=keras.losses.binary_crossentropy, optimizer=keras.optimizers.Adam())
model.summary()
Appreciate any help!

movie similarity using Word2Vec and deep Convolutional Autoencoders

i am new to python and i am trying to create a model that can measure how similar movies are based on the movies description,the steps i followed so far are:
1.turn each movie description into a vector of 100*(maximum number of words possible for a movie description) values using Word2Vec, this results in a 21300-values vector for each movie description.
2.create a deep convolutional autoencoder that tries to compress each vector(and hopefully extract meaning from it).
while the first step was successful and i am still struggling with the autoencoder, here is my code so far:
encoder_input = keras.Input(shape=(21300,), name='sum')
encoded= tf.keras.layers.Reshape((150,142,1),input_shape=(21300,))(encoder_input)
x = tf.keras.layers.Conv2D(128, (3, 3), activation="relu", padding="same",input_shape=(1,128,150,142))(encoded)
x = tf.keras.layers.MaxPooling2D((2, 2), padding="same")(x)
x = tf.keras.layers.Conv2D(64, (3, 3), activation="relu", padding="same")(x)
x = tf.keras.layers.MaxPooling2D((2, 2), padding="same")(x)#49*25*64
x = tf.keras.layers.Conv2D(32, (3, 3), activation="relu", padding="same")(x)
x = tf.keras.layers.MaxPooling2D((2, 2), padding="same")(x)#25*13*32
x = tf.keras.layers.Conv2D(16, (3, 3), activation="relu", padding="same")(x)
x = tf.keras.layers.MaxPooling2D((2, 2), padding="same")(x)
x = tf.keras.layers.Conv2D(8, (3, 3), activation="relu", padding="same")(x)
x = tf.keras.layers.MaxPooling2D((2, 2), padding="same")(x)
x=tf.keras.layers.Flatten()(x)
encoder_output=keras.layers.Dense(units=90, activation='relu',name='encoder')(x)
x= tf.keras.layers.Reshape((10,9,1),input_shape=(28,))(encoder_output)
# Decoder
decoder_input=tf.keras.layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = tf.keras.layers.UpSampling2D((2, 2))(decoder_input)
x = tf.keras.layers.Conv2D(16, (3, 3), activation='relu')(x)
x = tf.keras.layers.UpSampling2D((2, 2))(x)
x = tf.keras.layers.Conv2D(32, (3, 3), activation='relu')(x)
x = tf.keras.layers.UpSampling2D((2, 2))(x)
x = tf.keras.layers.Conv2D(64, (3, 3), activation='relu')(x)
x = tf.keras.layers.UpSampling2D((2, 2))(x)
x = tf.keras.layers.Conv2D(128, (3, 3), activation='relu')(x)
x = tf.keras.layers.UpSampling2D((2, 2))(x)
decoder_output = keras.layers.Conv2D(1, (3, 3), activation='relu', padding='same')(x)
autoencoder = keras.Model(encoder_input, decoder_output)
opt = tf.keras.optimizers.Adam(learning_rate=0.001, decay=1e-6)
autoencoder = keras.Model(encoder_input, decoder_output, name='autoencoder')
autoencoder.compile(opt, loss='mse')
print("STARTING FITTING")
history = autoencoder.fit(
movies_vector,
movies_vector,
epochs=25,
)
print("ENCODER READY")
#USING THE MIDDLE LAYER
encoder = keras.Model(inputs=autoencoder.input,
outputs=autoencoder.get_layer('encoder').output)
running this code gives me the following error:
required broadcastable shapes [[node mean_squared_error/SquaredDifference (defined at tmp/ipykernel_52/3425712667.py:119) ]] [Op:__inference_train_function_1568]
i have two questions:
1.how can i fix this error?
2.how can i improve my autoencoder so that i can use the compressed vectors to test for movie similarity?
The output of your model is (batch_size, 260, 228, 1), while your targets appear to be (batch_size, 21300). You can solve that problem by either adding a tf.keras.layers.Flatten() layer to the end of your model, or by not flattening your input.
You probably should not be using 2D convolutions, as there is no spatial or temporal correlation between adjacent feature channels in most text embedding. You should be able to safely reshape to (150,142) rather than (150, 142, 1) and use 1D convolution, pooling, and upsampling layers.

Tensorflow ModelCheckpoint not saving model, no loss after reloading

The callback is saving checkpoint files, but not the SavedModel model.pb file. Additionally, when I load the model from the checkpoints it does not reload 'val_loss' which I'm conditioning "save_best_model" on.
I tried using a model.save() only on the best iteration but was having trouble with getting that to work correctly and it would be more convenient to use the ModelCheckpoint callback.
Here is the relevant code
LOSS = tf.keras.losses.MeanSquaredError(),
#multi output 3 categories from 0 to 1
model = ImgToClassSimpleContinuous(img_height, img_width)
checkpoint_filename = "../chkpts/ImgToClassSimpleContinuous/checkpoint_dir"
model.load_weights(checkpoint_filename)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_filename,
verbose=1,mode='min', monitor="val_loss", save_best_only=True, save_weights_only=False)
model.compile(
optimizer='adam',
loss = [LOSS, LOSS, LOSS],
metrics=['mse'])
model.fit(
dataset_to_use,
validation_data = dataset_validation_batched,
# validation_steps=50,
epochs=MAX_EPOCHS,
batch_size=BATCH_SIZE,
callbacks=[cp_callback]
)
class ImgToClassSimpleContinuous(Model):
'''
pair with loss = categorical_crossentropy
'''
in_types = [DataType.d]
out_types = [DataType.tlc, DataType.tls, DataType.tll]
def __init__(self, img_height, img_width, *args, **kwargs):
super().__init__(ImgToClassSimple, *args, **kwargs)
initializer = 'he_normal'
input_shape = (img_height, img_width, 1)
inputs = tf.keras.Input(shape=input_shape)
flat_pix = layers.Flatten()(inputs)
x = layers.Conv2D(8, 3, padding='same', kernel_initializer=initializer)(inputs)
x = layers.PReLU()(x)
x = layers.Conv2D(8, 3, padding='same', kernel_initializer=initializer)(x)
x = layers.PReLU()(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.BatchNormalization()(x)
x = layers.Conv2D(16, 3, padding='same', kernel_initializer=initializer)(x)
x = layers.PReLU()(x)
x = layers.Conv2D(16, 3, padding='same', kernel_initializer=initializer)(x)
x = layers.PReLU()(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.BatchNormalization()(x)
t = layers.Conv2D(32, 3, padding='same', kernel_initializer=initializer)(x)
t = layers.PReLU()(t)
t = layers.Conv2D(32, 3, padding='same', kernel_initializer=initializer)(t)
t = layers.PReLU()(t)
t = layers.MaxPooling2D(pool_size=(2, 2))(t)
t = layers.BatchNormalization()(t)
t = tf.keras.layers.GlobalAveragePooling2D()(t)
t = layers.Flatten()(t)
s = layers.Conv2D(32, 3, padding='same', kernel_initializer=initializer)(x)
s = layers.PReLU()(s)
s = layers.Conv2D(32, 3, padding='same', kernel_initializer=initializer)(s)
s = layers.PReLU()(s)
s = layers.MaxPooling2D(pool_size=(2, 2))(s)
s = layers.BatchNormalization()(s)
s = tf.keras.layers.GlobalAveragePooling2D()(s)
s = layers.Flatten()(s)
l = layers.Conv2D(32, 3, padding='same', kernel_initializer=initializer)(x)
l = layers.PReLU()(l)
l = layers.Conv2D(32, 3, padding='same', kernel_initializer=initializer)(l)
l = layers.PReLU()(l)
l = layers.MaxPooling2D(pool_size=(2, 2))(l)
l = layers.BatchNormalization()(l)
l = tf.keras.layers.GlobalAveragePooling2D()(l)
l = layers.Flatten()(l)
t = layers.Dense(1, activation='sigmoid')(t)
s = layers.Dense(1, activation='sigmoid')(s)
l = layers.Dense(1, activation='sigmoid')(l)
# A Dense classifier with a single unit (binary classification)
self.model = tf.keras.Model(inputs, [t, s, l])
tf.keras.utils.plot_model(self.model, to_file="...", show_shapes=True)
def call(self, x):
return self.model(x)

Matplotlib ValueError: num must be 1 <= num <= 20, not 0

I'm following this tutorial of Building Autoencoders in Keras on MNIST handwritten digits.
Here is the code bellow:
input_img = Input(shape=(28, 28, 1)) # adapt this if using `channels_first` image data format
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
After loading Mnist dataset and train our model, here we are going to plot our original and reconstructed images
decoded_imgs = autoencoder.predict(x_test)
n = 10
plt.figure(figsize=(20, 4))
for i in range(n):
# display original
ax = plt.subplot(2, n, i)
plt.imshow(x_test[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# display reconstruction
ax = plt.subplot(2, n, i + n)
plt.imshow(decoded_imgs[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
I searched a lot to fix this problem without finding a solution, here is the error shown bellow:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-35-d0a536786436> in <module>()
5 for i in range(n):
6 # display original
----> 7 ax = plt.subplot(2, n, i)
8 plt.imshow(x_test[i].reshape(28, 28))
9 plt.gray()
2 frames
/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_subplots.py in __init__(self, fig, *args, **kwargs)
64 if num < 1 or num > rows*cols:
65 raise ValueError(
---> 66 f"num must be 1 <= num <= {rows*cols}, not {num}")
67 self._subplotspec = GridSpec(
68 rows, cols, figure=self.figure)[int(num) - 1]
ValueError: num must be 1 <= num <= 20, not 0
<Figure size 1440x288 with 0 Axes>
On the first loop, i==0 because range(10) is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]. You can't use 0 as an index for the subplots, which causes that error. You should instead use i+1 in your plt.subplot() to get the correct axis.

ValueError: Input 0 is incompatible with layer conv2d_2: expected ndim=4, found ndim=5 in Keras

I am trying to use a function to add layers to a very deep CNN using keras. Here is my function:
def add_layer(input_shape, kernel_size, filters, count):
x = Conv2D(filters, (kernel_size, kernel_size), padding = 'same', activation= None)(Input(input_shape))
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = Conv2D(filters, (kernel_size, kernel_size), padding = 'same', activation= None)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
return keras.layers.add([x,Input(input_shape)])
When I call this function from:
x = Input(shape = (6,264,264))
y = Conv2D(64, (7, 7), padding='same', activation='relu')(x)
y = MaxPooling2D((2,2))(y)
y = add_layer(y.shape, 3, 64, 3)
It gives following error:
ValueError: Input 0 is incompatible with layer conv2d_2: expected ndim=4, found ndim=5
When I remove the add_layer function and simply terminate the maxpooling to a dense layer, I get:
AttributeError: 'Tensor' object has no attribute 'ndim'
What could be the problem ? (Additionally my input has 50 np arrays of size (6,264,264)) i.e (50,6,264,264)
Pretty sure that your line,
x = Conv2D(filters, (kernel_size, kernel_size), padding = 'same', activation= None)(Input(input_shape))
should be
x = Conv2D(filters, (kernel_size, kernel_size), padding = 'same', activation= None)(Input(batch_shape=input_shape))
By default the ny_layer.shape would add a batch size in the shape. However, for Input(input_shape) the first argument assumes the shape is without is batch size and adds another dimension to its output. This would explain the origin of the extra dimension.