Keras TimeDistributed Not Masking CNN Model - tensorflow

For the sake of example, I have an input consisting of 2 images,of total shape (2,299,299,3). I'm trying to apply inceptionv3 on each image, and then subsequently process the output with an LSTM. I'm using a masking layer to exclude a blank image from being processed (specified below).
The code is:
import numpy as np
from keras import backend as K
from keras.models import Sequential,Model
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D, BatchNormalization, \
Input, GlobalAveragePooling2D, Masking,TimeDistributed, LSTM,Dense,Flatten,Reshape,Lambda, Concatenate
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.applications import inception_v3
IMG_SIZE=(299,299,3)
def create_base():
base_model = inception_v3.InceptionV3(weights='imagenet', include_top=False)
x = GlobalAveragePooling2D()(base_model.output)
base_model=Model(base_model.input,x)
return base_model
base_model=create_base()
#Image mask to ignore images with pixel values of -1
IMAGE_MASK = -2*np.expand_dims(np.ones(IMG_SIZE),0)
final_input=Input((2,IMG_SIZE[0],IMG_SIZE[1],IMG_SIZE[2]))
final_model = Masking(mask_value = -2.)(final_input)
final_model = TimeDistributed(base_model)(final_model)
final_model = Lambda(lambda x: x, output_shape=lambda s:s)(final_model)
#final_model = Reshape(target_shape=(2, 2048))(final_model)
#final_model = Masking(mask_value = 0.)(final_model)
final_model = LSTM(5,return_sequences=False)(final_model)
final_model = Model(final_input,final_model)
#Create a sample test image
TEST_IMAGE = np.ones(IMG_SIZE)
#Create a test sample input, consisting of a normal image and a masked image
TEST_SAMPLE = np.concatenate((np.expand_dims(TEST_IMAGE,axis=0),IMAGE_MASK))
inp = final_model.input # input placeholder
outputs = [layer.output for layer in final_model.layers] # all layer outputs
functors = [K.function([inp]+ [K.learning_phase()], [out]) for out in outputs]
layer_outs = [func([np.expand_dims(TEST_SAMPLE,0), 1.]) for func in functors]
This does not work correctly. Specifically, the model should mask the IMAGE_MASK part of the input, but it instead processes it with inception (giving a nonzero output). here are the details:
layer_out[-1] , the LSTM output is fine:
[array([[-0.15324114, -0.09620268, -0.01668587, 0.07938149, -0.00757846]], dtype=float32)]
layer_out[-2] and layer_out[-3] , the LSTM input is wrong, it should have all zeros in the second array:
[array([[[ 0.37713543, 0.36381325, 0.36197218, ..., 0.23298527,
0.43247852, 0.34844452],
[ 0.24972123, 0.2378867 , 0.11810347, ..., 0.51930511,
0.33289322, 0.33403745]]], dtype=float32)]
layer_out[-4], the input to the CNN is correctly masked:
[[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.],
...,
[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]]],
[[[-0., -0., -0.],
[-0., -0., -0.],
[-0., -0., -0.],
...,
[-0., -0., -0.],
[-0., -0., -0.],
[-0., -0., -0.]],
Note that the code seems to work correctly with a simpler base_model such as:
def create_base():
input_layer=Input(IMG_SIZE)
base_model=Flatten()(input_layer)
base_model=Dense(2048)(base_model)
base_model=Model(input_layer,base_model)
return base_model
I have exhausted most online resources on this. Permutations of this question have been asked on Keras's github, such as here, here and here, but I can't seem to find any concrete resolution.
The links suggest that the issues seem to be stemming from a combination of TimeDistributed being applied to BatchNormalization, and the hacky fixes of either the Lambda identity layer, or Reshape layers remove errors but don't seem to output the correct model.
I've tried to force the base model to support masking via:
base_model.__setattr__('supports_masking',True)
and I've also tried applying an identity layer via:
TimeDistributed(Lambda(lambda x: base_model(x), output_shape=lambda s:s))(final_model)
but none of these seem to work. Note that I would like the final model to be trainable, in particular the CNN part of it should remain trainable.

Not entirely sure this will work, but based on the comment made here, with a newer version of tensorflow + keras it should work:
final_model = TimeDistributed(Flatten())(final_input)
final_model = Masking(mask_value = -2.)(final_model)
final_model = TimeDistributed(Reshape(IMG_SIZE))(final_model)
final_model = TimeDistributed(base_model)(final_model)
final_model = Model(final_input,final_model)
I took a look at the source code of masking, and I noticed Keras creates a mask tensor that only reduces the last axis. As long as you're dealing with 5D tensors, it will cause no problem, but when you reduce the dimensions for the LSTM, this masking tensor becomes incompatible.
Doing the first flatten step, before masking, will assure that the masking tensor works properly for 3D tensors. Then you expand the image again to its original size.
I'll probably try to install newer versions soon to test it myself, but these installing procedures have caused too much trouble and I'm in the middle of something important here.
On my machine, this code compiles, but that strange error appears in prediction time (see link at the first line of this answer).
Creating a model for predicting the intermediate layers
I'm not sure, by the code I've seen, that the masking function is kept internally in tensors. I don't know exactly how it works, but it seems to be managed separately from the building of the functions inside the layers.
So, try using a keras standard model to make the predictions:
inp = final_model.input # input placeholder
outputs = [layer.output for layer in final_model.layers] # all layer outputs
fullModel = Model(inp,outputs)
layerPredictions = fullModel.predict(np.expand_dims(TEST_SAMPLE,0))
print(layerPredictions[-2])

It seems to be working as intended. Masking in Keras doesn't produce zeros as you would expect, it instead skips the timesteps that are masked in upstream layers such as LSTM and loss calculation. In case of RNNs, Keras (at least tensorflow) is implemented such that the states from the previous step are carried over, tensorflow_backend.py. This is done in part to preserve the shapes of tensors when dynamic input is given.
If you really want zeros you will have to implement your own layer with a similar logic to Masking and return zeros explicitly. To solve your problem, you need a mask before the final LSTM layer using the final_input:
class MyMask(Masking):
"""Layer that adds a mask based on initial input."""
def compute_mask(self, inputs, mask=None):
# Might need to adjust shapes
return K.any(K.not_equal(inputs[0], self.mask_value), axis=-1)
def call(self, inputs):
# We just return input back
return inputs[1]
def compute_output_shape(self, input_shape):
return input_shape[1]
final_model = MyMask(mask_value=-2.)([final_input, final_model])
You probably can attach the mask in a simpler manner but this custom class essentially adds a mask based on your initial inputs and outputs a Keras tensor that now has a mask.
Your LSTM will ignore in your example the second image. To confirm you can return_sequences=Trueand check that the output for 2 images are identical.

I'm trying implement the same thing, I want my LSTM sequences to have variable sizes. However I can't even implement your original model. I obtain the following error: TypeError: Layer input_1 does not support masking, but was passed an input_mask: Tensor("time_distributed_1/Reshape_1:0", shape=(?, 100, 100), dtype=bool) I'm using tensorflow 1.10 and keras 2.2.2
I solved the problem by adding a second input, a mask to specify which timesteps to take into account for the LSTM. That way the image sequence always has the same number of timesteps, the CNN always generates an output, but some of them are ignored for the LSTM input. However, the missing images need to be chosen carefully so that the batch normalization is not affected.
def LSTM_CNN(params):
resnet = ResNet50(include_top=False, weights='imagenet', pooling = 'avg')
input_layer = Input(shape=(params.numFrames, params.height, params.width, 3))
input_mask = Input(shape=(params.numFrames,1))
curr_layer = TimeDistributed(resnet)(input_layer)
resnetOutput = Dropout(0.5)(curr_layer)
curr_layer = multiply([resnetOutput,input_mask])
cnn_output = curr_layer
curr_layer = Masking(mask_value=0.0)(curr_layer)
lstm_out = LSTM(256, dropout=0.5)(curr_layer)
output = Dense(output_dim=params.numClasses, activation='sigmoid')(lstm_out)
model = Model([input_layer, input_mask], output)
return model

Related

keras: add constant number to all input values

I want to create a simple toy model in keras. The model should take an input, then add a 1 to every element and produce an output.
I found an example using keras, but it requires 2 inputs
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# create model
input1 = layers.Input(shape=(2,))
input2 = layers.Input(shape=(2,))
added = layers.Add()([input1, input2])
model = keras.models.Model(inputs=[input1, input2], outputs=added)
# run inference
input_shape = (2,)
x1 = tf.ones(input_shape)
x2 = tf.ones(input_shape)
y = model([x1, x2])
However, I need the model to only have a single input and simply increase every input value by 1, for example.
You can replace the second input of your toy model with a call to tf.ones_like:
input1 = layers.Input(shape=())
added = layers.Add()([input1, tf.ones_like(input1)])
model = keras.models.Model(inputs=input1, outputs=added)
tf.ones_like creates a tensor full of ones of the shape of the tensor passed as an argument. As this op depends only on the shape of the input tensor, you can technically create your network without a specified input shape, and it will accept any shape as input:
>>> model(3)
<tf.Tensor: shape=(), dtype=float32, numpy=4.0>
>>> model(tf.ones((1,2,3)))
<tf.Tensor: shape=(1, 2, 3), dtype=float32, numpy=
array([[[2., 2., 2.],
[2., 2., 2.]]], dtype=float32)>

How does Tensorflow or Keras handle model weight inititialization and when does it happen?

After reading the answer to this question I am a bit confused as to when exactly TensorFlow initializes the weight and bias variables.
As per the answers, Compile defines the loss function, the optimizer and the metrics. That's all.
Since the compile() method doesn't initialize it then that would suggest that it happens during the fit() method run.
However the issue with that is, in case of loading models or loading weights how would fit() know that the weights, its presented with, are actually useful and should not be thrown away and then assigned random values in place of those.
We pass the type of intitializer in the argument kernel_initializer while declaring the layer. For example:
dense02 = tf.keras.layers.Dense(units=10,
kernel_initializer='glorot_uniform',
bias_initializer='zeros')
So an obvious question would be whether the weights are initialized layer by layer during the first epoch forward pass or does it happen for all layers before the first epoch.
(What I am trying to say is that if there say 5 Dense layers in the model, then does the initialization happen say a layer at a time, i.e. the first Dense layer gets initialized then the forward pass happens for that layer, then the second layer is initialized and the forward pass for second Dense layer happens and so on)
Another aspect is regarding transfer learning, when stacking custom layers on top of a trained model, the trained model layers have the weights, while the layers that I added wouldn't have any useful layers. So how would TensorFlow know to only initialize the variables of the layers I added and not the mess up the layers of the transferred model (provided, I don't have trainable=False)
How does TensorFlow or Keras handle weight initialization?
The weights are initialized when the model is created (when each layer in model is initialized), i.e before the compile() and fit():
import tensorflow as tf
from tensorflow.keras import models, layers
inputs = layers.Input((3, ))
outputs = layers.Dense(units=10,
kernel_initializer='glorot_uniform',
bias_initializer='zeros')(inputs)
model = models.Model(inputs=inputs, outputs=outputs)
for layer in model.layers:
print("Config:\n{}\nWeights:\n{}\n".format(layer.get_config(), layer.get_weights()))
Outputs:
Config:
{'batch_input_shape': (None, 3), 'dtype': 'float32', 'sparse': False, 'ragged': False, 'name': 'input_1'}
Weights:
[]
Config:
{'name': 'dense', 'trainable': True, 'dtype': 'float32', 'units': 10, 'activation': 'linear', 'use_bias': True, 'kernel_initializer': {'class_name': 'GlorotUniform', 'config': {'seed': None}}, 'bias_initializer': {'class_name': 'Zeros', 'config': {}}, 'kernel_regularizer': None, 'bias_regularizer': None, 'activity_regularizer': None, 'kernel_constraint': None, 'bias_constraint': None}
Weights:
[array([[-0.60352975, 0.08275259, -0.6521113 , -0.5860774 , -0.42276743,
-0.3142944 , -0.28118378, 0.07770532, -0.5644444 , -0.47069687],
[ 0.4611913 , 0.35170448, -0.62191975, 0.5837332 , -0.3390234 ,
-0.4033073 , 0.03493106, -0.06078851, -0.53159714, 0.49872506],
[ 0.43685734, 0.6160207 , 0.01610583, -0.3673877 , -0.14144647,
-0.3792309 , 0.05478126, 0.602067 , -0.47438127, 0.36463356]],
dtype=float32), array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)]
After doing a bit more research, even though Mr. For Example's answer is correct, lets get a bit more deep into how initialization works in TensorFlow Keras.
As per the tf.keras.layers.Layer Doc, we can create variables in the following two methods:
__init__(self, ...): Defines custom layer attributes, and creates layer state variable that do not depend on input shapes, using add_weight()
build(self, input_shape): This method can be used to create weights that depend on the shape(s) of the input(s), using add_weight()
kb
The below code shows an example of a basic layer with 2 variables that does the computation: y = w . x + b:
class SimpleDense(Layer):
def __init__(self, units=32):
super(SimpleDense, self).__init__()
self.units = units
def build(self, input_shape): # Create the state of the layer (weights)
w_init = tf.random_normal_initializer()
self.w = tf.Variable(
initial_value=w_init(shape=(input_shape[-1], self.units),
dtype='float32'),
trainable=True)
b_init = tf.zeros_initializer()
self.b = tf.Variable(
initial_value=b_init(shape=(self.units,), dtype='float32'),
trainable=True)
def call(self, inputs): # Defines the computation from inputs to outputs
return tf.matmul(inputs, self.w) + self.b
# Instantiates the layer.
linear_layer = SimpleDense(4)
# This will also call `build(input_shape)` and create the weights.
y = linear_layer(tf.ones((2, 2)))
assert len(linear_layer.weights) == 2
# These weights are trainable, so they're listed in `trainable_weights`:
assert len(linear_layer.trainable_weights) == 2
The most interesting thing to note in the above code is when the build method is called.
The build() is called when the layer (after it has been initialized) is assigned some sort of input whether it be actual values or just a TensorFlow placeholder.
When using a Keras Sequential model, we add a layer to the model, it automatically assigns the input placeholder to the layer and there by initializing it at the same time.
Thus we see the weights before the calling of compile() or the fit() methods of the Keras Model. (Note that __call__() will automatically build the layer (if it has not been built yet) by calling build())
Regarding Transfer Learning, when we are loading the transferred model, we are loading already built layers, so the build method is not called again when you add the layers to your own model.
In other words, the layers, of the transferred model, already have had the input placeholder assigned to it and the build() method has already been called when the transferred model was being trained.
Useful References:
Keras Layer Doc
TF Tutorial: Custom Layers

The list of Numpy arrays that you are passing to your model is not the size the model expected

I am trying to obtain three different loss functions in Keras by passing them like so
input_img = Input(shape=(728,))
encoded = Dense(450, activation='relu')(input_img)
encoded = Dense(250, activation='relu')(encoded)
encoded= Dense(20, activation='relu')(encoded)
decoded = Dense(250, activation='relu')(encoded)
decoded = Dense(450, activation='relu')(decoded)
decoded = Dense(728, activation='sigmoid')(decoded)
loss1 = Dense(728, activation='sigmoid', name='p1')(decoded)
loss2 = Dense(728, activation='sigmoid', name='p2')(decoded)
loss3 = Dense(728, activation='sigmoid', name='p3')(decoded)
I defined three different loss functions and compiled succesfully
autoencoder = Model(inputs = [input_img], outputs=[loss1,loss2,loss3])
autoencoder.compile(optimizer='Adam', loss = [w_loss,b_loss, loss], metrics = [w_loss,b_loss], loss_weights=[1., 1., 1.])
I then fit the model
history_modified = autoencoder.fit(X_train, X_train, epochs=200, batch_size= 100, shuffle=True, validation_data=(X_test, X_test))
Where X_train dimensions are (100000, 728) and X_test dimensions are (50000,728)
The error I'm getting is
Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 3 array(s), but instead got the following list of 1 arrays: [array([[0., 0., 0., ..., 0., 0., 0.],
I don't what exactly is causing the problem, but I think it might have to do with the layers and how I have multiple loss functions.
As the error says, it has the problem with the shape of the target. Ideally, as there are three outputs (loss1,loss2,loss3) from the model, there should be three arrays in the output to compare the three output arrays. You must be passing only one array in the target and hence this error.

Custom Keras binary_crossentropy loss function not working

I’m trying to re-define keras’s binary_crossentropy loss function so that I can customize it but it’s not giving me the same results as the existing one.
I'm using TF 1.13.1 with Keras 2.2.4.
I went through Keras’s github code. My understanding is that the loss in model.compile(optimizer='adam', loss='binary_crossentropy', metrics =['accuracy']), is defined in losses.py, using binary_crossentropy defined in tensorflow_backend.py.
I ran a dummy data and model to test it. Here are my findings:
The custom loss function outputs the same results as keras’s one
Using the custom loss in a keras model gives different accuracy results
from numpy.random import seed
seed(1)
from tensorflow import set_random_seed
set_random_seed(2)
import tensorflow as tf
from keras import losses
import keras.backend as K
import keras.backend.tensorflow_backend as tfb
from keras.layers import Dense
from keras import Sequential
#Dummy check of loss output
def binary_crossentropy_custom(y_true, y_pred):
return K.mean(binary_crossentropy_custom_tf(y_true, y_pred), axis=-1)
def binary_crossentropy_custom_tf(target, output, from_logits=False):
"""Binary crossentropy between an output tensor and a target tensor.
# Arguments
target: A tensor with the same shape as `output`.
output: A tensor.
from_logits: Whether `output` is expected to be a logits tensor.
By default, we consider that `output`
encodes a probability distribution.
# Returns
A tensor.
"""
# Note: tf.nn.sigmoid_cross_entropy_with_logits
# expects logits, Keras expects probabilities.
if not from_logits:
# transform back to logits
_epsilon = tfb._to_tensor(tfb.epsilon(), output.dtype.base_dtype)
output = tf.clip_by_value(output, _epsilon, 1 - _epsilon)
output = tf.log(output / (1 - output))
return tf.nn.sigmoid_cross_entropy_with_logits(labels=target,
logits=output)
logits = tf.constant([[-3., -2.11, -1.22],
[-0.33, 0.55, 1.44],
[2.33, 3.22, 4.11]])
labels = tf.constant([[1., 1., 1.],
[1., 1., 0.],
[0., 0., 0.]])
custom_sigmoid_cross_entropy_with_logits = binary_crossentropy_custom(labels, logits)
keras_binary_crossentropy = losses.binary_crossentropy(y_true=labels, y_pred=logits)
with tf.Session() as sess:
print('CUSTOM sigmoid_cross_entropy_with_logits: ', sess.run(custom_sigmoid_cross_entropy_with_logits), '\n')
print('KERAS keras_binary_crossentropy: ', sess.run(keras_binary_crossentropy), '\n')
#CUSTOM sigmoid_cross_entropy_with_logits: [16.118095 10.886106 15.942386]
#KERAS keras_binary_crossentropy: [16.118095 10.886106 15.942386]
#Dummy check of model accuracy
X_train = tf.random.uniform((3, 5), minval=0, maxval=1, dtype=tf.dtypes.float32)
labels = tf.constant([[1., 0., 0.],
[0., 0., 1.],
[1., 0., 0.]])
model = Sequential()
#First Hidden Layer
model.add(Dense(5, activation='relu', kernel_initializer='random_normal', input_dim=5))
#Output Layer
model.add(Dense(3, activation='sigmoid', kernel_initializer='random_normal'))
#I ran model.fit for each model.compile below 10 times using the same X_train and provide the range of accuracy measurement
# model.compile(optimizer='adam', loss='binary_crossentropy', metrics =['accuracy']) #0.748 < acc < 0.779
# model.compile(optimizer='adam', loss=losses.binary_crossentropy, metrics =['accuracy']) #0.761 < acc < 0.778
model.compile(optimizer='adam', loss=binary_crossentropy_custom, metrics =['accuracy']) #0.617 < acc < 0.663
history = model.fit(X_train, labels, steps_per_epoch=100, epochs=1)
I'd expect the custom loss function to give similar model accuracy output but it does not. Any idea? Thanks!
Keras automatically selects which accuracy implementation to use according to the loss, and this won't work if you use a custom loss. But in this case you can just explictly use the right accuracy, which is binary_accuracy:
model.compile(optimizer='adam', loss=binary_crossentropy_custom, metrics =['binary_accuracy'])

Tensorflow tf.nn.embedding_lookup

is there a small neural network in tf.nn.embedding_lookup??
When I train some data, a value of the same index is changing.
So is it trained also? while I'm training my model
I checked the official embedding_lookup code but I can not see any tf.Variables for train embedding parameter.
But when I print all tf.Variables then I can found a Variable which is within embedding scope
Thank you.
Yes, the embedding is learned. You can look at the tf.nn.embedding_lookup operation as doing the following matrix multiplication more efficiently:
import tensorflow as tf
import numpy as np
NUM_CATEGORIES, EMBEDDING_SIZE = 5, 3
y = tf.placeholder(name='class_idx', shape=(1,), dtype=tf.int32)
RS = np.random.RandomState(42)
W_em_init = RS.randn(NUM_CATEGORIES, EMBEDDING_SIZE)
W_em = tf.get_variable(name='W_em',
initializer=tf.constant_initializer(W_em_init),
shape=(NUM_CATEGORIES, EMBEDDING_SIZE))
# Using tf.nn.embedding_lookup
y_em_1 = tf.nn.embedding_lookup(W_em, y)
# Using multiplication
y_one_hot = tf.one_hot(y, depth=NUM_CATEGORIES)
y_em_2 = tf.matmul(y_one_hot, W_em)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
sess.run([y_em_1, y_em_2], feed_dict={y: [1.0]})
# [array([[ 1.5230298 , -0.23415338, -0.23413695]], dtype=float32),
# array([[ 1.5230298 , -0.23415338, -0.23413695]], dtype=float32)]
The variable W_em will be trained in exactly the same way irrespective of whether you use y_em_1 or y_em_2 formulation; y_em_1 is likely to be more efficient, though.