CNN with LSTM-Layer - tensorflow

I have implemented a CNN with an LSTM layer. My input consists of four images. The images were transformed into a tensor by feature extraction. The input shape is (4,256,256,3).
The following is the structure of my model:
model = keras.models.Sequential()
model.add(TimeDistributed(Conv2D(32,(3,3),padding = 'same', activation = 'relu'),input_shape = (4,256,256,3)))
model.add(TimeDistributed(Conv2D(64,(3,3),padding = 'same', activation = 'relu')))
model.add(TimeDistributed(Conv2D(128,(3,3),padding = 'same', activation = 'relu')))
model.add(LSTM(128, activation='tanh'))# finalize with standard Dense, Dropout...
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='relu'))
optim = keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=optim, loss=['MSE'])
history =, y=Y, batch_size=4, epochs=5, validation_split=0.2, validation_data=(X,Y))
My problem is that my model predicts the same values for all inputs.
What could be the problem?

you use the same data for training and validation. this kills the whole point of validation. Perhaps the mistake lies in this. Try to split the data, or apply cross validation.
Also, the application of the relu activation function to the last layer in combination with the mse error looks strange. At least the real can give an unlimited result, and the data should be normalized.
I hope this will help you

if you are working with a classification problem specifically binary classification, then using sigmoid activation instead softmax And MSE loss is not a good choice for binary classification.


One back-propagation pass in keras [duplicate]

I am interested in building reinforcement learning models with the simplicity of the Keras API. Unfortunately, I am unable to extract the gradient of the output (not error) with respect to the weights. I found the following code that performs a similar function (Saliency maps of neural networks (using Keras))
get_output = theano.function([model.layers[0].input],model.layers[-1].output,allow_input_downcast=True)
fx = theano.function([model.layers[0].input] ,T.jacobian(model.layers[-1].output.flatten(),model.layers[0].input), allow_input_downcast=True)
grad = fx([trainingData])
Any ideas on how to calculate the gradient of the model output with respect to the weights for each layer would be appreciated.
To get the gradients of model output with respect to weights using Keras you have to use the Keras backend module. I created this simple example to illustrate exactly what to do:
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras import backend as k
model = Sequential()
model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
To calculate the gradients we first need to find the output tensor. For the output of the model (what my initial question asked) we simply call model.output. We can also find the gradients of outputs for other layers by calling model.layers[index].output
outputTensor = model.output #Or model.layers[index].output
Then we need to choose the variables that are in respect to the gradient.
listOfVariableTensors = model.trainable_weights
#or variableTensors = model.trainable_weights[0]
We can now calculate the gradients. It is as easy as the following:
gradients = k.gradients(outputTensor, listOfVariableTensors)
To actually run the gradients given an input, we need to use a bit of Tensorflow.
trainingExample = np.random.random((1,8))
sess = tf.InteractiveSession()
evaluated_gradients =,feed_dict={model.input:trainingExample})
And thats it!
The below answer is with the cross entropy function, feel free to change it your function.
outputTensor = model.output
listOfVariableTensors = model.trainable_weights
bce = keras.losses.BinaryCrossentropy()
loss = bce(outputTensor, labels)
gradients = k.gradients(loss, listOfVariableTensors)
sess = tf.InteractiveSession()
evaluated_gradients =,feed_dict={model.input:training_data1})

Multivariate Binary Classification Prediction Tensorflow 2 LSTM

I am currently working on the implementation of an LSTM to predict a binary outcome (either 0 or 1) for a given set of normed scaled features.
self._regressor.add(LSTM(units=60, activation='relu', return_sequences=True, input_shape=(data.x_train.shape[1], data.x_train.shape[2])))
self._regressor.add(LSTM(units=60, activation='relu', return_sequences=True))
self._regressor.add(LSTM(units=80, activation='relu', return_sequences=True))
self._regressor.add(LSTM(units=120, activation='relu'))
#this is the output layer
self._regressor.add(Dense(units=1, activation='sigmoid'))"TensorFlow Summary\n {}".format(self._regressor.summary()))
#run regressor
self._regressor.compile(optimizer='adam', loss="binary_crossentropy", metrics=['accuracy']), data.y_train, epochs=1, batch_size=32)
data.y_pred_scaled = self._regressor.predict(data.x_test)
data.y_pred = self._scaler_target.inverse_transform(data.y_pred_scaled)
scores = self._regressor.evaluate(data.x_test, data.y_test, verbose=0)
My issue here is that the output of my prediction has a range of max: 0.5188445 and min: 0.518052, implying to me that all of my classifications are positive (which is definitely incorrect). I even tried predict_classes and this yielded an array of 1's.
I am struggling to find where my issue is despite numerous searches online. I have ensured that my final output layer consists of a sigmoid function as well as included the loss as the binary_crossentropy also. My data has been scaled using sklearn's MinMaxScaler with feature_range=(0,1). I am running my code through a debugger and everything up to the looks good so far. I am just struggling with quantifying the output of the predictions.
Any help would be greatly appreciated.

How to build a pretrained CNN-LSTM network with Keras

I'm trying to use a CNN-LSTM network with Keras in order to analyze videos. I read about it and run into TimeDistributed function and some examples.
Actually, I tried the network described below, which is in fact composed by a convolutional and pooling layers followed by recurrent and dense layers.
model = Sequential()
model.add(TimeDistributed(Conv2D(2, (2,2), activation= 'relu' ), input_shape=(None, IMG_SIZE, IMG_SIZE, 3)))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(Dense(50, activation = 'softmax'))
model.compile(loss = 'categorical_crossentropy' , optimizer = 'adam' , metrics = ['acc'])
I haven't tested properly the model, since my dataset is too small. However, during training process the network reaches accuracy 0.98 in 4-5 epochs (perhaps it is overfitting, but it isn't a problem yet because I hope to get a suitable dataset later).
Then, I read about how to use a pretrained convolutional network (MobileNet, ResNet or Inception) as a feature extractor for LSTM network, such that I use the following code:
inputs = Input(shape = (frames, IMG_SIZE, IMG_SIZE, 3))
cnn_base = InceptionV3(include_top = False, weights='imagenet', input_shape = (IMG_SIZE, IMG_SIZE, 3))
cnn_out = GlobalAveragePooling2D()(cnn_base.output)
cnn = Model(inputs=cnn_base.input, outputs=cnn_out)
encoded_frames = TimeDistributed(cnn)(inputs)
encoded_sequence = LSTM(256)(encoded_frames)
hidden_layer = Dense(1024, activation="relu")(encoded_sequence)
outputs = Dense(50, activation="softmax")(hidden_layer)
model = Model([inputs], outputs)
In this case, when training the model it always shows accuracy ~0.02 (it is the baseline 1/50).
Since the first model at least learned anything, I am wondering if there is any error with the way the network is build in the second case.
Has anybody faced this situation? Any advice?
Thank you.
The reason is you have very small amount of data and retraining the complete Inception V3 weights. Either you have to train the model with more amount of data OR train the model with more number of epochs with hyper parameter tuning. You can find more about hyper parameter training here.
The ideal way is to freeze the base model by base_model.trainable = False and just train the new layers that you have added on top of the Inception V3 layers.
Unfreeze the top layers of the base model(Inception V3 layers) and set the bottom layers to be un-trainable. You can do it as below -
# Let's take a look to see how many layers are in the base model
print("Number of layers in the base model: ", len(base_model.layers))
# Fine-tune from this layer onwards
fine_tune_at = 100
# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
layer.trainable = False

How to specify number of layers in keras?

I'm trying to define a fully connected neural network in keras using tensorflow backend, I have a sample code but I dont know what it means.
model = Sequential()
model.add(Dense(10, input_dim=x.shape[1], kernel_initializer='normal', activation='relu'))
model.add(Dense(50, input_dim=x.shape[1], kernel_initializer='normal', activation='relu'))
model.add(Dense(20, input_dim=x.shape[1], kernel_initializer='normal', activation='relu'))
model.add(Dense(10, input_dim=x.shape[1], kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
From the above code I want to know what is the number of inputs to my network, number of outputs, number of hidden layers and number of neurons in each layer. And what is the number coming after model.add(Dense ? assuming x.shape[1]=60.
What is the name of this network exacly? Should I call it a fully connected network or convolutional network?
That should be quite easy.
For knowing about the model's inputs and outputs use,
input_tensor = model.input
output_tensor = model.output
You can print these tf.Tensor objects to get the shape and dtype.
For fetching the Layers of a model use,
layers = model.layers
print( layers[0].units )
With these tricks you can easily get the input and output tensors for a model or its layer.

RNN Not Generalizing on Text Classification

I am using keras and RNN to classify slack text data on whether the text is reaction worthy or not (1 - emoji, 0 - no emoji). I have removed usernames and urls from the text as well as dropped duplicates with different target variables.
I am not able to get the model to generalize to unseen data. The loss of the train/val sets look good and continually decrease but the accuracy of the val set only decreases.
I am using a pretrained GLOVE word embedding since my training size is only about 25,000 sentences.
I have added additional layers, changed my regularization value and increased dropout but get similar results. Is my model not complex enough to generalize the data? The times i added additional layers they were much smaller but deeper because the training time was about 2 min per epoch.
Any insight would be appreciated.
embedding_layer = Embedding(len(word_index) + 1,
# Creating the Model
model = Sequential()
model.add(Convolution1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compiling the model with our given Optimizer
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.000025)
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])