Understanding shapes in keras layers - tensorflow

I am learning Tensorflow and Keras to implement LSTM many-to-many model where the length of input sequence is equal to the length of the output sequence.
Sample Code:
Inputs:
voc_size = 10000
embed_dim = 64
lstm_units = 75
size_batch = 30
count_classes = 5
Model:
from tensorflow.keras.layers import ( Bidirectional, LSTM,
Dense, Embedding, TimeDistributed )
from tensorflow.keras import Sequential
def sample_build(embed_dim, voc_size, batch_size, lstm_units, count_classes):
model = Sequential()
model.add(Embedding(input_dim=voc_size,
output_dim=embed_dim,input_length=50))
model.add(Bidirectional(LSTM(units=lstm_units,return_sequences=True),
merge_mode="ave"))
model.add(Dense(200))
model.add(TimeDistributed(Dense(count_classes+1)))
# Compile model
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.summary()
return model
sample_model = sample_build(embed_dim,voc_size,
size_batch, rnn_units,
count_classes)
I am having trouble understanding the shapes of input and output for each layer. For example, the shape of the output of Embedding_Layer is (BATCH_SIZE, time_steps, length_of_input) and in this case, it is (30, 50, 64).
Similarly, the output shape of Bidirectional LSTM later is (30, 50, 75). This is will be the input for the next Dense Layer with 200 units. But the shape of the weight matrix of Dense Layer is (number of units in the current layer, number of units in the previous layer, which is (200,75) in this case. So how does the matrix calculation happen between 2D shape of the Dense Layer and the 3D shape of the Bidirectional Layer? Any explanations on the shape clarification will be helpful

The Dense can do 3D operation, it will flatten the the input to shape (batch_size * time_steps, features) and then apply a dense layer and reshape it back to orignal (batch_size, time_steps, units). In keras's documentation of Dense layer, it says:
Note: If the input to the layer has a rank greater than 2, then Dense computes the dot product between the inputs and the kernel along the last axis of the inputs and axis 1 of the kernel (using tf.tensordot). For example, if input has dimensions (batch_size, d0, d1), then we create a kernel with shape (d1, units), and the kernel operates along axis 2 of the input, on every sub-tensor of shape (1, 1, d1) (there are batch_size * d0 such sub-tensors). The output in this case will have shape (batch_size, d0, units).
Another point regarding the output of Embedding layer. As you said, it is correct that it is a 3D output, but correctly the shape correspond to (BATCH_SIZE, input_dim, embeddings_dim)

Related

How can I multiply a tensor with an unknown dimension to a tensorflow variable?

I'm working in Keras (Tensorflow 2). I'd like to multiply each element of a tensor with its own trainable weight. Let's say that my input tensor is 1D, with 10 elements; so I try to define the input as a Keras input tensor, the weights as a tf.Variable, and I try to use the Keras Multiply layer, thus:
import tensorflow as tf
inputs = tf.keras.layers.Input(shape=(10), name='inputs')
weights = tf.Variable(tf.random.normal([10]), name='weights')
outputs = tf.keras.layers.Multiply()([inputs, weights])
Now when I inspect the dimensions they are:
inputs: shape=(None, 10)
weights: shape=(10,)
outputs: shape=(10, 10)
The input dimension has a None dimension, for the batch size, which is what I expect and want. However I expected outputs to have shape=(None, 10). Instead, the initial dimension for the batch size seems to have taken a fixed size of 10. How should I correct this?
You need to broadcast weights along dimenstion 0. The shape of the dimension you want to fix must be constant.
That is, weights must have the shape (1, 10), not (10,).
This can be done using:
weights = tf.Variable(tf.random.normal([1, 10]), name='weights')
or
weights = tf.Variable(tf.random.normal([10]), name='weights')
...
weights = tf.expand_dims(weights, axis=0)

Unable to track record by record processing in LSTM algorithm for text classification?

We are working on multi-class text classification and following is the process which we have used.
1) We have created 300 dim's vector with word2vec word embedding using our own data and then passed that vector as a weights to LSTM embedding layer.
2) And then we have used one LSTM layer and one dense layer.
Here below is my code:
input_layer = layers.Input((train_seq_x.shape[1], ))
embedding_layer = layers.Embedding(len(word_index)+1, 300, weights=[embedding_matrix], trainable=False)(input_layer)
embedding_layer = layers.SpatialDropout1D(0.3)(embedding_layer)
lstm_layer1 = layers.LSTM(300,return_sequences=True,activation="relu")(embedding_layer)
lstm_layer1 = layers.Dropout(0.5)(lstm_layer1)
flat_layer = layers.Flatten()(lstm_layer1)
output_layer = layers.Dense(33, activation="sigmoid")(flat_layer)
model = models.Model(inputs=input_layer, outputs=output_layer)
model.compile(optimizer=optimizers.Adam(), loss='categorical_crossentropy',metrics=['accuracy'])
Please help me out on the below questions:
Q1) Why did we pass word embedding vector(300 dim's) as weights in LSTM embedding layer?
Q2) How can we know optimal number of neural in LSTM layer?
Q3) Can you please explain how the single record processing in LSTM algorithm?
Please let me know if you requires more information on the same.
Q1) Why did we pass word embedding vector(300 dim's) as weights in
LSTM embedding layer?
In a very simplistic way, you can think of an embedding layers as a lookup table which converts a word (represented by its index in a dictionary) to a vector. It is a trainable layers. Since you have already trained word embeddings instead of initializing the embedding layer with the random weight you initialize it with the vectors you have learned.
Embedding(len(word_index)+1, 300, weights=[embedding_matrix], trainable=False)(input_layer)
So here you are
creating an embedding layer or a look up table which can lookup words
indices 0 to len(word_index).
Each lookuped up word will map to a vector of size 300.
This lookup table is loaded with the vectors from "embedding_matrix"
(which is a pretrained model).
trainable=False will freez the weight in this layer.
You have passed 300 because it is the vector size of your pretrained model (embedding_matrix)
Q2) How can we know optimal number of neural in LSTM layer?
You have created a LSTM layer with takes 300 size vector as input and returns a vector of size 300. The output size and number of stacked LSTMS are hyperparameters which is tuned manually (usually using KFold CV)
Q3) Can you please explain how the single record processing in LSTM
algorithm?
A single record/sentence(s) are converted into indices of the vocabulary. So for every sentence you have an array of indices.
A batch of these sentences are created and feed as input to the model.
LSTM is unwrapped by passing in one index at a time as input at each timestep.
Finally the ouput of the LSTM is forward propagated by a final dense
layer to size 33. So looks like each input is mapped to one of 33
classes in your case.
Simple example
import numpy as np
from keras.preprocessing.text import one_hot
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten, LSTM
from keras.layers.embeddings import Embedding
from nltk.lm import Vocabulary
from keras.utils import to_categorical
training_data = [ "it was a good movie".split(), "it was a bad movie".split()]
training_target = [1, 0]
v = Vocabulary([word for s in training_data for word in s])
model = Sequential()
model.add(Embedding(len(v),50,input_length = 5, dropout = 0.2))
model.add(LSTM(10, dropout_U = 0.2, dropout_W = 0.2))
model.add(Dense(2,activation='softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])
print(model.summary())
x = np.array([list(map(lambda x: v[x], s)) for s in training_data])
y = to_categorical(training_target)
model.fit(x,y)

CNN features(dimensions) feed to LSTM Tensorflow

So recently i am working on a project which i am supposed to take images as input to a CNN and extract the features and feed them to LSTM for training. I am using 2 Layer CNN for feature extraction and im taking the features form fully connected layer and trying to feed them to LSTM. Problem is when i want to feed the FC layer to LSTM as input i get error regarding to wrong dimension. my FC layer is a Tensor with (128,1024) dimension. I tried to reshape it like this tf.reshape(fc,[-1]) which gives me a tensor ok (131072, )
dimension and still wont work. Could anyone give me any ideas of how im suppose to feed the FC to LSTM?here i just write part of my code and teh error i get.
Convolution Layer with 32 filters and a kernel size of 5
conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu)
# Max Pooling (down-sampling) with strides of 2 and kernel size of 2
conv1 = tf.layers.max_pooling2d(conv1, 2, 2)
# Convolution Layer with 32 filters and a kernel size of 5
conv2 = tf.layers.conv2d(conv1, 64, 3, activation=tf.nn.relu)
# Max Pooling (down-sampling) with strides of 2 and kernel size of 2
conv2 = tf.layers.max_pooling2d(conv2, 2, 2)
# Flatten the data to a 1-D vector for the fully connected layer
fc1 = tf.contrib.layers.flatten(conv2)
# Fully connected layer (in contrib folder for now)
fc1 = tf.layers.dense(fc1, 1024)
# Apply Dropout (if is_training is False, dropout is not applied)
fc1 = tf.layers.dropout(fc1, rate=dropout, training=is_training)
s = tf.reshape(fc1, [1])
rnn_cell = rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)
outputs, states = rnn.static_rnn(rnn_cell, s, dtype=tf.float32)
return tf.matmul(outputs[-1], rnn_weights['out']) + rnn_biases['out']
here is the error:
ValueError: Cannot reshape a tensor with 131072 elements to shape [1] (1 elements) for 'ConvNet/Reshape' (op: 'Reshape') with input shapes: [128,1024], [1] and with input tensors computed as partial shapes: input[1] = [1].
You have a logical error in how you approach the problem. Collapsing the data to a 1D tensor is not going to solve anything (even if you get it to work correctly).
If you are taking a sequence of images as input your input tensor should be 5D (batch, sequence_index, x, y, channel) or something permutation like that. conv2d should complain about the extra dimension but you probably missing one of them. You should try to fix it first.
Next use conv3d and max_pool3d with a window of 1 for the depth (since you don't want the different frames to interact at this stage).
When you are done you should still have 5D tensor, but x and y dimensions should be 1 (you should check this, and fix the operation if that's not the case).
The RNN part expects 3D tensors (batch, sequence_index, fature_index). You can use tf.squeeze to remove the 1 sized dimensions from your 5D tensor and get this 3D tensor. You shouldn't have to reshape anything.
If you don't use batches, it's OK, but the operations will still expect the dimension to be there (but for you it will be 1). Missing the dimension will cause problems with shapes down the line.

Getting the output of layer as a feature vector (KERAS)

I have a CNN model in keras (used for signal classification):
cnn = Sequential()
cnn.add(Conv1D(10,kernel_size=8,strides=4, padding="same",activation="relu",input_shape=(Dimension_of_input,1)))
cnn.add(MaxPooling1D(pool_size=3))
cnn.add(Conv1D(10,kernel_size=8,strides=4, padding="same",activation="relu"))
cnn.add(MaxPooling1D(2))
cnn.add(Flatten())
cnn.add(Dense(2, activation="softmax"))
Using the method 'model.summary()', I can get the shape of the output of each layer. In my model, the output of the last max pooling layer is (None, 1, 30) and of flatten layer is (None, 30).
For each train and test sample: Is it possible in keras to get the output of the flatten layer as a feature vector with the 30 features (numbers), before it is given as input to the dense layer??
Select the last layer by:
last = cnn.layers[-1]
then create a new model using:
inp = Input(shape=(Dimension_of_input,))
features = Model(inp, last)
So,
feature_vec = features.predict(x_train)
give you the output of the flatten layer as a feature vector for each train sample

Convolutional neural network Conv1d input shape

I am trying to create a CNN to classify data. My Data is X[N_data, N_features]
I want to create a neural net capable of classifying it. My problem is concerning the input shape of a Conv1D for the keras back end.
I want to repeat a filter over.. let say 10 features and then keep the same weights for the next ten features.
For each data my convolutional layer would create N_features/10 New neurones.
How can i do so? What should I put in input_shape?
def cnn_model():
model = Sequential()
model.add(Conv1D(filters=1, kernel_size=10 ,strides=10,
input_shape=(1, 1,N_features),kernel_initializer= 'uniform',
activation= 'relu'))
model.flatten()
model.add(Dense(N_features/10, init= 'uniform' , activation= 'relu' ))
Any advice?
thank you!
Try:
def cnn_model():
model = Sequential()
model.add(Conv1D(filters=1, kernel_size=10 ,strides=10,
input_shape=(N_features, 1),kernel_initializer= 'uniform',
activation= 'relu'))
model.flatten()
model.add(Dense(N_features/10, init= 'uniform' , activation= 'relu' ))
....
And reshape your x to shape (nb_of_examples, nb_of_features, 1).
EDIT:
Conv1D was designed for a sequence analysis - to have convolutional filters which would be the same no matter in which part of sequence we are. The second dimension is so called features dimension where you could have a vector of multiple features at each of timesteps. One may think about sequence dimension the same as spatial dimensions and feature dimension the same as channel dimension or color dimension in Conv2D. As #putonspectacles mentioned in his comment - you may set sequence dimension to None in order to make your network input length invariant.
#Marcin's answer might work, but might suggestion given the documentation here:
When using this layer as the first layer in a model, provide an
input_shape argument (tuple of integers or None, e.g. (10, 128) for
sequences of 10 vectors of 128-dimensional vectors, or (None, 128) for
variable-length sequences of 128-dimensional vectors.
would be:
model = Sequential()
model.add(Conv1D(filters=1, kernel_size=10 ,strides=10,
input_shape=(None, N_features),kernel_initializer= 'uniform',
activation= 'relu'))
Note that since input data (N_Data, N_features), we set the number of examples as unspecified (None). The strides argument controls the size of of the timesteps in this case.
To input a usual feature table data of shape (nrows, ncols) to Conv1d of Keras, following 2 steps are needed:
xtrain.reshape(nrows, ncols, 1)
# For conv1d statement:
input_shape = (ncols, 1)
For example, taking first 4 features of iris dataset:
To see usual format and its shape:
iris_array = np.array(irisdf.iloc[:,:4].values)
print(iris_array[:5])
print(iris_array.shape)
The output shows usual format and its shape:
[[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2]
[5. 3.6 1.4 0.2]]
(150, 4)
Following code alters the format:
nrows, ncols = iris_array.shape
iris_array = iris_array.reshape(nrows, ncols, 1)
print(iris_array[:5])
print(iris_array.shape)
Output of above code data format and its shape:
[[[5.1]
[3.5]
[1.4]
[0.2]]
[[4.9]
[3. ]
[1.4]
[0.2]]
[[4.7]
[3.2]
[1.3]
[0.2]]
[[4.6]
[3.1]
[1.5]
[0.2]]
[[5. ]
[3.6]
[1.4]
[0.2]]]
(150, 4, 1)
This works well for Conv1d of Keras. For input_shape (4,1) is needed.