How to expand output of embedding layer in keras - tensorflow

I have the following network:
model = Sequential()
model.add(Embedding(400000, 100, weights=[emb], input_length=12, trainable=False))
model.add(Conv2D(256,(2,2),activation='relu'))
the output from the embedding layer is of shape (batchSize, 12, 100). The conv2D layer requires an input of shape (batchSize, filter, 12, 100), and I get the following error:
Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=3
So, how can I expand the output from the embedding layer to make it proper for the Conv2D layer?
I'm using Keras with Tensorflow as the back end.

Adding a reshape Layer should be the way to go https://keras.io/layers/core/#reshape
Depending on the concrete situation Conv1D cold although work.

I managed to add another dimension with the following piece of code:
model = Sequential()
model.add(Embedding(400000, 100, weights=[emb], input_length=12, trainable=False))
model.add(Lambda(lambda x: expand_dims(x, 3)))
model.add(Conv2D(256,(2,2),activation='relu'))

Related

Tensorflow Keras output layer shape weird error

I am fairly new to TF, Keras and ML in general.
I am trying to implement a very simple MLP with an input shape of (batch_size,3,2) and an output shape of (batch_size,3), that is (if I got it right): for every 3x2 feature, there is a corresponding 3 value array label.
Here is how I create the model:
model = tf.keras.Sequential([
tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
tf.keras.layers.Dense(3)
])
and these are the X and y shapes:
X_train.shape,y_train.shape
TensorShape([64,3,2]),TensorShape([64,3])
On model.fit I am facing a weird error I cannot understand:
ValueError: Dimensions must be equal, but are 3 and 32 for ... with input shapes: [32,3,3] and [32,3]
I have no clue what's going on, I understand the batch size is 32, but where does that [32,3,3] comes from?
Moreover, if from the original 64, I lower the number (shapes) of X_train and y_train, say, to: (19,3,2) and (19,3), I get the following error instead:
InvalidArgumentError: required broadcastable shapes at loc(unknown)
What's even more weird for me is that if I specify a single unit for the output (last) layer, instead of 3 like this:
model = tf.keras.Sequential([
tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
tf.keras.layers.Dense(1)
])
model.fit works, but the predictions have shape (1,3,1) instead of my expected (3,)
I am very confused.
Whenever you have not any idea about the journey of data throughout your model, use model.summary() to see the details and what happens to the shape of data in each layer.
In this case, the input is a 2D array, and the output is a 1D array, and you just used dense layers. Dense layers can not handle 2d features in nature. For example for an image as input, you can not feed it directly to a dense layer. Instead you should use other layers such as Conv2D or Flatten your input (make it 1D) before feeding your data to the dense layer. Otherwise you will get the other dimension in the output.
Inference: If your input dimension and output dimension differs, somewhere in your model, the shape need to be changed. Most common ways to do so, is using a Flatten layer or GlobalAveragePooling and so on.
When you pass an input to a dense layer, the input should be flattened first. There are 2 ways to deal with this:
Way 1: Adding a flatten input as a first layer of your model:
model = Sequential()
model.add(Flatten(input_shape=(3,2)))
model.add(Dense(50, 'relu'))
model.add(Dense(3))
Way 2: Converting the 2D array to 1D before passing the inputs to your model:
X_train = tf.reshape(X_train, shape=([6]))
or
X_train = tf.reshape(X_train, shape=((6,)))
Then change the input shape of the first layer as:
model.add(Dense(50, 'relu', input_shape=(6,))

Keras: ValueError: Input 0 of layer sequential is incompatible with the layer: expected axis

I am trying to build a neural network using Keras but am getting the error:
ValueError: Input 0 of layer sequential is incompatible with the layer: expected axis -1 of input shape to have value 25168 but received input with shape (None, 34783)
I defined the model to be:
model = Sequential()
model.add(Dense(1024, input_dim = len(X), activation = 'relu'))
model.add(Dense(6, activation='softmax'))
In this, X is the result of using scikit-learn it's CountVectorizer() (after it is trained) as follows:
X = count_vectorizer.transform(X).todense()
Is there any method to fix this? Looking around I found that I might need to reshape the data, however I have no idea how and where.
You are using as input_dim the sample dimensionality: len(X) (the same as X.shape[0]) which is wrong.
Keras expects as input the number of dimensions of the features which, in your case of 2D input, is X.shape[-1]

Why do wee need to put one more layer and where is the softmax activation function?

I'm reading and testing the basic example of CNN from TensorFlow tutorial web site:
The model from the tutorial looks:
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu')
model.add(layers.Flatten())
# 1.why do we need the next line ?
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))
Two basic questions:
We are building CNN network.
Why do we need the last layer (model.add(layers.Dense(64, activation='relu'))) ?
It is not part of the CNN network, and from my tests, I'm getting same results (accuracy) with or without this last layer
In the tutorial they wrote that they used softmax in the last layer:
"CIFAR has 10 output classes, so you use a final Dense layer with 10 outputs and a softmax activation"
but they didn't use softmax in their code.
I checked the documentation, and the default activation function is None and not softmax. so the tutrial has a mistake and it is not used with softmax ?
Convolutional Neural Network (CNN)
CNN consist of (conv-pool)n-(flatten or globalpool)-(Dense)m, where the (conv-pool)n part extracts the features from a 2D signal and (Dense)m selects the features from the previous layers.
The output of the last layer is (4,4,64) which are 64 feature maps of size 4 × 4 (2D signals). We then flattens them to get a 4 × 4 × 64=1024 dim vector (instead, we can also use global max/avg pool to get a 64 dim vector). If you are using flatten then it will yield a 1024 dime vector and we have 10 classes. This will drastically reduce the dimension, leading to loss of important features. This is known as 'representation bottleneck'. To avoid this you can insert a Dense layer with (say 64 neuron) which will first project 1024 dim vector → 64 dim vector and then from 64 dim → 10 dim vector. If you use global max/ avg pooing then you can skip the additional Dense layer. In your case it seems that the representational bottleneck is avoided.
The tutorial is using
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
Tensorflow has efficient implementation for logits calculation. This way, you need not use softmax in the layer. It will automatically optimize it as if you used softmax.
But if you still wish to use softmax in the Dense layer then you can use it. but then in the compile() use from_logits=False. However, the later approach is less efficient as it requires double work.
The purpose of a dense layer or a fully connected layer before the final dense layer is to give weights or it votes to select the most appropriate label before selecting in the final layer. In this case of the image below adding a few more neurons to select the label cat
Check this link out for a deeper understanding of fc layers: https://missinglink.ai/guides/convolutional-neural-networks/fully-connected-layers-convolutional-neural-networks-complete-guide/
A softmax layer typically maps the predictions(logits) into a more understandable format where's each value in the tensor can add up to become 1
[1.6e-7, 1.6e-8, 1.6e-9, 1.6e-10] # Before applying softmax
[0.6, 0.1, 0.2, 0.1] # After applying softmax
Note: The typical way of using the predictions is getting the highest value with the tensor
import numpy as np
preds = model.predict(batch_data)
highest_val = np.argmax(preds) # returns an index, in this case 0

Cant build a CNN with keras for vectors - problem with dimensions

Let us say that I build an extreamly simple CNN with Keras to classify vectors.
My input (X_train) is a matrix in which each row is a vector and each column is a feature. My input labels (y_train) is matrix where each line is a one hot encoded vector. This is a binary classifier.
my CNN is built as follows:
model = Sequential()
model.add(Conv1D(64,3))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dense(2))
model.add(Activation('sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', matrics =
['accuracy'])
model.fit(X_train,y_train,batch_size = 32)
But when I try to run this code, I get back this error message:
Input 0 is incompatible with layer conv1d_23: expected ndim=3, found
ndim=2
why would keras expect 3 dims? one dim for samples, and one for features. And more importantly, how can I fix this?
X_train is suppose to have the shape: (batch_size, steps, input_dim), see documentation. It seems like you are missing one of the dimensions.
I would guess input_dim in your case is 1 and that is why it is missing. If so, change the
model.fit
line to
model.fit(tf.expand_dims(X_train,-1), y_train,batch_size = 32)
Your code is not a minimal working example, so I am not able to verify if that is the only problem, but this should hopefully fix your current error message.
A Conv1D layer expects an input with shape (samples, width, channels), so this does not match your input data, producing an error.
The convolution operation is done on the width dimension, so assuming that you want to do convolution on what you call features, then you should reshape your data to add a dummy channels dimension with a value of one:
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1))

Keras: difference of InputLayer and Input

I made a model using Keras with Tensorflow. I use Inputlayer with these lines of code:
img1 = tf.placeholder(tf.float32, shape=(None, img_width, img_heigh, img_ch))
first_input = InputLayer(input_tensor=img1, input_shape=(img_width, img_heigh, img_ch))
first_dense = Conv2D(16, 3, 3, activation='relu', border_mode='same', name='1st_conv1')(first_input)
But I get this error:
ValueError: Layer 1st_conv1 was called with an input that isn't a symbolic tensor. Received type: <class 'keras.engine.topology.InputLayer'>. Full input: [<keras.engine.topology.InputLayer object at 0x00000000112170F0>]. All inputs to the layer should be tensors.
When I use Input like this, it works fine:
first_input = Input(tensor=img1, shape=(224, 224, 3), name='1st_input')
first_dense = Conv2D(16, 3, 3, activation='relu', border_mode='same', name='1st_conv1')(first_input)
What is the difference between Inputlayer and Input?
InputLayer is a layer.
Input is a tensor.
You can only call layers passing tensors to them.
The idea is:
outputTensor = SomeLayer(inputTensor)
So, only Input can be passed because it's a tensor.
Honestly, I have no idea about the reason for the existence of InputLayer. Maybe it's supposed to be used internally. I never used it, and it seems I'll never need it.
According to tensorflow website, "It is generally recommend to use the functional layer API via Input, (which creates an InputLayer) without directly using InputLayer."
Know more at this page here
Input: Used for creating a functional model
inp=tf.keras.Input(shape=[?,?,?])
x=layers.Conv2D(.....)(inp)
Input Layer: used for creating a sequential model
x=tf.keras.Sequential()
x.add(tf.keras.layers.InputLayer(shape=[?,?,?]))
And the other difference is that
When using InputLayer with the Keras Sequential model, it can be skipped by moving the input_shape parameter to the first layer after the InputLayer.
That is in sequential model you can skip the InputLayer and specify the shape directly in the first layer.
i.e From this
model = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(4,)),
tf.keras.layers.Dense(8)])
To this
model = tf.keras.Sequential([
tf.keras.layers.Dense(8, input_shape=(4,))])
To define it in simple words:
keras.layers.Input is used to instantiate a Keras Tensor. In this case, your data is probably not a tf tensor, maybe an np array.
On the other hand, keras.layers.InputLayer is a layer where your data is already defined as one of the tf tensor types, i.e., can be a ragged tensor or constant or other types.
I hope this helps!