Layer "model" expects 2 input(s), but it received 1 input tensors - multimodal

I built a vqa model, and set two inputs(images, questions).
It was well trained with train/val datasets, but with test_dataset, it keep printing errors like below;
ValueError: Layer "model" expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(224, 224, 3) dtype=float32>]
The variables I used are;
test_qt
test_it
test_qt and test_it are both lists of tensors..
I built a dataset with this code;
test_ds = tf.data.Dataset.from_tensor_slices((test_it, test_qt))
I also tried to directly give each input separately but got this error.
ValueError: Data cardinality is ambiguous:
x sizes: 224, 224, 302, 302
Make sure all arrays contain the same number of samples.

Related

How Keras can calculate the number of parameters at early stage when there are still None dimensions?

Sorry for the very basic question (I'm new with Keras). I was wondering how Keras can calculate for each layer the number of parameters at an early stage (before fit) despite that model.summary shows that there are dimensions that still have None values at this stage. Are these values already determined in some way and if yes, why not show them in the summary?
I ask the question because I'm having a hard time figure out my "tensor shape bug" (I'm trying to determine the output dimensions of the the C5 block of my resnet50 model but I cannot see them in model.summary even if I see the number of parameters).
I give below an example based on C5_reduced layer in RetinaNet which is fed by C5 layer of Resnet50. The C5_reduced is
Conv2D(256,kernel_size=1,strides=1,pad=1)
Based on model.summary for this particular layer:
C5_reduced (Conv2D) (None, None, None, 256) 524544
I've made the guess that C5 is (None,1,1,2048) because 2048*256+256 = 524544 (I don't know how to confirm or infirm that hypothesis). So if it's already known, why not show it on summary? If dimensions 2 and 3 would have been different, the number of parameters would have been different too right?
If you pass exact input shape to your very first layer or input layer on your network, you will have the output that you want. For instance I used input layer here:
input_1 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
Passed input as (224,224,3). 3 represents the depth here. Note that convolutional parameters' calculation differ from Dense layers' calculation.
If you do such following:
tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(150, 150, 3))
You will see:
conv2d (Conv2D) ---> (None, 148, 148, 16)
Dimensions reduced to 148x148, in Keras padding is valid by default. Also strides is 1. Then the shape of output will be 148 x 148. (You can search for the formula.)
So then what are None values?
First None value is the batch size. In Keras first dimension is the batch size. You can pass them and make fixed, or you can determine them while fitting the model, or predicting.
In 2D convolution, the expected input is (batch_size, height, width, channels), you can also have shapes such as (None, None, None, 3), that means varying image sizes are allowed.
Edit:
tf.keras.layers.Input(shape = (None, None, 3)),
tf.keras.layers.Conv2D(16, (3,3), activation='relu')
Produces:
conv2d_21 (Conv2D) (None, None, None, 16) 448
Regarding to your question, how are the parameters calculated even we passed image height & width as None?
Convolution parameters calculated according to:
(filter_height * filter_width * input_image_channels + 1) * number_of_filters
When we put them into formula,
filter_height = 3
filter_width = 3
input_image_channel = 3
number_of_filters = 16
Parameters = (3 x 3 x 3 + 1) * 16 = 28 * 16 = 448
Notice, we only needed input_image's channel number which is 3, representing that it is an RGB image.
If you want to calculate the params for later convolutions, you need to consider that the number of filters from previous layer becomes the number of channels for current layer's channel.
That's how you can end up having None params rather than batch_size. Keras needs to know if your image is RGB or not in that case. Or you won't specify the dimensions while creating the model and can pass them while fitting the model with the dataset.
You need to define an input layer for your model. The total number of trainable parameters is unknown until you either a) compile the model and feed it data, at which point the model makes a graph based on the dimensions of the input and you will then be able to determine the number of params, or b) you define an input layer for the model with the input dimensions stated, then you can find the number of params with model.summary().
The point is that the model cannot know the number of parameters between the input and first hidden layer until it is defined, or you run inference and give it the shape of the input.

Keras BatchNormalization layer incompatibility error

I have the following (part of) network architecture:
Obtained by
...
pool = GlobalAvgPool()(gc_2)
predictions = Dense(units=32, activation='relu', use_bias=False)(pool)
predictions = BatchNormalization()(predictions)
...
I am trying to insert a batch normalization layer, but I get the following error:
ValueError: Input 0 of layer batch_normalization_1 is incompatible with the layer: expected ndim=2, found ndim=3. Full shape received: [None, 1, 32]
I am guessing the second dimension is causing this mishap. Is there any way I can get rid of it?
If your model is complied successfully, there is no problem with your model definition.
This is more likely to happen because of the input data shape and dimensions are incompatible with your model's desired input shape.
expected ndim=2, found ndim=3. means that the model requires a 2D tensor with

Why is it is asking for labels to have some other shape?

Hello I am trying to get an output of an array of 7 classes. But when I run my code it says that it expects my data output labels to have some other shape. Here is my code -
def make_model(self):
self.model.add(InceptionV3(include_top=False,
input_shape=(self.WIDTH, self.HEIGHT, 3),
weights="imagenet"))
self.model.add(Dense(7, activation='softmax'))
self.model.layers[0].trainable = False
My model compilation and fitment part
def train(self):
self.model.compile(optimizer=self.optimizer, loss='mse', metrics=['accuracy'])
self.model.fit(x=x, y=y, batch_size=64,
validation_split=0.15, shuffle=True, epochs=self.epochs,
callbacks=[self.tensorboard, self.reducelr])
I get the error -
File "model.py", line 60, in train
callbacks=[self.tensorboard, self.reducelr])
ValueError: A target array with shape (23639, 7) was passed for an output of shape (None, 6, 13, 7) while using as loss `mean_squared_error`. This loss expects targets to have the same shape as the output.
Now here it is saying that it expected (None, 6, 13, 7) however i gave it labels - (23639, 7)
Now we can clearly see that in the self.model.add(Dense(7, activation='softmax')) I have specified 7 as the number of output categories
Here is the model summary -
So can someone tell me what is wrong here
By the way i did try using categorical_crossentropy to see if it makes a difference but it didn't.
In case you wanted the full code -
Full Code
The problem is in the output of the InceptionV3... it returns 4D sequences, you need to reduce the dimensionality before the final dense layer in order to match the target dimensionality (2D). you can do this using Flatten or GlobalPooling layers.
If yours is a classification problem I also recommend you use categorical_crossentropy (if you have one-hot encoded label) or sparse_categorical_crossentropy (if u have integer encoded labels). mse is suited for regression problems

The input dimension of the LSTM layer in Keras

I'm trying keras.layers.LSTM.
The following code works.
#!/usr/bin/python3
import tensorflow as tf
import numpy as np
from tensorflow import keras
data = np.array([1, 2, 3]).reshape((1, 3, 1))
x = keras.layers.Input(shape=(3, 1))
y = keras.layers.LSTM(10)(x)
model = keras.Model(inputs=x, outputs=y)
print (model.predict(data))
As shown above, the input data shape is (1, 3, 1), and the actual input shape in the Input layer is (3, 1). I'm a little bit confused about this inconsistency of the dimension.
If I use the following shape in the Input layer, it doesn't work:
x = keras.layers.Input(shape=(1, 3, 1))
The error message is as follows:
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1, 3, 1]
It seems that the rank of the input must be 3, but why should we use a rank-2 shape in the Input layer?
Keras works with "batches" of "samples". Since most models use variable batch sizes that you define only when fitting, for convenience you don't need to care about the batch dimension, but only with the sample dimension.
That said, when you use shape = (3,1), this is the same as defining batch_shape = (None, 3, 1) or batch_input_shape = (None, 3, 1).
The three options mean:
A variable batch size: None
With samples of shape (3, 1).
It's important to know this distinction especially when you are going to create custom layers, losses or metrics. The actual tensors all have the batch dimension and you should take that into account when making operations with tensors.
Check out the documentation for tf.keras.Input. The syntax is as-
tf.keras.Input(
shape=None,
batch_size=None,
name=None,
dtype=None,
sparse=False,
tensor=None,
**kwargs
)
shape: defines the shape of a single sample, with variable batch size.
Notice, that it expects the first value as batch_size otherwise pass batch_size as a parameter explicitly

Feed tensorflow or keras neural nets input with custom dimensions

I would like to feed a neural net inputs of following shape:
Each training entry is a 2D array with dimensions 700x10. There are in total 204 training entries.
Labels is just 1-dimensional array of size 204 (binary output)
I tried to just use Dense layers:
model = Sequential()
model.add(Dense(300, activation='relu', input_shape=(700, 10)))
model.add(Dense(1, activation='sigmoid'))
But then I am getting following error (not related to input_shape on the first layer, but during validation of output):
ValueError: Error when checking target: expected dense_2 to have 3 dimensions, but got array with shape (204, 1)
204 - amount of training data.
Stacktrace:
model.fit(xTrain, yTrain, epochs=4, batch_size=6)
File "keras\models.py", line 867, in fit
initial_epoch=initial_epoch)
File "keras\engine\training.py", line 1522, in fit
batch_size=batch_size)
File "keras\engine\training.py", line 1382, in _standardize_user_data
exception_prefix='target')
File "keras\engine\training.py", line 132, in _standardize_input_data
What I found out while debugging Keras code:
It fails during validation before training. It validates output array.
According to the neural network structure, first Dense layer produces somehow 700, 1 dimensional output and it fails afterwards, since my output is just 1-d array with 204 in it.
How do I overcome this issue? I tried to add Flatten() after Dense() layer, but it probably affects accuracy in a bad way: I would like to keep information specific to one point from 700 array grouped.
The Dense layers works on only one dimension, the last.
If you're inputting (700,10) to it, it will output (700,units). Check your model.summary() to see this.
A simple solution is to flatten your data before applying dense:
model.add(Flatten(input_shape=(700,10)))
model.add(Dense(300,...))
model.add(Dense(1,...))
This way, the Dense layer will see a simple (7000,) input.
Now if you do want your model to understand those 2 dimensions separately, you should perhaps try more elaborated structures. What to do will depend a lot on what your data is and what you want to do, how you want your model to understand it, etc.