Simple ML Algo not working: ValueError: Error when checking input: expected dense_4_input to have shape (None, 5) but got array with shape (5, 1) - numpy

I have an incredible simple algorithm that is erroring with, "ValueError: Error when checking input: expected dense_4_input to have shape (None, 5) but got array with shape (5, 1)"....
Here is the code I am running.
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense
x = np.array([[1],[2],[3],[4],[5]])
y = np.array([[1],[2],[3],[4],[5]])
x_val = np.array([[6],[7]])
x_val = np.array([[6],[7]])
model = Sequential()
model.add(Dense(1, input_dim=5))
model.compile(optimizer='rmsprop', loss='mse')
model.fit(x, y, epochs=2, validation_data=(x_val, y_val))

There are two problems:
First: As the output already says: "ValueError: Error when checking input: expected dense_4_input to have shape (None, 5) but got array with shape (5, 1)" This means, that the Neural Network expects an array of shape (*, 5). With the asterisk I want to indicate that the dimensions is free to choose by the user. Say if you have tons of data and every example is a vector of shape (1, 5) you can stack them all underneath and pass one big chunk of data to the neural net, it will know how to handle it. Therefore you have to make x a row vector as follows:
x = np.array([[1,2,3,4,5]])
See also in the Keras docs- Specifying the input shape.
Second: You specify the output of the first Layer to be one. This means, the 5 dimensional input will be connected to only one neuron. Your output vector y however has 5 values. So your output vector dimension and your neural net output don't fit together.
So you have to go with a scalar y:
y = np.array([1])
Furthermore, your validation data and training data should have the same dimensions. Additionaly there is a typo in your code: y_val is never defined.

Related

Tensorflow Keras output layer shape weird error

I am fairly new to TF, Keras and ML in general.
I am trying to implement a very simple MLP with an input shape of (batch_size,3,2) and an output shape of (batch_size,3), that is (if I got it right): for every 3x2 feature, there is a corresponding 3 value array label.
Here is how I create the model:
model = tf.keras.Sequential([
tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
tf.keras.layers.Dense(3)
])
and these are the X and y shapes:
X_train.shape,y_train.shape
TensorShape([64,3,2]),TensorShape([64,3])
On model.fit I am facing a weird error I cannot understand:
ValueError: Dimensions must be equal, but are 3 and 32 for ... with input shapes: [32,3,3] and [32,3]
I have no clue what's going on, I understand the batch size is 32, but where does that [32,3,3] comes from?
Moreover, if from the original 64, I lower the number (shapes) of X_train and y_train, say, to: (19,3,2) and (19,3), I get the following error instead:
InvalidArgumentError: required broadcastable shapes at loc(unknown)
What's even more weird for me is that if I specify a single unit for the output (last) layer, instead of 3 like this:
model = tf.keras.Sequential([
tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
tf.keras.layers.Dense(1)
])
model.fit works, but the predictions have shape (1,3,1) instead of my expected (3,)
I am very confused.
Whenever you have not any idea about the journey of data throughout your model, use model.summary() to see the details and what happens to the shape of data in each layer.
In this case, the input is a 2D array, and the output is a 1D array, and you just used dense layers. Dense layers can not handle 2d features in nature. For example for an image as input, you can not feed it directly to a dense layer. Instead you should use other layers such as Conv2D or Flatten your input (make it 1D) before feeding your data to the dense layer. Otherwise you will get the other dimension in the output.
Inference: If your input dimension and output dimension differs, somewhere in your model, the shape need to be changed. Most common ways to do so, is using a Flatten layer or GlobalAveragePooling and so on.
When you pass an input to a dense layer, the input should be flattened first. There are 2 ways to deal with this:
Way 1: Adding a flatten input as a first layer of your model:
model = Sequential()
model.add(Flatten(input_shape=(3,2)))
model.add(Dense(50, 'relu'))
model.add(Dense(3))
Way 2: Converting the 2D array to 1D before passing the inputs to your model:
X_train = tf.reshape(X_train, shape=([6]))
or
X_train = tf.reshape(X_train, shape=((6,)))
Then change the input shape of the first layer as:
model.add(Dense(50, 'relu', input_shape=(6,))

Why is it is asking for labels to have some other shape?

Hello I am trying to get an output of an array of 7 classes. But when I run my code it says that it expects my data output labels to have some other shape. Here is my code -
def make_model(self):
self.model.add(InceptionV3(include_top=False,
input_shape=(self.WIDTH, self.HEIGHT, 3),
weights="imagenet"))
self.model.add(Dense(7, activation='softmax'))
self.model.layers[0].trainable = False
My model compilation and fitment part
def train(self):
self.model.compile(optimizer=self.optimizer, loss='mse', metrics=['accuracy'])
self.model.fit(x=x, y=y, batch_size=64,
validation_split=0.15, shuffle=True, epochs=self.epochs,
callbacks=[self.tensorboard, self.reducelr])
I get the error -
File "model.py", line 60, in train
callbacks=[self.tensorboard, self.reducelr])
ValueError: A target array with shape (23639, 7) was passed for an output of shape (None, 6, 13, 7) while using as loss `mean_squared_error`. This loss expects targets to have the same shape as the output.
Now here it is saying that it expected (None, 6, 13, 7) however i gave it labels - (23639, 7)
Now we can clearly see that in the self.model.add(Dense(7, activation='softmax')) I have specified 7 as the number of output categories
Here is the model summary -
So can someone tell me what is wrong here
By the way i did try using categorical_crossentropy to see if it makes a difference but it didn't.
In case you wanted the full code -
Full Code
The problem is in the output of the InceptionV3... it returns 4D sequences, you need to reduce the dimensionality before the final dense layer in order to match the target dimensionality (2D). you can do this using Flatten or GlobalPooling layers.
If yours is a classification problem I also recommend you use categorical_crossentropy (if you have one-hot encoded label) or sparse_categorical_crossentropy (if u have integer encoded labels). mse is suited for regression problems

issue with fitting data with TensorFlow Keras

I am attempting to create a model for deciphering hand written text. The issue I am encountering right now is feeding my data to the model.
I start out with a list of file names with each file as a picture. I also have a list of labels for each.
I then iterate through the file names and load those images.
for i in range(len(images)):
print(len(images) - i)
images[i] = np.array(cv2.imread(images[i]))
I then compile the model. And feed the lists to it as such.
self.model.fit(np.array(imgs), np.array(labels), epochs=10, validation_data=(np.array(test_images), np.array(test_labels)), callbacks=[checkpoint])
I get this error:
ValueError: Error when checking input: expected conv2d_1_input to have 4 dimensions, but got array with shape (80, 1)
My np array of images is size (80, 1), which is what I thought I was supposed to be feeding to the model, but I am confused as to why it is complaining.
Conv2D expects input:
Input shape: 4D tensor with shape: (batch_size, channels, rows, cols)
if data_format='channels_first' or 4D tensor with shape: (batch_size,
rows, cols, channels) if data_format='channels_last'.
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D
What you should be feeding the model should have the shape (batch_size, h, w, c)
where h, w, c are height, width and number of channels in the image respectively.
The problem could be that cv2 cannot find the images in which case it will not through an error it will simply return none resulting in the shape that you got here (80, 1). You could add a check in the for loop for none values as a start and try to get the right path for your images

RNN with multiple input sequences for each target

A standard RNN computational graph looks like follows (In my case, for regression to a single scalar value y)
I want to construct a network which accepts as input m sequences X_1...X_m (where both m and sequence lengths vary), runs the RNN on each sequence X_i to obtain a representation vector R_i, averages the representations and then runs a fully connected net to compute the output y_hat. Computational graph should look something like this:
Question
Can this be implemented (preferably) in Keras? Otherwise in TensorFlow? I'd very much appreciate if someone can point me to a working implementation of this or something similar.
There isn't a straightforward Keras implementation, as Keras enforces the batch axis (sampels dimension, dimension 0) as fixed for the input & output layers (but not all layers in-between) - whereas you seek to collapse it by averaging. There is, however, a workaround - see below:
import tensorflow.keras.backend as K
from tensorflow.keras.layers import Input, Dense, GRU, Lambda
from tensorflow.keras.layers import Reshape, GlobalAveragePooling1D
from tensorflow.keras.models import Model
from tensorflow.keras.utils import plot_model
import numpy as np
def make_model(batch_shape):
ipt = Input(batch_shape=batch_shape)
x = Lambda(lambda x: K.squeeze(x, 0))(ipt)
x, s = GRU(4, return_state=True)(x) # s == last returned state
x = Lambda(lambda x: K.expand_dims(x, 0))(s)
x = GlobalAveragePooling1D()(x) # averages along axis1 (original axis2)
x = Dense(32, activation='relu')(x)
out = Dense(1, activation='sigmoid')(x)
model = Model(ipt, out)
model.compile('adam', 'binary_crossentropy')
return model
def make_data(batch_shape):
return (np.random.randn(*batch_shape),
np.random.randint(0, 2, (batch_shape[0], 1)))
m, timesteps = 16, 100
batch_shape = (1, m, timesteps, 1)
model = make_model(batch_shape)
model.summary() # see model structure
plot_model(model, show_shapes=True)
x, y = make_data(batch_shape)
model.train_on_batch(x, y)
Above assumes the task is binary classification, but you can easily adapt it to anything else - the main task's tricking Keras by feeding m samples as 1, and the rest of layers can freely take m instead as Keras doesn't enforce the 1 there.
Note, however, that I cannot guarantee this'll work as intended per the following:
Keras treats all entries along the batch axis as independent, whereas your samples are claimed as dependent
Per (1), the main concern is backpropagation: I'm not really sure how gradient will flow with all the dimensionality shuffling going on.
(1) is also consequential for stateful RNNs, as Keras constructs batch_size number of independent states, which'll still likely behave as intended as all they do is keep memory, but still worth understanding fully - see here
(2) is the "elephant in the room", but aside that, the model fits your exact description. Chances are, if you've planned out forward-prop and all dims agree w/ code's, it'll work as intended - else, and also for sanity-check, I'd suggest opening another question to verify gradients flow as you intend them to per above code.
model.summary():
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(1, 32, 100, 1)] 0
_________________________________________________________________
lambda (Lambda) (32, 100, 1) 0
_________________________________________________________________
gru (GRU) [(32, 16), (32, 16)] 864
_________________________________________________________________
lambda_1 (Lambda) (1, 32, 16) 0
_________________________________________________________________
global_average_pooling1d (Gl (1, 16) 0
_________________________________________________________________
dense (Dense) (1, 8) 136
_________________________________________________________________
dense_1 (Dense) (1, 1) 9
On LSTMs: will return two last states, one for cell state, one for hidden state - see source code; you should understand what this exactly means if you are to use it. If you do, you'll need concatenate:
from tensorflow.keras.layers import concatenate
# ...
x, s1, s2 = LSTM(return_state=True)(x)
x = concatenate([s1, s2], axis=-1)
# ...

The input dimension of the LSTM layer in Keras

I'm trying keras.layers.LSTM.
The following code works.
#!/usr/bin/python3
import tensorflow as tf
import numpy as np
from tensorflow import keras
data = np.array([1, 2, 3]).reshape((1, 3, 1))
x = keras.layers.Input(shape=(3, 1))
y = keras.layers.LSTM(10)(x)
model = keras.Model(inputs=x, outputs=y)
print (model.predict(data))
As shown above, the input data shape is (1, 3, 1), and the actual input shape in the Input layer is (3, 1). I'm a little bit confused about this inconsistency of the dimension.
If I use the following shape in the Input layer, it doesn't work:
x = keras.layers.Input(shape=(1, 3, 1))
The error message is as follows:
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1, 3, 1]
It seems that the rank of the input must be 3, but why should we use a rank-2 shape in the Input layer?
Keras works with "batches" of "samples". Since most models use variable batch sizes that you define only when fitting, for convenience you don't need to care about the batch dimension, but only with the sample dimension.
That said, when you use shape = (3,1), this is the same as defining batch_shape = (None, 3, 1) or batch_input_shape = (None, 3, 1).
The three options mean:
A variable batch size: None
With samples of shape (3, 1).
It's important to know this distinction especially when you are going to create custom layers, losses or metrics. The actual tensors all have the batch dimension and you should take that into account when making operations with tensors.
Check out the documentation for tf.keras.Input. The syntax is as-
tf.keras.Input(
shape=None,
batch_size=None,
name=None,
dtype=None,
sparse=False,
tensor=None,
**kwargs
)
shape: defines the shape of a single sample, with variable batch size.
Notice, that it expects the first value as batch_size otherwise pass batch_size as a parameter explicitly