I trained a simple MLP model using new tf.keras version 2.2.4-tf. Here is how the model look like:
input_layer = Input(batch_shape=(138, 28))
first_layer = Dense(30, activation=activation, name="first_dense_layer_1")(input_layer)
first_layer = Dropout(0.1)(first_layer, training=True)
second_layer = Dense(15, activation=activation, name="second_dense_layer")(first_layer)
out = Dense(1, name='output_layer')(second_layer)
model = Model(input_layer, out)
I'm getting an error when I try to do prediction prediction_result = model.predict(test_data, batch_size=138). The test_data has shape of (69, 28), so it is smaller than the batch_size which is 138. Here is the error, it seems like the issue comes from first dropout layer:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [138,30] vs. [69,30]
[[node model/dropout/dropout/mul_1 (defined at ./mlp_new_tf.py:471) ]] [Op:__inference_distributed_function_1700]
The same solution works with no issues in older version of keras (2.2.4) and tensorflow (1.12.0). How can I fix the issue? I don't have more data for test, so I can't change the test_data set to have more data points!
Since you are seeing the issue at prediction time, one way of getting around this would be to pad the test data to be a multiple of your batch size. It shouldn't slow down the prediction since the number of batches doesn't change. numpy.pad should do the trick.
Related
I'm trying to resize some images to use them with Inception. I want to do it as a separate preprocessing step to speed things up later. Running tf.image.resize on all of the images at once crashes, as does a loop.
I'd like to do it in batches, but I don't know how to do that with making it part of my neural network model - I want to keep it outside as a separate preprocessing step.
I made this:
inception = tf.keras.applications.inception_v3.InceptionV3(include_top=True, input_shape=(299, 299, 3))
inception = tf.keras.Model([inception.input], [inception.layers[-2].output])
for layer in inception.layers:
layer.trainable = False
resize_incept = tf.keras.Sequential([
tf.keras.layers.Resizing(299, 299),
inception])
resize_incept.compile()
So can I just call it on my images? But then how do I batch it? When I call it like resize_incept(images) it crashes (too big), but if I call resize_incept(images, batch_size = 25), it doesn't work (TypeError: call() got an unexpected keyword argument 'batch_size').
EDIT: I'm trying to figure out if I can use tf.data.dataset for this
EDIT2: I've put my data (which was an array of (batch, 32, 32, 3) into tf.data.dataset so I can try this:
train_dataset = train_dataset.map(lambda x, y: (resize_incept(x), y))
But when I try it, it give me this error:
ValueError: Input 0 is incompatible with layer model_1: expected shape=(None, 299, 299, 3), found shape=(299, 299, 3)
The problem seems to be that whatever is coming out of the resize layer is somehow wrong for going into the inception layer (because what I'm putting in at first is (32,32,3) and there are no complaints about those dimensions)? But the inception layer, already has input_shape=(299, 299, 3) so I would think that's what shape it would take?
If you want it as a seperate preprocessing step. I would recommend to use the image_dataset_from_directory() function included in keras preprocesssing.
tf.keras.preprocessing.image_dataset_from_directory(
directory,
labels="inferred",
label_mode="int",
class_names=None,
color_mode="rgb",
batch_size=32,
image_size=(256, 256),
shuffle=True,
seed=None,
validation_split=None,
subset=None,
interpolation="bilinear",
follow_links=False,
crop_to_aspect_ratio=False,
**kwargs)
You can manipulate the batch size and the wanted image size and choose wether to split the data or not. Also, be carefull with crop_to_aspect_ratio as you may lose important features from the cropped areas.
If you want to know more about it's parameters and a code example you could check the
image_dataset_from_directory documentation.
Good luck!
I have a plausible problem that currently fails to solve. During the training, my loss function explodes becomes inf or NaN, because the MSE of all errors becomes huge if the predictions (at the beginning of the training) are worse. And that is the normal intended behavior and correct. But, how do I train a ConvLSTM to which loss function to be able to learn a multi-step multi-variate output?
E.g. i try a (32, None, 200, 12) to predict (32, None, 30, 12). 32 is the batch size, None is the number of samples (>3000). 200 is the number of time steps, 12 features wide. 30 output time steps, 12 features wide.
My ConvLSTM model:
input = Input(shape=(None, input_shape[1]))
conv1d_1 = Conv1D(filters=64, kernel_size=3, activation=LeakyReLU())(input)
conv1d_2 = Conv1D(filters=32, kernel_size=3, activation=LeakyReLU())(conv1d_1)
dropout = Dropout(0.3)(conv1d_2)
lstm_1 = LSTM(32, activation=LeakyReLU())(dropout)
dense_1 = Dense(forecasting_horizon * input_shape[1], activation=LeakyReLU())(lstm_1)
output = Reshape((forecasting_horizon, input_shape[1]))(dense_1)
model = Model(inputs=input, outputs=output)
My ds generation:
ds_inputs = tf.keras.utils.timeseries_dataset_from_array(df[:-forecast_horizon], None, sequence_length=window_size, sequence_stride=1,
shuffle=False, batch_size=None)
ds_targets = tf.keras.utils.timeseries_dataset_from_array(df[forecast_horizon:], None, sequence_length=forecast_horizon, sequence_stride=1,
shuffle=False, batch_size=None)
ds_inputs = ds_inputs.batch(batch_size, drop_remainder=True)
ds_targets = ds_targets.batch(batch_size, drop_remainder=True)
ds = tf.data.Dataset.zip((ds_inputs, ds_targets))
ds = ds.shuffle(buffer_size=(len(ds)))
Besides MSE, I already tried MeanAbsoluteError, MeanSquaredLogarithmicError, MeanAbsolutePercentageError, CosineSimilarity. Where the last, produce non-sense. MSLE works best but does not favor large errors and therefore the MSE (used as metric has an incredible variation during training). Additionally, after a while, the Network becomes stale and gets no better loss (my explanation is that the difference in loss becomes too minor on the logarithmic scale and therefore the weights cannot be well adjusted).
I can partially answer my own question. One issue is that I used ReLu/LeakyReLu which will lead to exploding gradient problem because the RNN/LSTM Layer applies the same weights over time leading to exploding values as the values add up. Weights will not be reduced by any chance (ReLu min == 0). With Tanh as activation, it is possible to have negative values which also allow a reduction of the internal weights and really minimize the chance of exploding weights/predictions within the network. After some tests, the LSTM layer stays numerical stable.
I have the following (part of) network architecture:
Obtained by
...
pool = GlobalAvgPool()(gc_2)
predictions = Dense(units=32, activation='relu', use_bias=False)(pool)
predictions = BatchNormalization()(predictions)
...
I am trying to insert a batch normalization layer, but I get the following error:
ValueError: Input 0 of layer batch_normalization_1 is incompatible with the layer: expected ndim=2, found ndim=3. Full shape received: [None, 1, 32]
I am guessing the second dimension is causing this mishap. Is there any way I can get rid of it?
If your model is complied successfully, there is no problem with your model definition.
This is more likely to happen because of the input data shape and dimensions are incompatible with your model's desired input shape.
expected ndim=2, found ndim=3. means that the model requires a 2D tensor with
Let us say that I build an extreamly simple CNN with Keras to classify vectors.
My input (X_train) is a matrix in which each row is a vector and each column is a feature. My input labels (y_train) is matrix where each line is a one hot encoded vector. This is a binary classifier.
my CNN is built as follows:
model = Sequential()
model.add(Conv1D(64,3))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dense(2))
model.add(Activation('sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', matrics =
['accuracy'])
model.fit(X_train,y_train,batch_size = 32)
But when I try to run this code, I get back this error message:
Input 0 is incompatible with layer conv1d_23: expected ndim=3, found
ndim=2
why would keras expect 3 dims? one dim for samples, and one for features. And more importantly, how can I fix this?
X_train is suppose to have the shape: (batch_size, steps, input_dim), see documentation. It seems like you are missing one of the dimensions.
I would guess input_dim in your case is 1 and that is why it is missing. If so, change the
model.fit
line to
model.fit(tf.expand_dims(X_train,-1), y_train,batch_size = 32)
Your code is not a minimal working example, so I am not able to verify if that is the only problem, but this should hopefully fix your current error message.
A Conv1D layer expects an input with shape (samples, width, channels), so this does not match your input data, producing an error.
The convolution operation is done on the width dimension, so assuming that you want to do convolution on what you call features, then you should reshape your data to add a dummy channels dimension with a value of one:
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1))
I am currently learning Deep learning and Keras. When I am executing this code I am getting weird error: "TypeError: Unable to build Dense layer with non-floating point dtype " and I can't figure out what is the problem. What am I missing? How to fix this weird error?
The error show at the model.fit(...
def create_nerual_network():
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu)) # Simple Dense Layer
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu)) # Simple Dense Layer
model.add(tf.keras.layers.Dense(2, activation=tf.nn.softmax)) # Output layer
return model
train_images, train_labels = load_dataset() #this function works fine
model = create_nerual_network()
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(train_images, train_labels, epochs = 15, verbose=2)
train_loss, train_acc = model.evaluate(train_images, train_labels)
It is interesting that you do not specify your input shape anywhere before the model compilation but maybe newer versions of Keras can figure this out from provided input.
In which case I am quite certain that the problem is with train_images, look at what dtype is this array, it's probably int8 which is usual format for images as they use 8 bit integers for each color channel.
It is common practice to at least normalize your data before training and always convert it to float.
Try putting this before calling model.fit:
train_images = train_images / 256.
This will normalize your images into range [0, 1) and convert it to float array. It is possible that you have to convert to floats also your labels.