I currently have a simple LSTM model implemented in Keras, with a training set x_train of dimensions (1,18227, 98) and a test set x_test of dimensions (1,3217, 98) timesteps/features, respectively. Currently the model is training without hitch, but when I attempt to evaluate using my test set, I receive this error:
File "keras_LSTM.py", line 170, in <module>
loss, f1, precision = model.evaluate(x_test, y_test,
batch_size=batch_size)
File
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-
packages/keras/engine/training.py", line 1102, in evaluate
batch_size=batch_size)
File
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/engine/training.py", line 751, in _standardize_user_data
exception_prefix='input')
File
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-
packages/keras/engine/training_utils.py", line 138, in
standardize_input_data
str(data_shape))
ValueError: Error when checking input: expected lstm_1_input to have
shape (18227, 98) but got array with shape (3217, 98)
Any help would be greatly appreciated - will provide code if needed. It should also be noted that my input shapes are 3 dimensional - however the error report omits the batch_size dimension and output a tuple of (sequence_length, feature_number).
Keras LSTM layer expects the input to be 3 dims as (batch_size, seq_length, input_dims), but you have assigned it wrongly. Try this
First reshape your data as:
(1,18227, 98) and a test set x_test of dimensions (1,3217, 98)
X_train = x_train.reshape(-1,98)
X_test = x_test.reshape(-1,98)
Now, use choose seq_length, i chose 10.
seq_length = 10
X_train1 = []
X_test1 = []
for i in range(0, X_train.shape[0] - seq_length, 1):
X_train1.append(X_train[i:i+seq_length])
X_test1.append(X_test[i:i+seq_length])
# labels.append(labels[i+seq_length-1])
import numpy as np
X_train1 = np.reshape(X_train1, (-1, seq_length, 98))
X_test1 = np.reshape(X_test1, (-1, seq_length, 98))
Now, you are good to go
input_dims = 98 # an integer
seq_length = 10 # an integer
model = Sequential()
model.add(LSTM(128, activation='relu', input_shape=(seq_length, input_dims), return_sequences=True))
You were using single sequence, for your model, which is ineffective way.
Ankish Bansal's idea of splitting the long sequence in small windows may be a good approach. But you might want to keep the entire sequence connected for some reason.
In that case, you should set your input_shape=(None,98), this way, your model accepts any sequence length. (Provided you don't use any Flatten layer or others that require fixed dimensions.)
But, if there is "more than one sequence" in your data, you should probably review everything, because the number of sequences usually should be the batch size.
Related
I am experimenting with stateful LSTM on a time-series regression problem by using TensorFlow. I apologize that I cannot share the dataset.
Below is my code.
train_feature = train_feature.reshape((train_feature.shape[0], 1, train_feature.shape[1]))
val_feature = val_feature.reshape((val_feature.shape[0], 1, val_feature.shape[1]))
batch_size = 64
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(50, batch_input_shape=(batch_size, train_feature.shape[1], train_feature.shape[2]), stateful=True))
model.add(tf.keras.layers.Dense(1))
model.compile(optimizer='adam',
loss='mse',
metrics=[tf.keras.metrics.RootMeanSquaredError()])
model.fit(train_feature, train_label,
epochs=10,
batch_size=batch_size)
When I run the above code, after the end of the first epoch, I will get an error as follows.
InvalidArgumentError: [_Derived_] Invalid input_h shape: [1,64,50] [1,49,50]
[[{{node CudnnRNN}}]]
[[sequential_1/lstm_1/StatefulPartitionedCall]] [Op:__inference_train_function_1152847]
Function call stack:
train_function -> train_function -> train_function
However, the model will be successfully trained if I change the batch_size to 1, and change the code for model training to the following.
total_epochs = 10
for i in range(total_epochs):
model.fit(train_feature, train_label,
epochs=1,
validation_data=(val_feature, val_label),
batch_size=batch_size,
shuffle=False)
model.reset_states()
Nevertheless, with a very large data (1 million rows), the model training will take a very long time since the batch_size is 1.
So, I wonder, how to train a stateful LSTM with a batch size larger than 1 (e.g. 64), without getting the invalid input_h shape error?
Thanks for your answers.
The fix is to ensure batch size never changes between batches. They must all be the same size.
Method 1
One way is to use a batch size that perfectly divides your dataset into equal-sized batches. For example, if total size of data is 1500 examples, then use a batch size of 50 or 100 or some other proper divisor of 1500.
batch_size = len(data)/proper_divisor
Method 2
The other way is to ignore any batch that is less than the specified size, and this can be done using the TensorFlow Dataset API and setting the drop_remainder to True.
batch_size = 64
train_data = tf.data.Dataset.from_tensor_slices((train_feature, train_label))
train_data = train_data.repeat().batch(batch_size, drop_remainder=True)
steps_per_epoch = len(train_feature) // batch_size
model.fit(train_data,
epochs=10, steps_per_epoch = steps_per_epoch)
When using the Dataset API like above, you will need to also specify how many rounds of training count as an epoch (essentially how many batches to count as 1 epoch). A tf.data.Dataset instance (the result from tf.data.Dataset.from_tensor_slices) doesn't know the size of the data that it's streaming to the model, so what constitutes as one epoch has to be manually specified with steps_per_epoch.
Your new code will look like this:
train_feature = train_feature.reshape((train_feature.shape[0], 1, train_feature.shape[1]))
val_feature = val_feature.reshape((val_feature.shape[0], 1, val_feature.shape[1]))
batch_size = 64
train_data = tf.data.Dataset.from_tensor_slices((train_feature, train_label))
train_data = train_data.repeat().batch(batch_size, drop_remainder=True)
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(50, batch_input_shape=(batch_size, train_feature.shape[1], train_feature.shape[2]), stateful=True))
model.add(tf.keras.layers.Dense(1))
model.compile(optimizer='adam',
loss='mse',
metrics=[tf.keras.metrics.RootMeanSquaredError()])
steps_per_epoch = len(train_feature) // batch_size
model.fit(train_data,
epochs=10, steps_per_epoch = steps_per_epoch)
You can also include the validation set as well, like this (not showing other code):
batch_size = 64
val_data = tf.data.Dataset.from_tensor_slices((val_feature, val_label))
val_data = val_data.repeat().batch(batch_size, drop_remainder=True)
validation_steps = len(val_feature) // batch_size
model.fit(train_data, epochs=10,
steps_per_epoch=steps_per_epoch,
validation_steps=validation_steps)
Caveat: This means a few datapoints will never be seen by the model. To get around that, you can shuffle the dataset each round of training, so that the datapoints left behind each epoch changes, giving everyone a chance to be seen by the model.
buffer_size = 1000 # the bigger the slower but more effective shuffling.
train_data = tf.data.Dataset.from_tensor_slices((train_feature, train_label))
train_data = train_data.shuffle(buffer_size=buffer_size, reshuffle_each_iteration=True)
train_data = train_data.repeat().batch(batch_size, drop_remainder=True)
Why the error occurs
Stateful RNNs and their variants (LSTM, GRU, etc.) require fixed batch size. The reason is simply because statefulness is one way to realize Truncated Backprop Through Time, by passing the final hidden state for a batch as the initial hidden state of the next batch. The final hidden state for the first batch has to have exactly the same shape as the initial hidden state of the next batch, which requires that batch size stay the same across batches.
When you set the batch size to 64, model.fit will use the remaining data at the end of an epoch as a batch, and this may not have up to 64 datapoints. So, you get such an error because the batch size is different from what the stateful LSTM expects. You don't have the problem with batch size of 1 because any remaining data at the end of an epoch will always contain exactly 1 datapoint, so no errors. More generally, 1 is always a divisor of any integer. So, if you picked any other divisor of your data size, you should not get the error.
In the error message you posted, it appears the last batch has size of 49 instead of 64. On a side note: The reason the shapes look different from the input is because, under the hood, keras works with the tensors in time_major (i.e. the first axis is for steps of sequence). When you pass a tensor of shape (10, 15, 2) that represents (batch_size, steps_per_sequence, num_features), keras reshapes it to (15, 10, 2) under the hood.
Hello I am trying to get an output of an array of 7 classes. But when I run my code it says that it expects my data output labels to have some other shape. Here is my code -
def make_model(self):
self.model.add(InceptionV3(include_top=False,
input_shape=(self.WIDTH, self.HEIGHT, 3),
weights="imagenet"))
self.model.add(Dense(7, activation='softmax'))
self.model.layers[0].trainable = False
My model compilation and fitment part
def train(self):
self.model.compile(optimizer=self.optimizer, loss='mse', metrics=['accuracy'])
self.model.fit(x=x, y=y, batch_size=64,
validation_split=0.15, shuffle=True, epochs=self.epochs,
callbacks=[self.tensorboard, self.reducelr])
I get the error -
File "model.py", line 60, in train
callbacks=[self.tensorboard, self.reducelr])
ValueError: A target array with shape (23639, 7) was passed for an output of shape (None, 6, 13, 7) while using as loss `mean_squared_error`. This loss expects targets to have the same shape as the output.
Now here it is saying that it expected (None, 6, 13, 7) however i gave it labels - (23639, 7)
Now we can clearly see that in the self.model.add(Dense(7, activation='softmax')) I have specified 7 as the number of output categories
Here is the model summary -
So can someone tell me what is wrong here
By the way i did try using categorical_crossentropy to see if it makes a difference but it didn't.
In case you wanted the full code -
Full Code
The problem is in the output of the InceptionV3... it returns 4D sequences, you need to reduce the dimensionality before the final dense layer in order to match the target dimensionality (2D). you can do this using Flatten or GlobalPooling layers.
If yours is a classification problem I also recommend you use categorical_crossentropy (if you have one-hot encoded label) or sparse_categorical_crossentropy (if u have integer encoded labels). mse is suited for regression problems
I want to get the value of an intermediate tensor in a convolutional neural network for a specific input. I know how to do this in keras and even though I have trained a model using keras, I'm going to move towards constructing and training the model using only tensorflow. Therefore, I want to move away from something like K.function(input_layer, output_layer) which is fairly simple, and instead use tensorflow. I believe I should use placeholder values, like the following approach:
with tf.compat.v1.Session(graph=tf.Graph()) as sess:
loaded_model = tf.keras.models.load_model(filepath)
graph = tf.compat.v1.get_default_graph()
images = tf.compat.v1.placeholder(tf.float32, shape=(None, 28, 28, 1)) # To specify input at MNIST images
output_tensor = graph.get_tensor_by_name(tensor_name) # tensor_name is 'dense_1/MatMul:0'
output = sess.run([output_tensor], feed_dict={images: x_test[0:1]}) # x_test[0:1] is of shape (1, 28, 28, 1)
print(output)
However, I get the following error message for the sess.run() line: Invalid argument: You must feed a value for placeholder tensor 'conv2d_2_input' with dtype float and shape [?,28,28,1]. I am unsure why I get this message because the image used for feed_dict is of type float and is what I believe to be the correct shape. Any help would be suggested.
You must use the input tensor from the Keras model, not make your own new placeholder, which would be disconnected from the rest of the model:
with tf.Graph().as_default(), tf.compat.v1.Session() as sess:
# Load model
loaded_model = tf.keras.models.load_model(filepath)
# Take model input tensor
images = loaded_model.input
# Take output of the second layer (index 1)
output_tensor = loaded_model.layers[1].output
# Evaluate
output = sess.run(output_tensor, feed_dict={images: x_test[0:1]})
print(output)
Getting a ValueError related to shape when passing Tensorflow Dataset into a Keras's model.fit function.
My dataset's X_train has shape (100 samples x 62 features) and Y_train is (100 samples x 1 label
Reproducible code below:
import numpy as np
from tensorflow.keras import layers, Sequential, optimizers
from tensorflow.data import Dataset
num_samples = 100
num_features = 62
num_labels = 1
batch_size = 32
steps_per_epoch = int(num_samples/batch_size)
X_train = np.random.rand(num_samples,num_features)
Y_train = np.random.rand(num_samples, num_labels)
final_dataset = Dataset.from_tensor_slices((X_train, Y_train))
model = Sequential()
model.add(layers.Dense(256, activation='relu',input_shape=(num_features,)))
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(num_labels, activation='softmax'))
model.compile(optimizer=optimizers.Adam(0.001), loss='categorical_crossentropy',metrics=['accuracy'])
history = model.fit(final_dataset,epochs=10,batch_size=batch_size,steps_per_epoch = steps_per_epoch)
The error is:
ValueError: Error when checking input: expected dense_input to have shape (62,) but got array with shape (1,)
Why is the dense_input getting an array with shape (1,)? I am clearly passing it an X_train of shape (n_samples, n_features).
Interestingly the error goes away if I were to apply a batch(some number) function to the dataset, but seems like I am missing something.
It's an intended behavior.
When you use Tensorflow Dataset , you shouldn't specify the batch_size in the fit method of 'Model'. Instead as you mentioned you have to generate the batches using the function with tensorflow dataset.
As mentioned here in the documentation
batch_size: Integer or None. Number of samples per gradient update. If unspecified, batch_size will default to 32. Do not specify the batch_size if your data is in the form of symbolic tensors, dataset, dataset iterators, generators, or keras.utils.Sequence instances (since they generate batches).
The classical behavior is therefore to do as you did: generate the batches with the dataset.
Also use repeat if you want to perform multiple epochs. On the .fit side you'll have to specify the steps_per_epoch to indicate how many batch is one epoch and epochs for your number of epochs.
I would like to do a transfer learning from the given inceptionV3 in tensorflow example. Following the classify image example and the operator and tensor names given here https://github.com/AKSHAYUBHAT/VisualSearchServer/blob/master/notebooks/notebook_network.ipynb I can create my graph. But when, I put a batch of images of size (100, 299, 299, 3) in the pre-computed inception graph, I get the following shape error at the pool_3 layer :
ValueError: Cannot reshape a tensor with 204800 elements to shape [1, 2048] (2048 elements)
It seems that this inceptionV3 graph doesn't accept image batch as input. am I wrong ?
Actually it works for transfer learning if you extract the right thing. There is no problem feeding a batch of images in the shape of [N, 299, 299, 3] as ResizeBilinear:0 and then using the pool_3:0 tensor. It's the reshaping afterwards that breaks, but you can reshape yourself (you'll have your own layers afterwards anyway). If you wanted to use the original classifier with a batch, you could add your own reshaping on top of pool_3:0 and then add the softmax layer, reusing the weights/biases tensors of the original softmax.
TLDR: With double_img being a stack of two images with shape (2, 299, 299, 3) this works:
pooled_2 = sess.graph.get_tensor_by_name("pool_3:0").eval(session=sess, feed_dict={'ResizeBilinear:0':double_img})
pooled_2.shape
# => (2, 1, 1, 2048)
You're not wrong. This seems like a very reasonable feature request, so I've opened a ticket for it on github. Follow that for updates.
Something like this should do it:
with g.as_default():
inputs = tf.placeholder(tf.float32, shape=[batch_size, 299, 299, 3],
name='input')
with slim.arg_scope(inception.inception_v3_arg_scope()):
logits, end_points = inception.inception_v3( inputs,
num_classes=FLAGS.num_classes, is_training=False)
variables_to_restore = lim.get_variables_to_restore(exclude=exclude)
sess = tf.Session()
saver = tf_saver.Saver(variables_to_restore)
Then you should be able to call the operation:
sess.run("pool_3:0",feed_dict={'ResizeBilinear:0':images})
etarion made a very good point. However, we don't have to reshape it ourselves; instead, we could change the value of shape that reshape takes as input. I.e.,
input_tensor_name = 'import/input:0'
shape_tensor_name = 'import/InceptionV3/Predictions/Shape:0'
output_tensor_name= 'import/InceptionV3/Predictions/Reshape_1:0'
output_tensor = tf.import_graph_def(
graph.as_graph_def(),
input_map={input_tensor_name: image_batch,
shape_tensor_name: [batch_size, num_class]},
return_elements=[output_tensor_name])
These tensor names are based on inception_v3_2016_08_28_frozen.pb.