ValueError: Input 0 of layer is incompatible with the layer: expected shape=(None, 224, 224, 3), found shape=(224, 224, 3), what is the problem? - tensorflow

I am trying to build a machine learning model using pre-trained VGG16 with tensorflow, but I keep getting the same problem with the shape of the input. Compared to other public codes, the only difference is that I use a tf.data.dataset to share the data, instead of the DirectoryIterator of tf.image
Here is my code:
zip_ref = ZipFile(zip_file, 'r')
zip_ref.extractall(repository_dir)
zip_ref.close()
train_dir = os.path.join(repository_dir, "seg_train", "seg_train")
test_dir = os.path.join(repository_dir, "seg_test", "seg_test")
os.system(f"rm -r {os.path.join(repository_dir, 'seg_pred')}")
# load variables
validation_percentage = 0.2
label_mode = "int"
# for our model purposes
img_size = (224, 224)
color_mode='rgb'
data_train, data_val = image_dataset_from_directory(
train_dir,
batch_size=None,
label_mode=label_mode,
color_mode=color_mode,
image_size=img_size,
validation_split=validation_percentage,
subset="both",
seed=123,
)
data_test = image_dataset_from_directory(
test_dir,
batch_size=None,
label_mode=label_mode,
color_mode=color_mode,
image_size=img_size,
)
classes = data_train.class_names
print(classes)
scale = 1.0/255
normalization_layer = tf.keras.layers.Rescaling(scale)
data_train_norm = data_train.map(lambda x,y: (normalization_layer(x), y))
data_val_norm = data_val.map(lambda x,y: (normalization_layer(x), y))
data_test_norm = data_test.map(lambda x,y: (normalization_layer(x), y))
input_size = None
for img, label in data_train_norm.take(1).as_numpy_iterator():
input_size = img.shape
print(input_size)
base_model = VGG16(
input_shape=input_size, # Shape of our images
include_top = False, # Leave out the last fully connected layer
weights = 'imagenet'
)
# we do not train the parameters
for layer in base_model.layers:
layer.trainable = False
# Flatten the output layer to 1 dimension
x = layers.Flatten()(base_model.output)
# https://medium.com/analytics-vidhya/car-brand-classification-using-vgg16-transfer-learning-f219a0f09765
# FC layer very simple and with a softmax activation unit
x = layers.Dense(len(classes), activation="softmax")(x)
landscapeModel01 = Model(inputs=base_model.input, outputs=x, name="landscapeModel01")
loss = "sparse_categorical_crossentropy"
optimizer = "adam"
landscapeModel01.compile(
optimizer=optimizer,
loss=loss,
metrics=["loss","accuracy"]
)
#fit data
shuffle=True # variable
epochs=50 # variable, according if it is able to converge
batch_size = 200
print(landscapeModel01.input)
landscapeModel01.fit(
data_train_norm,
validation_data=data_val_norm,
epochs=epochs,
shuffle=shuffle,
batch_size=batch_size
)
and this is the error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In [10], line 8
4 batch_size = 200
6 print(landscapeModel01.input)
----> 8 landscapeModel01.fit(
9 data_train_norm,
10 validation_data=data_val_norm,
11 epochs=epochs,
12 shuffle=shuffle,
13 batch_size=batch_size
14 )
File ~/anaconda3/envs/faa/lib/python3.10/site-packages/keras/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.__traceback__)
68 # To get the full stack trace, call:
69 # `tf.debugging.disable_traceback_filtering()`
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
File /tmp/__autograph_generated_file8y_bf523.py:15, in outer_factory.<locals>.inner_factory.<locals>.tf__train_function(iterator)
13 try:
14 do_return = True
---> 15 retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
16 except:
17 do_return = False
ValueError: in user code:
File "/home/renan/anaconda3/envs/faa/lib/python3.10/site-packages/keras/engine/training.py", line 1160, in train_function *
return step_function(self, iterator)
File "/home/renan/anaconda3/envs/faa/lib/python3.10/site-packages/keras/engine/training.py", line 1146, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/home/renan/anaconda3/envs/faa/lib/python3.10/site-packages/keras/engine/training.py", line 1135, in run_step **
outputs = model.train_step(data)
File "/home/renan/anaconda3/envs/faa/lib/python3.10/site-packages/keras/engine/training.py", line 993, in train_step
y_pred = self(x, training=True)
File "/home/renan/anaconda3/envs/faa/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/renan/anaconda3/envs/faa/lib/python3.10/site-packages/keras/engine/input_spec.py", line 295, in assert_input_compatibility
raise ValueError(
ValueError: Input 0 of layer "landscapeModel01" is incompatible with the layer: expected shape=(None, 224, 224, 3), found shape=(224, 224, 3)
What can I fix to make the code work?
versions:
tensorflow==2.10.0
#EDIT
I just found the solution: I was loading images with a batch size equals none, but the trained model demanded that the images had one, even if it was 1.

Solution
I just needed to load images in the image_dataset_from_directory with a batch_size parameter different from None. Considering my investigation did not consider data augmentation in the beginning, I chose 1.

Related

ML error says: "ValueError: Input 0 of layer "sequential_6" is incompatible with the layer: expected shape=(None, 42), found shape=(None, 41)"

I needed to increase the accuracy of a model. So I tried using TabNet.
I'm attaching the train & test data in a google drive link
Link: https://drive.google.com/drive/folders/1ronho26m9uX9_ooBTh0M81ox1a43ika8?usp=sharing
Here is my code.
import tensorflow as tf
import pandas as pd
# Load the train and test data into pandas dataframes
train_df = pd.read_csv("train.csv")
#train_df1 = pd.read_csv("train.csv")
test_df = pd.read_csv("test.csv")
# Split the target variable and the features
train_labels = train_df[[f'F_{i}' for i in range(40)]]
#train_labels=trai
test_labels = train_df.target
# Convert the dataframes to tensors
train_dataset = tf.data.Dataset.from_tensor_slices((train_df.values, train_labels.values))
test_dataset = tf.data.Dataset.from_tensor_slices((test_df.values, test_labels.values))
# Define the model using the TabNet architecture
model = tf.keras.models.Sequential([
tf.keras.layers.Input(shape=(train_df.shape[1],)),
tf.keras.layers.Dense(32, activation="relu"),
tf.keras.layers.Dense(64, activation="relu"),
tf.keras.layers.Dense(128, activation="relu"),
tf.keras.layers.Dense(1)
])
# Compile the model with a mean squared error loss function and the Adam optimizer
model.compile(loss="mean_squared_error", optimizer="adam")
# Train the model on the training data
model.fit(train_dataset.batch(32), epochs=5)
# Make predictions on the test data
predictions = model.predict(test_dataset.batch(32))
#predictions = model.predict(test_dataset)
# Evaluate the model on the test data
mse = tf.keras.losses.mean_squared_error(test_labels, predictions)
print("Mean Squared Error:", mse.numpy().mean())
I don't know what's wrong with it as I'm just a beginner.
Here is the error code:
ValueError Traceback (most recent call last)
<ipython-input-40-87712e1604a9> in <module>
24
25 # Make predictions on the test data
---> 26 predictions = model.predict(test_dataset.batch(32))
27 #predictions = model.predict(test_dataset)
28
1 frames
/usr/local/lib/python3.8/dist-packages/keras/engine/training.py in tf__predict_function(iterator)
13 try:
14 do_return = True
---> 15 retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
16 except:
17 do_return = False
ValueError: in user code:
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1845, in predict_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1834, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1823, in run_step **
outputs = model.predict_step(data)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1791, in predict_step
return self(x, training=False)
File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/usr/local/lib/python3.8/dist-packages/keras/engine/input_spec.py", line 264, in assert_input_compatibility
raise ValueError(f'Input {input_index} of layer "{layer_name}" is '
ValueError: Input 0 of layer "sequential_6" is incompatible with the layer: expected shape=(None, 42), found shape=(None, 41)
I didn't really know what to do. So I'm expecting help from you guys. Would be grateful to whatever tips you guys give.

Keras slicing index out of range beginner question

I am using Keras and Tensorflow for the first time. I am trying to get vgg16 to train on the imagenette dataset.
Im not sure why I am getting an index out of range error. Maybe I am doing something obviously wrong. Ive been messing with this for awhile now so any help would be great!
Attached is the error and source code
I resized my images to the correct size based on the vgg16 documentation: https://keras.io/api/applications/vgg/#vgg16-function
`
tfds_name = 'imagenette'
(ds_train, ds_validation), ds_info= tfds.load(
name=tfds_name,
split=['train', 'validation'],
with_info = True,
as_supervised=True)
#model from assignment link https://keras.io/api/applications/vgg/#vgg16-function
ourModel = tf.keras.applications.VGG16(
include_top=True, #3 fill layers on top
weights="imagenet", #use imagenet
input_tensor=None, #use another layer as input
input_shape=None, #inly set if include to false
pooling=None, #use with include top false
classes=1000, #number of classes to set, we use imagenet values
classifier_activation="softmax", # classifier on input can only be none or softmax on pretrained
)
#make it so layers frozen
#for layer in ourModel.layers[:-1]:
# layer.trainable = False
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy()
ourModel.compile(optimizer="adam",
loss=loss_fn,
metrics=['accuracy'])
def reshape(img,label):
img = tf.cast(img, tf.float32)
img = tf.image.resize(img, (224,224))
resize_image = tf.reshape(img, [-1, 224, 224, 3])
resize_image = preprocess_input(resize_image)
return resize_image, label
ds_train = ds_train.map(reshape)
ds_validation = ds_validation.map(reshape)
ourModel.fit(ds_train,
epochs=10,
validation_data = ds_validation)
`
Error:
ValueError: in user code:
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1051, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1040, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1030, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 890, in train_step
loss = self.compute_loss(x, y, y_pred, sample_weight)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 949, in compute_loss
y, y_pred, sample_weight, regularization_losses=self.losses)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 212, in __call__
batch_dim = tf.shape(y_t)[0]
ValueError: slice index 0 of dimension 0 out of bounds. for '{{node strided_slice}} = StridedSlice[Index=DT_INT32, T=DT_INT32, begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1](Shape, strided_slice/stack, strided_slice/stack_1, strided_slice/stack_2)' with input shapes: [0], [1], [1], [1] and with computed input tensors: input[1] = <0>, input[2] = <1>, input[3] = <1>.
Your reshape function is creating a single batch for the image data but not for label data. So return resize_image, label[None, ...] should fix the above bug.
But the right way to batch the train (and valid dataset) is to do: ds_train=ds_train.batch(BATCH_SIZE) and then removing tf.reshape(img, [-1, 224, 224, 3])

The last dimension of the inputs to a Dense layer should be defined. Found None. Full input shape received: (None, None)

I am trying to build a binary Prediction model through Keras TensorFlow.
And I am having trouble when I add data augmentation inside.
This is my code.
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
train_dir = 'C:/train'
test_dir = 'C:/test'
train_data = train_datagen.flow_from_directory(train_dir,target_size=(224,224),class_mode='binary',seed=42)
test_data = test_datagen.flow_from_directory(test_dir,target_size=(224,224),class_mode='binary',seed=42)
tf.random.set_seed(42)
from tensorflow.keras.layers.experimental import preprocessing
data_augmentation = keras.Sequential([
preprocessing.RandomFlip("horizontal"),
preprocessing.RandomZoom(0.2),
preprocessing.RandomRotation(0.2),
preprocessing.RandomHeight(0.2),
preprocessing.RandomWidth(0.2),
], name='data_augmentation')
model_1 = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3),name='input_layer'),
data_augmentation,
tf.keras.layers.Conv2D(20,3,activation='relu'),
tf.keras.layers.MaxPool2D(pool_size=2),
tf.keras.layers.Conv2D(20,3,activation='relu'),
tf.keras.layers.MaxPool2D(pool_size=2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1,activation='sigmoid')
])
model_1.compile(loss=tf.keras.losses.binary_crossentropy,
optimizer='Adam',
metrics=['accuracy'])
model_1.fit(train_data,epochs=10,validation_data=test_data)
I ready tried this way but error again
inputs = tf.keras.layers.Input(shape=(224,224,3),name='input_layer')
x = data_augmentation(inputs)
x = tf.keras.layers.Conv2D(20,3,activation='relu')(x)
x = tf.keras.layers.Conv2D(20,3,activation='relu')(x)
x = tf.keras.layers.MaxPool2D(pool_size=2)(x)
x = tf.keras.layers.Flatten()(x)
outputs = tf.keras.layers.Dense(1,activation='sigmoid')(x)
model_1 = tf.keras.Model(inputs,outputs)
and this is error message:
Traceback (most recent call last):
File "C:/Users/pondy/PycharmProjects/pythonProject2/main.py", line 60, in <module>
model_1 = tf.keras.Sequential([
File "C:\Users\pondy\PycharmProjects\pythonProject2\venv\lib\site-packages\tensorflow\python\training\tracking\base.py", line 530, in _method_wrapper
result = method(self, *args, **kwargs)
File "C:\Users\pondy\PycharmProjects\pythonProject2\venv\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\pondy\PycharmProjects\pythonProject2\venv\lib\site-packages\keras\layers\core\dense.py", line 139, in build
raise ValueError('The last dimension of the inputs to a Dense layer '
ValueError: The last dimension of the inputs to a Dense layer should be defined. Found None. Full input shape received: (None, None)
Process finished with exit code 1
if didn't add data_augmentation inside it's not error
thank you for help<3
your code will work if you remove
preprocessing.RandomHeight(0.2),
preprocessing.RandomWidth(0.2),

tf.data.Dataset.from_generator : value error "not enough values to unpack (expected 2, got 1)"

I am currently trying to do a classification problem using a pre-trained transformer model. I wrote a custom generator using tf.data.Dataset.from_generator method. The model takes two inputs : input_id and attn_mask. While calling model.fit i am getting value error "not enough values to unpack (expected 2, got 1)" The received arguments list shows it got both input_id and attn_mask. Can anyone help me solve this?
import tensorflow.keras as keras
from tensorflow.keras.models import Model
from transformers import TFBertModel,BertConfig
def _input_fn():
x = (train_data.iloc[:,0:512]).to_numpy()
y = (train_data.iloc[:,512:516]).to_numpy()
attn = np.asarray(np.tile(attn_mask,x.shape[0]).reshape(-1,512))
def generator():
for s1, s2, l in zip(x, attn, y):
yield {"input_id": s1, "attn_mask": s2}, l
dataset = tf.data.Dataset.from_generator(generator, output_types=({"input_id": tf.int32, "attn_mask": tf.int32}, tf.int32))
#dataset = dataset.batch(2)
#dataset = dataset.shuffle
return dataset
train_data is the dataframe containing the training data(16000 x 516). Last four columns are one hot encoded labels. Since i am not using the Autotokenizer function, i am passing the attention mask as attn_mask.
my model
bert = 'bert-base-uncased'
config = BertConfig(dropout=0.2, attention_dropout=0.2)
config.output_hidden_states = False
transformer_model = TFBertModel.from_pretrained(bert, config = config)
input_ids_in = tf.keras.layers.Input(shape=(512), name='input_id', dtype='int32')
input_masks_in = tf.keras.layers.Input(shape=(512), name='attn_mask', dtype='int32')
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in)[0]
#cls_token = embedding_layer[:,0,:]
#X = tf.keras.layers.BatchNormalization()(cls_token)
X = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(50, return_sequences=True, dropout=0.1, recurrent_dropout=0.1))(embedding_layer)
X = tf.keras.layers.GlobalMaxPool1D()(X)
#X = tf.keras.layers.BatchNormalization()(X)
X = tf.keras.layers.Dense(50, activation='relu')(X)
X = tf.keras.layers.Dropout(0.2)(X)
X = tf.keras.layers.Dense(4, activation='softmax')(X)
model = tf.keras.Model(inputs=[input_ids_in, input_masks_in], outputs = X)
for layer in model.layers[:3]:
layer.trainable = False
optimizer = tf.keras.optimizers.Adam(0.001, beta_1=0.9, beta_2=0.98,
epsilon=1e-9)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
epochs = 1
batch_size =2
history = model.fit(_input_fn(), epochs= epochs, batch_size= batch_size, verbose=2)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tmp/ipykernel_16908/300834086.py in <module>
2 batch_size =2
3 #history = model.fit(trainDataGenerator(batch_size), epochs= epochs, validation_data=valDataGenerator(batch_size), verbose=2) #
----> 4 history = model.fit(_input_fn(), epochs= epochs, batch_size= batch_size, verbose=2) #validation_data=val_ds,
~/.local/lib/python3.8/site-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb
~/.local/lib/python3.8/site-packages/transformers/models/bert/modeling_tf_bert.py in call(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict, training, **kwargs)
1124 kwargs_call=kwargs,
1125 )
-> 1126 outputs = self.bert(
1127 input_ids=inputs["input_ids"],
1128 attention_mask=inputs["attention_mask"],
~/.local/lib/python3.8/site-packages/transformers/models/bert/modeling_tf_bert.py in call(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict, training, **kwargs)
771 raise ValueError("You have to specify either input_ids or inputs_embeds")
772
--> 773 batch_size, seq_length = input_shape
774
775 if inputs["past_key_values"] is None:
ValueError: Exception encountered when calling layer "bert" (type TFBertMainLayer).
not enough values to unpack (expected 2, got 1)
Call arguments received:
• input_ids=tf.Tensor(shape=(512,), dtype=int32)
• attention_mask=tf.Tensor(shape=(512,), dtype=int32)
• token_type_ids=None
• position_ids=None
• head_mask=None
• inputs_embeds=None
• encoder_hidden_states=None
• encoder_attention_mask=None
• past_key_values=None
• use_cache=True
• output_attentions=False
• output_hidden_states=False
• return_dict=True
• training=True
• kwargs=<class 'inspect._empty'>
Edit:
Adding the output of calling _input_fn()
<FlatMapDataset shapes: ({input_id: <unknown>, attn_mask: <unknown>}, <unknown>), types: ({input_id: tf.int32, attn_mask: tf.int32}, tf.int32)>
I solved this error by batching my tf.data.Dataset. This gave the TensorSpec in my dataset a shape that had two values to unpack ->
TensorSpec(shape=(16, 200)...
This is what the error refers to.
Solution
print(train_ds) #Before Batching
new_train_ds = train_ds.batch(16, drop_remainder=True)
print(new_train_ds) #After Batching
# Before Batching
<MapDataset element_spec=({'input_ids': TensorSpec(shape=(200,),
dtype=tf.float64, name=None), 'attention_mask': TensorSpec(shape=
(200,), dtype=tf.float64, name=None)}, TensorSpec(shape=(11,),
dtype=tf.float64, name=None))>
# After Batching
<BatchDataset element_spec=({'input_ids': TensorSpec(shape=(16, 200),
dtype=tf.float64, name=None), 'attention_mask': TensorSpec(shape=(16,
200), dtype=tf.float64, name=None)}, TensorSpec(shape=(16, 11),
dtype=tf.float64, name=None))>

How to read next batch in tensorflow automatically?

I have 70 training sample, 10 testing samples, with every sample contains 11*99 elements. I want to use LSTM to classify the testing samples, here is the code:
import tensorflow as tf
import scipy.io as sc
# data read
feature_training = sc.loadmat("feature_training_reshaped.mat")
feature_training_reshaped = feature_training['feature_training_reshaped']
print (feature_training_reshaped.shape)
feature_testing = sc.loadmat("feature_testing_reshaped.mat")
feature_testing_reshaped = feature_testing['feature_testing_reshaped']
print (feature_testing_reshaped.shape)
label_training = sc.loadmat("label_training.mat")
label_training = label_training['aa']
print (label_training.shape)
label_testing = sc.loadmat("label_testing.mat")
label_testing = label_testing['label_testing']
print (label_testing.shape)
a=feature_training_reshaped.reshape([70, 11, 99])
b=feature_testing_reshaped.reshape([10, 11, 99])
print (a.shape)
# hyperparameters
lr = 0.001
training_iters = 1000
batch_size = 70
n_inputs = 99 # MNIST data input (img shape: 11*99)
n_steps = 11 # time steps
n_hidden_units = 128 # neurons in hidden layer
n_classes = 2 # MNIST classes (0-9 digits)
# tf Graph input
x = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
y = tf.placeholder(tf.float32, [None, n_classes])
# Define weights
weights = {
# (28, 128)
'in': tf.Variable(tf.random_normal([n_inputs, n_hidden_units])),
# (128, 10)
'out': tf.Variable(tf.random_normal([n_hidden_units, n_classes]))
}
biases = {
# (128, )
'in': tf.Variable(tf.constant(0.1, shape=[n_hidden_units, ])),
# (10, )
'out': tf.Variable(tf.constant(0.1, shape=[n_classes, ]))
}
def RNN(X, weights, biases):
# hidden layer for input to cell
########################################
# all the data in this batch flow into this layer in one time
# transpose the inputs shape from 70batch, 11steps,99inputs
# X ==> (70 batch * 11 steps, 99 inputs)
X = tf.reshape(X, [-1, n_inputs])
# into hidden
# X_in = (70 batch * 11 steps, 99 inputs)
X_in = tf.matmul(X, weights['in']) + biases['in']
# another shape transpose X_in ==> (70 batch, 11 steps, 128 hidden),
X_in = tf.reshape(X_in, [-1, n_steps, n_hidden_units])
# cell
##########################################
# basic LSTM Cell.
lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(n_hidden_units, forget_bias=1.0, state_is_tuple=True)
# lstm cell is divided into two parts (c_state, h_state)
##### TAKE Care, batch_size should be 10 when the testing dataset only has 10 data
_init_state = lstm_cell.zero_state(batch_size, dtype=tf.float32)
print ("_init_state:", _init_state)
# You have 2 options for following step.
# 1: tf.nn.rnn(cell, inputs);
# 2: tf.nn.dynamic_rnn(cell, inputs).
# If use option 1, you have to modified the shape of X_in, go and check out this:
# https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/recurrent_network.py
# In here, we go for option 2.
# dynamic_rnn receive Tensor (batch, steps, inputs) or (steps, batch, inputs) as X_in.
# Make sure the time_major is changed accordingly.
outputs, final_state = tf.nn.dynamic_rnn(lstm_cell, X_in, initial_state=_init_state, time_major=False)
# outputs size would be a tensor [70,11,128]; size of X_in is (70 batch, 11 steps, 128 hidden)
# final_state size would be [batch_size, outputs],which is [70,128]
print (outputs)
print (final_state)
# hidden layer for output as the final results
#############################################
results = tf.matmul(final_state[1], weights['out']) + biases['out']
# # or
# unpack to list [(batch, outputs)..] * steps
# outputs = tf.unpack(tf.transpose(outputs, [1, 0, 2])) # states is the last outputs
# results = tf.matmul(outputs[-1], weights['out']) + biases['out']
return results
pred = RNN(x, weights, biases)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
train_op = tf.train.AdamOptimizer(lr).minimize(cost)
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
step = 0
while step * batch_size < training_iters:
# batch_xs, batch_ys = fea.next_batch(batch_size)
# batch_xs = batch_xs.reshape([batch_size, n_steps, n_inputs])
sess.run([train_op], feed_dict={
x: a,
y: label_training,
})
if step % 10 == 0:
print(sess.run(accuracy, feed_dict={
x: b,
y: label_testing,
}))
step += 1
At last, I got the result & error:
(770, 99)
(110, 99)
(70, 2)
(10, 2)
(70, 11, 99)
('_init_state:', LSTMStateTuple(c=<tf.Tensor 'zeros:0' shape=(70, 128) dtype=float32>, h=<tf.Tensor 'zeros_1:0' shape=(70, 128) dtype=float32>))
Tensor("RNN/transpose:0", shape=(70, 11, 128), dtype=float32)
LSTMStateTuple(c=<tf.Tensor 'RNN/while/Exit_2:0' shape=(70, 128) dtype=float32>, h=<tf.Tensor 'RNN/while/Exit_3:0' shape=(70, 128) dtype=float32>)
Traceback (most recent call last):
File "/home/xiangzhang/RNN.py", line 150, in <module>
y: label_testing,
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 717, in run
run_metadata_ptr)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 915, in _run
feed_dict_string, options, run_metadata)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 965, in _do_run
target_list, options, run_metadata)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 985, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [10,128] vs. shape[1] = [70,128]
[[Node: RNN/while/BasicLSTMCell/Linear/concat = Concat[N=2, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](RNN/while/BasicLSTMCell/Linear/concat/concat_dim, RNN/while/TensorArrayRead, RNN/while/Identity_3)]]
Caused by op u'RNN/while/BasicLSTMCell/Linear/concat', defined at:
File "/home/xiangzhang/RNN.py", line 128, in <module>
pred = RNN(x, weights, biases)
File "/home/xiangzhang/RNN.py", line 110, in RNN
outputs, final_state = tf.nn.dynamic_rnn(lstm_cell, X_in, initial_state=_init_state, time_major=False)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 836, in dynamic_rnn
dtype=dtype)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 1003, in _dynamic_rnn_loop
swap_memory=swap_memory)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2518, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2356, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2306, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 988, in _time_step
(output, new_state) = call_cell()
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 974, in <lambda>
call_cell = lambda: cell(input_t, state)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell.py", line 310, in __call__
concat = _linear([inputs, h], 4 * self._num_units, True)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell.py", line 907, in _linear
res = math_ops.matmul(array_ops.concat(1, args), matrix)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 872, in concat
name=name)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 436, in _concat
values=values, name=name)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2380, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/xiangzhang/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1298, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [10,128] vs. shape[1] = [70,128]
[[Node: RNN/while/BasicLSTMCell/Linear/concat = Concat[N=2, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](RNN/while/BasicLSTMCell/Linear/concat/concat_dim, RNN/while/TensorArrayRead, RNN/while/Identity_3)]]
Process finished with exit code 1
I thought the reason maybe is the testing dataset is only 10, less than batch_size=70, so that when I run the testing dataset, the code _init_state = lstm_cell.zero_state(batch_size, dtype=tf.float32) would has the unmatch error.
There are two ways to solve it but I don't know how to implement any of it neither:
change the batch_size value, set it as 70 when training, 10 when testing. But, I don't know how to code it, please tell me how to do?
Or, I can set the batch_size=10 , and automatically read the training dataset ten by ten. Also, I don't know how to read next batch in tensorflow automatically, and the command next_batch in MNIST dataset can not work.
The second solution is particular important, please kindly to help me, thanks very much.