Why do these error messages show up in Compilation - tensorflow

I am trying to compile the Keras model to train and test the dataset. But in the compilation process, the following error messages show up. Could anyone help me how to solve this? I have been checking other pages and followed their suggestions, but none of them really helps me to solve that.
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation="relu"), # Rectified Linear Unit.
tf.keras.layers.Dense(10, activation="softmax")
model.compile(optimizer="adam", loss="sparse_categorial_crossentropy", metrics=["accuracy"])
The lines below appear when I try to compile and run.
Traceback (most recent call last):
File "/home/eaindra/PycharmProjects/NeuralNetwork/Tensorflow1.py", line 42, in <module>
model.compile(optimizer="adam", loss="sparse_categorial_crossentropy", metrics=["accuracy"])
File "/home/eaindra/anaconda3/envs/tensor/lib/python3.6/site-packages/tensorflow_core/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/home/eaindra/anaconda3/envs/tensor/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 336, in compile
self.loss, self.output_names)
File "/home/eaindra/anaconda3/envs/tensor/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_utils.py", line 1351, in prepare_loss_functions
loss_functions = [get_loss_function(loss) for _ in output_names]
File "/home/eaindra/anaconda3/envs/tensor/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_utils.py", line 1351, in <listcomp>
loss_functions = [get_loss_function(loss) for _ in output_names]
File "/home/eaindra/anaconda3/envs/tensor/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_utils.py", line 1087, in get_loss_function
loss_fn = losses.get(loss)
File "/home/eaindra/anaconda3/envs/tensor/lib/python3.6/site-packages/tensorflow_core/python/keras/losses.py", line 1183, in get
return deserialize(identifier)
File "/home/eaindra/anaconda3/envs/tensor/lib/python3.6/site-packages/tensorflow_core/python/keras/losses.py", line 1174, in deserialize
printable_module_name='loss function')
File "/home/eaindra/anaconda3/envs/tensor/lib/python3.6/site-packages/tensorflow_core/python/keras/utils/generic_utils.py", line 210, in deserialize_keras_object
raise ValueError('Unknown ' + printable_module_name + ':' + object_name)
ValueError: Unknown loss function:sparse_categorial_crossentropy

It seems to be a typo on the loss function. You wrote categorial instead of categorical and missed the closing square brackets in the model definition.
Fixed code segment attached below;
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation="relu"), # Rectified Linear Unit.
tf.keras.layers.Dense(10, activation="softmax")])
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])

Related

ML error says: "ValueError: Input 0 of layer "sequential_6" is incompatible with the layer: expected shape=(None, 42), found shape=(None, 41)"

I needed to increase the accuracy of a model. So I tried using TabNet.
I'm attaching the train & test data in a google drive link
Link: https://drive.google.com/drive/folders/1ronho26m9uX9_ooBTh0M81ox1a43ika8?usp=sharing
Here is my code.
import tensorflow as tf
import pandas as pd
# Load the train and test data into pandas dataframes
train_df = pd.read_csv("train.csv")
#train_df1 = pd.read_csv("train.csv")
test_df = pd.read_csv("test.csv")
# Split the target variable and the features
train_labels = train_df[[f'F_{i}' for i in range(40)]]
#train_labels=trai
test_labels = train_df.target
# Convert the dataframes to tensors
train_dataset = tf.data.Dataset.from_tensor_slices((train_df.values, train_labels.values))
test_dataset = tf.data.Dataset.from_tensor_slices((test_df.values, test_labels.values))
# Define the model using the TabNet architecture
model = tf.keras.models.Sequential([
tf.keras.layers.Input(shape=(train_df.shape[1],)),
tf.keras.layers.Dense(32, activation="relu"),
tf.keras.layers.Dense(64, activation="relu"),
tf.keras.layers.Dense(128, activation="relu"),
tf.keras.layers.Dense(1)
])
# Compile the model with a mean squared error loss function and the Adam optimizer
model.compile(loss="mean_squared_error", optimizer="adam")
# Train the model on the training data
model.fit(train_dataset.batch(32), epochs=5)
# Make predictions on the test data
predictions = model.predict(test_dataset.batch(32))
#predictions = model.predict(test_dataset)
# Evaluate the model on the test data
mse = tf.keras.losses.mean_squared_error(test_labels, predictions)
print("Mean Squared Error:", mse.numpy().mean())
I don't know what's wrong with it as I'm just a beginner.
Here is the error code:
ValueError Traceback (most recent call last)
<ipython-input-40-87712e1604a9> in <module>
24
25 # Make predictions on the test data
---> 26 predictions = model.predict(test_dataset.batch(32))
27 #predictions = model.predict(test_dataset)
28
1 frames
/usr/local/lib/python3.8/dist-packages/keras/engine/training.py in tf__predict_function(iterator)
13 try:
14 do_return = True
---> 15 retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
16 except:
17 do_return = False
ValueError: in user code:
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1845, in predict_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1834, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1823, in run_step **
outputs = model.predict_step(data)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1791, in predict_step
return self(x, training=False)
File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/usr/local/lib/python3.8/dist-packages/keras/engine/input_spec.py", line 264, in assert_input_compatibility
raise ValueError(f'Input {input_index} of layer "{layer_name}" is '
ValueError: Input 0 of layer "sequential_6" is incompatible with the layer: expected shape=(None, 42), found shape=(None, 41)
I didn't really know what to do. So I'm expecting help from you guys. Would be grateful to whatever tips you guys give.

How to deal with incompatible shapes while trying to fit a model in Tensorflow Keras

So, I have done some modifications with the VGG16 neural network to make a linear-regression and classification model.
And at last I am writing the following code to compile and fit my data into the model to initiate training->
class Facetracker(Model):
# The initialization function-
def __init__(self,eyetracker,**kwargs):
super().__init__(**kwargs)
self.model = eyetracker #Instantiating the model
def compile(self,opt,classlosss,localization_loss,**kwargs):
super().compile(**kwargs)
self.classloss = class_loss
self.localization_loss = regress_loss
self.opt = optimizer
# Defining the training step
def train_step(self,batch,**kwargs):
X,y = batch #unpacking our data
with tf.GradientTape() as tape:
classes,coords = self.model(X,training=True)
batch_classloss = self.classloss(y[0],classes)
batch_localloss = self.localization_loss(tf.cast(y[1],tf.float32),coords)
# calculating total loss-
total_loss = batch_localloss+0.5*batch_classloss
grad = tape.gradient(total_loss,self.model.trainable_variables)
optimizer.apply_gradients(zip(grad,self.model.trainable_variables))
return{
"total_loss":total_loss,
"class_loss":batch_classloss,
"localilzation_loss":batch_localloss
}
def test_step(self,batch):
X,y = batch
classes,coords = self.model(X,training=False)
batch_classloss = self.classloss(y[0],classes)
batch_localloss = self.localization_loss(tf.cast(y[1],tf.float32),coords)
total_loss = batch_localloss+0.5*batch_classloss
return{
"total_loss": total_loss,
"class_loss": batch_classloss,
"localilzation_loss": batch_localloss
}
# def call(self, X, **kwargs):
# return self.model(X,**kwargs)
# Replacing the call function with a lambda function
lambda self,X,**kwargs: self.model(X,**kwargs)
# Subclassing our model-
print("Subclassing.....")
model = Facetracker(facetracker)
print("Compiling......")
model.compile(optimizer,classlosss=class_loss,localization_loss=localization_loss)
# Preparing the log directory
logdir="logdir"
tensorboard_callbacks = tf.keras.callbacks.TensorBoard(log_dir=logdir)
print("Fitting the model")
hist = model.fit(train.take(80),
epochs=16,
initial_epoch =8,
validation_data=val,
validation_steps =8,
validation_freq=2,
callbacks = [[tensorboard_callbacks]])
This includes the class prepared for training the model and the last few lines are subclassing the model and fitting the prepared data into the model.
The error I now get seems pretty hefty to me and it goes like this->
File "C:\Users\Radhe Krishna\OneDrive\Documents\MarkATT\main.py", line 535, in <module>
hist = model.fit(train.take(80),
File "C:\Users\Radhe Krishna\OneDrive\Documents\MarkATT\MarkATT\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\Radhe Krishna\OneDrive\Documents\MarkATT\MarkATT\lib\site-packages\tensorflow\python\eager\execute.py", line 54, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:
Detected at node 'gradient_tape/sub_2/BroadcastGradientArgs' defined at (most recent call last):
File "C:\Users\Radhe Krishna\OneDrive\Documents\MarkATT\main.py", line 535, in <module>
hist = model.fit(train.take(80),
File "C:\Users\Radhe Krishna\OneDrive\Documents\MarkATT\MarkATT\lib\site-packages\keras\utils\traceback_utils.py", line 64, in error_handler
return fn(*args, **kwargs)
File "C:\Users\Radhe Krishna\OneDrive\Documents\MarkATT\MarkATT\lib\site-packages\keras\engine\training.py", line 1409, in fit
tmp_logs = self.train_function(iterator)
File "C:\Users\Radhe Krishna\OneDrive\Documents\MarkATT\MarkATT\lib\site-packages\keras\engine\training.py", line 1051, in train_function
return step_function(self, iterator)
File "C:\Users\Radhe Krishna\OneDrive\Documents\MarkATT\MarkATT\lib\site-packages\keras\engine\training.py", line 1040, in step_function
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "C:\Users\Radhe Krishna\OneDrive\Documents\MarkATT\MarkATT\lib\site-packages\keras\engine\training.py", line 1030, in run_step
outputs = model.train_step(data)
File "C:\Users\Radhe Krishna\OneDrive\Documents\MarkATT\main.py", line 498, in train_step
grad = tape.gradient(total_loss,self.model.trainable_variables)
Node: 'gradient_tape/sub_2/BroadcastGradientArgs'
Incompatible shapes: [2,4] vs. [8]
[[{{node gradient_tape/sub_2/BroadcastGradientArgs}}]] [Op:__inference_train_function_22026]
It would be a great help if someone could examine this problem and help me get out of it. I have been quite struggling with it now.
Thanks in advance!!!
I has planned to initiate training with the code provided, but I firstly received an error for the call function and as I resolved it by converting it into lambda function- This issue of incompatible shapes came up. I tries to adjust the integers in the input data and batch related fields but nothing worked.

Incompatible shapes while using triplet loss and pre-trained resnet

I am trying to use pre-trained resnet and fine-tune it using triplet loss. The following code I came up with is a combination of tutorials I found on the topic:
import pathlib
import tensorflow as tf
import tensorflow_addons as tfa
with tf.device('/cpu:0'):
INPUT_SHAPE = (32, 32, 3)
BATCH_SIZE = 16
data_dir = pathlib.Path('/home/user/dataset/')
base_model = tf.keras.applications.ResNet50V2(
weights='imagenet',
pooling='avg',
include_top=False,
input_shape=INPUT_SHAPE,
)
# following two lines are added after edit, originally it was model = base_model
head_model = tf.keras.layers.Lambda(lambda x: tf.math.l2_normalize(x, axis=1))(base_model.output)
model = tf.keras.Model(inputs=base_model.input, outputs=head_model)
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rotation_range=10,
zoom_range=0.1,
)
generator = datagen.flow_from_directory(
data_dir,
target_size=INPUT_SHAPE[:2],
batch_size=BATCH_SIZE,
seed=42,
)
model.compile(
optimizer=tf.keras.optimizers.Adam(0.001),
loss=tfa.losses.TripletSemiHardLoss(),
)
model.fit(
generator,
epochs=5,
)
Unfortunately after running the code I get the following error:
Found 4857 images belonging to 83 classes.
Epoch 1/5
Traceback (most recent call last):
File "ReID/external_process.py", line 35, in <module>
model.fit(
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
return method(self, *args, **kwargs)
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 1098, in fit
tmp_logs = train_function(iterator)
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
result = self._call(*args, **kwds)
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 840, in _call
return self._stateless_fn(*args, **kwds)
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2829, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1843, in _filtered_call
return self._call_flat(
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1923, in _call_flat
return self._build_call_outputs(self._inference_function.call(
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 545, in call
outputs = execute.execute(
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 1328 values, but the requested shape has 16
[[{{node TripletSemiHardLoss/PartitionedCall/Reshape}}]] [Op:__inference_train_function_13749]
Function call stack:
train_function
2020-10-23 22:07:09.094736: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated.
[[{{node PyFunc}}]]
The dataset directory has 83 subdirectories, one per class and each of this subdirectories contains images of given class. The dimension 1328 in the error output is the batch size (16) times number of classes (83), and the dimension 16 is the batch size (both dimensions change accordingly if I change the BATCH_SIZE.
To be honest I do not really understand the error, so any solution or even any kind of indight where is the problem is deeply appreciated.
The problem is that the TripletSemiHardLoss expects
labels y_true to be provided as 1-D integer Tensor with shape [batch_size] of multi-class integer labels
but the flow_from_directory by default generate categorical labels; using class_mode="sparse" should fix the problem.

L2-normalization with Keras Backend?

I'd like to normalize the inputs going into my neural network but, as I'm defining my model in this way:
df = pd.read_csv(r'C:\Users\Davide Mori\PycharmProjects\pythonProject\Dataset.csv')
print(df)
target_column = ['W_mag', 'W_phase']
predictors = list(set(list(df.columns)) - set(target_column))
X = df[predictors].values
Y = df[target_column].values
def get_model(n_inputs, n_outputs):
model = Sequential()
model.add(Dense(1000,input_dim= n_inputs, activation='relu'))
#model.add(Lambda(lambda x: K.l2_normalize(x, axis=1)))
model.add(Dense(1000, activation='linear', activity_regularizer=regularizers.l1(0.0001)))
model.add(Activation('relu'))
model.add(Dense(n_outputs, activation='linear'))
model.compile(optimizer="adam", loss="mean_squared_error", metrics=["mean_squared_error"])
model.summary()
return model
n_inputs, n_outputs = X.shape[1], Y.shape[1]
model = get_model(n_inputs, n_outputs)
# fit the model on all data
model.fit(X, Y, epochs=100, batch_size=1)
how do I apply the lambda layer to my inputs? Isn't wrong the commented line position? Because If I put the lambda layer there I'm normalizing what is already be "transformed" by the first hidden layer,right? How can I solve this problem?
This is the error I have when putting the lambda layer before everything else :
2020-10-12 15:08:46.036872: I
tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports
instructions that this TensorFlow binary was not compiled to use: AVX AVX2
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "C:\Program Files\JetBrains\PyCharm
2020.2.2\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197,
in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the
script
File "C:\Program Files\JetBrains\PyCharm
2020.2.2\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line
18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/Davide Mori/PycharmProjects/pythonProject/prova_rete_sfs.py",
line 60, in <module>
model = get_model(n_inputs, n_outputs)
File "C:/Users/Davide Mori/PycharmProjects/pythonProject/prova_rete_sfs.py",
line 52, in get_model
model.summary()
File "C:\Users\Davide Mori\Anaconda3\envs\pythonProject\lib\site-
packages\tensorflow_core\python\keras\engine\network.py", line 1302, in
summary
raise ValueError('This model has not yet been built. '
ValueError: This model has not yet been built. Build the model first by
calling `build()` or calling `fit()` with some data, or specify an
`input_shape` argument in the first layer(s) for automatic build.

Resource exhausted: OOM when allocating tensor with shape[845246,300]

I am working with a sequence to sequence language model, and after changing the code to pass custom word embedding weights to the Embeddings layer, I am receiving a OOM error when I try to train on the gpu.
Here is the relevant code:
def create_model(word_map, X_train, Y_train, vocab_size, max_length):
# define model
model = Sequential()
# get custom embedding weights as matrix
embedding_matrix = get_weights_matrix_from_word_map(word_map)
model.add(Embedding(len(word_map)+1, 300, weights=[embedding_matrix], input_length=max_length-1))
model.add(LSTM(50))
model.add(Dense(vocab_size, activation='softmax'))
print(model.summary())
# compile network
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, Y_train, epochs=100, verbose=2)
return model
And here is the full error log from the server:
File "/home2/slp24/thesis/UpdatedLanguageModel_7_31.py", line 335, in create_model_2
model.fit(X_train, Y_train, batch_size=32, epochs=1, verbose=2) ## prev X, y
File "/opt/python-3.4.1/lib/python3.4/site-packages/keras/models.py", line 963, in fit
validation_steps=validation_steps)
File "/opt/python-3.4.1/lib/python3.4/site-packages/keras/engine/training.py", line 1682, in fit
self._make_train_function()
File "/opt/python-3.4.1/lib/python3.4/site-packages/keras/engine/training.py", line 990, in _make_train_function
loss=self.total_loss)
File "/opt/python-3.4.1/lib/python3.4/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/opt/python-3.4.1/lib/python3.4/site-packages/keras/optimizers.py", line 466, in get_updates
m_t = (self.beta_1 * m) + (1. - self.beta_1) * g
File "/opt/python-3.4.1/lib/python3.4/site-packages/tensorflow/python/ops/math_ops.py", line 898, in binary_op_wrapper
y = ops.convert_to_tensor(y, dtype=x.dtype.base_dtype, name="y")
File "/opt/python-3.4.1/lib/python3.4/site-packages/tensorflow/python/framework/ops.py", line 932, in convert_to_tensor
as_ref=False)
File "/opt/python-3.4.1/lib/python3.4/site-packages/tensorflow/python/framework/ops.py", line 1022, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/opt/python-3.4.1/lib/python3.4/site-packages/tensorflow/python/ops/gradients_impl.py", line 100, in _IndexedSlicesToTensor
value.values, value.indices, value.dense_shape[0], name=name)
File "/opt/python-3.4.1/lib/python3.4/site-packages/tensorflow/python/ops/gen_math_ops.py", line 5186, in unsorted_segment_sum
num_segments=num_segments, name=name)
File "/opt/python-3.4.1/lib/python3.4/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/opt/python-3.4.1/lib/python3.4/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/opt/python-3.4.1/lib/python3.4/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[845246,300] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: training/Adam/mul_2/y = UnsortedSegmentSum[T=DT_FLOAT, Tindices=DT_INT32, Tnumsegments=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](training/Adam/gradients/embedding_1/Gather_grad/Reshape, training/Adam/gradients/embedding_1/Gather_grad/Reshape_1/_101, training/Adam/mul_2/strided_slice)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Edit:
So far I have tried
Adding batching, started with batch_size=32
I am currently working to decrease the number of output classes from 845,286. I think something went wrong when I calculated the custom embedding matrix, specifically when I was "connecting" the vocabulary token index's assigned during preprocessing and the y_categorical values assigned by Keras that the model uses...
Any help or guidance is greatly appreciated! I have searched many similar issued but have not been able to apply those fixes to my code thus far. Thank you
You're exceeding the memory size of your GPU.
You can:
Train/Predict with smaller batches
Or, if even a batch_size=1 is too much, you need a model with less parameters.
Hint, the length in that tensor (845246) is really really big. Is that the correct length?
I had the same problem with Google Colab GPU
The batch size was 64 and this error has appeared and after I reduced the batch size to 32 it worked properly