xgboost for categorical data gives an error when evaluating

xgboost for categorical data gives an error when evaluating - xgboost

I'm trying to run xgboost classifier on a categorical dataset.
The only way i succeeded to create a model is with this code:
import xgboost as xgb
df = pd.DataFrame(data={'1':['MM', 'GC', 'AU', 'GU', 'BB'],
'2': ['MM', 'GC', 'BB', 'GU', 'AU'], '3': ['MM', 'GC', 'AU', 'GC', 'GC'],
'4': ['MM', 'MM', 'AU', 'GC', 'AU'],'5': ['MM', 'GC', 'BB', 'GC', 'MM'],
'class': [0,0,1,1,0]})
X2, y2 = df[['1','2','3','4','5']],df['class']
X2 = X2.astype('category')
Xy = xgb.DMatrix(X2, y2, enable_categorical=True)
booster = xgb.train({"tree_method": "hist", "max_cat_to_onehot": 5}, Xy)
booster.save_model("categorical-model.json")
SHAP = booster.predict(Xy, pred_interactions=True)
scores2 = cross_val_score(booster, X2, y2, scoring='roc_auc', cv=cv, n_jobs=-1)
print('Mean ROC AUC: %.5f' % mean(scores2))
The problem that it gives me this error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-200-76bcf397e943> in <module>
----> 1 scores2 = cross_val_score(booster, X2, y2, scoring='roc_auc', cv=cv, n_jobs=-1)
2 # summarize performance
3 print('Mean ROC AUC: %.5f' % mean(scores2))
~\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py in cross_val_score(estimator, X, y, groups, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch, error_score)
506 """
507 # To ensure multimetric format is not supported
--> 508 scorer = check_scoring(estimator, scoring=scoring)
509
510 cv_results = cross_validate(
~\anaconda3\lib\site-packages\sklearn\metrics\_scorer.py in check_scoring(estimator, scoring, allow_none)
446 """
447 if not hasattr(estimator, "fit"):
--> 448 raise TypeError(
449 "estimator should be an estimator implementing 'fit' method, %r was passed"
450 % estimator
TypeError: estimator should be an estimator implementing 'fit' method, <xgboost.core.Booster object at 0x0000022E071161C0> was passed
And I don't understand why.
Could you also help me with the code to calculate the accuracy of the model?
Thank you!

Related

'key of type tuple not found and not a MultiIndex' while generating ROC for multi-class classification

I am trying to generate a ROC curve using XGBoost through a multi-class classification but facing this 'key of type tuple not found and not a MultiIndex' everytime.
Classification:
from xgboost import XGBClassifier
from xgboost import plot_tree
from sklearn import metrics
from sklearn.metrics import roc_curve, auc
from itertools import cycle
from sklearn.metrics import roc_auc_score
model = XGBClassifier()
model = model.fit(x_train, y_train)
print('Accuracy:', model.score(x_test,y_test))
score=cross_val_score(model,X,y,cv=5)
print(score)
print('CV Score:',np.mean(score))
y_pred1=model.predict(x_test)
Generating ROC:
n_classes = 5
fpr = dict()
tpr = dict()
roc_auc = dict()
lw=2
for i in range(n_classes):
fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_pred1[:, i])
roc_auc[i] = auc(fpr[i], tpr[i])
colors = cycle(['blue', 'red', 'green', 'yellow', 'pink'])
for i, color in zip(range(n_classes), colors):
plt.plot(fpr[i], tpr[i], color=color, lw=2,
label='ROC curve of class {0} (area = {1:0.2f})'
''.format(i, roc_auc[i]))
plt.plot([0, 1], [0, 1], 'k--', lw=lw)
plt.xlim([-0.05, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic for multi-class data')
plt.legend(loc="lower right")
plt.show()
Out:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-34-14f08a1b6222> in <module>
5 lw=2
6 for i in range(n_classes):
----> 7 fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_pred1[:, i])
8 roc_auc[i] = auc(fpr[i], tpr[i])
9 colors = cycle(['blue', 'red', 'green', 'yellow', 'pink'])
2 frames
/usr/local/lib/python3.8/dist-packages/pandas/core/series.py in _get_values_tuple(self, key)
1014
1015 if not isinstance(self.index, MultiIndex):
-> 1016 raise KeyError("key of type tuple not found and not a MultiIndex")
1017
1018 # If key is contained, would have returned by now
KeyError: 'key of type tuple not found and not a MultiIndex'
Q: Why is it returning a multi-index error even after I have 5 classes in my dataframe?

tf.data.Dataset.from_generator : value error "not enough values to unpack (expected 2, got 1)"

I am currently trying to do a classification problem using a pre-trained transformer model. I wrote a custom generator using tf.data.Dataset.from_generator method. The model takes two inputs : input_id and attn_mask. While calling model.fit i am getting value error "not enough values to unpack (expected 2, got 1)" The received arguments list shows it got both input_id and attn_mask. Can anyone help me solve this?
import tensorflow.keras as keras
from tensorflow.keras.models import Model
from transformers import TFBertModel,BertConfig
def _input_fn():
x = (train_data.iloc[:,0:512]).to_numpy()
y = (train_data.iloc[:,512:516]).to_numpy()
attn = np.asarray(np.tile(attn_mask,x.shape[0]).reshape(-1,512))
def generator():
for s1, s2, l in zip(x, attn, y):
yield {"input_id": s1, "attn_mask": s2}, l
dataset = tf.data.Dataset.from_generator(generator, output_types=({"input_id": tf.int32, "attn_mask": tf.int32}, tf.int32))
#dataset = dataset.batch(2)
#dataset = dataset.shuffle
return dataset
train_data is the dataframe containing the training data(16000 x 516). Last four columns are one hot encoded labels. Since i am not using the Autotokenizer function, i am passing the attention mask as attn_mask.
my model
bert = 'bert-base-uncased'
config = BertConfig(dropout=0.2, attention_dropout=0.2)
config.output_hidden_states = False
transformer_model = TFBertModel.from_pretrained(bert, config = config)
input_ids_in = tf.keras.layers.Input(shape=(512), name='input_id', dtype='int32')
input_masks_in = tf.keras.layers.Input(shape=(512), name='attn_mask', dtype='int32')
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in)[0]
#cls_token = embedding_layer[:,0,:]
#X = tf.keras.layers.BatchNormalization()(cls_token)
X = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(50, return_sequences=True, dropout=0.1, recurrent_dropout=0.1))(embedding_layer)
X = tf.keras.layers.GlobalMaxPool1D()(X)
#X = tf.keras.layers.BatchNormalization()(X)
X = tf.keras.layers.Dense(50, activation='relu')(X)
X = tf.keras.layers.Dropout(0.2)(X)
X = tf.keras.layers.Dense(4, activation='softmax')(X)
model = tf.keras.Model(inputs=[input_ids_in, input_masks_in], outputs = X)
for layer in model.layers[:3]:
layer.trainable = False
optimizer = tf.keras.optimizers.Adam(0.001, beta_1=0.9, beta_2=0.98,
epsilon=1e-9)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
epochs = 1
batch_size =2
history = model.fit(_input_fn(), epochs= epochs, batch_size= batch_size, verbose=2)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tmp/ipykernel_16908/300834086.py in <module>
2 batch_size =2
3 #history = model.fit(trainDataGenerator(batch_size), epochs= epochs, validation_data=valDataGenerator(batch_size), verbose=2) #
----> 4 history = model.fit(_input_fn(), epochs= epochs, batch_size= batch_size, verbose=2) #validation_data=val_ds,
~/.local/lib/python3.8/site-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb
~/.local/lib/python3.8/site-packages/transformers/models/bert/modeling_tf_bert.py in call(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict, training, **kwargs)
1124 kwargs_call=kwargs,
1125 )
-> 1126 outputs = self.bert(
1127 input_ids=inputs["input_ids"],
1128 attention_mask=inputs["attention_mask"],
~/.local/lib/python3.8/site-packages/transformers/models/bert/modeling_tf_bert.py in call(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict, training, **kwargs)
771 raise ValueError("You have to specify either input_ids or inputs_embeds")
772
--> 773 batch_size, seq_length = input_shape
774
775 if inputs["past_key_values"] is None:
ValueError: Exception encountered when calling layer "bert" (type TFBertMainLayer).
not enough values to unpack (expected 2, got 1)
Call arguments received:
• input_ids=tf.Tensor(shape=(512,), dtype=int32)
• attention_mask=tf.Tensor(shape=(512,), dtype=int32)
• token_type_ids=None
• position_ids=None
• head_mask=None
• inputs_embeds=None
• encoder_hidden_states=None
• encoder_attention_mask=None
• past_key_values=None
• use_cache=True
• output_attentions=False
• output_hidden_states=False
• return_dict=True
• training=True
• kwargs=<class 'inspect._empty'>
Edit:
Adding the output of calling _input_fn()
<FlatMapDataset shapes: ({input_id: <unknown>, attn_mask: <unknown>}, <unknown>), types: ({input_id: tf.int32, attn_mask: tf.int32}, tf.int32)>

I solved this error by batching my tf.data.Dataset. This gave the TensorSpec in my dataset a shape that had two values to unpack ->
TensorSpec(shape=(16, 200)...
This is what the error refers to.
Solution
print(train_ds) #Before Batching
new_train_ds = train_ds.batch(16, drop_remainder=True)
print(new_train_ds) #After Batching
# Before Batching
<MapDataset element_spec=({'input_ids': TensorSpec(shape=(200,),
dtype=tf.float64, name=None), 'attention_mask': TensorSpec(shape=
(200,), dtype=tf.float64, name=None)}, TensorSpec(shape=(11,),
dtype=tf.float64, name=None))>
# After Batching
<BatchDataset element_spec=({'input_ids': TensorSpec(shape=(16, 200),
dtype=tf.float64, name=None), 'attention_mask': TensorSpec(shape=(16,
200), dtype=tf.float64, name=None)}, TensorSpec(shape=(16, 11),
dtype=tf.float64, name=None))>

How to fix "pop from empty list" error while using Keras tuner search method with TPU in google colab?

I previously was able to run the search method of keras tuner on my model with GPU runtime of Google colab. But when I switched to the TPU runtime, I get the following error. I haven't been able to come to the conclusion of how to access a google cloud storage for the TPU runtime to save the checkpoint folder that the keras tuner saves model checkpoints in. I also don't know how to do it and I'm getting the following error. Please help me resolve this issue.
My code:
def post_se(hp):
ip = Input(shape=(6, 128))
x = Masking()(ip)
x = LSTM(units=hp.Choice('lstm_1', values = [8,16,32,64,128,256,512]),return_sequences=True)(x)
x = Dropout(hp.Choice(name='Dropout', values = [0.0,0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8]))(x)
x = LSTM(units=hp.Choice('lstm_2', values = [8,16,32,64,128,256,512]))(x)
x = Dropout(hp.Choice(name='Dropout_2', values = [0.0,0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8]))(x)
y = Permute((2, 1))(ip)
y = Conv1D(hp.Choice('conv_1_filter', values = [32,64,128,256,512]), hp.Choice(name='conv_1_filter_size', values = [3,5,7,8,9]), padding='same', kernel_initializer='he_uniform')(y)
y = BatchNormalization()(y)
y = Activation('relu')(y)
y = squeeze_excite_block(y)
y = Conv1D(hp.Choice('conv_2_filter', values = [32,64,128,256,512]), hp.Choice(name='conv_2_filter_size',values = [3,5,7,8,9]), padding='same', kernel_initializer='he_uniform')(y)
y = BatchNormalization()(y)
y = Activation('relu')(y)
y = squeeze_excite_block(y)
y = Conv1D(hp.Choice('conv_3_filter', values = [32,64,128,256,512,]), hp.Choice(name='conv_3_filter_size',values = [3,5,7,8,9]), padding='same', kernel_initializer='he_uniform')(y)
y = BatchNormalization()(y)
y = Activation('relu')(y)
y = GlobalAveragePooling1D()(y)
x = concatenate([x,y])
# batch_size = hp.Choice('batch_size', values=[32, 64, 128, 256, 512, 1024, 2048, 4096])
out = Dense(num_classes, activation='softmax')(x)
model = Model(ip, out)
if gpu:
opt = keras.optimizers.Adam(learning_rate=0.001)
if tpu:
opt = keras.optimizers.Adam(learning_rate=8*0.001)
model.compile(optimizer=opt, loss='categorical_crossentropy',metrics=['accuracy'])
# model.summary()
return model
if gpu:
tuner = kt.tuners.BayesianOptimization(post_se,
objective='val_accuracy',
max_trials=30,
seed=42,
project_name='Model_gpu')
# Will stop training if the "val_loss" hasn't improved in 30 epochs.
tuner.search(X_train, train_label, epochs=200, validation_split=0.1, shuffle=True, callbacks=[tensorflow.keras.callbacks.EarlyStopping('val_loss', patience=30)])
if tpu:
print("TPU")
with strategy.scope():
tuner = kt.tuners.BayesianOptimization(post_se,
objective='val_accuracy',
max_trials=30,
seed=42,
project_name='Model_tpu')
# Will stop training if the "val_loss" hasn't improved in 30 epochs.
tuner.search(X_train, train_label, epochs=200, validation_split=0.1, shuffle=True, callbacks=[tensorflow.keras.callbacks.EarlyStopping('val_loss', patience=30)])
The error log
---------------------------------------------------------------------------
UnimplementedError Traceback (most recent call last)
/usr/lib/python3.7/contextlib.py in __exit__(self, type, value, traceback)
129 try:
--> 130 self.gen.throw(type, value, traceback)
131 except StopIteration as exc:
10 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in resource_creator_scope(resource_type, resource_creator)
2957 resource_creator):
-> 2958 yield
2959
<ipython-input-15-24c1e1bb603d> in <module>()
17 # Will stop training if the "val_loss" hasn't improved in 30 epochs.
---> 18 tuner.search(X_train, train_label, epochs=200, validation_split=0.1, shuffle=True, callbacks=[tensorflow.keras.callbacks.EarlyStopping('val_loss', patience=30)])
/usr/local/lib/python3.7/dist-packages/keras_tuner/engine/base_tuner.py in search(self, *fit_args, **fit_kwargs)
178 self.on_trial_begin(trial)
--> 179 results = self.run_trial(trial, *fit_args, **fit_kwargs)
180 # `results` is None indicates user updated oracle in `run_trial()`.
/usr/local/lib/python3.7/dist-packages/keras_tuner/engine/tuner.py in run_trial(self, trial, *args, **kwargs)
303 copied_kwargs["callbacks"] = callbacks
--> 304 obj_value = self._build_and_fit_model(trial, *args, **copied_kwargs)
305
/usr/local/lib/python3.7/dist-packages/keras_tuner/engine/tuner.py in _build_and_fit_model(self, trial, *args, **kwargs)
233 model = self._try_build(hp)
--> 234 return self.hypermodel.fit(hp, model, *args, **kwargs)
235
/usr/local/lib/python3.7/dist-packages/keras_tuner/engine/hypermodel.py in fit(self, hp, model, *args, **kwargs)
136 """
--> 137 return model.fit(*args, **kwargs)
138
/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
66 filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in _numpy(self)
1116 except core._NotOkStatusException as e: # pylint: disable=protected-access
-> 1117 raise core._status_to_exception(e) from None # pylint: disable=protected-access
1118
UnimplementedError: File system scheme '[local]' not implemented (file: './untitled_project/trial_78ed6883514d67dc6222064095c134cb/checkpoints/epoch_0/checkpoint_temp/part-00000-of-00001')
Encountered when executing an operation using EagerExecutor. This error cancels all future operations and poisons their output tensors.
During handling of the above exception, another exception occurred:
IndexError Traceback (most recent call last)
<ipython-input-15-24c1e1bb603d> in <module>()
16 seed=42)
17 # Will stop training if the "val_loss" hasn't improved in 30 epochs.
---> 18 tuner.search(X_train, train_label, epochs=200, validation_split=0.1, shuffle=True, callbacks=[tensorflow.keras.callbacks.EarlyStopping('val_loss', patience=30)])
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py in __exit__(self, exception_type, exception_value, traceback)
454 "tf.distribute.set_strategy() out of `with` scope."),
455 e)
--> 456 _pop_per_thread_mode()
457
458
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribution_strategy_context.py in _pop_per_thread_mode()
64
65 def _pop_per_thread_mode():
---> 66 ops.get_default_graph()._distribution_strategy_stack.pop(-1) # pylint: disable=protected-access
67
68
IndexError: pop from empty list
For some extra info, I am attaching my code in this post.

This is your error:
UnimplementedError: File system scheme '[local]' not implemented (file: './untitled_project/trial_78ed6883514d67dc6222064095c134cb/checkpoints/epoch_0/checkpoint_temp/part-00000-of-00001')
See https://stackoverflow.com/a/62881833/14043558 for a solution.

TypeError when trying to use EarlyStopping with f1-metric as stopping criterion

I want for training a CNN with Early Stopping and want to use the f1-metric as stopping criterion.
When I compile the code for the CNN model I get the a TypeError as error message.
I'm still using Tensorflow 1.4 would like to avoid an upgrade to 2.0, because I have in mind that my previous code doesn't work anymore.
The error message is as follows:
TypeError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in _num_samples(x)
158 try:
--> 159 return len(x)
160 except TypeError:
14 frames
/usr/local/lib/python3.6/dist-
packages/tensorflow_core/python/framework/ops.py in __len__(self)
740 "Please call `x.shape` rather than `len(x)` for "
--> 741 "shape information.".format(self.name))
742
TypeError: len is not well defined for symbolic Tensors. (dense_16_target:0) Please call `x.shape` rather than `len(x)` for shape information.
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-44-cd3da16e057c> in <module>()
----> 1 model = model_cnn(False,False, False,True,6, 0.2, 0.5)
2 X_train, X_val, y_train, y_val = split_data(X_train, y_train,1)
3 cnn, ep = train_model_es(model, X_train, y_train, X_val, y_val, X_test, y_test, 50, 500,1)
<ipython-input-42-d275d9c69c03> in model_cnn(spat, extra_pool, avg_pool, cw, numb_conv, drop_conv, drop_dense)
36 if cw == True:
37 print("sparse categorical crossentropy")
---> 38 model.compile(loss="sparse_categorical_crossentropy", optimizer=Adam(), metrics=['accuracy', f1_metric])
39 #model.compile(loss="sparse_categorical_crossentropy", optimizer=Adam(), metrics=['accuracy'])
40 print("nothing")
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in compile(self, optimizer, loss, metrics, loss_weights, sample_weight_mode, weighted_metrics, target_tensors, **kwargs)
452 output_metrics = nested_metrics[i]
453 output_weighted_metrics = nested_weighted_metrics[i]
--> 454 handle_metrics(output_metrics)
455 handle_metrics(output_weighted_metrics, weights=weights)
456
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in handle_metrics(metrics, weights)
421 metric_result = weighted_metric_fn(y_true, y_pred,
422 weights=weights,
--> 423 mask=masks[i])
424
425 # Append to self.metrics_names, self.metric_tensors,
/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py in weighted(y_true, y_pred, weights, mask)
426 """
427 # score_array has ndim >= 2
--> 428 score_array = fn(y_true, y_pred)
429 if mask is not None:
430 # Cast the mask to floatX to avoid float64 upcasting in Theano
<ipython-input-9-b21dc3bd89a6> in f1_metric(y_test, y_pred)
1 def f1_metric(y_test, y_pred):
----> 2 f1 = f1_score(y_test, y_pred, average='macro')
3 return f1
/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in f1_score(y_true, y_pred, labels, pos_label, average, sample_weight, zero_division)
1097 pos_label=pos_label, average=average,
1098 sample_weight=sample_weight,
-> 1099 zero_division=zero_division)
1100
1101
/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in fbeta_score(y_true, y_pred, beta, labels, pos_label, average, sample_weight, zero_division)
1224 warn_for=('f-score',),
1225 sample_weight=sample_weight,
-> 1226 zero_division=zero_division)
1227 return f
1228
/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in precision_recall_fscore_support(y_true, y_pred, beta, labels, pos_label, average, warn_for, sample_weight, zero_division)
1482 raise ValueError("beta should be >=0 in the F-beta score")
1483 labels = _check_set_wise_labels(y_true, y_pred, average, labels,
-> 1484 pos_label)
1485
1486 # Calculate tp_sum, pred_sum, true_sum ###
/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in _check_set_wise_labels(y_true, y_pred, average, labels, pos_label)
1299 str(average_options))
1300
-> 1301 y_type, y_true, y_pred = _check_targets(y_true, y_pred)
1302 present_labels = unique_labels(y_true, y_pred)
1303 if average == 'binary':
/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in _check_targets(y_true, y_pred)
78 y_pred : array or indicator matrix
79 """
---> 80 check_consistent_length(y_true, y_pred)
81 type_true = type_of_target(y_true)
82 type_pred = type_of_target(y_pred)
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in check_consistent_length(*arrays)
206 """
207
--> 208 lengths = [_num_samples(X) for X in arrays if X is not None]
209 uniques = np.unique(lengths)
210 if len(uniques) > 1:
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in <listcomp>(.0)
206 """
207
--> 208 lengths = [_num_samples(X) for X in arrays if X is not None]
209 uniques = np.unique(lengths)
210 if len(uniques) > 1:
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in _num_samples(x)
159 return len(x)
160 except TypeError:
--> 161 raise TypeError(message)
162
163
TypeError: Expected sequence or array-like, got <class 'tensorflow.python.framework.ops.Tensor'>
And here is the relevant code:
def f1_metric(y_test, y_pred):
f1 = f1_score(y_test, y_pred, average='macro')
return f1
def train_model_es(model, X, y, X_val, y_val, X_test, y_test):
es = EarlyStopping(monitor='f1_metric', mode='max', patience=20, restore_best_weights=True)
y = np.argmax(y, axis=1)
y_val = np.argmax(y_val, axis=1)
y_test = np.argmax(y_test, axis=1)
class_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
class_weights = dict(enumerate(class_weights))
history = model.fit(X, y, class_weight=class_weights, batch_size=32,epochs=20, verbose=1,
validation_data=(X_val, y_val), callbacks=[es])
def model_cnn():
model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), input_shape=(28,28,1),padding='same'))
model.add(BatchNormalization())
model.add(ELU())
model.add(Conv2D(32, kernel_size=(3,3), padding='same'))
model.add(BatchNormalization())
model.add(ELU())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(256))
model.add(BatchNormalization())
model.add(ELU())
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss="sparse_categorical_crossentroy",optimizer=Adam(),metrics=["accuracy",f1_metric])
return model
Does anyone have a tip on how to fix this error message.
Many thanks for every hint

As the error message suggests,
Error 1) You are doing a len() operation on symbolic tensor. You cannot do that operation on symbolic tensor. You can find difference between a variable tensor and symbolic tensor here.
Error 2) You are using a tensor for the operation which expects array as input. Can you please convert y_true and y_pred from tensor to array and use in the f1_score and other operations.
Example - To convert tensor to array
%tensorflow_version 1.x
print(tf.__version__)
import tensorflow as tf
import numpy as np
x = tf.constant([1,2,3,4,5,6])
print("Type of x:",x)
with tf.Session() as sess:
y = np.array(x.eval())
print("Type of y:",y.shape,y)
Output -
1.15.2
Type of x: Tensor("Const_24:0", shape=(6,), dtype=int32)
Type of y: (6,) [1 2 3 4 5 6]

sklearn classification_report ValueError: Unknown label type:

I am trying a simple classification report on the output from a Keras model prediction. The format of the inputs are two 1D arrays, but the error is still thrown.
Y_pred = np.squeeze(model.predict(test_data[0:5]))
classification_report(test_labels[0:5], Y_pred)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-235-49afd2f46d17> in <module>()
----> 1 classification_report(test_labels[0:5], Y_pred)
/Library/Python/2.7/site-packages/sklearn/metrics/classification.pyc in classification_report(y_true, y_pred, labels, target_names, sample_weight, digits)
1356
1357 if labels is None:
-> 1358 labels = unique_labels(y_true, y_pred)
1359 else:
1360 labels = np.asarray(labels)
/Library/Python/2.7/site-packages/sklearn/utils/multiclass.pyc in unique_labels(*ys)
97 _unique_labels = _FN_UNIQUE_LABELS.get(label_type, None)
98 if not _unique_labels:
---> 99 raise ValueError("Unknown label type: %s" % repr(ys))
100
101 ys_labels = set(chain.from_iterable(_unique_labels(y) for y in ys))
ValueError: Unknown label type: (array([-0.38947693, 0.18258421, -0.00295772, -0.06293461, -0.29382696]), array([-0.46586546, 0.1359883 , -0.00223112, -0.08303966, -0.29208803]))
Both of the inputs are of the same type, so I am confused why this would not work? I have tried changing the type explicitly to dtype=float and flattening the inputs, but it still does not work.

classification_report works only for classification problems.
If you have a classification problem (eg, binary), use the following
Y_pred = np.squeeze(model.predict(test_data[0:5]))
threshold = 0.5
classification_report(test_labels[0:5], Y_pred > threshold)
threshold will make everything greater than 0.5 (in example above), 1.0

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

xgboost for categorical data gives an error when evaluating - xgboost

Related

'key of type tuple not found and not a MultiIndex' while generating ROC for multi-class classification

tf.data.Dataset.from_generator : value error "not enough values to unpack (expected 2, got 1)"

How to fix "pop from empty list" error while using Keras tuner search method with TPU in google colab?

TypeError when trying to use EarlyStopping with f1-metric as stopping criterion

sklearn classification_report ValueError: Unknown label type:

Categories

Resources