How to use both custom eval_metric and a built-in metric in xgboost.XGBClassifier - xgboost

Hello!
I am trying to combine a list of eval_metrics consisting of a one custom eval function and multiple built-in eval functions.
When i use a list of built-in functions everything works ok:
model.fit(
X_train_inner,
y_train_inner,
early_stopping_rounds=20,
eval_metric = ["error", "logloss", "map"],
eval_set=[(X_test_inner, y_test_inner)])
Also, when I use my custom function on its own, everything is OK:
model.fit(
X_train_inner,
y_train_inner,
early_stopping_rounds=20,
eval_metric = custom_f1_eval_function,
eval_set=[(X_test_inner, y_test_inner)])
But how do I pass both custom and builtin functions to the eval_metric parameter?
Thank you!

https://github.com/dmlc/xgboost/issues/1125
def auc_copc(preds, dtrain):
preds = 1.0 / (1.0 + np.exp(-preds))
labels = dtrain.get_label()
auc = roc_auc_score(labels, preds)
copc = np.sum(labels) / np.sum(preds)
return [('auc', auc), ('copc', copc)]

Related

Save histogram during evaluation with estimator api

Is it possible to save a histogram during evaluation using the estimator API?
I couldn't find a solution since the estimator api does not write down any summaries during evaluation and I can only add scalars to the evaluates metrics.
For the sake of those who came here and haven't found a solution, I will update that I used the above approach, with a slight modification:
summary_writer = tf.compat.v1.summary.FileWriter(
logdir=self.model_dir + '/eval_histograms/',
filename_suffix='.host_call')
summary_ops = [tf.compat.v1.summary.histogram(name=k, values=v)
for k, v in
prepare_endpoints_for_summary(endpoints).items()]
eval_hooks = [
tf.estimator.SummarySaverHook(save_steps=1,
summary_writer=summary_writer,
summary_op=summary_op)
for summary_op in summary_ops]
And it worked fine!
You can use SummarySaverHook
eval_hooks = []
eval_summary_hook = tf.train.SummarySaverHook(
save_steps=1,
output_dir='model_dir',
summary_op=tf.summary.histogram(logits.name, logits))
eval_hooks.append(eval_summary_hook)
return tf.estimator.EstimatorSpec(mode=mode,
loss=loss,
eval_metric_ops=eval_metric_ops,
evaluation_hooks=evaluation_hooks
)

Custom Keras metric, changing

I am currently trying to create my own loss function for Keras (using Tensorflow backend). This is a simple categorical crossentropy but I am applying a factor on the 1st column to penalize more loss from the 1st class.
Yet I am new to Keras and I can't figure out how to translate my function (below) as I have to use symbolic expressions and it seems I can't go element-wise:
def custom_categorical_crossentropy(y_true, y_pred):
y_pred = np.clip(y_pred, _EPSILON, 1.0-_EPSILON)
out = np.zeros(y_true.shape).astype('float32')
for i in range(0,y_true.shape[0]):
for j in range (0,y_true.shape[1]):
#penalize more all elements on class 1 so that loss takes its low proportion in the dataset into account
if(j==0):
out[i][j] = -(prop_database*(y_true[i][j] * np.log(y_pred[i][j]) + (1.0 - y_true[i][j]) * np.log(1.0 - y_pred[i][j])))
else:
out[i][j] = -(y_true[i][j] * np.log(y_pred[i][j]) + (1.0 - y_true[i][j]) * np.log(1.0 - y_pred[i][j]))
out = np.mean(out.astype('float32'), axis=-1)
return tf.convert_to_tensor(out,
dtype=tf.float32,
name='custom_loss')
Can someone help me?
Many thanks!
You can use class_weight in the fit method to penalize classes without creating functions:
weights = {
0:2,
1:1,
2:1,
3:1,
...
}
model.compile(optimizer=chooseOne, loss='categorical_crossentropy')
model.fit(......., class_weight = weights)
This will make the first class be twice as important as the others.

Output error rate per label / confusion matrix

I train an image classifier using Keras up to around 98% test accuracy. Now I know that the overall accuracy is 98%, but i want to know the accuracy/error per distinct class/label.
Has Keras a builtin function for that or would I have to test this myself per class/label?
Update: Thanks #gionni. I didn't know the actual term was "Confusion Matrix". But that's what I am actually looking for. That being said, is there a function to generate one? I have to use Keras 1.2.2 by the way.
I had similar issue so I could share my code with you. The following function computes a single class accuracy:
def single_class_accuracy(interesting_class_id):
def fn(y_true, y_pred):
class_id_preds = K.argmax(y_pred, axis=-1)
# Replace class_id_preds with class_id_true for recall here
positive_mask = K.cast(K.equal(class_id_preds, interesting_class_id), 'int32')
true_mask = K.cast(K.equal(y_true, interesting_class_id), 'int32')
acc_mask = K.cast(K.equal(positive_mask, true_mask), 'float32')
class_acc = K.mean(acc_mask)
return class_acc
return fn
Now - if you want to get an accuracy for 0 class you could add it to metrics while compiling a model:
model.compile(..., metrics=[..., single_class_accuracy(0)])
If you want to have all classes accuracy you could type:
model.compile(...,
metrics=[...] + [single_class_accuracy(i) for i in range(nb_of_classes)])
There may be better options, but you can use this:
import numpy as np
#gather each true label
distinct, counts = np.unique(trueLabels,axis=0,return_counts=True)
for dist,count in zip(distinct, counts):
selector = (trueLabels == dist).all(axis=-1)
selectedX = testData[selector]
selectedY = trueLabels[selector]
print('\n\nEvaluating for ' + str(count) + ' occurrences of class ' + str(dist))
print(model.evaluate(selectedX,selectedY,verbose=0))

Keras How to use max_value in Relu activation function

Relu function as defined in keras/activation.py is:
def relu(x, alpha=0., max_value=None):
return K.relu(x, alpha=alpha, max_value=max_value)
It has a max_value which can be used to clip the value. Now how can this be used/called in the code?
I have tried the following:
(a)
model.add(Dense(512,input_dim=1))
model.add(Activation('relu',max_value=250))
assert kwarg in allowed_kwargs, 'Keyword argument not understood:
' + kwarg
AssertionError: Keyword argument not understood: max_value
(b)
Rel = Activation('relu',max_value=250)
same error
(c)
from keras.layers import activations
uu = activations.relu(??,max_value=250)
The problem with this is that it expects the input to be present in the first value. The error is 'relu() takes at least 1 argument (1 given)'
So how do I make this a layer?
model.add(activations.relu(max_value=250))
has the same issue 'relu() takes at least 1 argument (1 given)'
If this file cannot be used as layer, then there seems to be no way of specifying a clip value to Relu. This implies that the comment here https://github.com/fchollet/keras/issues/2119 closing a proposed change is wrong...
Any thoughts? Thanks!
You can use the ReLU function of the Keras backend. Therefore, first import the backend:
from keras import backend as K
Then, you can pass your own function as activation using backend functionality.
This would look like
def relu_advanced(x):
return K.relu(x, max_value=250)
Then you can use it like
model.add(Dense(512, input_dim=1, activation=relu_advanced))
or
model.add(Activation(relu_advanced))
Unfortunately, you must hard code additional arguments.
Therefore, it is better to use a function, that returns your function and passes your custom values:
def create_relu_advanced(max_value=1.):
def relu_advanced(x):
return K.relu(x, max_value=K.cast_to_floatx(max_value))
return relu_advanced
Then you can pass your arguments by either
model.add(Dense(512, input_dim=1, activation=create_relu_advanced(max_value=250)))
or
model.add(Activation(create_relu_advanced(max_value=250)))
That is as easy as one lambda :
from keras.activations import relu
clipped_relu = lambda x: relu(x, max_value=3.14)
Then use it like this:
model.add(Conv2D(64, (3, 3)))
model.add(Activation(clipped_relu))
When reading a model saved in hdf5 use custom_objects dictionary:
model = load_model(model_file, custom_objects={'<lambda>': clipped_relu})
Tested below, it'd work:
import keras
def clip_relu (x):
return keras.activations.relu(x, max_value=1.)
predictions=Dense(num_classes,activation=clip_relu,name='output')
This is what I did using Lambda layer to implement clip relu:
Step 1: define a function to do reluclip:
def reluclip(x, max_value = 20):
return K.relu(x, max_value = max_value)
Step 2: add Lambda layer into model:
y = Lambda(function = reluclip)(y)

Tensorflow seq2seq weight sharing

def rnn_seq2seq(encoder_inputs, decoder_inputs, cell, output_projection=None,feed_previous=False, dtype=tf.float32, scope=None):
with tf.variable_scope(scope or "rnn_seq2seq"):
_, enc_states = rnn.rnn(cell, encoder_inputs,dtype=dtype)
def extract_argmax(prev, i):
if output_projection is not None:
prev = tf.nn.xw_plus_b(prev, output_projection[0], output_projection[1])
return tf.to_float(tf.equal(prev,tf.reduce_max(prev,reduction_indices=[1],keep_dims=True)))
loop_function = None
if feed_previous:
loop_function = extract_argmax
#seq2seq.rnn_decoder is provided in tensorflow/models/rnn/seq2seq.py
return seq2seq.rnn_decoder(decoder_inputs, enc_states[-1], cell, loop_function=loop_function)
I want to create two RNN models, one for training and another for testing. For that, I can call the function twice passing the feed_previous to True or False.
train_op,train_states = rnn_seq2seq(enc_inp,dec_inp,cell,output_projection=op,feed_previous=False)
test_op,_ = rnn_seq2seq(enc_inp,dec_inp,cell,output_projection=op,feed_previous=True)
But if I call the above function twice, wouldn't it create two different RNNs ? I am wondering if they would be able to share the weights.
Both functions operate on the same default graph and so can reuse the variables, check out variable scopes tutorial and see if your variables are created with reuse=True parameter
As a sanity check, try following snippet to list all variables in the default graph:
[v.name for v in tf.get_default_graph().as_graph_def().node if v.op=='Variable']