ValueError: No gradients provided for any variable Huggingface - tensorflow

Hi I am following the Huggingface course for Question Answering.
I built my own Dataset and all the features are present and I get the exact same results up until fitting the model.
There I get the above error.
After some research it seems this is caused by not having the columns in the correct order.
The tokenizer does output it in a different order and I changed it, but neither the order in the course nor the order of the tokenizer seem to work.
Can someone think of another issue?
I don't have the Data Collator as it's deprecated now.
Token Type Ids are commented out because the tokenizer does not return them.
I'm using "distilbert-base-cased-distilled-squad" because I just want to try and that seems like the fastest (smallest) model.
tf_train_dataset = train_dataset.to_tf_dataset(
columns=[
"attention_mask",
"end_positions",
"input_ids",
"start_positions",
#"token_type_ids",
],
shuffle=True,
batch_size=4,
)
Thank you very much!
edit: I get the same error with the model from the tutorial.

Related

How to save the best model instead of the last one for Detectron2

I want to save the best model instead of the last model for detectron2. The evaluation metric I want to use is AP50 or something similar. The code I currently have is:
trainer.register_hooks([
EvalHook(eval_period=20, eval_function=lambda:{'AP50':function?}),
BestCheckpointer(eval_period=20, checkpointer=trainer.checkpointer, val_metric= "AP50", mode="max")
])
But I have no idea what I have to substitute for the function in EvalHook. I use a subset of the coco dataset to train the model, and I saw that detectron2 contains some evaluation measures for the coco dataset, but I have no idea how to implement this.
This notebook has an implementation of what you asked and what I am searching for...
trainer.resume_or_load(resume=False)
if cfg.TEST.AUG.ENABLED:
trainer.register_hooks(
[hooks.EvalHook(0, lambda: trainer.test_with_TTA(cfg, trainer.model))] #this block uses a hook to run evalutaion periodically
) #https://detectron2.readthedocs.io/en/latest/modules/engine.html#detectron2.engine.hooks.EvalHook
trainer.train()
Try this...
I will report here if it works.

Tensorflow validation_data error for multi-input model

My tensorflow 2.6 model has two inputs. When I train this model without validation data - a la model.fit(x=[train_data1, train_data2], y= train_target)- it works perfectly. When I try to add some validation data, however, I receive errors.
model.fit(x=[train_data1, train_data2], y= train_target,
validation_data=([val_data1, val_data2], val_target))
throws the following error:
Layer Input__ expects 2 input(s), but it received 3 input tensors.
The closest thing I got for help is this question. There, the answerer suggests doing exactly as I have done. What can be done so that this model can use validation_data?
After an hour of beating my head against the wall, I restarted the kernel then tried
model.fit(x=[train_data1, train_data2], y= train_target,
validation_data=([val_data1, val_data2], val_target))
again, just like in the question. It worked...
Like every IT person in the history of the human race will remind you, "Did you try turning it off and on again?" Lesson learned.
try wrapping it in a numpy array or a tensor like this:
validation_data=(np.array([val_data1, val_data2]), val_target)

Predict probability of predicted class

ml beginner here.
I have a dataset containing the GPA, GRE, TOEFL, SOP&LOR Ranking(out of 5)etc. (all numerical) , and a final column that states whether or not they were admitted to a university(0 or 1), which is what we'll use as y_train.
I'm supposed to not just classify the predicted labels, but also calculate the probability of each person getting admitted.
edit: so from the from the first comment, I built a Logistic Regression model, and with some googling I found 'predict_proba' from sklearn and tried implementing it. There werent any syntactical errors but the code values given by predict_proba were horribly wrong.
Link: https://github.com/tarunn2799/gre-pred/blob/master/GRE%20Admission%20Probability-%20Extraaedge.ipynb
please help me finding where I've gone wrong, and also tips to reduce the loss
thank you!
I read your notebook, but I'm confused why you think the predict_proba are horribly wrong..
Is the predict accuracy not good, or the format of predict_proba not as you expected?
You could use sklearn.metrics.accuracy_score(), sklearn.metrics.confusion_matrix() to check your predict label, or use sklearn.metrics.roc_auc_score() to check the result of predict_proba. Check both train & text parts are better.
I think the format of predict_proba is correct, or maybe you could try the predict_log_proba() to calculate the log probability?
Hope this could help you.

In Tensorflow-Serving, is it possible to get only the top-k prediction results?

When using the code in https://www.tensorflow.org/serving, but with a DNNClassifier Estimator model, the curl/query request returns all the possible label classes and their associated scores.
Using a model with 100,000+ possible output/label classes, the response becomes too large. Is there any way to limit the number of outputs to the top-k results? (Similar to how it can be done in keras).
The only possibility I could think of is feeding some parameter into the predict API through the signatures, but I haven't found any parameters that would give this functionality. I've read through a ton of documentation + code and googled a ton, but to no avail.
Any help would be greatly appreciated. Thanks in advance for any responses. <3
AFAIC, there are 2 ways to support your need.
You could add some lines in tensorflow-serving source code referring to this
You could do something like this while training/retraining your model.
Hope this will help.
Putting this up here in case it helps anyone. It's possible to override the classification_output() function in head.py (which is used by dnn.py) in order to filter the top-k results. You can insert this snippet into your main.py / train.py file, and whenever you save an DNNClassifier model, that model will always output at most num_top_k_results when doing inference/serving. The vast majority of the method is copied from the original classification_output() function. (Note this may or may not work with 1.13 / 2.0 as it hasn't been tested on those.)
from tensorflow.python.estimator.canned import head as head_lib
num_top_k_results = 5
def override_classification_output(scores, n_classes, label_vocabulary=None):
batch_size = array_ops.shape(scores)[0]
if label_vocabulary:
export_class_list = label_vocabulary
else:
export_class_list = string_ops.as_string(math_ops.range(n_classes))
# Get the top_k results
top_k_scores, top_k_indices = tf.nn.top_k(scores, num_top_k_results)
# Using the top_k_indices, get the associated class names (from the vocabulary)
top_k_classes = tf.gather(tf.convert_to_tensor(value=export_class_list), tf.squeeze(top_k_indices))
export_output_classes = array_ops.tile(
input=array_ops.expand_dims(input=top_k_classes, axis=0),
multiples=[batch_size, 1])
return export_output.ClassificationOutput(
scores=top_k_scores,
# `ClassificationOutput` requires string classes.
classes=export_output_classes)
# Override the original method with our custom one.
head_lib._classification_output = override_classification_output

Gradient for Each Example Using map_fn

I want to get the gradient of a layer with respect to a parameter matrix for each example. Normally, I would need a Jacobian, but following this idea, I decided to use map_fn so I could feed forward data in a batch rather than one by one. This gives me a problem I do not understand, unfortunately. With the code
get_grads = tf.map_fn(lambda x: tf.gradients(x, W['1'])[0], softmax_probs)
sess.run(get_grads, feed_dict={x: images[0:100]})
I get this error
InvalidArgumentError: TensorArray map_21/TensorArray_36#map_21/while/gradients: Could not write to TensorArray index 0 because it has already been read.
W['1'] is a variable in the graph. Ideas?
It seems like your issue may be connected with the bug
https://github.com/tensorflow/tensorflow/issues/7643
One commenter posts a possible fix at the end. You could try that out.
Alternatively, if you what you want is the jacobian, then you can check out this solution:
https://github.com/tensorflow/tensorflow/issues/675#issuecomment-362853672
although it appears that it will not work when nested.
I don't think this will work because x in this case is a loop variable which TensorFlow does not know how to connect to softmax_probs.