How to save the best model instead of the last one for Detectron2 - object-detection

I want to save the best model instead of the last model for detectron2. The evaluation metric I want to use is AP50 or something similar. The code I currently have is:
trainer.register_hooks([
EvalHook(eval_period=20, eval_function=lambda:{'AP50':function?}),
BestCheckpointer(eval_period=20, checkpointer=trainer.checkpointer, val_metric= "AP50", mode="max")
])
But I have no idea what I have to substitute for the function in EvalHook. I use a subset of the coco dataset to train the model, and I saw that detectron2 contains some evaluation measures for the coco dataset, but I have no idea how to implement this.

This notebook has an implementation of what you asked and what I am searching for...
trainer.resume_or_load(resume=False)
if cfg.TEST.AUG.ENABLED:
trainer.register_hooks(
[hooks.EvalHook(0, lambda: trainer.test_with_TTA(cfg, trainer.model))] #this block uses a hook to run evalutaion periodically
) #https://detectron2.readthedocs.io/en/latest/modules/engine.html#detectron2.engine.hooks.EvalHook
trainer.train()
Try this...
I will report here if it works.

Related

Reload Keras-Tuner Trials from the directory

I'm trying to reload or access the Keras-Tuner Trials after the Tuner's search has completed for inspecting the results. I'm not able to find any documentation or answers related to this issue.
For example, I set up BayesianOptimization to search for the best hyper-parameters as follows:
## Build Hyper Parameter Search
tuner = kt.BayesianOptimization(build_model,
objective='val_categorical_accuracy',
max_trials=10,
directory='kt_dir',
project_name='lstm_dense_bo')
tuner.search((X_train_seq, X_train_num), y_train_cat,
epochs=30,
batch_size=64,
validation_data=((X_val_seq, X_val_num), y_val_cat),
callbacks=[callbacks.EarlyStopping(monitor='val_loss', patience=3,
restore_best_weights=True)])
I see this creates trial files in the directory kt_dir with project name lstm_dense_bo such as below:
Now, if I restart my Jupyter kernel, how can I reload these trials into a Tuner object and subsequently inspect the best model or the best hyperparameters or the best trial?
I'd very much appreciate your help. Thank you
I was trying to do the same thing. I was looking into the keras docs for an easier way than this but could not find one - so if any other SO-ers have a better idea, please let us know!
Load the previous tuner. Make sure overwrite=False or else you'll delete your trials.
workdir = "mlp_202202151345"
obj = "val_recall"
tuner = kt.Hyperband(
hypermodel=build_model,
metrics=metrics,
objective=kt.Objective(obj, direction="max"),
executions_per_trial=1,
overwrite=False,
directory=workdir,
project_name="keras_tuner",
)
Look for a trial you want to load. Note that TensorBoard works really well for this. In this example, I'm loading 1a38ebaba07b77501999cb1c4ab9413e.
Here's the part that I could not find in Keras docs. This might be dependent on the tuner you use (I am using Hyperband):
tuner.oracle.get_trial('1a38ebaba07b77501999cb1c4ab9413e')
Returns a Trial object (also could not find in the docs). The Trial object has a hyperparameters attribute that will return that trial's hyperparameters. Now:
tuner.hypermodel.build(trial.hyperparameters)
Gives you the trial's model for training, evaluation, predictions, etc.
NOTE This seems convuluted and hacky, would love to see a better way.
j7skov has correctly mentioned that you need to reload previous tuner and set the parameter overwrite=False(so that tuner will not overwrite already generated trials).
Further if you want to load first K best models then we need to use tuner's get_best_models method as below
# This will load 10 best hyper tuned models with the weights
# corresponding to their best checkpoint (at the end of the best epoch of best trial).
best_model_count = 10
bo_tuner_best_models = tuner.get_best_models(num_models=best_model_count)
Then you can access a specific best model as below
best_model_id = 7
model = bo_tuner_best_models[best_model_id]
This method is for querying the models trained during the search. For best performance, it is recommended to retrain your Model on the full dataset using the best hyperparameters found during search, which can be obtained using tuner.get_best_hyperparameters().
tuner_best_hyperparameters = tuner.get_best_hyperparameters(num_trials=best_model_count)
best_hp = tuner_best_hyperparameters[best_model_id]
model = tuner.hypermodel.build(best_hp)
If you want to just display hyperparameters for the K best models then use tuner's results_summary method as below
tuner.results_summary(num_trials=best_model_count)
For further reference visit this page.
Inspired by j7skov, I found that the models can be reloaded
by manipulating tuner.oracle.trials and tuner.load_model.
By assigning tuner.oracle.trials to a variable, we can find that it is a dict object containing all relavant trials in the tuning process.
The keys of the dictionary are the trial_id, and the values of the
dictionary are the instance of the Trial object.
Alternatively, we can return the best few trials by using tuner.oracle.get_best_trials.
To inspect the hyperparameters of the trial, we can use the summary method of the instance.
To load the model, we can pass the trial instance to tuner.load_model.
Beware that different versions can lead to incompatibilities.
For example the directory structure is a little different between keras-tuner==1.0 and keras-tuner==1.1 as far as I know.
Using your example, the working flow may be summarized as follows.
# Recreate the tuner object
tuner = kt.BayesianOptimization(build_model,
objective='val_categorical_accuracy',
max_trials=10,
directory='kt_dir',
project_name='lstm_dense_bo',
overwrite=False)
# Return all trials from the oracle
trials = tuner.oracle.trials
# Print out the ID and the score of all trials
for trial_id, trial in trials.items():
print(trial_id, trial.score)
# Return best 5 trials
best_trials = tuner.oracle.get_best_trials(num_trials=5)
for trial in best_trials:
trial.summary()
model = tuner.load_model(trial)
# Do some stuff to the model
using
tuner = kt.BayesianOptimization(build_model,
objective='val_categorical_accuracy',
max_trials=10,
directory='kt_dir',
project_name='lstm_dense_bo')
will load the tuner again.

Learning to rank how to save model

I successfully managed to implement learning to rank by following the tutorial TF-Ranking for sparse features using the ANTIQUE question answering dataset.
Now my goal is to successfully save the learned model to disk so that I can easily load it without training again. Due to the Tensorflow docs, the estimator.export_saved_model() method seems to be the way to go. But I can't wrap my head around how to tell Tensorflow how my feature structure looks like. Due to the docs here the easiest way seems to be calling tf.estimator.export.build_parsing_serving_input_receiver_fn(), which returns me the required inpur receiver function which I have to pass to the export_saved_model function. But how do I tell Tensorflow how my features from my learning to rank model look like?
From my current understanding the model has context feature specs and example feature specs. So I guess I somehow have to combine those two specs into one feature description, which I then can pass to the build_parsing_serving_input_receiver_fn function?
So I think you are on the right track;
You can get a build_ranking_serving_input_receiver_fn like this: (substitue context_feature_columns(...) and example_feature_columns(...) with defs you probably have for creating your own context and example structures for your training data):
def example_serving_input_fn():
context_feature_spec = tf.feature_column.make_parse_example_spec(
context_feature_columns(_VOCAB_PATHS).values())
example_feature_spec = tf.feature_column.make_parse_example_spec(
list(example_feature_columns(_VOCAB_PATHS).values()))
servingInputReceiver = tfr.data.build_ranking_serving_input_receiver_fn(
data_format=tfr.data.ELWC,
context_feature_spec=context_feature_spec,
example_feature_spec=example_feature_spec,
list_size=_LIST_SIZE,
receiver_name="input_ranking_data",
default_batch_size=None)
return servingInputReceiver
And then pass this to export_saved_model like this:
ranker.export_saved_model('path_to_save_model', example_serving_input_fn())
(ranker here is a tf.estimator.Estimator, maybe you called this 'estimator' in your code)
ranker = tf.estimator.Estimator(
model_fn=model_fn,
model_dir=_MODEL_DIR,
config=run_config)

How to implement the tensor product of two layers in Keras/Tf

I'm trying to set up a DNN for classification and at one point I want to take the tensor product of a vector with itself. I'm using the Keras functional API at the moment but it isn't immediately clear that there is a layer that does this already.
I've been attempting to use a Lambda layer and numpy in order to try this, but it's not working.
Doing a bit of googling reveals
tf.linalg.LinearOperatorKronecker, which does not seem to work either.
Here's what I've tried:
I have a layer called part_layer whose output is a single vector (rank one tensor).
keras.layers.Lambda(lambda x_array: np.outer(x_array, x_array),) ( part_layer) )
Ideally I would want this to to take a vector of the form [1,2] and give me [[1,2],[2,4]].
But the error I'm getting suggests that the np.outer function is not recognizing its arguments:
AttributeError: 'numpy.ndarray' object has no attribute '_keras_history
Any ideas on what to try next, or if there is a simple function to use?
You can use two operations:
If you want to consider the batch size you can use the Dot function
Otherwise, you can use the the dot function
In both case the code should look like this:
dot_lambda = lambda x_array: tf.keras.layers.dot(x_array, x_array)
# dot_lambda = lambda x_array: tf.keras.layers.Dot(x_array, x_array)
keras.layers.Lambda(dot_lamda)( part_layer)
Hope this help.
Use tf.tensordot(x_array, x_array, axes=0) to achieve what you want. For example, the expression print(tf.tensordot([1,2], [1,2], axes=0)) gives the desired result: [[1,2],[2,4]].
Keras/Tensorflow needs to keep an history of operations applied to tensors to perform the optimization. Numpy has no notion of history, so using it in the middle of a layer is not allowed. tf.tensordot performs the same operation, but keeps the history.

Training data in two steps with same accuracy?

I am trying to implement active learning machine(an experiment for a project) algorithm, where I want to train separately, please check my code below.
clf = BernoulliNB()
clf.fit(X_train[0:40], y_train[0:40])
clf.fit(X_train[40:], y_train[40:])
The above usually done like this
clf = BernoulliNB()
clf.fit(X_train, y_train)
Both have different accuracy score. I want to add training data to existing model itself since its computationally expensive - I don't want my model to do one more time computation.
Any way I can ?
You should use partial_fit to train your model in batches.
clf = BernoulliNB()
clf.partial_fit(X_train[0:40], y_train[0:40])
clf.partial_fit(X_train[40:], y_train[40:])
Please check this to know more about the function.
Hope this helps:)
This is called online training or Incremental learning used for large data. Please see this page for strategies.
Essentially, in scikit-learn, you need partial_fit() with all the labels in y known in advance.
partial_fit(X, y, classes=None, sample_weight=None)
classes : array-like, shape = [n_classes] (default=None)
List of all the classes that can possibly appear in the y vector. Must be provided at the first call to partial_fit, can be omitted in subsequent calls.
If you simply do this:
clf.partial_fit(X_train[0:40], y_train[0:40])
clf.partial_fit(X_train[40:], y_train[40:])
Then there is a possibility that that if any class which is not present in the first 40 samples, and comes in next iterations of partial_fit(), then it will throw an error.
So ideally you should be doing this:
# First call
clf.partial_fit(X_train[0:40], y_train[0:40], classes = np.unique(y_train))
# subsequent calls
clf.partial_fit(X_train[40:80], y_train[40:80])
clf.partial_fit(X_train[80:], y_train[80:])
and so on..

tensorflow retrain model file

im getting started with tensorflow und using retrain.py to teach it some new categories - this works well - however i have some questions:
In the comments of retrain.py it says:
"This produces a new model file that can be loaded and run by any TensorFlow
program, for example the label_image sample code"
however I havent found where this new model file is saved to ?
also: it does contain the whole model, right ? not just the retrained part ?
Thanks for clearing this up
1)I think you may want to save the new model.
When you want to save a model after some process, you can use
saver.save(sess, 'directory/model-name', *optional-arg).
Check out https://www.tensorflow.org/api_docs/python/tf/train/Saver
If you change model-name by epoch or any measure you would like to use, you can save the new model(otherwise, it may overlap with previous models saved).
You can find the model saved by searching 'checkpoint', '.index', '.meta'.
2)Saving the whole model or just part of it?
It's the part you need to learn bunch of ideas on tf.session and savers. You can save either the whole or just part, it's up to you. Again, start from the above link. The moral is that you put the variables you would like to save in a list quoted as 'var_list' in the link, and you can save only for them. When you call them back, you now also need to specify which variables in your model correspond to the variables in the loaded variables.
While running retrain.py you can give --output_graph and --output_labels parameters which specify the location to save graph (default is /tmp/output_graph.pb) and the labels as well. You can change those as per your requirements.