FANN incremental learning - training-data

I am now using FANN to do incremental learning. Would anyone let me know whether my program is wrong or not? Thank you.
I have a data set to train. But in future I will get some new data set. I wanna incrementally train the current nn with the new data set, called "incremental learning".
I firstly create and train a nn with old data "old.data". I also set the training algorithm.
struct fann *ann = fann_create_standard(num_layers, num_input, num_neurons_hidden, num_output);
fann_set_activation_function_hidden(ann, FANN_SIGMOID_SYMMETRIC);
fann_set_activation_function_output(ann, FANN_SIGMOID_SYMMETRIC);
fann_set_training_algorithm(ann, FANN_TRAIN_INCREMENTAL);
fann_train_on_file(ann, "old.data", max_epochs, epochs_between_reports, desired_error);
fann_save(ann, "mynn.net");
fann_destroy(ann);
Then when I have new data set "new.data", I think I can program like this:
struct fann *ann = fann_create_from_file("mynn.net");
fann_train_on_file(ann, "new.data", max_epochs, epochs_between_reports, desired_error);
Is my program correct?

No
if you train with new data maybe lost old.data's experience

Related

Can you build a model that normalises FEATURES using the test set while avoiding data leakage?

Just can't wrap my head around this one.
I understand that:
Normalising a target variable using the test set uses information on that target variable in the test set. This means we get inflated performance metrics that cannot be replicated once we receive a new test set (which does not have a target variable available).
However, when we receive a new test set, we do have predictor variables available. What is wrong with using these to normalise? Yes, the predictors contain information that relates to the target variable, however that's literally the definition of predicting using a model, we use the information in predictors to get specific predictions for a target. Why can't it be built-in to the model definition that it uses input data to normalise, before predicting?
The performance metrics, surely, wouldn't be skewed as we are just using information from the predictors.
Yes, the test set is supposed to be 'unseen', but in reality, surely it's only the test set target variable that is unseen, not the predictors.
I have read around this and answers so far are vague, just repeating that test set is unseen and that we gain information about the test set. I would really appreciate an answer on why we can't use specifically the predictors, as I think the target case is obvious.
Thanks in advance!!
Having gone away and thought about my Q - normalising our data on the training set as well - I realise this doesn't make much sense. Normalising is not part of the training, but something we do before training, therefore normalising w/ test set features is fine as an idea, but we then would have to go train this normalised data on the training set outcomes. I originally thought "normalise on more data" > "normalise on less data" but actually we'd normalise on one set (training + test), then fit on another (training). Probably get a more poorly trained model as a result and so as I believe it's a stupid idea!

Reload Keras-Tuner Trials from the directory

I'm trying to reload or access the Keras-Tuner Trials after the Tuner's search has completed for inspecting the results. I'm not able to find any documentation or answers related to this issue.
For example, I set up BayesianOptimization to search for the best hyper-parameters as follows:
## Build Hyper Parameter Search
tuner = kt.BayesianOptimization(build_model,
objective='val_categorical_accuracy',
max_trials=10,
directory='kt_dir',
project_name='lstm_dense_bo')
tuner.search((X_train_seq, X_train_num), y_train_cat,
epochs=30,
batch_size=64,
validation_data=((X_val_seq, X_val_num), y_val_cat),
callbacks=[callbacks.EarlyStopping(monitor='val_loss', patience=3,
restore_best_weights=True)])
I see this creates trial files in the directory kt_dir with project name lstm_dense_bo such as below:
Now, if I restart my Jupyter kernel, how can I reload these trials into a Tuner object and subsequently inspect the best model or the best hyperparameters or the best trial?
I'd very much appreciate your help. Thank you
I was trying to do the same thing. I was looking into the keras docs for an easier way than this but could not find one - so if any other SO-ers have a better idea, please let us know!
Load the previous tuner. Make sure overwrite=False or else you'll delete your trials.
workdir = "mlp_202202151345"
obj = "val_recall"
tuner = kt.Hyperband(
hypermodel=build_model,
metrics=metrics,
objective=kt.Objective(obj, direction="max"),
executions_per_trial=1,
overwrite=False,
directory=workdir,
project_name="keras_tuner",
)
Look for a trial you want to load. Note that TensorBoard works really well for this. In this example, I'm loading 1a38ebaba07b77501999cb1c4ab9413e.
Here's the part that I could not find in Keras docs. This might be dependent on the tuner you use (I am using Hyperband):
tuner.oracle.get_trial('1a38ebaba07b77501999cb1c4ab9413e')
Returns a Trial object (also could not find in the docs). The Trial object has a hyperparameters attribute that will return that trial's hyperparameters. Now:
tuner.hypermodel.build(trial.hyperparameters)
Gives you the trial's model for training, evaluation, predictions, etc.
NOTE This seems convuluted and hacky, would love to see a better way.
j7skov has correctly mentioned that you need to reload previous tuner and set the parameter overwrite=False(so that tuner will not overwrite already generated trials).
Further if you want to load first K best models then we need to use tuner's get_best_models method as below
# This will load 10 best hyper tuned models with the weights
# corresponding to their best checkpoint (at the end of the best epoch of best trial).
best_model_count = 10
bo_tuner_best_models = tuner.get_best_models(num_models=best_model_count)
Then you can access a specific best model as below
best_model_id = 7
model = bo_tuner_best_models[best_model_id]
This method is for querying the models trained during the search. For best performance, it is recommended to retrain your Model on the full dataset using the best hyperparameters found during search, which can be obtained using tuner.get_best_hyperparameters().
tuner_best_hyperparameters = tuner.get_best_hyperparameters(num_trials=best_model_count)
best_hp = tuner_best_hyperparameters[best_model_id]
model = tuner.hypermodel.build(best_hp)
If you want to just display hyperparameters for the K best models then use tuner's results_summary method as below
tuner.results_summary(num_trials=best_model_count)
For further reference visit this page.
Inspired by j7skov, I found that the models can be reloaded
by manipulating tuner.oracle.trials and tuner.load_model.
By assigning tuner.oracle.trials to a variable, we can find that it is a dict object containing all relavant trials in the tuning process.
The keys of the dictionary are the trial_id, and the values of the
dictionary are the instance of the Trial object.
Alternatively, we can return the best few trials by using tuner.oracle.get_best_trials.
To inspect the hyperparameters of the trial, we can use the summary method of the instance.
To load the model, we can pass the trial instance to tuner.load_model.
Beware that different versions can lead to incompatibilities.
For example the directory structure is a little different between keras-tuner==1.0 and keras-tuner==1.1 as far as I know.
Using your example, the working flow may be summarized as follows.
# Recreate the tuner object
tuner = kt.BayesianOptimization(build_model,
objective='val_categorical_accuracy',
max_trials=10,
directory='kt_dir',
project_name='lstm_dense_bo',
overwrite=False)
# Return all trials from the oracle
trials = tuner.oracle.trials
# Print out the ID and the score of all trials
for trial_id, trial in trials.items():
print(trial_id, trial.score)
# Return best 5 trials
best_trials = tuner.oracle.get_best_trials(num_trials=5)
for trial in best_trials:
trial.summary()
model = tuner.load_model(trial)
# Do some stuff to the model
using
tuner = kt.BayesianOptimization(build_model,
objective='val_categorical_accuracy',
max_trials=10,
directory='kt_dir',
project_name='lstm_dense_bo')
will load the tuner again.

Learning to rank how to save model

I successfully managed to implement learning to rank by following the tutorial TF-Ranking for sparse features using the ANTIQUE question answering dataset.
Now my goal is to successfully save the learned model to disk so that I can easily load it without training again. Due to the Tensorflow docs, the estimator.export_saved_model() method seems to be the way to go. But I can't wrap my head around how to tell Tensorflow how my feature structure looks like. Due to the docs here the easiest way seems to be calling tf.estimator.export.build_parsing_serving_input_receiver_fn(), which returns me the required inpur receiver function which I have to pass to the export_saved_model function. But how do I tell Tensorflow how my features from my learning to rank model look like?
From my current understanding the model has context feature specs and example feature specs. So I guess I somehow have to combine those two specs into one feature description, which I then can pass to the build_parsing_serving_input_receiver_fn function?
So I think you are on the right track;
You can get a build_ranking_serving_input_receiver_fn like this: (substitue context_feature_columns(...) and example_feature_columns(...) with defs you probably have for creating your own context and example structures for your training data):
def example_serving_input_fn():
context_feature_spec = tf.feature_column.make_parse_example_spec(
context_feature_columns(_VOCAB_PATHS).values())
example_feature_spec = tf.feature_column.make_parse_example_spec(
list(example_feature_columns(_VOCAB_PATHS).values()))
servingInputReceiver = tfr.data.build_ranking_serving_input_receiver_fn(
data_format=tfr.data.ELWC,
context_feature_spec=context_feature_spec,
example_feature_spec=example_feature_spec,
list_size=_LIST_SIZE,
receiver_name="input_ranking_data",
default_batch_size=None)
return servingInputReceiver
And then pass this to export_saved_model like this:
ranker.export_saved_model('path_to_save_model', example_serving_input_fn())
(ranker here is a tf.estimator.Estimator, maybe you called this 'estimator' in your code)
ranker = tf.estimator.Estimator(
model_fn=model_fn,
model_dir=_MODEL_DIR,
config=run_config)

Training data in two steps with same accuracy?

I am trying to implement active learning machine(an experiment for a project) algorithm, where I want to train separately, please check my code below.
clf = BernoulliNB()
clf.fit(X_train[0:40], y_train[0:40])
clf.fit(X_train[40:], y_train[40:])
The above usually done like this
clf = BernoulliNB()
clf.fit(X_train, y_train)
Both have different accuracy score. I want to add training data to existing model itself since its computationally expensive - I don't want my model to do one more time computation.
Any way I can ?
You should use partial_fit to train your model in batches.
clf = BernoulliNB()
clf.partial_fit(X_train[0:40], y_train[0:40])
clf.partial_fit(X_train[40:], y_train[40:])
Please check this to know more about the function.
Hope this helps:)
This is called online training or Incremental learning used for large data. Please see this page for strategies.
Essentially, in scikit-learn, you need partial_fit() with all the labels in y known in advance.
partial_fit(X, y, classes=None, sample_weight=None)
classes : array-like, shape = [n_classes] (default=None)
List of all the classes that can possibly appear in the y vector. Must be provided at the first call to partial_fit, can be omitted in subsequent calls.
If you simply do this:
clf.partial_fit(X_train[0:40], y_train[0:40])
clf.partial_fit(X_train[40:], y_train[40:])
Then there is a possibility that that if any class which is not present in the first 40 samples, and comes in next iterations of partial_fit(), then it will throw an error.
So ideally you should be doing this:
# First call
clf.partial_fit(X_train[0:40], y_train[0:40], classes = np.unique(y_train))
# subsequent calls
clf.partial_fit(X_train[40:80], y_train[40:80])
clf.partial_fit(X_train[80:], y_train[80:])
and so on..

Mxnet Gluon custom data iterator

I have written a custom data iterator using mx.io.DataIter class. What's the easiest way to use this data iterator with Gluon interface?
I went through the documentation and couldn't find an easy way to do so. One of my idea was to use it as iterator and get data adn label from each batch as follows.
for e in range(epochs):
train_iter.reset()
for batch_data in train_iter:
data = nd.concatenate(([d for d in batch_data.data]))
label = nd.concatenate(([l for l in batch_data.label]))
with autograd.record():
output = net(data)
loss = softmax_cross_entropy(output, label)
loss.backward()
trainer.step(batch_size)
print(nd.mean(loss).asscalar())
But this may not be optimal as I need to concatenate per batch.
What's the optimal way to achieve this? i.e. is there a systematic
way to write a simple custom iterator for gluon?
How do I add context information in above cases?
I think your approach works. Basically you can get data from batch_data.data and label from batch_data.label and feed them into the network.
I'm not sure why you need to concat the data and labels - maybe it's to do with your network definition.
If you ever need to split the data and train on multiple GPUs, you can use the gluon.utils.split_and_load function to do that.