Is there any solution for failing to load model in tensorflow2.3? - tensorflow

I try to use tf.keras.models.load_model to load saved model in tensorflow 2.3.
However, I got the same error in
https://github.com/tensorflow/tensorflow/issues/41535
It seems an important function. But this issue is still not solved. Does anyone know if there is any alternative method to implement the same result?

I found an alternative method to load custom model in tensorflow 2.3. You need to do some following changes. I will explain by some code snapshots
for __init__() of custom model. Before,
def __init__(self, mask_ratio=0.1, hyperparam=0.1, **kwargs):
layers = []
layer_configs = {}
if 'layers' in kwargs.keys():
layer_configs = kwargs['layers']
for config in layer_configs:
layer = tf.keras.layers.deserialize(config)
layers.append(layer)
super(custom_model, self).__init__(layers) # custom_model is your custom model class
self.mask_ratio = mask_ratio
self.hyperparam = hyperparam
...
After,
def __init__(self, mask_ratio=0.1, hyperparam=0.1, **kwargs):
super(custom_model, self).__init__() # custom_model is your custom model class
self.mask_ratio = mask_ratio
self.hyperparam = hyperparam
...
define two functions in your custom model class
def get_config(self):
config = {
'mask_ratio': self.mask_ratio,
'hyperparam': self.hyperparam
}
base_config = super(custom_model, self).get_config()
return dict(list(config.items()) + list(base_config.items()))
#classmethod
def from_config(cls, config):
#config = cls().get_config()
return cls(**config)
After finishing training, save model using 'h5' format
model.save(file_path, save_format='h5')
Finally, load model as following codes,
model = tf.keras.models.load_model(model_path, compile=False, custom_objects={'custom_model': custom_model})

Related

Weights of pre-trained BERT model not initialized

I am using the Language Interpretability Toolkit (LIT) to load and analyze a BERT model that I pre-trained on an NER task.
However, when I'm starting the LIT script with the path to my pre-trained model passed to it, it fails to initialize the weights and tells me:
modeling_utils.py:648] loading weights file bert_remote/examples/token-classification/Data/Models/results_21_03_04_cleaned_annotations/04.03._8_16_5e-5_cleaned_annotations/04-03-2021 (15.22.23)/pytorch_model.bin
modeling_utils.py:739] Weights of BertForTokenClassification not initialized from pretrained model: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
modeling_utils.py:745] Weights from pretrained model not used in BertForTokenClassification: ['bert.embeddings.position_ids']
It then simply uses the bert-base-german-cased version of BERT, which of course doesn't have my custom labels and thus fails to predict anything. I think it might have to do with PyTorch, but I can't find the error.
If relevant, here is how I load my dataset into CoNLL 2003 format (modification of the dataloader scripts found here):
def __init__(self):
# Read ConLL Test Files
self._examples = []
data_path = "lit_remote/lit_nlp/examples/datasets/NER_Data"
with open(os.path.join(data_path, "test.txt"), "r", encoding="utf-8") as f:
lines = f.readlines()
for line in lines[:2000]:
if line != "\n":
token, label = line.split(" ")
self._examples.append({
'token': token,
'label': label,
})
else:
self._examples.append({
'token': "\n",
'label': "O"
})
def spec(self):
return {
'token': lit_types.Tokens(),
'label': lit_types.SequenceTags(align="token"),
}
And this is how I initialize the model and start the LIT server (modification of the simple_pytorch_demo.py script found here):
def __init__(self, model_name_or_path):
self.tokenizer = transformers.AutoTokenizer.from_pretrained(
model_name_or_path)
model_config = transformers.AutoConfig.from_pretrained(
model_name_or_path,
num_labels=15, # FIXME CHANGE
output_hidden_states=True,
output_attentions=True,
)
# This is a just a regular PyTorch model.
self.model = _from_pretrained(
transformers.AutoModelForTokenClassification,
model_name_or_path,
config=model_config)
self.model.eval()
## Some omitted snippets here
def input_spec(self) -> lit_types.Spec:
return {
"token": lit_types.Tokens(),
"label": lit_types.SequenceTags(align="token")
}
def output_spec(self) -> lit_types.Spec:
return {
"tokens": lit_types.Tokens(),
"probas": lit_types.MulticlassPreds(parent="label", vocab=self.LABELS),
"cls_emb": lit_types.Embeddings()
This actually seems to be expected behaviour. In the documentation of the GPT models the HuggingFace team writes:
This will issue a warning about some of the pretrained weights not being used and some weights being randomly initialized. That’s because we are throwing away the pretraining head of the BERT model to replace it with a classification head which is randomly initialized.
So it seems to not be a problem for the fine-tuning. In my use case described above it worked despite the warning as well.

In tensorflow, for custom layers that need arguments at instantialion, does the get_config method need overriding?

Ubuntu - 20.04,
Tensorflow - 2.2.0,
Tensorboard - 2.2.1
I have read that one needs to reimplement the config method in order for a custom layer to be serializable.
I have a custom layer that accepts arguments in its __init__. It uses another custom layer and that consumes arguments in its __init__ as well. I can:
Without Tensorboard callbacks:
Use them in a model both in eager model and graph form
Run tf.saved_model.save and it executes without a glich
Load the thus saved model using tf.saved_model.load and it loads the model saved in 2. above
I can call model(input) the loaded model. I can also call 'call_and_return_all_conditional_losses(input)` and they run right as well
With Tensorboard callbacks:
All of the above (can .fit, save, load, predict from loaded etc) except.. While running fit i get
WARNING:tensorflow:Model failed to serialize as JSON. Ignoring... Layer PREPROCESS_MONSOON has arguments in `__init__` and therefore must override `get_config`.
Pasting the entire code here that can be run end to end. You just need to have tensorflow 2 installed. Please delete/add the callbacks (only tensorboard callbacks is there) to .fit to see the two behaviors mentioned above
import pandas as pd
import tensorflow as tf
from tensorflow.keras import layers as l
from tensorflow import keras as k
import numpy as np
##making empty directories
import os
os.makedirs('r_data',exist_ok=True)
os.makedirs('r_savedir',exist_ok=True)
#Preparing the dataset
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train_ = pd.DataFrame(x_train.reshape(60000,-1),columns = ['col_'+str(i) for i in range(28*28)])
x_test_ = pd.DataFrame(x_test.reshape(10000,-1),columns = ['col_'+str(i) for i in range(28*28)])
x_train_['col_cat1'] = [np.random.choice(['a','b','c','d','e','f','g','h','i']) for i in range(x_train_.shape[0])]
x_test_['col_cat1'] = [np.random.choice(['a','b','c','d','e','f','g','h','i','j']) for i in range(x_test_.shape[0])]
x_train_['col_cat2'] = [np.random.choice(['a','b','c','d','e','f','g','h','i']) for i in range(x_train_.shape[0])]
x_test_['col_cat2'] = [np.random.choice(['a','b','c','d','e','f','g','h','i','j']) for i in range(x_test_.shape[0])]
x_train_[np.random.choice([True,False],size = x_train_.shape,p=[0.05,0.95]).reshape(x_train_.shape)] = np.nan
x_test_[np.random.choice([True,False],size = x_test_.shape,p=[0.05,0.95]).reshape(x_test_.shape)] = np.nan
x_train_.to_csv('r_data/x_train.csv',index=False)
x_test_.to_csv('r_data/x_test.csv',index=False)
pd.DataFrame(y_train).to_csv('r_data/y_train.csv',index=False)
pd.DataFrame(y_test).to_csv('r_data/y_test.csv',index=False)
#**THE MAIN LAYER THAT WE ARE TALKING ABOUT**
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow import feature_column
import os
class NUM_TO_DENSE(layers.Layer):
def __init__(self,num_cols):
super().__init__()
self.keys = num_cols
self.keys_all = self.keys+[str(i)+'__nullcol' for i in self.keys]
# def get_config(self):
# config = super().get_config().copy()
# config.update({
# 'keys': self.keys,
# 'keys_all': self.keys_all,
# })
# return config
def build(self,input_shape):
def create_moving_mean_vars():
return tf.Variable(initial_value=0.,shape=(),dtype=tf.float32,trainable=False)
self.moving_means_total = {t:create_moving_mean_vars() for t in self.keys}
self.layer_global_counter = tf.Variable(initial_value=0.,shape=(),dtype=tf.float32,trainable=False)
def call(self,inputs, training = True):
null_cols = {k:tf.math.is_finite(inputs[k]) for k in self.keys}
current_means = {}
def compute_update_current_means(t):
current_mean = tf.math.divide_no_nan(tf.reduce_sum(tf.where(null_cols[t],inputs[t],0.),axis=0),\
tf.reduce_sum(tf.cast(tf.math.is_finite(inputs[t]),tf.float32),axis=0))
self.moving_means_total[t].assign_add(current_mean)
return current_mean
if training:
current_means = {t:compute_update_current_means(t) for t in self.keys}
outputs = {t:tf.where(null_cols[t],inputs[t],current_means[t]) for t in self.keys}
outputs.update({str(k)+'__nullcol':tf.cast(null_cols[k],tf.float32) for k in self.keys})
self.layer_global_counter.assign_add(1.)
else:
outputs = {t:tf.where(null_cols[t],inputs[t],(self.moving_means_total[t]/self.layer_global_counter))\
for t in self.keys}
outputs.update({str(k)+'__nullcol':tf.cast(null_cols[k],tf.float32) for k in self.keys})
return outputs
class PREPROCESS_MONSOON(layers.Layer):
def __init__(self,cat_cols_with_unique_values,num_cols):
'''cat_cols_with_unqiue_values: (dict) {'col_cat':[unique_values_list]}
num_cols: (list) [num_cols_name_list]'''
super().__init__()
self.cat_cols = cat_cols_with_unique_values
self.num_cols = num_cols
# def get_config(self):
# config = super().get_config().copy()
# config.update({
# 'cat_cols': self.cat_cols,
# 'num_cols': self.num_cols,
# })
# return config
def build(self,input_shape):
self.ntd = NUM_TO_DENSE(self.num_cols)
self.num_colnames = self.ntd.keys_all
self.ctd = {k:layers.DenseFeatures\
(feature_column.embedding_column\
(feature_column.categorical_column_with_vocabulary_list\
(k,v),tf.cast(tf.math.ceil(tf.math.log(tf.cast(len(self.cat_cols[k]),tf.float32))),tf.int32).numpy()))\
for k,v in self.cat_cols.items()}
self.cat_colnames = [i for i in self.cat_cols]
self.dense_colnames = self.num_colnames+self.cat_colnames
def call(self,inputs,training=True):
dense_num_d = self.ntd(inputs,training=training)
dense_cat_d = {k:self.ctd[k](inputs) for k in self.cat_colnames}
dense_num = tf.stack([dense_num_d[k] for k in self.num_colnames],axis=1)
dense_cat = tf.concat([dense_cat_d[k] for k in self.cat_colnames],axis=1)
dense_all = tf.concat([dense_num,dense_cat],axis=1)
return dense_all
##Inputs
label_path = 'r_data/y_train.csv'
data_path = 'r_data/x_train.csv'
max_epochs = 100
batch_size = 32
shuffle_seed = 42
##Creating layer inputs
dfs = pd.read_csv(data_path,nrows=1)
cdtypes_x = dfs.dtypes
nc = list(dfs.select_dtypes(include=[int,float]).columns)
oc = list(dfs.select_dtypes(exclude=[int,float]).columns)
cdtypes_y = pd.read_csv(label_path,nrows=1).dtypes
dfc = pd.read_csv(data_path,usecols=oc)
ccwuv = {i:list(pd.Series(dfc[i].unique()).dropna()) for i in dfc.columns}
preds_name = pd.read_csv(label_path,nrows=1).columns
##creating datasets
dataset = tf.data.experimental.make_csv_dataset(
'r_data/x_train.csv',batch_size, column_names=cdtypes_x.index,prefetch_buffer_size=1,
shuffle=True,shuffle_buffer_size=10000,shuffle_seed=shuffle_seed)
labels = tf.data.experimental.make_csv_dataset(
'r_data/y_train.csv',batch_size, column_names=cdtypes_y.index,prefetch_buffer_size=1,
shuffle=True,shuffle_buffer_size=10000,shuffle_seed=shuffle_seed)
dataset = tf.data.Dataset.zip((dataset,labels))
##CREATING NETWORK
p = PREPROCESS_MONSOON(cat_cols_with_unique_values=ccwuv,num_cols=nc)
indict = {}
for i in nc:
indict[i] = k.Input(shape = (), name=i,dtype=tf.float32)
for i in ccwuv:
indict[i] = k.Input(shape=(), name=i,dtype=tf.string)
x = p(indict)
x = l.BatchNormalization()(x)
x = l.Dense(10,activation='relu',name='dense_1')(x)
predictions = l.Dense(10,activation=None,name=preds_name[0])(x)
model = k.Model(inputs=indict,outputs=predictions)
##Compiling model
model.compile(optimizer=k.optimizers.Adam(),
loss=k.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['sparse_categorical_accuracy'])
##callbacks
log_dir = './tensorboard_dir/no_config'
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
## Fit model on training data
history = model.fit(dataset,
batch_size=64,
epochs=30,
steps_per_epoch=5,
validation_split=0.,
callbacks = [tensorboard_callback])
#saving the model
tf.saved_model.save(model,'r_savedir')
#loading the model
model = tf.saved_model.load('r_savedir')
##Predicting on loaded model
for i in dataset:
print(model(i[0],training=False))
break
I have commented out the part from the code where i override the config files in my custom layers and you can comment them in and the Warning about the layers not being serializable would go away.
Question:
Do i or do i not need to override the config method in order to make a custom layer that accepts arguments in __init__ serializable?
Thank you in advance for help
You must add 'get_config' to your code
def get_config(self):
config = super().get_config()
return config
The NUM_TO_DENSE class must be like this
class NUM_TO_DENSE(layers.Layer):
def __init__(self,num_cols):
super().__init__()
self.keys = num_cols
self.keys_all = self.keys+[str(i)+'__nullcol' for i in self.keys]
def get_config(self):
config = super().get_config()
return config

Balanced Accuracy Score in Tensorflow

I am implementing a CNN for an highly unbalanced classification problem and I would like to implement custum metrics in tensorflow to use the Select Best Model callback.
Specifically I would like to implement the balanced accuracy score, which is the average of the recall of each class (see sklearn implementation here), does someone know how to do it?
I was facing the same issue so I implemented a custom class based off SparseCategoricalAccuracy:
class BalancedSparseCategoricalAccuracy(keras.metrics.SparseCategoricalAccuracy):
def __init__(self, name='balanced_sparse_categorical_accuracy', dtype=None):
super().__init__(name, dtype=dtype)
def update_state(self, y_true, y_pred, sample_weight=None):
y_flat = y_true
if y_true.shape.ndims == y_pred.shape.ndims:
y_flat = tf.squeeze(y_flat, axis=[-1])
y_true_int = tf.cast(y_flat, tf.int32)
cls_counts = tf.math.bincount(y_true_int)
cls_counts = tf.math.reciprocal_no_nan(tf.cast(cls_counts, self.dtype))
weight = tf.gather(cls_counts, y_true_int)
return super().update_state(y_true, y_pred, sample_weight=weight)
The idea is to set each class weight inversely proportional to its size.
This code produces some warnings from Autograph but I believe those are Autograph bugs, and the metric seems to work fine.
There are 3 ways I can think of tackling the situation :-
1)Random Under-sampling - In this method you can randomly remove samples from the majority classes.
2)Random Over-sampling - In this method you can increase the samples by replicating them.
3)Weighted cross entropy - You can also use weighted cross entropy so that the loss value can be compensated for the minority classes. See here
I have personally tried method2 and it does increase my accuracy by significant value but it may vary from dataset to dataset
NOTE
It appears that the implementation/API of the Recall class, which I used as a template for my answer, has been modified in the newer TF versions (as pointed out by #guilaumme-gaudin), so I recommend you look at the Recall implementation used in your current TF version and take it from there to implement the metric using the same approach I describe in the original post, this way I don't have to update my answer every time the TF team modifies the implementation/API of its metrics.
Original post
I'm not an expert in Tensorflow but using a bit of pattern matching between metrics implementations in the tf source code I came up with this
from tensorflow.python.keras import backend as K
from tensorflow.python.keras.metrics import Metric
from tensorflow.python.keras.utils import metrics_utils
from tensorflow.python.ops import init_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.keras.utils.generic_utils import to_list
class BACC(Metric):
def __init__(self, thresholds=None, top_k=None, class_id=None, name=None, dtype=None):
super(BACC, self).__init__(name=name, dtype=dtype)
self.init_thresholds = thresholds
self.top_k = top_k
self.class_id = class_id
default_threshold = 0.5 if top_k is None else metrics_utils.NEG_INF
self.thresholds = metrics_utils.parse_init_thresholds(
thresholds, default_threshold=default_threshold)
self.true_positives = self.add_weight(
'true_positives',
shape=(len(self.thresholds),),
initializer=init_ops.zeros_initializer)
self.true_negatives = self.add_weight(
'true_negatives',
shape=(len(self.thresholds),),
initializer=init_ops.zeros_initializer)
self.false_positives = self.add_weight(
'false_positives',
shape=(len(self.thresholds),),
initializer=init_ops.zeros_initializer)
self.false_negatives = self.add_weight(
'false_negatives',
shape=(len(self.thresholds),),
initializer=init_ops.zeros_initializer)
def update_state(self, y_true, y_pred, sample_weight=None):
return metrics_utils.update_confusion_matrix_variables(
{
metrics_utils.ConfusionMatrix.TRUE_POSITIVES: self.true_positives,
metrics_utils.ConfusionMatrix.TRUE_NEGATIVES: self.true_negatives,
metrics_utils.ConfusionMatrix.FALSE_POSITIVES: self.false_positives,
metrics_utils.ConfusionMatrix.FALSE_NEGATIVES: self.false_negatives,
},
y_true,
y_pred,
thresholds=self.thresholds,
top_k=self.top_k,
class_id=self.class_id,
sample_weight=sample_weight)
def result(self):
"""
Returns the Balanced Accuracy (average between recall and specificity)
"""
result = (math_ops.div_no_nan(self.true_positives, self.true_positives + self.false_negatives) +
math_ops.div_no_nan(self.true_negatives, self.true_negatives + self.false_positives)) / 2
return result
def reset_states(self):
num_thresholds = len(to_list(self.thresholds))
K.batch_set_value(
[(v, np.zeros((num_thresholds,))) for v in self.variables])
def get_config(self):
config = {
'thresholds': self.init_thresholds,
'top_k': self.top_k,
'class_id': self.class_id
}
base_config = super(BACC, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
I've simply taken the Recall class implementation from the source code as a template and I extended it to make sure it has a TP,TN,FP and FN defined.
After that I modified the result method so that it calculates balanced accuracy and voila :)
I compared the results from this with sklearn's balanced accuracy score and the values matched so I think it's correct, but do double check just in case.
I have not tested this code yet, but looking at the source code of tensorflow==2.1.0, this might work for the binary classification case:
from tensorflow.keras.metrics import Recall
from tensorflow.python.ops import math_ops
class BalancedBinaryAccuracy(Recall):
def result(self):
result = (math_ops.div_no_nan(self.true_positives, self.true_positives + self.false_negatives) +
math_ops.div_no_nan(self.true_negatives, self.true_negatives + self.false_positives)) / 2
return result[0] if len(self.thresholds) == 1 else result
As an alternative to writing a custom metric, you can write a custom callback using the metrics already implemented ad available via the training logs. For example you can define the training balanced accuracy callback like this:
class TrainBalancedAccuracyCallback(tf.keras.callbacks.Callback):
def __init__(self, **kargs):
super(TrainBalancedAccuracyCallback, self).__init__(**kargs)
def on_epoch_end(self, epoch, logs={}):
train_sensitivity = logs['tp'] / (logs['tp'] + logs['fn'])
train_specificity = logs['tn'] / (logs['tn'] + logs['fp'])
logs['train_sensitivity'] = train_sensitivity
logs['train_specificity'] = train_specificity
logs['train_balacc'] = (train_sensitivity + train_specificity) / 2
print('train_balacc', logs['train_balacc'])
and the same for the validation:
class ValBalancedAccuracyCallback(tf.keras.callbacks.Callback):
def __init__(self, **kargs):
super(ValBalancedAccuracyCallback, self).__init__(**kargs)
def on_epoch_end(self, epoch, logs={}):
val_sensitivity = logs['val_tp'] / (logs['val_tp'] + logs['val_fn'])
val_specificity = logs['val_tn'] / (logs['val_tn'] + logs['val_fp'])
logs['val_sensitivity'] = val_sensitivity
logs['val_specificity'] = val_specificity
logs['val_balacc'] = (val_sensitivity + val_specificity) / 2
print('val_balacc', logs['val_balacc'])
and then you can use these as values to the callback argument of the fit method of the model.

A custom alternating update rule with keras

I would like to use an alternating update rule with keras.
I.e. per-batch I would like to call a regular gradient-based step, and next call a custom step.
I thought about implementing it by either inheriting an optimizer or a callback (and use the on-batch calls). However, neither would do, because they both lack the batch-data and batch-labels (and I need both).
Any idea on how to implement a custom alternating update with keras?
If required, I don't mind directly calling tensorflow specific methods, as long as I can keep use the project wrapped with the keras framework (with model.fit, model.predict .. )
try create a custom callback
import keras.callbacks as callbacks
class JSONMetrics(callbacks.Callback):
_model = None
_each_epoch = None
_metrics = None
_epoch = None
_file_json = None
def __init__(self,model,each_epoch,logger=None):
self._file_json = "file_log.json"
self._model = model
self._each_epoch= each_epoch
self._epoch = 0
self._metrics = {'loss':[], 'acc':[]}
def on_epoch_begin(self, epoch, logs):
# print('Epoch {0} begin'.format(epoch))
try:
with open(self._file_json, 'r') as f:
self._metrics = json.load(f)
def on_epoch_end(self, epoch, logs):
self._logger.info('Nemesis: Epoch {0} end'.format(epoch))
self._metrics['loss'].append(logs.get('loss'))
self._metrics['acc'].append(logs.get('acc'))
with open(self._file_json, 'w') as f:
data = json.dump(self._metrics, f)
if self._epoch % self._each_epoch == 0:
file_name = 'weights%08d.h5' % self._epoch
#print('Saving weights at {0} file'.format(file_name))
self._model.save_weights(file_name)
self._epoch += 1
You can evoke the self.model to solve your problem and save the acc and loss for example.

Custom scikit-learn pickling doesn't work inside a grid search

I have written a scikit-learn estimator. It has a parameter and a model_ attribute that is set by fit.
class MyEstimator(BaseEstimator, TransformerMixin):
def __init__(self, param="default"):
self.param = param
self.model_ = None
def fit(self, x, y):
# Sets the value of self.model_
I want to be able to pickle MyEstimator, but the model_ object I create cannot be serialized with pickle because it is a keras model. Following the example of the blog post "Pickling Keras Models" I added the following pickling handler methods to my class.
class MyEstimator(BaseEstimator, TransformerMixin):
def __getstate__(self):
state = super().__getstate__().copy()
with tempfile.NamedTemporaryFile(suffix=".hdf5", delete=True) as fd:
keras.models.save_model(self.model_, fd.name, overwrite=True)
state["model_"] = fd.read()
return state
def __setstate__(self, state):
super().__setstate__(state)
with tempfile.NamedTemporaryFile(suffix=".hdf5", delete=True) as fd:
fd.write(state["model_"])
fd.flush()
self.__dict__["model_"] = keras.models.load_model(fd.name)
This replaces the unpickleable model_ member with a representation generated by keras' serializer that can be pickled. Using this customization I can call fit, serialize and deserialize, and get back my original model. Everything works.
e = MyEstimator()
e.fit(x, y)
with open("myfile.pk", mode="wb") as f:
pickle.dump(e, f)
with open("myfile.pk", mode="rb") as f:
pickle.load(f) # Returns a copy of e
However, serialization does not work when I try to put MyEstimator in a pipeline and pickle the result of a GridSearchCV.
s = GridSearchCV(Pipeline([
# ...
("estimator", MyEstimator())
# ...
]))
s.fit(x, y)
with open("myfile.pk", mode="wb") as f:
pickle.dump(s, f)
During the pickle.dump call I expect to see MyEstimator.__getstate__ get called with a fitted self.model_ object. (This is what happens when I serialize the model by itself, outside the grid search.) Instead self.model_ is None, so I am unable to serialize the best_estimator_ generated by my grid search.
It looks like grid search serialization is instantiating a new MyEstimator object instead of using the one that was in the pipeline. This seems wrong to me. I've looked through the scikit-learn code, but can't see where this is happening.
Is this a bug in scikit-learn, or am I doing something wrong?
(Note: keras does have a wrapper layer that can convert some keras models into scikit-learn estimators, but I can't use that here for other reasons and I'm not sure it wouldn't just have the same problem.)
The search object contains a mixed of MyEstimator objects, some of which have not had fit called on them. The fix is to check if model_ is None before trying to serialize it with the keras tools.
class MyEstimator(BaseEstimator, TransformerMixin):
def __getstate__(self):
state = super().__getstate__().copy()
if self.model_ is not None:
with tempfile.NamedTemporaryFile(suffix=".hdf5", delete=True) as fd:
keras.models.save_model(self.model_, fd.name, overwrite=True)
state["model_"] = fd.read()
return state
def __setstate__(self, state):
super().__setstate__(state)
if self.model_ is not None:
with tempfile.NamedTemporaryFile(suffix=".hdf5", delete=True) as fd:
fd.write(state["model_"])
fd.flush()
self.__dict__["model_"] = keras.models.load_model(fd.name)
I don't know why there would be any unfitted models in the search object after the grid search had completed, but there are.