xgboost library 0.9 vs 1.3.3 - xgboost

I worked for some time on hyper-parameters tuning on a XGBRegressor based on xgboost library version 0.9 (Python for Anaconda on Windows).
After installing the latest Anaconda and xgboost(version 1.3.3 this time) on a different PC, I noticed different results on the same code. Here is the simplified code:
from xgboost import XGBRegressor
params = {
'max_depth': 9,
'n_estimators': 400,
'learning_rate': 0.1,
'min_child_weight': 2,
'reg_alpha': 0.05,
'reg_lambda': 1,
'colsample_bytree': .8,
'colsample_bynode': 1,
'colsample_bylevel': .9,
'objective': 'reg:squarederror'
}
model = XGBRegressor(**params)
model.fit(X, Y)
If I am using model to make predictions, I get different results with xgb 1.3.3. I tried to match the hyper parameters by adding all the new default parameters I found in the 1.3.3 library
'base_score': 0.5,
'booster': 'gbtree',
'gamma': 0,
'max_delta_step': 0,
'random_state': 0,
'scale_pos_weight': 1,
'subsample': 1,
'seed': 0
but still the same result. Can you help me adapting the code in order to get the same results on the new environment. I need this to avoid reworking on tuning.
Thank you!

You need to find out which version of XGboost have you used for the desired results. If it xgboost 0.9, you need to install this version. To do so, first uninstall xgboost
pip uninstall xgboost
and then, install xgboost-0.9
pip install xgboost==0.9
You can also look for updates here

Related

Getting KeyError : 'callable_inputs' when trying to save a TF model in S3 bucket

I'm using sagemaker 2.5.1 and tensorflow 2.3.0
The weird part is that the same code worked before, the only change that I could think of is the new release of the two libraries
This appears to be a bug with SageMaker.
I'm assuming you are using a TensorFlow estimator to train the model. Something like this:
estimator = TensorFlow(
entry_point='script.py',
role=role,
train_instance_count=1,
train_instance_type='ml.p3.2xlarge',
framework_version='2.3.0',
py_version='py37',
script_mode=True,
hyperparameters={
'epochs': 100,
'batch-size': 256,
'learning-rate': 0.001
}
)
If that's the case, either TensorFlow 2.2 it TensorFlow 3.3 is causing this error when debugger callbacks are enabled. To fix the issue, you can set the debugger_hook_config to False:
estimator = TensorFlow(
entry_point='script.py',
role=role,
train_instance_count=1,
train_instance_type='ml.p3.2xlarge',
framework_version='2.3.0',
py_version='py37',
script_mode=True,
debugger_hook_config=False,
hyperparameters={
'epochs': 100,
'batch-size': 256,
'learning-rate': 0.001
}
)
The problem is actually coming from smdebug version 0.9.1
Downgrading to 0.8.1 solves the issue

How to use the efficientnet-lite provided by tfhub for the second training on tf2.1

The version I use is tensorflow-gpu version 2.1.0, installed from pip.
import tensorflow as tf
import tensorflow_hub as hub
tf.keras.backend.set_learning_phase(True)
module_url = "https://tfhub.dev/tensorflow/efficientnet/lite0/classification/2"
module2 = tf.keras.Sequential([
hub.KerasLayer(module_url, trainable=False, input_shape=(224,224,3))])
output1 = module2(tf.ones(shape=(1,224,224,3)))
print(module2.summary())
When I set trainable = True, the operation will give an error.
So, can't I retrain it on tf2.1 version?
The EfficientNet-Lite models on TFHub are based on TensorFlow 1, and thus are subject to many restrictions on TF2 including fine-tuning as you've discovered. The EfficientNet models were updated to TF2 but we're still waiting for their lite counterparts.
https://www.tensorflow.org/hub/model_compatibility
https://github.com/tensorflow/hub/issues/751
UPDATE: Beginning October 5, 2021, the EfficientNet-Lite models on TFHub are available for TensorFlow 2.

How to get same model and prediction in scikit-learn 0.17.1 and 0.22.1?

My current code is written on python2.7 using scikit-learn 0.17.1 . I am trying to move to latest versions of python3.7 and scikit-learn 0.22.1 . For the exact same feature set, my model object after calling .fit is different. The values of coeff_ are not matching at all. As a result my predictions are also not matching. Is this expected?
This is a big code rewrite exercise and I need to test finally that my changes are working perfectly.
My initial test plan was to compare predictions for exact same values between old code and new code. But if predictions are going to be different, I will have to check that predictions accuracy has improved in the new code. I would still be not sure if I have missed some transformations.
If you have faced similar problem, please advice what you did in such a scenario.
here is the code I am using which is same in 0.17.1 and 0.22.1 -
model = ElasticNetCV(l1_ratio=[0.5, 0.75, 0.95, 0.99, 1.0], normalize=True, n_jobs=1)
model.fit(features.values, targets.values)

Beam Search Decoder Tensorflow 2.0

I am looking to implement a sequence to sequence neural net with attention and beam search in Tensorflow 2.0 alpha. While the tutorials on their website have been very useful, I am having trouble figuring out the best way to implement beam search since the contrib library is deprecated - can anyone point me in the right direction?
I tried to use TF2.0s upgrade script to upgrade my tensorflow 1.X beam search to 2.0, but it does not support the contrib library.
This is how the beam search code looked for 1.x
decoder = tf.contrib.seq2seq.BeamSearchDecoder(
cell=decoder_cell,
embedding=self.embeddings,
start_tokens=tf.fill([self.batch_size], tf.constant(2)),
end_token=tf.constant(3),
initial_state=initial_state,
beam_width=self.beam_width,
output_layer=self.projection_layer
)
outputs, _, _ = tf.contrib.seq2seq.dynamic_decode(
decoder, output_time_major=True, maximum_iterations=summary_max_len, scope=decoder_scope)
self.prediction = tf.transpose(outputs.predicted_ids, perm=[1, 2, 0])
Few of the Tensorflow 1.x APIs are moved to different APIs in Tensorflow 2.x.
Tf.contrib is one such library which partly moved to Tensorflow addons.
For the tf.contrib.seq2seq.BeamSearchDecoder is moved to tfa.seq2seq.BeamSearchDecoder in TFv2.x.
tfa.seq2seq.BeamSearchDecoder(
cell: tf.keras.layers.Layer,
beam_width: int,
embedding_fn: Optional[Callable] = None,
output_layer: Optional[tf.keras.layers.Layer] = None,
length_penalty_weight: tfa.types.FloatTensorLike = 0.0,
coverage_penalty_weight: tfa.types.FloatTensorLike = 0.0,
reorder_tensor_arrays: bool = True,
**kwargs
)

Keras / tensorflow - limit number of cores (intra_op_parallelism_threads not working)

I've been trying to run keras on a CPU cluster, and for this I need to limit the number of cores used (it's a shared system). So to limit the number of cores, I landed on this answer. However, this simply doesn't work. I tried running with this basic code:
from keras.applications.vgg16 import VGG16
from keras import backend as K
import numpy as np
conf = K.tf.ConfigProto(device_count={'CPU': 1},
intra_op_parallelism_threads=2,
inter_op_parallelism_threads=2)
K.set_session(K.tf.Session(config=conf))
model = VGG16(weights='imagenet', include_top=False)
x = np.random.randn(1000, 224, 224, 3)
features = model.predict(x)
When I run this and check htop, it uses all (128) logical cores. Is this a bug in keras? Or am I doing something wrong?
Keras says that my CPU supports SSE4.1 and SSE4.2, which are not used because I didn't compile from binary. Will compiling from binary also fix the original question?
EDIT: I've found a workaround when launching the keras script from a unix machine:
taskset -c 0-23 python keras_script.py
This will run the script on the first 24 cores of the machine. It works, but it would still be nice if this was available from within keras/tensorflow.
I found this snippet of code that works for me, hope it helps:
from keras import backend as K
import tensorflow as tf
jobs = 2 # it means number of cores
config = tf.ConfigProto(intra_op_parallelism_threads=jobs,
inter_op_parallelism_threads=jobs,
allow_soft_placement=True,
device_count={'CPU': jobs})
session = tf.Session(config=config)
K.set_session(session)