GridSearch for XGBoost - xgboost

I am reading the grid search for XGBoost on Analytics Vidhaya.
It has the following in the code
param_test1 = {'max_depth':range(3,10,2), 'min_child_weight':range(1,6,2)}
gsearch1 = GridSearchCV(estimator = XGBClassifier( learning_rate =0.1,
n_estimators=140,
max_depth=5,
min_child_weight=1,
gamma=0,
subsample=0.8,
colsample_bytree=0.8,
objective= 'binary:logistic',
nthread=4,
scale_pos_weight=1,
seed=27),
param_grid = param_test1,
scoring='roc_auc',
n_jobs=4, iid=False,
cv=5)
My quesiton is if the grid search is used to find a better max_depth and min_child_weight,
then why these two parameters are set in gsearch1 as 5 and 1, respectively.
Moreover, in my own code when I comment these two out, then the result changes. Why is that?
Thanks

Related

Hierachical Bayesian Linear Regression using PyMC3 is super slow

I am trying to write some code for implementing HBM in the case of logistic regression using the adults dataset from the UCI repository.
I have already written the code, but sampling is super slow, on the order of 107s per sample, for even 64 dimensions or features. Am I doing something wrong?
I am attaching the code for reference. I also rescaled the data thanks to suggestions to try to speed it up, to no avail.
I appreciate any feedback.
The code is a mixture of what has been written here and here.
#re loading the dataset this time without converting the country into one-hot vector rather for hierarchical modeling
adult_df = pd.read_csv('adult.data', header=None, sep=', ', )
adult_df.columns = ["Age", "WorkClass", "fnlwgt", "Education", "EducationNum",
"MaritalStatus", "Occupation", "Relationship", "Race", "Gender",
"CapitalGain", "CapitalLoss", "HoursPerWeek", "NativeCountry", "Income"]
adult_df["Income"] = adult_df["Income"].map({ "<=50K": 0, ">50K": 1 })
adult_df.drop("CapitalGain", axis=1, inplace=True,)
adult_df.drop("CapitalLoss", axis=1, inplace=True,)
adult_df.Age = adult_df.Age.astype(float)
adult_df.fnlwgt = adult_df.fnlwgt.astype(float)
adult_df.EducationNum = adult_df.EducationNum.astype(float)
adult_df.HoursPerWeek = adult_df.HoursPerWeek.astype(float)
# dropping native country here!!
adult_df = pd.get_dummies(adult_df, columns=[
"WorkClass", "Education", "MaritalStatus", "Occupation", "Relationship",
"Race", "Gender",
])
standard_scaler_cols = ["Age", "fnlwgt", "EducationNum", "HoursPerWeek",]
other_cols = list(set(adult_df.columns) - set(standard_scaler_cols))
mapper = DataFrameMapper(
[([col,], StandardScaler(),) for col in standard_scaler_cols] +
[(col, None,) for col in other_cols]
)
le = preprocessing.LabelEncoder()
country_idx = le.fit_transform(adult_df['NativeCountry'])
pd.value_counts(pd.Series(y_all))
y_all = adult_df["Income"].values
adult_df.drop("Income", axis=1, inplace=True,)
adult_df.drop("NativeCountry", axis=1, inplace=True,)
n_countries = len(set(country_idx))
n_features = len(adult_df.columns)
min_max_scaler = preprocessing.MinMaxScaler()
adult_df = min_max_scaler.fit_transform(adult_df)
X_train, X_test, y_train, y_test, country_idx_train, country_idx_test = train_test_split(adult_df, y_all, country_idx, train_size=0.1, test_size=0.25, stratify=y_all, random_state=rs)
with pm.Model() as multilevel_model:
# Hyperiors for intercept
mu_theta = pm.MvNormal(name='mu_a', mu=np.zeros(n_features), cov=np.eye(n_features), shape=n_features)
packed_L_theta = pm.LKJCholeskyCov('packed_L', n=n_features,
eta=2., sd_dist=pm.HalfCauchy.dist(2.5))
L_theta = pm.expand_packed_triangular(n_features, packed_L_theta)
theta = pm.MvNormal(mu=mu_theta, name='mu_theta', chol=L_theta, shape=[n_countries, n_features])
# Hyperiors for intercept (Comment 1)
mu_b = pm.StudentT('mu_b', nu=3, mu=0., sd=1.0)
sigma_b = pm.HalfNormal('sigma_b', sd=1.0)
b = pm.Normal('b', mu=mu_b, sd=sigma_b, shape=[n_countries, 1])
# Calculate predictions given values
# for intercept and slope
yhat = pm.invlogit(b[country_idx_train] + pm.math.dot(theta[country_idx_train], np.asarray(X_train).T))
#Make predictions fit reality
y = pm.Binomial('y', n=np.ones(y_train.shape[0]), p=yhat, observed=y_train)
You will probably have more success on our discourse with pymc3 questions: https://discourse.pymc.io/ I invite you to move your question there.
The first thing I would check is if your Theano is compiling against MKL libraries, or maybe even using Python mode. If you installed things via conda, that should give you MKL, if you're using pip it might be more difficult. http://deeplearning.net/software/theano/troubleshooting.html#test-blas

How to exporting adversarial examples for Facenet in Cleverhans?

I am trying to follow this blog https://brunolopezgarcia.github.io/2018/05/09/Crafting-adversarial-faces.html to generate adversarial face images against Facenet. The code is here https://github.com/tensorflow/cleverhans/tree/master/examples/facenet_adversarial_faces and works fine! My question is how can I export these adversarial images. Is this question too straightforward, so the blog didn't mention it, but only shows some sample pictures.
I was thinking it is not a hard problem, since I know the generated adversarial samples are in the "adv". But this adv (float32) came from faces1, after being prewhiten and normalized. To restore the int8 images from adv(float32), I have to reverse the normalization and prewhiten process. It seems like if we want output some images from facenet, we have to do this process.
I am new to Facenet and Cleverhans, I am not sure whether this is the best way to do that, or is that common way(such as functions) for people to export images from Facenet.
In facenet_fgsm.py, we finally got the adversarial samples. I need to export adv to plain int images.
adv = sess.run(adv_x, feed_dict=feed_dict)
In set_loader.py. There are some kinda of normalization.
def load_testset(size):
# Load images paths and labels
pairs = lfw.read_pairs(pairs_path)
paths, labels = lfw.get_paths(testset_path, pairs, file_extension)
# Random choice
permutation = np.random.choice(len(labels), size, replace=False)
paths_batch_1 = []
paths_batch_2 = []
for index in permutation:
paths_batch_1.append(paths[index * 2])
paths_batch_2.append(paths[index * 2 + 1])
labels = np.asarray(labels)[permutation]
paths_batch_1 = np.asarray(paths_batch_1)
paths_batch_2 = np.asarray(paths_batch_2)
# Load images
faces1 = facenet.load_data(paths_batch_1, False, False, image_size)
faces2 = facenet.load_data(paths_batch_2, False, False, image_size)
# Change pixel values to 0 to 1 values
min_pixel = min(np.min(faces1), np.min(faces2))
max_pixel = max(np.max(faces1), np.max(faces2))
faces1 = (faces1 - min_pixel) / (max_pixel - min_pixel)
faces2 = (faces2 - min_pixel) / (max_pixel - min_pixel)
In the facenet.py load_data function, there is a prewhiten process.
nrof_samples = len(image_paths)
images = np.zeros((nrof_samples, image_size, image_size, 3))
for i in range(nrof_samples):
img = misc.imread(image_paths[i])
if img.ndim == 2:
img = to_rgb(img)
if do_prewhiten:
img = prewhiten(img)
img = crop(img, do_random_crop, image_size)
img = flip(img, do_random_flip)
images[i,:,:,:] = img
return images
I hope some expert can point me some hidden function in facenet or cleverhans that can directly export the adv images, otherwise reversing normalization and prewhiten process seems akward. Thank you very much.
I don't know much about the Facenet code. From your discussion, it seems like you will have to save the values of min_pixel,max_pixelto reverse the normalization, and then look at theprewhiten` function to see how you can reverse it. I'll email Bruno to see if he has any further comments to help you out.
EDIT: Now image exporting is included in the Facenet example of Cleverhans: https://github.com/tensorflow/cleverhans/commit/08f6fb9cf2a7f199467d5ed60179fc3ae9140458

Soft attention from scratch for video sequences

I am trying to implement soft attention for video sequences classification. As there are a lot of implementations and examples about NLP so I tried following this schema but for video 1. Basically a LSTM with an Attention Model in between.
1 https://blog.heuritech.com/2016/01/20/attention-mechanism/
My code for my attention layer is the following which I am not sure it is implemented correctly.
def attention_layer(self, input, context):
# Input is a Tensor: [batch_size, lstm_units]
# Input (Seq_length, batch_size, lstm_units)
# Context is a LSTMStateTuple: [batch_size, lstm_units]. Hidden_state, output = StateTuple
hidden_state, _ = context
weights_y = tf.get_variable("att_weights_Y", [self.lstm_units, self.lstm_units], initializer=tf.contrib.layers.xavier_initializer())
weights_c = tf.get_variable("att_weights_c", [self.lstm_units, self.lstm_units], initializer=tf.contrib.layers.xavier_initializer())
z_ = []
for feat in input:
# Equation => M = tanh(Wc c + Wy y)
Wcc = tf.matmul(hidden_state, weights_c)
Wyy = tf.matmul(feat, weights_y)
m = tf.add(Wcc, Wyy)
m = tf.tanh(m, name='M_matrix')
# Equation => s = softmax(m)
s = tf.nn.softmax(m, name='softmax_att')
z = tf.multiply(feat, s)
z_.append(z)
out = tf.stack(z_, axis=1)
out = tf.reduce_sum(out, 1)
return out, s
So, adding this layer in between my LSTMs (or at the begining of my 2 LSTM) makes the training so slow. More specifically, it takes a lot of time when I declare my optimizer:
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
My questions are:
Is the implementation correct? If it is, is there a way to optimize it in order to make it train properly?
I was not able to make it work with the seq2seq APIs. Is there any API with Tensorflow that allows me tackle this specific issue?
Does it actually makes sense to use this for sequence classification?

Error in prediction in svm classifier after one hot encoding

I have used one hot encoding to my dataset before training my SVM classifier.
which increased number of features in training set to 982. But during
prediction of test dataset which has 7 features i am getting error " X has 7
features per sample; expecting 982". I don't understand how to increase
number of features in test dataset.
My code is:
df = pd.read_csv('train.csv',header=None);
features = df.iloc[:,:-1].values
labels = df.iloc[:,-1].values
encode = LabelEncoder()
features[:,2] = encode.fit_transform(features[:,2])
features[:,3] = encode.fit_transform(features[:,3])
features[:,4] = encode.fit_transform(features[:,4])
features[:,5] = encode.fit_transform(features[:,5])
df1 = pd.DataFrame(features)
#--------------------------- ONE HOT ENCODING --------------------------------#
hotencode = OneHotEncoder(categorical_features=[2])
features = hotencode.fit_transform(features).toarray()
hotencode = OneHotEncoder(categorical_features=[14])
features = hotencode.fit_transform(features).toarray()
hotencode = OneHotEncoder(categorical_features=[37])
features = hotencode.fit_transform(features).toarray()
hotencode = OneHotEncoder(categorical_features=[466])
features = hotencode.fit_transform(features).toarray()
X = np.array(features)
y = np.array(labels)
clf = svm.LinearSVC()
clf.fit(X,y)
d_test = pd.read_csv('query.csv')
Z_test =np.array(d_test)
confidence = clf.predict(Z_test)
print("The query image belongs to Class ")
print(confidence)
######################### test dataset
query.csv
1 0.076 1 3232236298 2886732679 3128 60604
The short answer: you need to apply the same OHE transform (or LE+OHE in your case) on the test set.
For a good advice, see Scikit Learn OneHotEncoder fit and transform Error: ValueError: X has different shape than during fitting or How to deal with imputation and hot one encoding in pandas?

per_image_whitening in Python

I'm trying to set up TensorFlow to accept one image at a time but I believe I'm getting incorrect results because I pass a regular array without first performing tf.image.per_image_whitening() beforehand. Is there an easy way to do this in Python to an individual image without using the image queue?
Here's my code so far:
im = Image.open(request.FILES.values()[0])
im = im.convert('RGB')
im = im.crop((0, 0, cifar10.IMAGE_SIZE, cifar10.IMAGE_SIZE))
(width, height) = im.size
image_array = list(im.getdata())
image_array = np.array(image_array)
image_array = image_array.reshape((1, height, width, 3))
# tf.image.per_image_whitening() should be done here
#mean = numpy.mean(image_array)
#stddev = numpy.std(image_array)
#adjusted_stddev = max(stddev, 1.0/len(image_array.flatten())))
feed_dict = {"shuffle_batch:0": image_array}
# predictions always returns something close to [1, 0]
predictions = sess.run(tf.nn.softmax(logits), feed_dict=feed_dict)
If you want to avoid the image queue and do the predictions one by one, I think
image_array = (image_array - mean) / adjusted_stddev
should be able to do the trick.
If you want to do the prediction by batches, it's a little bit complicated as per_image_whitening (now per_image_standardization) only works with single images. So you need to do it before you form the batch like the way above or setup a preprocess procedure.