How i can use xgboost memory maximize? (xgboost no use full gpu) - gpu

i am trying to use xgboost.
However, it's different with keras which uses full gpu memory when learning ANN model.
As shown in below, xgboost uses small amount of gpu.
How i can make full gpu for xgboost?
grid = {'eta':[0.01, 0.1, 0.2],
'min_child_weight':[1, 2, 3, 4],
'max_depth': [3, 4, 5, 6],
'subsample': [0.5, 0.6, 0.7, 0.8],
'nthread': [8],
'colsample_bytree': [0.5, 0.6, 0.7, 0.8]}
XGBoost_model = XGBRegressor(n_estimators = 300, tree_method='gpu_hist', gpu_id=0)
XGBoost_211220 = GridSearchCV(estimator=XGBoost_model,
param_grid=grid,
scoring='neg_mean_absolute_error',
cv=10,
n_jobs=-1,
verbose=10)
XGBoost_211220.fit(sc_x_train_211220, sc_y_train_211220)
Above is my code to learn XGBoost.

Related

Sort two-dimensional array based on ranks from a different two-dimensional array

I have two two-dimensional numpy arrays:
import numpy as np
import scipy.stats.rankdata
arr_data = np.array( [[0.3, 0.1, 0.7, 0.5], [0.1, 0.5, 0.4, 0.07]] )
weights = np.array( [[0.05, 0.1, 0.35, 0.5], [0.2, 0.4, 0.1, 0.3]] )
I need to sort both of them based on one common ranking. The common ranks are generated from values in the first array along axis=1:
ranks = scipy.stats.rankdata(arr_data, axis=1).astype(int)
print('data', arr_data)
print('ranks',ranks)
The obtained ranks are as follows:
[[2 1 4 3]
[2 4 3 1]]
I'm stuck with how to proceed to obtain the following sorted arrays:
for arr_data: [[0.1, 0.3, 0.5, 0.7], [0.07, 0.1, 0.4, 0.5]]
for weights: [[0.1, 0.05, 0.5, 0.35], [0.3, 0.2, 0.1, 0.4]]
i.e., my weighs are sorted based on data-array ranking. Ultimately, I want to multiply the data with their corresponding weights keeping the order of the sorted values from the data array. In my project, I have very large datasets so I'd like to avoid Python lists and looping.
Turns out there is an elegant solution:
import numpy as np
data = np.array([[0.3, 0.1, 0.7, 0.5], [0.1, 0.5, 0.4, 0.07]])
weights = np.array([[0.05, 0.1, 0.35, 0.5], [0.2, 0.4, 0.1, 0.3]])
ranks = np.argsort(data, axis=1)
sorted_data = np.take_along_axis(data, ranks, axis=1)
sorted_weights = np.take_along_axis(weights, ranks, axis=1)
print('data\n', data)
print('weights\n', weights)
print("sorted data\n",sorted_data)
print("sorted weights\n", sorted_weights)

Scaler Transform help sklearn

I'm working on a logistic regression assignment and my professor has this code example.
What is the new_x variable and why are we transforming it as a matrix?
data = pd.DataFrame( {’id’: [ 1,2,3,4,5,6,7,8], ’Label’: [’green’, ’green’, ’green’, ’green’,
’red’, ’red’, ’red’, ’red’],
’Height’: [5, 5.5, 5.33, 5.75, 6.00, 5.92, 5.58, 5.92],
’Weight’: [100, 150, 130, 150, 180, 190, 170, 165], ’Foot’: [6, 8, 7, 9, 13, 11, 12, 10]},
columns = [’id’, ’Height’, ’Weight’, ’Foot’, ’Label’] )
X = data[[’Height’, ’Weight’]].values
scaler = StandardScaler()
scaler.fit(X)
X = scaler.transform(X)
Y = data[’Label’].values
log_reg_classifier = LogisticRegression()
log_reg_classifier.fit(X,Y)
new_x = scaler.transform(np.asmatrix([6, 160]))
predicted = log_reg_classifier.predict(new_x)
accuracy = log_reg_classifier.score(X, Y)
Let's take it step by step.
data = pd.DataFrame( {’id’: [ 1,2,3,4,5,6,7,8], ’Label’: [’green’, ’green’, ’green’, ’green’,
’red’, ’red’, ’red’, ’red’],
’Height’: [5, 5.5, 5.33, 5.75, 6.00, 5.92, 5.58, 5.92],
’Weight’: [100, 150, 130, 150, 180, 190, 170, 165], ’Foot’: [6, 8, 7, 9, 13, 11, 12, 10]},
columns = [’id’, ’Height’, ’Weight’, ’Foot’, ’Label’] )
You create an initial feature matrix that contains the columns [’id’, ’Height’, ’Weight’, ’Foot’, ’Label’].
scaler = StandardScaler()
scaler.fit(X)
X = scaler.transform(X)
Y = data[’Label’].values
You than obtain a np.array, that contains only weight and height using data[[’Height’, ’Weight’]].values. See pandas docs on slicing for more info. You can obtain the size of the feature matrix with X.shape i. e., [n,2].
X = data[[’Height’, ’Weight’]].values
scaler = StandardScaler()
scaler.fit(X)
X = scaler.transform(X)
Y = data[’Label’].values
log_reg_classifier = LogisticRegression()
log_reg_classifier.fit(X,Y)
You use those two features only to train the logistic regression after standardization.
That is your classifier is learned on two features (i. e., height and weight) only, but mutliple samples. Every classifier in sklearn implements the fit() method to fit the classifier to the training data.
As your model is trained on a feature matrix with two features, your sample that you want to predict (new_x) also needs two features. Thus, you first create a np.asmatrix([6, 160] with shape [1,2] and elements [height=6,weight=160], scale it and pass it to your trained model. log_reg_classifier.predict(new_x) returns the prediction. You assess the performance of the classifier by comparing the prediction with the true label and calculating the (mean) accuracy. Et voila.
new_x = scaler.transform(np.asmatrix([6, 160]))
predicted = log_reg_classifier.predict(new_x)
accuracy = log_reg_classifier.score(X, Y)

Element-wise assignment in tensorflow

In numpy, it could be easily done as
>>> img
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]], dtype=int32)
>>> img[img>5] = [1,2,3,4]
>>> img
array([[1, 2, 3],
[4, 5, 1],
[2, 3, 4]], dtype=int32)
However, there seems not exist similar operation in tensorflow.
You can never assign a value to a tensor in tensorflow as the change in tensor value is not traceable by backpropagation, but you can still get another tensor from origin tensor, here is a solution
import tensorflow as tf
tf.enable_eager_execution()
img = tf.constant(list(range(1, 10)), shape=[3, 3])
replace_mask = img > 5
keep_mask = tf.logical_not(replace_mask)
keep = tf.boolean_mask(img, keep_mask)
keep_index = tf.where(keep_mask)
replace_index = tf.where(replace_mask)
replace = tf.random_uniform((tf.shape(replace_index)[0],), 0, 10, tf.int32)
updates = tf.concat([keep, replace], axis=0)
indices = tf.concat([keep_index, replace_index], axis=0)
result = tf.scatter_nd(tf.cast(indices, tf.int32), updates, shape=tf.shape(img))
Actually there is a way to achieve this. Very similar to #Jie.Zhou's answer, you can replace tf.constant with tf.Variable, then replace tf.scatter_nd with tf.scatter_nd_update

Map RGB Semantic Maps to One Hot Encodings and vice versa in TensorFlow

The image below is a sample semantic map from the Cityscapes Dataset. It's provided in the form of an RGB image where each specific colour represents a class.
In some deep learning tasks, we would like to map this into a one hot encoding. For example, if it has 20 classes, then this image would be mapped from H x W x 3 to H x W x 20.
How do we do this in TensorFlow?
My solution is below. Looking forward to suggestions on how to make this more efficient or perhaps an answer that's more efficient.
import tensorflow as tf
import numpy as np
import scipy.misc
img = scipy.misc.imread('aachen_000000_000019_gtFine_color.png', mode = 'RGB')
palette = np.array(
[[128, 64, 128],
[244, 35, 232],
[ 70, 70, 70],
[102, 102, 156],
[190, 153, 153],
[153, 153, 153],
[250, 170, 30],
[220, 220, 0],
[107, 142, 35],
[152, 251, 152],
[ 70, 130, 180],
[220, 20, 60],
[255, 0, 0],
[ 0, 0, 142],
[ 0, 0, 70],
[ 0, 60, 100],
[ 0, 80, 100],
[ 0, 0, 230],
[119, 11, 32],
[ 0, 0, 0],
[255, 255, 255]], np.uint8)
semantic_map = []
for colour in palette:
class_map = tf.reduce_all(tf.equal(img, colour), axis=-1)
semantic_map.append(class_map)
semantic_map = tf.stack(semantic_map, axis=-1)
# NOTE cast to tf.float32 because most neural networks operate in float32.
semantic_map = tf.cast(semantic_map, tf.float32)
magic_number = tf.reduce_sum(semantic_map)
print semantic_map.shape
palette = tf.constant(palette, dtype=tf.uint8)
class_indexes = tf.argmax(semantic_map, axis=-1)
# NOTE this operation flattens class_indexes
class_indexes = tf.reshape(class_indexes, [-1])
color_image = tf.gather(palette, class_indexes)
color_image = tf.reshape(color_image, [1024, 2048, 3])
sess = tf.Session()
# NOTE magic_number checks that there are only 1024*2048 1s in the entire
# 1024*2048*21 tensor.
magic_number_val = sess.run(magic_number)
assert magic_number_val == 1024*2048
color_image_val = sess.run(color_image)
scipy.misc.imsave('test.png', color_image_val)

what TensorFlow hash_bucket_size matters

I am creating a DNNclassifier with sparse columns. The training data looks like this,
samples col1 col2 price label
eg1 [[0,1,0,0,0,2,0,1,0,3,...] [[0,0,4,5,0,...] 5.2 0
eg2 [0,0,...] [0,0,...] 0 1
eg3 [0,0,...]] [0,0,...] 0 1
The following snippet can run successfully,
import tensorflow as tf
sparse_feature_a = tf.contrib.layers.sparse_column_with_hash_bucket('col1', 3, dtype=tf.int32)
sparse_feature_b = tf.contrib.layers.sparse_column_with_hash_bucket('col2', 1000, dtype=tf.int32)
sparse_feature_a_emb = tf.contrib.layers.embedding_column(sparse_id_column=sparse_feature_a, dimension=2)
sparse_feature_b_emb = tf.contrib.layers.embedding_column(sparse_id_column=sparse_feature_b, dimension=2)
feature_c = tf.contrib.layers.real_valued_column('price')
estimator = tf.contrib.learn.DNNClassifier(
feature_columns=[sparse_feature_a_emb, sparse_feature_b_emb, feature_c],
hidden_units=[5, 3],
n_classes=2,
model_dir='./tfTmp/tfTmp0')
# Input builders
def input_fn_train(): # returns x, y (where y represents label's class index).
features = {'col1': tf.SparseTensor(indices=[[0, 1], [0, 5], [0, 7], [0, 9]],
values=[1, 2, 1, 3],
dense_shape=[3, int(250e6)]),
'col2': tf.SparseTensor(indices=[[0, 2], [0, 3]],
values=[4, 5],
dense_shape=[3, int(100e6)]),
'price': tf.constant([5.2, 0, 0])}
labels = tf.constant([0, 1, 1])
return features, labels
estimator.fit(input_fn=input_fn_train, steps=100)
However, I have a question from this sentence,
sparse_feature_a = tf.contrib.layers.sparse_column_with_hash_bucket('col1', 3, dtype=tf.int32)
where 3 means hash_bucket_size=3, but this sparse tensor includes 4 non-zero values,
'col1': tf.SparseTensor(indices=[[0, 1], [0, 5], [0, 7], [0, 9]],
values=[1, 2, 1, 3],
dense_shape=[3, int(250e6)])
It seems has_bucket_size does nothing here. No matter how many non-zero values you have in your sparse tensor, you just need to set it with an integer > 1 and it works correctly.
I know my understanding may not be right. Could anyone explain how has_bucket_size works? Thanks a lot!
hash_bucket_size works by taking the original indices, hashing them into a space of the specified size, and using the hashed indices as features.
This means you can specify your model before knowing the full range of possible indices, at the cost of some indices maybe colliding.