Related
When I run the xgboost rank demo by setting 2 samples for every group, eval_metric=auc, it shows warning that 'Dataset is empty, or contains only positive or negative samples'.
I have tried for many times modify the dtarget for training and validattion group and found that it has no effect and the problem occurs only when I set 2 samples for every gourp in dgroup, such as [2,2,2]. I don't kwnow where the problem is.
My xgboost param is :
xgb_rank_params1 = {
'booster': 'gbtree',
'eta': 0.1,
'gamma': 1.0,
'min_child_weight': 0.1,
'objective': 'rank:pairwise',
'eval_metric': 'auc',
'max_depth': 6,
'num_boost_round': 10,
'save_period': 0
}
data prebuild code is:
n_group = 3
n_choice = 2
dtrain = np.random.uniform(0, 100, [n_group * n_choice, 2])
dtarget = [1, 0, 1, 0, 1, 0]
# **problem here : when set n_choice = 2 sample for every gourp**
dgroup = np.array([n_choice for i in range(n_group)]).flatten()
# concate Train data, very import here !
xgbTrain = DMatrix(dtrain, label=dtarget)
xgbTrain.set_group(dgroup)
# generate eval data
dtrain_eval = np.random.uniform(0, 100, [n_group * n_choice, 2])
xgbTrain_eval = DMatrix(dtrain_eval, label=dtarget)
xgbTrain_eval.set_group(dgroup)
evallist = [(xgbTrain, 'train'), (xgbTrain_eval, 'eval')]
rankModel = train(xgb_rank_params1, xgbTrain, num_boost_round=20, evals=evallist)
output says:
[15:54:52] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.6.0/src/metric/auc.cc:330: Dataset is empty, or contains only positive or negative samples.
[0] train-auc:nan eval-auc:nan
[15:54:52] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.6.0/src/metric/auc.cc:330: Dataset is empty, or contains only positive or negative samples.
[15:54:52] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.6.0/src/metric/auc.cc:330: Dataset is empty, or contains only positive or negative samples.
[1] train-auc:nan eval-auc:nan
How to put a stopping condition on jacobian (or gradient) for Newton-CG methode?
I want the algorithme to stop when the jacobian reaches to 1e-2, is it possible to do with Newton-CG ??
input:
scipy.optimize.minimize(f, [5.0,1.0,2.0,5.0], args=Data, method='Newton-CG',jac=Jacf)
output:
jac: array([7.64265411e-08, 1.74985718e-08, 4.12408407e-07, 5.02972841e-08])
message: 'Optimization terminated successfully.'
nfev: 12
nhev: 0
nit: 11
njev: 68
status: 0
success: True
x: array([0.22545395, 0.3480084 , 1.06811724, 1.64873479])
in BFGS method, which is symilar to Newton-CG, there is a gtol option, it allows to stop the iteration when the gradient reaches to some value. But in Newton-CG theres no that type of option.
Does anyone know how to stop the iteration when the jacobien reaches to 1e-2.
Here are some details to reproduce my code:
def convert_line2matrix(a):
n = len(a)
if (np.sqrt(n) % 1 == 0) :
d = int(np.sqrt(n))
Mat = np.zeros((d,d))
for i in range(d):
for j in range(d):
Mat[i,j] = a[j+d*i]
else:
raise ValueError(f"{a} cant be converted into a (n x n) matrix. The array has {len(a)} elements, \n\t thus impossible to build a square matrix with {len(a)} elements.")
return Mat
def convert_matrix2line(Matrix):
result = []
dim = len(Matrix)
for i in range(dim):
for j in range(dim):
result.append(Matrix[i,j])
return np.array(result)
my_data = np.array([[0.21530249, 0.32450331, 0 ],
[0.1930605 , 0.31788079, 0 ],
[0.17793594, 0.31788079, 0 ],
[0.16459075, 0.31125828, 1 ],
[0.24822064, 0.31125828, 0 ],
[0.28647687, 0.32450331, 0 ],
[0.32829181, 0.31788079, 0 ],
[0.38879004, 0.32450331, 0 ],
[0.42882562, 0.32450331, 0 ],
[0.47419929, 0.32450331, 0 ],
[0.5044484 , 0.32450331, 0 ],
[0.1797153 , 0.31125828, 0 ],
[0.16548043, 0.31125828, 1 ],
[0.17793594, 0.29801325, 1 ],
[0.1930605 , 0.31788079, 0 ]])
Data = pd.DataFrame(my_data, columns=['X_1','X_2', 'Allum'])
def logLB(params,Data):
B = convert_line2matrix(params)
X = np.array(Data.iloc[:,:len(B)])
Y = np.array(Data.iloc[:,len(B)])
result = 0
n = len(Data)
BB = np.transpose(B) # B
for i in range(n):
if(1-np.exp(-X[i].T # BB # X[i]) > 0):
result += Y[i]*(-np.transpose(X[i]) # BB # X[i]) + (1 - Y[i])*np.log(1-np.exp(-X[i].T # BB # X[i]))
return result
def f(params, Data):
return -logLB(params, Data)
def dlogLB(params, Data):
B = convert_line2matrix(params)
X = np.array(Data.iloc[:,:len(B)])
Y = np.array(Data.iloc[:,len(B)])
BB = B.T # B
N = len(Data)
M = len(B)
Jacobian = np.zeros(np.shape(B))
for n in range(len(B)):
for m in range(len(B)):
result = 0
for c in range(N):
som = 0
for i in range(M):
som += X[c,m]*B[n,i]*X[c,i]
if (1 - np.exp(-X[c].T # BB # X[c]) > 0):
result += -2*Y[c]*som + (1-Y[c])*np.exp(-X[c].T # BB # X[c])*(2*som)/(1 - np.exp(-X[c].T # BB # X[c]))
Jacobian[n,m] = result
return convert_matrix2line(Jacobian)
def Jacf(params, Data):
return -dlogLB(params, Data)
I assume that you want to stop the optimizer as soon as the euclidian norm of the gradient reaches a specific value, which is exactly the meaning of the BFGS method's gtol option. Otherwise, it doesn't make any sense mathematically, since the evaluated gradient is a vector and thus can't be compared to a scalar value.
The Newton-CG method doesn't provide a similar option. However, you could use a simple callback that is called after each iteration and terminates the algorithm when the callback returns True. Unfortunately, you can only terminate the optimizer by a callback with the trust-constr method. For all other methods, the callback's return value is ignored, so it's very limited.
A possible hacky and ugly way to terminate the optimizer by the callback anyway would be raising an exception:
import numpy as np
from scipy.optimize import minimize
class Callback:
def __init__(self, eps, args, jac):
self.eps = eps
self.args = args
self.jac = jac
self.x = None
self.gtol = None
def __call__(self, xk):
self.x = xk
self.gtol = np.linalg.norm(self.jac(xk, *self.args))
if self.gtol <= self.eps:
raise Exception("Gradient norm is below threshold")
Here, xk is the current iterate, eps your desired tolerance, args a tuple containing your optional objective und gradient arguments and jac the gradient. Then, you can use it like this:
from scipy.optimize import minimize
cb = Callback(1.0e-1, (Data,), Jacf)
try:
res = minimize(f, [5.0,1.0,2.0,5.0], args=Data, method='Newton-CG',
jac=Jacf, callback=cb)
except:
x = cb.x
gtol = cb.gtol
print(f"gtol = {gtol:E}, x = {x}")
which yields
gtol = 5.515263E-02, x = [14.43322108 -5.18163542 0.22582261 -0.04859385]
I want to sample datas using weighted distribution (probability)
The examples are like below:
class distribution:
doc_distribution = {0: 40, 1: 18, 2: 8, 3: 598, ... , 9: 177}
I would to make the batch of dataset by equal probability of class.
total_dataset = 0
init_dist = []
for value in doc_distribution.values():
total_dataset += value
for value in doc_distribution.values():
init_dist.append(value / total_dataset)
target_dist = []
for value in doc_distribution.values():
target_dist.append(1 / len(doc_distribution))
Then, I make input_fn of tf.estimator to export the model,
def input_fn(ngram_words, labels, opts):
dataset = tf.data.Dataset.from_tensor_slices((ngram_words, labels))
rej = tf.data.experimental.rejection_resample(class_func = lambda _, c : c, \
target_dist = target_dist, initial_dist = init_dist, seed = opts.seed)
dataset = dataset.shuffle(buffer_size = len(ngram_words) * 2, seed = opts.seed)
return dataset.batch(20)
Finally, I could get the result of rejection_resample as below:
for next_elem in a:
k = next_elem[1]
break
dist = {}
for val in np.array(k):
if val in dist:
dist[val] += 1
else:
dist[val] = 1
print(dist)
The result is: {3: 33, 8: 14, 4: 17, 7: 5, 5: 10, 9: 12, 0: 6, 6: 3}
I don't know why rejection_resample doesn't work well, I just want to extract samples equally.
How should I fix it?
Is there any methods to sample equally in input_fn of tf.estimator?
We can use tf.data.experimental.sample_from_datasets instead of rejection_resample.
unbatched_dataset = [(dataset.filter(lambda _, label: label == i)) for i in range(0, classify_num)]
weights = [1 / classify_num] * classify_num
balanced_ds = tf.data.experimental.sample_from_datasets(unbatched_dataset, weights, seed=opts.seed)
dataset = balanced_ds.shuffle(buffer_size = 1000, seed = opts.seed).repeat(opts.epochs)
I am doing some work using image processing and sparse coding. Problem is, the following code works only on some images.
Here is the image that it works perfectly on:
And here is the image that it loops forever on:
Here is the code:
import cv2
import numpy as np
import networkx as nx
from preproc import Preproc
# From https://github.com/vicariousinc/science_rcn/blob/master/science_rcn/learning.py
def sparsify(bu_msg, suppress_radius=3):
"""Make a sparse representation of the edges by greedily selecting features from the
output of preprocessing layer and suppressing overlapping activations.
Parameters
----------
bu_msg : 3D numpy.ndarray of float
The bottom-up messages from the preprocessing layer.
Shape is (num_feats, rows, cols)
suppress_radius : int
How many pixels in each direction we assume this filter
explains when included in the sparsification.
Returns
-------
frcs : see train_image.
"""
frcs = []
img = bu_msg.max(0) > 0
while True:
r, c = np.unravel_index(img.argmax(), img.shape)
print(r, c)
if not img[r, c]:
break
frcs.append((bu_msg[:, r, c].argmax(), r, c))
img[r - suppress_radius:r + suppress_radius + 1,
c - suppress_radius:c + suppress_radius + 1] = False
return np.array(frcs)
if __name__ == '__main__':
img = cv2.imread('https://i.stack.imgur.com/Nb08A.png', 0)
img2 = cv2.imread('https://i.stack.imgur.com/2MW93.png', 0)
prp = Preproc()
bu_msg = prp.fwd_infer(img)
frcs = sparsify(bu_msg)
and the accompanying preprocessing code:
"""A pre-processing layer of the RCN model. See Sec S8.1 for details.
"""
import numpy as np
from scipy.ndimage import maximum_filter
from scipy.ndimage.filters import gaussian_filter
from scipy.signal import fftconvolve
class Preproc(object):
"""
A simplified preprocessing layer implementing Gabor filters and suppression.
Parameters
----------
num_orients : int
Number of edge filter orientations (over 2pi).
filter_scale : float
A scale parameter for the filters.
cross_channel_pooling : bool
Whether to pool across neighboring orientation channels (cf. Sec S8.1.4).
Attributes
----------
filters : [numpy.ndarray]
Kernels for oriented Gabor filters.
pos_filters : [numpy.ndarray]
Kernels for oriented Gabor filters with all-positive values.
suppression_masks : numpy.ndarray
Masks for oriented non-max suppression.
"""
def __init__(self,
num_orients=16,
filter_scale=2.,
cross_channel_pooling=False):
self.num_orients = num_orients
self.filter_scale = filter_scale
self.cross_channel_pooling = cross_channel_pooling
self.suppression_masks = generate_suppression_masks(filter_scale=filter_scale,
num_orients=num_orients)
def fwd_infer(self, img, brightness_diff_threshold=18.):
"""Compute bottom-up (forward) inference.
Parameters
----------
img : numpy.ndarray
The input image.
brightness_diff_threshold : float
Brightness difference threshold for oriented edges.
Returns
-------
bu_msg : 3D numpy.ndarray of float
The bottom-up messages from the preprocessing layer.
Shape is (num_feats, rows, cols)
"""
filtered = np.zeros((len(self.filters),) + img.shape, dtype=np.float32)
for i, kern in enumerate(self.filters):
filtered[i] = fftconvolve(img, kern, mode='same')
localized = local_nonmax_suppression(filtered, self.suppression_masks)
# Threshold and binarize
localized *= (filtered / brightness_diff_threshold).clip(0, 1)
localized[localized < 1] = 0
if self.cross_channel_pooling:
pooled_channel_weights = [(0, 1), (-1, 1), (1, 1)]
pooled_channels = [-np.ones_like(sf) for sf in localized]
for i, pc in enumerate(pooled_channels):
for channel_offset, factor in pooled_channel_weights:
ch = (i + channel_offset) % self.num_orients
pos_chan = localized[ch]
if factor != 1:
pos_chan[pos_chan > 0] *= factor
np.maximum(pc, pos_chan, pc)
bu_msg = np.array(pooled_channels)
else:
bu_msg = localized
# Setting background to -1
bu_msg[bu_msg == 0] = -1.
return bu_msg
#property
def filters(self):
return get_gabor_filters(
filter_scale=self.filter_scale, num_orients=self.num_orients, weights=False)
#property
def pos_filters(self):
return get_gabor_filters(
filter_scale=self.filter_scale, num_orients=self.num_orients, weights=True)
def get_gabor_filters(size=21, filter_scale=4., num_orients=16, weights=False):
"""Get Gabor filter bank. See Preproc for parameters and returns."""
def _get_sparse_gaussian():
"""Sparse Gaussian."""
size = 2 * np.ceil(np.sqrt(2.) * filter_scale) + 1
alt = np.zeros((int(size), int(size)), np.float32)
alt[int(size // 2), int(size // 2)] = 1
gaussian = gaussian_filter(alt, filter_scale / np.sqrt(2.), mode='constant')
gaussian[gaussian < 0.05 * gaussian.max()] = 0
return gaussian
gaussian = _get_sparse_gaussian()
filts = []
for angle in np.linspace(0., 2 * np.pi, num_orients, endpoint=False):
acts = np.zeros((size, size), np.float32)
x, y = np.cos(angle) * filter_scale, np.sin(angle) * filter_scale
acts[int(size / 2 + y), int(size / 2 + x)] = 1.
acts[int(size / 2 - y), int(size / 2 - x)] = -1.
filt = fftconvolve(acts, gaussian, mode='same')
filt /= np.abs(filt).sum() # Normalize to ensure the maximum output is 1
if weights:
filt = np.abs(filt)
filts.append(filt)
return filts
def generate_suppression_masks(filter_scale=4., num_orients=16):
"""
Generate the masks for oriented non-max suppression at the given filter_scale.
See Preproc for parameters and returns.
"""
size = 2 * int(np.ceil(filter_scale * np.sqrt(2))) + 1
cx, cy = size // 2, size // 2
filter_masks = np.zeros((num_orients, size, size), np.float32)
# Compute for orientations [0, pi), then flip for [pi, 2*pi)
for i, angle in enumerate(np.linspace(0., np.pi, num_orients // 2, endpoint=False)):
x, y = np.cos(angle), np.sin(angle)
for r in range(1, int(np.sqrt(2) * size / 2)):
dx, dy = round(r * x), round(r * y)
if abs(dx) > cx or abs(dy) > cy:
continue
filter_masks[i, int(cy + dy), int(cx + dx)] = 1
filter_masks[i, int(cy - dy), int(cx - dx)] = 1
filter_masks[num_orients // 2:] = filter_masks[:num_orients // 2]
return filter_masks
def local_nonmax_suppression(filtered, suppression_masks, num_orients=16):
"""
Apply oriented non-max suppression to the filters, so that only a single
orientated edge is active at a pixel. See Preproc for additional parameters.
Parameters
----------
filtered : numpy.ndarray
Output of filtering the input image with the filter bank.
Shape is (num feats, rows, columns).
Returns
-------
localized : numpy.ndarray
Result of oriented non-max suppression.
"""
localized = np.zeros_like(filtered)
cross_orient_max = filtered.max(0)
filtered[filtered < 0] = 0
for i, (layer, suppress_mask) in enumerate(zip(filtered, suppression_masks)):
competitor_maxs = maximum_filter(layer, footprint=suppress_mask, mode='nearest')
localized[i] = competitor_maxs <= layer
localized[cross_orient_max > filtered] = 0
return localized
The problem I found was that np.unravel_index returns all the positions of features for the first image, whereas it only returns (0, 0) continuously for the second. My hypothesis is that it is either a problem with the preprocessing code, or it is a bug in the np.unravel_index function itself, but I am not too sure.
Okay, so turns out there is an underlying problem when calling argmax on the image. I rewrote the sparsification script to not use argmax and it works exactly the same. It should now work with any image.
def sparsify(bu_msg, suppress_radius=3):
"""Make a sparse representation of the edges by greedily selecting features from the
output of preprocessing layer and suppressing overlapping activations.
Parameters
----------
bu_msg : 3D numpy.ndarray of float
The bottom-up messages from the preprocessing layer.
Shape is (num_feats, rows, cols)
suppress_radius : int
How many pixels in each direction we assume this filter
explains when included in the sparsification.
Returns
-------
frcs : see train_image.
"""
frcs = []
img = bu_msg.max(0) > 0
for (r, c), _ in np.ndenumerate(img):
if img[r, c]:
frcs.append((bu_msg[:, r, c].argmax(), r, c))
img[r - suppress_radius:r + suppress_radius + 1,
c - suppress_radius:c + suppress_radius + 1] = False
return np.array(frcs)
Given a tensorflow dataset
Train_dataset = tf.data.Dataset.from_tensor_slices((Train_Image_Filenames,Train_Image_Labels))
Train_dataset = Train_dataset.map(Parse_JPEG_Augmented)
...
I would like to stratify my batches to deal with class imbalance. I found tf.contrib.training.stratified_sample and thought I could use it in the following way:
Train_dataset_iter = Train_dataset.make_one_shot_iterator()
Train_dataset_Image_Batch,Train_dataset_Label_Batch = Train_dataset_iter.get_next()
Train_Stratified_Images,Train_Stratified_Labels = tf.contrib.training.stratified_sample(Train_dataset_Image_Batch,Train_dataset_Label_Batch,[1/Classes]*Classes,Batch_Size)
But it gives the following error and I'm not sure that this would allow me to keep the performance benefits of tensorflow dataset as I may have then have to pass Train_Stratified_Images and Train_Stratified_Labels via feed_dict ?
File "/xxx/xxx/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/training/python/training/sampling_ops.py", line 192, in stratified_sample
with ops.name_scope(name, 'stratified_sample', list(tensors) + [labels]):
File "/xxx/xxx/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 459, in __iter__
"Tensor objects are only iterable when eager execution is "
TypeError: Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn.
What would be the "best practice" way of using dataset with stratified batches?
Here is below a simple example to demonstrate the usage of sample_from_datasets (thanks #Agade for the idea).
import math
import tensorflow as tf
import numpy as np
def print_dataset(name, dataset):
elems = np.array([v.numpy() for v in dataset])
print("Dataset {} contains {} elements :".format(name, len(elems)))
print(elems)
def combine_datasets_balanced(dataset_smaller, size_smaller, dataset_bigger, size_bigger, batch_size):
ds_smaller_repeated = dataset_smaller.repeat(count=int(math.ceil(size_bigger / size_smaller)))
# we repeat the smaller dataset so that the 2 datasets are about the same size
balanced_dataset = tf.data.experimental.sample_from_datasets([ds_smaller_repeated, dataset_bigger], weights=[0.5, 0.5])
# each element in the resulting dataset is randomly drawn (without replacement) from dataset even with proba 0.5 or from odd with proba 0.5
balanced_dataset = balanced_dataset.take(2 * size_bigger).batch(batch_size)
return balanced_dataset
N, M = 3, 10
even = tf.data.Dataset.range(0, 2 * N, 2).repeat(count=int(math.ceil(M / N)))
odd = tf.data.Dataset.range(1, 2 * M, 2)
even_odd = combine_datasets_balanced(even, N, odd, M, 2)
print_dataset("even", even)
print_dataset("odd", odd)
print_dataset("even_odd_all", even_odd)
Output :
Dataset even contains 12 elements : # 12 = 4 x N (because of .repeat)
[0 2 4 0 2 4 0 2 4 0 2 4]
Dataset odd contains 10 elements :
[ 1 3 5 7 9 11 13 15 17 19]
Dataset even_odd contains 10 elements : # 10 = 2 x M / 2 (2xM because of .take(2 * M) and /2 because of .batch(2))
[[ 0 2]
[ 1 4]
[ 0 2]
[ 3 4]
[ 0 2]
[ 4 0]
[ 5 2]
[ 7 4]
[ 0 9]
[ 2 11]]