Debug when "failed to create the sampler" - rstan

Is it impossible to debug the following error, or is there any corresponding
function of browser().
I want to extract objects of the transformed data block in a Stan file when the function sampling() fails to create a stanfit object.
failed to create the sampler; sampling not done
Error in new_CppObject_xp(fields$.module, fields$.pointer, ...) :
Exception: binomial_rng: Probability parameter is nan, but must be in the interval [0, 1] (in 'model23b420c17ad_SBC' at line 247)
AN ANSWER: A Function Print() in a Stan File As a Debugger:
Using print(), we can print any object in the transformed data block, regardless of consequence of the function rstan::sampling().
m <- rstan::stan_model(model_code = '
data{real x;}
transformed data{real z; z = Phi( (-1.14194+ 2.66963)/(-0.257783) );
print("Here, we can use print() as a debugger")
print("")
print("z = ", z)
}
parameters {real y;}
model {y ~ normal(z,1);}
generated quantities {real zhat = z;}')
f <- rstan::sampling(m, data=list(x=1), iter = 100,chains=1)
extract(f)[["zhat"]]
As an result of the above toy code, the specified objects in a Stan file are printed in the R console as follows:
> f <- rstan::sampling(m, data=list(x=1), iter = 100,chains=1)
Here, we can use print() as a debugger
z = 1.54953e-009
SAMPLING FOR MODEL 'adff65652652045694506de44240c84c' NOW (CHAIN 1).
Chain 1:
Chain 1: Gradient evaluation took 0 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1:
Chain 1:
Chain 1: WARNING: There aren't enough warmup iterations to fit the
Chain 1: three stages of adaptation as currently configured.
Chain 1: Reducing each adaptation stage to 15%/75%/10% of
Chain 1: the given number of warmup iterations:
Chain 1: init_buffer = 7
Chain 1: adapt_window = 38
Chain 1: term_buffer = 5
Chain 1:
Chain 1: Iteration: 1 / 100 [ 1%] (Warmup)
Chain 1: Iteration: 10 / 100 [ 10%] (Warmup)
Chain 1: Iteration: 20 / 100 [ 20%] (Warmup)
Chain 1: Iteration: 30 / 100 [ 30%] (Warmup)
Chain 1: Iteration: 40 / 100 [ 40%] (Warmup)
Chain 1: Iteration: 50 / 100 [ 50%] (Warmup)
Chain 1: Iteration: 51 / 100 [ 51%] (Sampling)
Chain 1: Iteration: 60 / 100 [ 60%] (Sampling)

Related

minisom python package pca initialization code

Copied PCA initialization code from minisom package shows below.
def pca_weights_init(self, data):
"""Initializes the weights to span the first two principal components.
This initialization doesn't depend on random processes and
makes the training process converge faster.
It is strongly reccomended to normalize the data before initializing
the weights and use the same normalization for the training data.
"""
if self._input_len == 1:
msg = 'The data needs at least 2 features for pca initialization'
raise ValueError(msg)
self._check_input_len(data)
if len(self._neigx) == 1 or len(self._neigy) == 1:
msg = 'PCA initialization inappropriate:' + \
'One of the dimensions of the map is 1.'
warn(msg)
pc_length, pc = linalg.eig(cov(transpose(data)))
pc_order = argsort(-pc_length)
for i, c1 in enumerate(linspace(-1, 1, len(self._neigx))):
for j, c2 in enumerate(linspace(-1, 1, len(self._neigy))):
self._weights[i, j] = c1*pc[pc_order[0]] + c2*pc[pc_order[1]]
Shouldn't the last line be changed to the following since eigenvectors are shown column-wise?
self._weights[i, j] = c1*pc[:,pc_order[0]] + c2*pc[:,pc_order[1]]

How to stop the iteration when the Jacobian reached to an arbitrary (small) value in Newton-CG method?

How to put a stopping condition on jacobian (or gradient) for Newton-CG methode?
I want the algorithme to stop when the jacobian reaches to 1e-2, is it possible to do with Newton-CG ??
input:
scipy.optimize.minimize(f, [5.0,1.0,2.0,5.0], args=Data, method='Newton-CG',jac=Jacf)
output:
jac: array([7.64265411e-08, 1.74985718e-08, 4.12408407e-07, 5.02972841e-08])
message: 'Optimization terminated successfully.'
nfev: 12
nhev: 0
nit: 11
njev: 68
status: 0
success: True
x: array([0.22545395, 0.3480084 , 1.06811724, 1.64873479])
in BFGS method, which is symilar to Newton-CG, there is a gtol option, it allows to stop the iteration when the gradient reaches to some value. But in Newton-CG theres no that type of option.
Does anyone know how to stop the iteration when the jacobien reaches to 1e-2.
Here are some details to reproduce my code:
def convert_line2matrix(a):
n = len(a)
if (np.sqrt(n) % 1 == 0) :
d = int(np.sqrt(n))
Mat = np.zeros((d,d))
for i in range(d):
for j in range(d):
Mat[i,j] = a[j+d*i]
else:
raise ValueError(f"{a} cant be converted into a (n x n) matrix. The array has {len(a)} elements, \n\t thus impossible to build a square matrix with {len(a)} elements.")
return Mat
def convert_matrix2line(Matrix):
result = []
dim = len(Matrix)
for i in range(dim):
for j in range(dim):
result.append(Matrix[i,j])
return np.array(result)
my_data = np.array([[0.21530249, 0.32450331, 0 ],
[0.1930605 , 0.31788079, 0 ],
[0.17793594, 0.31788079, 0 ],
[0.16459075, 0.31125828, 1 ],
[0.24822064, 0.31125828, 0 ],
[0.28647687, 0.32450331, 0 ],
[0.32829181, 0.31788079, 0 ],
[0.38879004, 0.32450331, 0 ],
[0.42882562, 0.32450331, 0 ],
[0.47419929, 0.32450331, 0 ],
[0.5044484 , 0.32450331, 0 ],
[0.1797153 , 0.31125828, 0 ],
[0.16548043, 0.31125828, 1 ],
[0.17793594, 0.29801325, 1 ],
[0.1930605 , 0.31788079, 0 ]])
Data = pd.DataFrame(my_data, columns=['X_1','X_2', 'Allum'])
def logLB(params,Data):
B = convert_line2matrix(params)
X = np.array(Data.iloc[:,:len(B)])
Y = np.array(Data.iloc[:,len(B)])
result = 0
n = len(Data)
BB = np.transpose(B) # B
for i in range(n):
if(1-np.exp(-X[i].T # BB # X[i]) > 0):
result += Y[i]*(-np.transpose(X[i]) # BB # X[i]) + (1 - Y[i])*np.log(1-np.exp(-X[i].T # BB # X[i]))
return result
def f(params, Data):
return -logLB(params, Data)
def dlogLB(params, Data):
B = convert_line2matrix(params)
X = np.array(Data.iloc[:,:len(B)])
Y = np.array(Data.iloc[:,len(B)])
BB = B.T # B
N = len(Data)
M = len(B)
Jacobian = np.zeros(np.shape(B))
for n in range(len(B)):
for m in range(len(B)):
result = 0
for c in range(N):
som = 0
for i in range(M):
som += X[c,m]*B[n,i]*X[c,i]
if (1 - np.exp(-X[c].T # BB # X[c]) > 0):
result += -2*Y[c]*som + (1-Y[c])*np.exp(-X[c].T # BB # X[c])*(2*som)/(1 - np.exp(-X[c].T # BB # X[c]))
Jacobian[n,m] = result
return convert_matrix2line(Jacobian)
def Jacf(params, Data):
return -dlogLB(params, Data)
I assume that you want to stop the optimizer as soon as the euclidian norm of the gradient reaches a specific value, which is exactly the meaning of the BFGS method's gtol option. Otherwise, it doesn't make any sense mathematically, since the evaluated gradient is a vector and thus can't be compared to a scalar value.
The Newton-CG method doesn't provide a similar option. However, you could use a simple callback that is called after each iteration and terminates the algorithm when the callback returns True. Unfortunately, you can only terminate the optimizer by a callback with the trust-constr method. For all other methods, the callback's return value is ignored, so it's very limited.
A possible hacky and ugly way to terminate the optimizer by the callback anyway would be raising an exception:
import numpy as np
from scipy.optimize import minimize
class Callback:
def __init__(self, eps, args, jac):
self.eps = eps
self.args = args
self.jac = jac
self.x = None
self.gtol = None
def __call__(self, xk):
self.x = xk
self.gtol = np.linalg.norm(self.jac(xk, *self.args))
if self.gtol <= self.eps:
raise Exception("Gradient norm is below threshold")
Here, xk is the current iterate, eps your desired tolerance, args a tuple containing your optional objective und gradient arguments and jac the gradient. Then, you can use it like this:
from scipy.optimize import minimize
cb = Callback(1.0e-1, (Data,), Jacf)
try:
res = minimize(f, [5.0,1.0,2.0,5.0], args=Data, method='Newton-CG',
jac=Jacf, callback=cb)
except:
x = cb.x
gtol = cb.gtol
print(f"gtol = {gtol:E}, x = {x}")
which yields
gtol = 5.515263E-02, x = [14.43322108 -5.18163542 0.22582261 -0.04859385]

Using tensorflow dataset with stratified sampling

Given a tensorflow dataset
Train_dataset = tf.data.Dataset.from_tensor_slices((Train_Image_Filenames,Train_Image_Labels))
Train_dataset = Train_dataset.map(Parse_JPEG_Augmented)
...
I would like to stratify my batches to deal with class imbalance. I found tf.contrib.training.stratified_sample and thought I could use it in the following way:
Train_dataset_iter = Train_dataset.make_one_shot_iterator()
Train_dataset_Image_Batch,Train_dataset_Label_Batch = Train_dataset_iter.get_next()
Train_Stratified_Images,Train_Stratified_Labels = tf.contrib.training.stratified_sample(Train_dataset_Image_Batch,Train_dataset_Label_Batch,[1/Classes]*Classes,Batch_Size)
But it gives the following error and I'm not sure that this would allow me to keep the performance benefits of tensorflow dataset as I may have then have to pass Train_Stratified_Images and Train_Stratified_Labels via feed_dict ?
File "/xxx/xxx/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/training/python/training/sampling_ops.py", line 192, in stratified_sample
with ops.name_scope(name, 'stratified_sample', list(tensors) + [labels]):
File "/xxx/xxx/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 459, in __iter__
"Tensor objects are only iterable when eager execution is "
TypeError: Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn.
What would be the "best practice" way of using dataset with stratified batches?
Here is below a simple example to demonstrate the usage of sample_from_datasets (thanks #Agade for the idea).
import math
import tensorflow as tf
import numpy as np
def print_dataset(name, dataset):
elems = np.array([v.numpy() for v in dataset])
print("Dataset {} contains {} elements :".format(name, len(elems)))
print(elems)
def combine_datasets_balanced(dataset_smaller, size_smaller, dataset_bigger, size_bigger, batch_size):
ds_smaller_repeated = dataset_smaller.repeat(count=int(math.ceil(size_bigger / size_smaller)))
# we repeat the smaller dataset so that the 2 datasets are about the same size
balanced_dataset = tf.data.experimental.sample_from_datasets([ds_smaller_repeated, dataset_bigger], weights=[0.5, 0.5])
# each element in the resulting dataset is randomly drawn (without replacement) from dataset even with proba 0.5 or from odd with proba 0.5
balanced_dataset = balanced_dataset.take(2 * size_bigger).batch(batch_size)
return balanced_dataset
N, M = 3, 10
even = tf.data.Dataset.range(0, 2 * N, 2).repeat(count=int(math.ceil(M / N)))
odd = tf.data.Dataset.range(1, 2 * M, 2)
even_odd = combine_datasets_balanced(even, N, odd, M, 2)
print_dataset("even", even)
print_dataset("odd", odd)
print_dataset("even_odd_all", even_odd)
Output :
Dataset even contains 12 elements : # 12 = 4 x N (because of .repeat)
[0 2 4 0 2 4 0 2 4 0 2 4]
Dataset odd contains 10 elements :
[ 1 3 5 7 9 11 13 15 17 19]
Dataset even_odd contains 10 elements : # 10 = 2 x M / 2 (2xM because of .take(2 * M) and /2 because of .batch(2))
[[ 0 2]
[ 1 4]
[ 0 2]
[ 3 4]
[ 0 2]
[ 4 0]
[ 5 2]
[ 7 4]
[ 0 9]
[ 2 11]]

Matrix Inversion in CBLAS/LAPACK vs Python

The matrix I am trying to invert is:
[ 1 0 1]
A = [ 2 0 1]
[-1 1 1]
The true inverse is:
[-1 1 0]
A^-1 = [-3 2 1]
[ 2 -1 0]
Using Python's numpy.linalg.inv, I get the correct answer. One of my routines for matrix inverse uses dgetri_, it is:
void compute_matrix_inverse_dbl( double* matrix,
int order,
double * inverse )
{
int N, lwork;
int success;
int *pivot;
double* workspace;
//===Allocate Space===//
pivot = malloc(order * order * order * sizeof(*pivot));
workspace = malloc(order * order * sizeof(*workspace));
//===Run Setup===//
N = order;
copy_array_dbl(matrix, order*order, inverse);
lwork = order*order;
//===Factor Matrix===//
dgetrf_(&N,&N,inverse,&N,pivot,&success);
//===Compute Inverse===//
dgetri_(&N, inverse, &N, pivot, workspace, &lwork, &success);
//===Clean Up===//
free(workspace);
free(pivot);
return;
}
Using this routine, I get:
[-1 1 +-e1 ]
A^-1 = [-3 2 1 ]
[ 2 -1 +-e2 ]
Where e1 and e2 and small numbers on the order of machine precision 1e-16. Now perhaps dgetri_ is not the best to use. However, when I invert using QR decomposition via zgeqrf_ and zungqr_, I get a similar answer. When I use dgesvd_ for inverse using SVD, I get a similar answer as well.
Python seems to use a routine called _umath_linalg.inv. So I have a few questions:
What does that routine do?
What CBLAS/LAPACK routine can I use to invert this matrix and get a result like CBLAS/LAPACK (such that the e1 and e2 get replaced by proper zeros)?
It seems that numpy.linalg.inv is a lite version of the scipy.linalg.inv as per the description:
This module is a lite version of the linalg.py module in SciPy which
contains high-level Python interface to the LAPACK library.
Looking at scipy.linalg.inv, it does a call to getrf, then getri.

WinBUGS Examples Vol 1, Dyes example returns error

Currently going through examples volume 1 and came across an error with the dyes example.
When I try to load inits from the example it returns "this chain contains uninitialized variables. I am not sure which part of it is not right as on the first sight I see theta, tau.btw and tau.with is all specified and nothing is left out.
I am using the code directly from Examples Vol 1 under help tab. The same error happened to all three choices of priors for between-variation.
I would really appreciate any advice on the problem. Thanks in advance.
Below is the code I copied directly from the dyes example.
model
{
for( i in 1 : batches ) {
mu[i] ~ dnorm(theta, tau.btw)
for( j in 1 : samples ) {
y[i , j] ~ dnorm(mu[i], tau.with)
}
}
theta ~ dnorm(0.0, 1.0E-10)
# prior for within-variation
sigma2.with <- 1 / tau.with
tau.with ~ dgamma(0.001, 0.001)
# Choice of priors for between-variation
# Prior 1: uniform on SD
#sigma.btw~ dunif(0,100)
#sigma2.btw<-sigma.btw*sigma.btw
#tau.btw<-1/sigma2.btw
# Prior 2: Uniform on intra-class correlation coefficient,
# ICC=sigma2.btw / (sigma2.btw+sigma2.with)
ICC ~ dunif(0,1)
sigma2.btw <- sigma2.with *ICC/(1-ICC)
tau.btw<-1/sigma2.btw
# Prior 3: gamma(0.001, 0.001) NOT RECOMMENDED
#tau.btw ~ dgamma(0.001, 0.001)
#sigma2.btw <- 1 / tau.btw
}
Data
list(batches = 6, samples = 5,
y = structure(
.Data = c(1545, 1440, 1440, 1520, 1580,
1540, 1555, 1490, 1560, 1495,
1595, 1550, 1605, 1510, 1560,
1445, 1440, 1595, 1465, 1545,
1595, 1630, 1515, 1635, 1625,
1520, 1455, 1450, 1480, 1445), .Dim = c(6, 5)))
Inits1
list(theta=1500, tau.with=1, sigma.btw=1)
Inits2
list(theta=1500, tau.with=1,ICC=0.5)
Inits3
list(theta=1500, tau.with=1, tau.btw=1)
That is not an error per se. Yes you have provided the inits for the parameters of interest.
However there are the six mu[i] variables that are not data, but are variables drawn from mu[i] ~ dnorm(theta, tau.btw).
You could provide initial values for these as well, but it is best imo to just click on gen inits if you are using WinBUGS from the GUI - this will provide initial values for those.