binomial regression with group effects - bayesian

I am trying to build a binomial regression model with rstan.
The aim is to get the effect size bt between two conditions X in a group t in total and the effect size btg for subgroups g.
library(rstan)
df <- data.frame(hits=c(36,1261,36,1261,49,1248,17,7670,25,759,29,755),trials=c(118,53850,184,53784,209,53759,118,53850,184,53784,209,53759)
,X=rep(c(1,0),6),g=rep(rep(1:3, each=2),2),t=rep(1:2,each=6),tg=rep(1:6,each=2) )
stanIn <- list(Nt=length(unique(df$t)), #number of groups t
Nc=length(df$t), #number of rows
Ng=length(unique(df$g)), #number of subgroups g
Ntg=length(unique(df$tg)), #number of t and g combinations
N=df$trials,
n=df$hits,
X=df$X, #condition 1 or 0
t=df$t, #index of groups
g=df$g, #index of subgroups
tg=df$tg) #index of combinations between t and g
model <- stan(data=stanIn, file="minimal.stan", chains = 4)
with minimal.stan as below.
data {
int<lower=1> Nt;
int<lower=1> Nc;
int<lower=1> Ng;
int<lower=1> Ntg;
int<lower=1> N[Nc];
int<lower=0> n[Nc];
int<lower=0,upper=1> X[Nc];
int<lower=1> t[Nc];
int<lower=0> g[Nc];
int<lower=1> tg[Nc];
}
parameters {
real at[Nt]; // group intercepts
real bt[Nt]; // group slopes
real btg[Ntg]; // subgroup slopes
real atg[Ntg]; // subgroup intercept
}
transformed parameters {
vector[Nc] theta; // binomial probabilities
for (i in 1:Nc) { // linear model
theta[i] = inv_logit( (atg[tg[i]]+at[t[i]] ) + (bt[t[i]]+btg[tg[i]]) * X[i]);
//theta[i] = inv_logit(at[t[i]] + bt[t[i]] * X[i]); //group effect
//theta[i] = inv_logit(atg[tg[i]] + btg[tg[i]] * X[i]); //subgroup effects
}
}
model {
at ~ normal(0.0, 20.0);
bt ~ normal(0.0, 20.0);
atg ~ normal(0.0, 20.0);
btg ~ normal(0.0, 20.0);
n ~ binomial(N, theta);
}
I can model the overall group effect with the first commented line (in
transformed parameters) and the subgroup effects with the second commented line. The idea was to combine both to get a group effect and the deviation from it for the individual group (first line).
However, this gives really weird results for bt and btg (A), and I was expecting something more like (B) (I can not recreate the behavior seen in A with a minimal example, this only occurs in the full dataset.)
If its not apparent from the type of question, I am completely new to statistic modeling and suspect that I have a conceptional error. So I would be grateful for any hint on this issue or a source to read up on those things (Feels like a common thing but I did not find anything).

I see several issues. Most directly, the model is not identified as the parameters atg and btg includes at and bt. I've changed those to ag and bg as they are your subgroup parameters, and indexed them as such, below:
library(rstan)
df <- data.frame(hits = c(36, 1261, 36, 1261, 49, 1248,
17, 7670, 25, 759, 29, 755),
trials = c(118, 53850, 184, 53784, 209, 53759,
118, 53850, 184, 53784, 209, 53759),
X = rep(c(1,0), 6),
g = rep(rep(1:3, each=2), 2),
t = rep(1:2, each=6),
tg = rep(1:6, each=2) )
stanIn <- list(Nt = length(unique(df$t)), #number of groups t
Nc = length(df$t), #number of rows
Ng = length(unique(df$g)), #number of subgroups g
N = df$trials,
n = df$hits,
X = df$X, #condition 1 or 0
t = df$t, #index of groups
g = df$g) #index of subgroups
model <- stan(data = stanIn, file = "minimal.stan",
cores = 4, chains = 4,
control = list(max_treedepth = 14))
with this modified Stan model samples without issues:
data {
int<lower=1> Nt;
int<lower=1> Nc;
int<lower=1> Ng;
int<lower=1> N[Nc];
int<lower=0> n[Nc];
vector<lower=0,upper=1>[Nc] X;
int<lower=1> t[Nc];
int<lower=0> g[Nc];
}
parameters {
vector<offset=0, multiplier=20>[Nt] at; // group intercepts
vector<offset=0, multiplier=20>[Nt] bt; // group slopes
vector<offset=0, multiplier=20>[Ng] ag; // subgroup intercepts
vector<offset=0, multiplier=20>[Ng] bg; // subgroup slopes
}
model {
at ~ normal(0.0, 20.0);
bt ~ normal(0.0, 20.0);
ag ~ normal(0.0, 20.0);
bg ~ normal(0.0, 20.0);
n ~ binomial_logit(N, ag[g] + at[t] + (bt[t] + bg[g]) .* X);
}
generated quantities {
vector[Nc] theta = inv_logit(ag[g] + at[t] + (bt[t] + bg[g]) .* X);
}
Of note, I had to use a high max_treedepth to fit the model, but it's hard for me to comment on that without understanding the data. I've also moved theta to generated quantities in case you want those calculations, but the binomial_logit handles this directly.
I've also setup noncentered parameters using offset and multiplier so that the Stan sampler has a better parameter space to sample from. Finally, I've re-coded the loops as vectors.

Related

Supply different families of priors as a parameter in the bugs/stan model

This is the classic eight school example in Bayesian data analysis by Andrew Gelman. Please see the stan file and R code below. I use a cauchy prior with paratmer A for the hyperparamter tau in the stan file. I am trying to supply the R function "school" with different priors not within cauchy family, for example, uniform(0,1000) prior, so that I do not have to create different stans file for the new priors. Is this possible within stan or bugs?
schools.stan:
`
data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // standard error of effect estimates
real<lower=0> A;
}
parameters {
real mu; // population treatment effect
real<lower=0> tau; // standard deviation in treatment effects
vector[J] eta; // unscaled deviation from mu by school
}
transformed parameters {
vector[J] theta = mu + tau * eta; // school treatment effects
}
model {
eta ~ normal(0, 1);
y ~ normal(theta, sigma);
tau ~ cauchy(0,A);
}
`
`
school <- function(A=100){
schools_dat <- list(J = 8,
y = c(28, 8, -3, 7, -1, 1, 18, 12),
sigma = c(15, 10, 16, 11, 9, 11, 10, 18),
A=A)
fit <- stan(file = "schools.stan", data = schools_dat,iter = 20)
print(fit)
}
school()
`
I tried the following but have no idea how to change the stan file correspondingly.
`
school <- function(prior="dunif(0,1000"){
schools_dat <- list(J = 8,
y = c(28, 8, -3, 7, -1, 1, 18, 12),
sigma = c(15, 10, 16, 11, 9, 11, 10, 18),
prior=prior)
fit <- stan(file = "schools.stan", data = schools_dat,iter = 20)
print(fit)
}
school()
`
It's possible to pre-specify more than one distribution in the Stan code, and then specify which distribution you want in the input data. Stan isn't really intended to be used this way, but it can be done!
Here's an example. I've added a new data variable, tau_prior; it's an integer that specifies which prior you want to use for tau. 1 = Cauchy, 2 = uniform, 3 = exponential. In addition, for each type of prior, there's a data variable that sets a hyperparameter. (Hyperparameters for the distributions that aren't chosen have no effect.)
data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // standard error of effect estimates
int<lower=1,upper=3> tau_prior;
real<lower=0> cauchy_sigma;
real<lower=0> uniform_beta;
real<lower=0> exponential_beta;
}
parameters {
real mu; // population treatment effect
real<lower=0> tau; // standard deviation in treatment effects
vector[J] eta; // unscaled deviation from mu by school
}
transformed parameters {
vector[J] theta = mu + tau * eta; // school treatment effects
}
model {
eta ~ normal(0, 1);
y ~ normal(theta, sigma);
if(tau_prior == 1) {
tau ~ cauchy(0, cauchy_sigma);
} else if(tau_prior == 2) {
tau ~ uniform(0, uniform_beta);
} else if(tau_prior == 3) {
tau ~ exponential(exponential_beta);
}
}
I've also modified the R function so that it provides default values for each hyperparameter, on a scale similar to the one you've used already.
school <- function(tau_prior = 1,
cauchy_sigma = 100,
uniform_beta = 1000,
exponential_beta = 0.01) {
schools_dat <- list(J = 8,
y = c(28, 8, -3, 7, -1, 1, 18, 12),
sigma = c(15, 10, 16, 11, 9, 11, 10, 18),
tau_prior = tau_prior,
cauchy_sigma = cauchy_sigma,
uniform_beta = uniform_beta,
exponential_beta = exponential_beta)
fit <- stan(file = "schools.stan", data = schools_dat, iter = 20)
print(fit)
}
# The default: use a Cauchy prior with scale 100.
school()
# Use a uniform prior with the default upper limit (1000).
school(tau_prior = 2)
# Use an exponential prior with a non-default rate (1).
school(tau_prior = 3, exponential_beta = 1)

How to sample from a sum of two distributions: binomial and poisson

Is there a way to predict a value from a sum of two distributions? I am getting a syntax error on rstan when I try to estimate y here: y ~ binomial(,) + poisson()
library(rstan)
BH_model_block <- "
data{
int y;
int a;
}
parameters{
real <lower = 0, upper = 1> c;
real <lower = 0, upper = 1> b;
}
model{
y ~ binomial(a,b)+ poisson(c);
}
"
BH_model <- stan_model(model_code = BH_model_block)
BH_fit <- sampling(BH_model,
data = list(y = 5,
a = 2),
iter= 1000)
Produces this error:
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
error in 'model2c6022623d56_457bd7ab767c318c1db686d1edf0b8f6' at line 13, column 20
-------------------------------------------------
11:
12: model{
13: y ~ binomial(a,b)+ poisson(c);
^
14: }
-------------------------------------------------
PARSER EXPECTED: ";"
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model '457bd7ab767c318c1db686d1edf0b8f6' due to the above error.
Stan doesn't support integer parameters, so you can't technically do that. For two real variables, it'd look like this:
parameters {
real x;
real y;
}
transformed parameters {
real z = x + y;
}
model {
x ~ normal(0, 1);
y ~ gamma(0.1, 2);
}
Then you get the sum distribution for z. If the variables are discrete, it won't compile.
If you don't need z in the model, then you can do this in the generated quantities block,
generated quantities {
int x = binomial_rng(a, b);
int y = poisson_rng(c);
int z = x + y;
}
The drawback of doing this is that none of the variables are available in the model block. If you need discrete parameters, they need to be marginalized as described in the user's guide chapter on latent discrete parameters (also in the chapter on mixtures and HMMs). This is not so easy with a Poisson, because support isn't bounded. If the expectations of the two discrete distributions is small, then you can do it approximately with a loop over plausible values.
It looked from the example in the original post that z is data. That's a slightly different marginalization over x and y, but you only sum over the x and y such that x + y = z, so the combinatorics are greatly reduced.
An alternative is to substitute the Binomial with a Poisson, and use Poisson additivity:
BH_model_block <- "
data{
int y;
int a;
}
parameters{
real <lower = 0, upper = 1> c;
real <lower = 0, upper = 1> b;
}
model{
y ~ poisson(a * b + c);
}
"
This differs in that if b is not small, the Binomial has a lower variance than the Poisson, but maybe there is overdispersion anyhow?

Different FFT results from Matlab fft and Objective-c fft

Here is my code in matlab:
x = [1 2 3 4];
result = fft(x);
a = real(result);
b = imag(result);
Result from matlab:
a = [10,-2,-2,-2]
b = [ 0, 2, 0,-2]
And my runnable code in objective-c:
int length = 4;
float* x = (float *)malloc(sizeof(float) * length);
x[0] = 1;
x[1] = 2;
x[2] = 3;
x[3] = 4;
// Setup the length
vDSP_Length log2n = log2f(length);
// Calculate the weights array. This is a one-off operation.
FFTSetup fftSetup = vDSP_create_fftsetup(log2n, FFT_RADIX2);
// For an FFT, numSamples must be a power of 2, i.e. is always even
int nOver2 = length/2;
// Define complex buffer
COMPLEX_SPLIT A;
A.realp = (float *) malloc(nOver2*sizeof(float));
A.imagp = (float *) malloc(nOver2*sizeof(float));
// Generate a split complex vector from the sample data
vDSP_ctoz((COMPLEX*)x, 2, &A, 1, nOver2);
// Perform a forward FFT using fftSetup and A
vDSP_fft_zrip(fftSetup, &A, 1, log2n, FFT_FORWARD);
//Take the fft and scale appropriately
Float32 mFFTNormFactor = 0.5;
vDSP_vsmul(A.realp, 1, &mFFTNormFactor, A.realp, 1, nOver2);
vDSP_vsmul(A.imagp, 1, &mFFTNormFactor, A.imagp, 1, nOver2);
printf("After FFT: \n");
printf("%.2f | %.2f \n",A.realp[0], 0.0);
for (int i = 1; i< nOver2; i++) {
printf("%.2f | %.2f \n",A.realp[i], A.imagp[i]);
}
printf("%.2f | %.2f \n",A.imagp[0], 0.0);
The output from objective c:
After FFT:
10.0 | 0.0
-2.0 | 2.0
The results are so close. I wonder where is the rest ? I know missed something but don't know what is it.
Updated: I found another answer here . I updated the output
After FFT:
10.0 | 0.0
-2.0 | 2.0
-2.0 | 0.0
but even that there's still 1 element missing -2.0 | -2.0
Performing a FFT delivers a right hand spectrum and a left hand spectrum.
If you have N samples the frequencies you will return are:
( -f(N/2), -f(N/2-1), ... -f(1), f(0), f(1), f(2), ..., f(N/2-1) )
If A(f(i)) is the complex amplitude A of the frequency component f(i) the following relation is true:
Real{A(f(i)} = Real{A(-f(i))} and Imag{A(f(i)} = -Imag{A(-f(i))}
This means, the information of the right hand spectrum and the left hand spectrum is the same. However, the sign of the imaginary part is different.
Matlab returns the frequency in a different order.
Matlab order is:
( f(0), f(1), f(2), ..., f(N/2-1) -f(N/2), -f(N/2-1), ... -f(1), )
To get the upper order use the Matlab function fftshift().
In the case of 4 Samples you have got in Matlab:
a = [10,-2,-2,-2]
b = [ 0, 2, 0,-2]
This means:
A(f(0)) = 10 (DC value)
A(f(1)) = -2 + 2i (first frequency component of the right hand spectrum)
A(-f(2) = -2 ( second frequency component of the left hand spectrum)
A(-f(1) = -2 - 2i ( first frequency component of the left hand spectrum)
I do not understand your objective-C code.
However, it seems to me that the program returns the right hand spectrum only.
So anything is perfect.

Dirichlet-Multinomial WinBUGS code

I'm trying to code a dirichlet-multinomial model using BUGS.
Basically I have 18 regions and 3 categories per region. In example,
Region 1: 0.50 belongs to Low, 0.30 belongs to Middle, and 0.20 belongs to High. The list goes on to Region 18 of course with varying proportions.The only code I got is this
`model {
for (i in 1:N) {
x[1:3] ~ dmulti(p[],n[i])
p[1:3] ~ ddirch(alpha[])
}
for (k in 1:3) {
alpha[k] <- 1
}
}
DATA list(n=c(38483, 2259, 1900),x=c(29256.42719, 1857.431404, 1548.007808, 29256.42719, 1857.431404, 1548.007808, 29256.42719, 1857.431404, 1548.007808), N=3)`
I shortened it to 3 regions first just for example. It states 'Dirichlet36' after clicking 'gen inits'. Please help me to code this.
This may be helpful (source):
Learning about the parameters of a Dirichlet distribution
Suppose as part of a model there are J probability arrays p[j, 1:K], j = 1, ..., J, where K is the dimension of each array and sum(p[j, 1:K]) = 1 for all j. We give each of them a Dirichlet prior:
p[j, 1:K] ~ ddirch(alpha[])
and we would like to learn about alpha[]. However, the parameters alpha[] of a Dirichlet distribution cannot be stochastic nodes. The trick is to note that if delta[k] ~ dgamma(alpha[k], 1), then the vector with elements delta[k] / sum(delta[1:K]), k = 1, ..., K, is Dirichlet with parameters alpha[k], k = 1, ..., K. So the following construction should allow learning about the parameters alpha[]:
for (k in 1:K) {
p[j, k] <- delta[j, k] / sum(delta[j,])
delta[j, k] ~ dgamma(alpha[k], 1)
}
A prior can be put directly on the alpha[k]'s.

Checking if lines intersect and if so return the coordinates

I've written some code below to check if two line segments intersect and if they do to tell me where. As input I have the (x,y) coordinates of both ends of each line. It appeared to be working correctly but now in the scenario where line A (532.87,787.79)(486.34,769.85) and line B (490.89,764.018)(478.98,783.129) it says they intersect at (770.136, 487.08) when the lines don't intersect at all.
Has anyone any idea what is incorrect in the below code?
double dy[2], dx[2], m[2], b[2];
double xint, yint, xi, yi;
WsqT_Location_Message *location_msg_ptr = OPC_NIL;
FIN (intersect (<args>));
dy[0] = y2 - y1;
dx[0] = x2 - x1;
dy[1] = y4 - y3;
dx[1] = x4 - x3;
m[0] = dy[0] / dx[0];
m[1] = dy[1] / dx[1];
b[0] = y1 - m[0] * x1;
b[1] = y3 - m[1] * x3;
if (m[0] != m[1])
{
//slopes not equal, compute intercept
xint = (b[0] - b[1]) / (m[1] - m[0]);
yint = m[1] * xint + b[1];
//is intercept in both line segments?
if ((xint <= max(x1, x2)) && (xint >= min(x1, x2)) &&
(yint <= max(y1, y2)) && (yint >= min(y1, y2)) &&
(xint <= max(x3, x4)) && (xint >= min(x3, x4)) &&
(yint <= max(y3, y4)) && (yint >= min(y3, y4)))
{
if (xi && yi)
{
xi = xint;
yi = yint;
location_msg_ptr = (WsqT_Location_Message*)op_prg_mem_alloc(sizeof(WsqT_Location_Message));
location_msg_ptr->current_latitude = xi;
location_msg_ptr->current_longitude = yi;
}
FRET(location_msg_ptr);
}
}
FRET(location_msg_ptr);
}
There is an absolutely great and simple theory about lines and their intersections that is based on adding an extra dimensions to your points and lines. In this theory a line can be created from two points with one line of code and the point of line intersection can be calculated with one line of code. Moreover, points at the Infinity and lines at the Infinity can be represented with real numbers.
You probably heard about homogeneous representation when a point [x, y] is represented as [x, y, 1] and the line ax+by+c=0 is represented as [a, b, c]?
The transitioning to Cartesian coordinates for a general homogeneous representation of a point [x, y, w] is [x/w, y/w]. This little trick makes all the difference including representation of lines at infinity (e.g. [1, 0, 0]) and making line representation look similar to point one. This introduces a GREAT symmetry into formulas for numerous line/point manipulation and is an
absolute MUST to use in programming. For example,
It is very easy to find line intersections through vector product
p = l1xl2
A line can be created from two points is a similar way:
l=p1xp2
In the code of OpenCV it it just:
line = p1.cross(p2);
p = line1.cross(line2);
Note that there are no marginal cases (such as division by zero or parallel lines) to be concerned with here. My point is, I suggest to rewrite your code to take advantage of this elegant theory about lines and points.
Finally, if you don't use openCV, you can use a 3D point class and create your own cross product function similar to this one:
template<typename _Tp> inline Point3_<_Tp> Point3_<_Tp>::cross(const Point3_<_Tp>& pt) const
{
return Point3_<_Tp>(y*pt.z - z*pt.y, z*pt.x - x*pt.z, x*pt.y - y*pt.x);
}