Stan version of a JAGS model which includes a sum of discrete values - Is it possible? - bayesian

I was trying to run this model in Stan. I have a running JAGS version of it (that returns highly autocorrelated parameters) and I know how to formulate it as CDF of a double exponential (with two rates), which would probably run without problems. However, I would like to use this version as a starting point for similar but more complex models.
By now I have the suspicion that a model like this is not possible in Stan. Maybe because of the discreteness introduces by taking the sum of a Boolean value, Stan may not be able to calculate gradients.
Does anyone know whether this is the case, or if I do something else in a wrong way in this model? I paste the errors I get below the model code.
Many thanks in advance
Jan
Model:
data {
int y[11];
int reps[11];
real soas[11];
}
parameters {
real<lower=0.001,upper=0.200> v1;
real<lower=0.001,upper=0.200> v2;
}
model {
real dif[11,96];
real cf[11];
real p[11];
real t1[11,96];
real t2[11,96];
for (i in 1:11){
for (r in 1:reps[i]){
t1[i,r] ~ exponential(v1);
t2[i,r] ~ exponential(v2);
dif[i,r] <- (t1[i,r]+soas[i]<=(t2[i,r]));
}
cf[i] <- sum(dif[i]);
p[i] <-cf[i]/reps[i];
y[i] ~ binomial(reps[i],p[i]);
}
}
Here is some dummy data:
psy_dat = {
'soas' : numpy.array(range(-100,101,20)),
'y' : [47, 46, 62, 50, 59, 47, 36, 13, 7, 2, 1],
'reps' : [48, 48, 64, 64, 92, 92, 92, 64, 64, 48, 48]
}
And here are the errors:
DIAGNOSTIC(S) FROM PARSER:
Warning (non-fatal): Left-hand side of sampling statement (~) contains a non-linear transform of a parameter or local variable.
You must call increment_log_prob() with the log absolute determinant of the Jacobian of the transform.
Sampling Statement left-hand-side expression:
get_base1(get_base1(t1,i,"t1",1),r,"t1",2) ~ exponential_log(...)
Warning (non-fatal): Left-hand side of sampling statement (~) contains a non-linear transform of a parameter or local variable.
You must call increment_log_prob() with the log absolute determinant of the Jacobian of the transform.
Sampling Statement left-hand-side expression:
get_base1(get_base1(t2,i,"t2",1),r,"t2",2) ~ exponential_log(...)
And at runtime:
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
stan::prob::exponential_log(N4stan5agrad3varE): Random variable is nan:0, but must not be nan!
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Rejecting proposed initial value with zero density.
Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model
Here is a working JAGS version of this model:
model {
for ( n in 1 : N ) {
for (r in 1 : reps[n]){
t1[r,n] ~ dexp(v1)
t2[r,n] ~ dexp(v2)
c[r,n] <- (1.0*((t1[r,n]+durs[n])<=t2[r,n]))
}
p[n] <- max((min(sum(c[,n]) / (reps[n]),0.99999999999999)), 1-0.99999999999999))
y[n] ~ dbin(p[n],reps[n])
}
v1 ~ dunif(0.0001,0.2)
v2 ~ dunif(0.0001,0.2)
}
With regard to the min() and max(): See this post https://stats.stackexchange.com/questions/130978/observed-node-inconsistent-when-binomial-success-rate-exactly-one?noredirect=1#comment250046_130978.

I'm still not sure what model you are trying to estimate (it would be best if you post the JAGS code) but what you have above cannot work in Stan. Stan is closer to C++ in the sense that you have to declare and then define objects. In your Stan program, you have the two declarations
real t1[11,96];
real t2[11,96];
but no definitions of t1 or t2. Consequently, they are initalized to NaN and when you do
t1[i,r] ~ exponential(v1);
that gets parsed as something like
for(i in 1:11) for(r in 1:reps[i]) lp__ += log(v1) - v1 * NaN
where lp__ is an internal symbol that holds value of the log-posterior, which becomes NaN and it cannot do Metropolis-style updates of the parameters.
Perhaps you meant for t1 and t2 to be unknown parameters, in which case they should be declared in the parameters block. The following [EDITED] Stan program is valid and should work, but it might not be the program you had in mind (it does not make a lot of sense to me and the discontinuity in dif will probably preclude Stan from sampling from this posterior distribution efficiently).
data {
int<lower=1> N;
int y[N];
int reps[N];
real soas[N];
}
parameters {
real<lower=0.001,upper=0.200> v1;
real<lower=0.001,upper=0.200> v2;
real t1[N,max(reps)];
real t2[N,max(reps)];
}
model {
for (i in 1:N) {
real dif[reps[i]];
for (r in 1:reps[i]) {
dif[r] <- (t1[i,r]+soas[i]) <= t2[i,r];
}
y[i] ~ binomial(reps[i], (1.0 + sum(dif)) / (1.0 + reps[i]));
}
to_array_1d(t1) ~ exponential(v1);
to_array_1d(t2) ~ exponential(v2);
}

Related

Rjags jags model compiling error when using for loop

I am using a Rjags package to run MCMC. I have binomial dataset and I tried to run a "for loop" function in order to generate parameters for multiple datasets from different authors in a combined data.
I specified jags model and uninformative priors for each parameter that I want to get posteriors, but I kept getting an error message like this;
jcode <- "model{
for (i in 1:3){
n.pos[i] ~ dbinom(seropos_est[i],N[i]) #fit to binomial data
seropos_est[i] = 1-exp(-lambdaS1*age[i]) #catalytic model
}
for (i in 4:7) {
n.pos[i] ~ dbinom(seropos_est[i],N[i]) #fit to binomial data
seropos_est[i] = 1-exp(-lambdaS2*age[i]) #catalytic model
}
for (i in 8:11) {
n.pos[i] ~ dbinom(seropos_est[i],N[i]) #fit to binomial data
seropos_est[i] = 1-exp(-lambdaS3*age[i]) #catalytic model
}
#priors
lambdaS1 ~ dnorm(0,1) #uninformative prior
lambdaS2 ~ dnorm(0,1) #uninformative prior
lambdaS3 ~ dnorm(0,1) #uninformative prior
}"
Parameter vector
paramVector <- c("lambdaS1", "lambdaS2", "lambdaS3")
`
mcmc.length=50000
jdat = list(n.pos= df_chik$N.pos,
N=df_chik$N,
age=df_chik$agemid)
jmod = jags.model(textConnection(jcode), data=jdat, n.chains=4, n.adapt=15000)
jpos = coda.samples(jmod, paramVector, n.iter=mcmc.length)
`Error message
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 11
Unobserved stochastic nodes: 3
Total graph size: 74
Initializing model
Deleting model
This is an error message that I am kept getting. I would appreciate if anyone can help me out with this!
The text you show under “Error message” i.e. this text:
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 11
Unobserved stochastic nodes: 3
Total graph size: 74
Initializing model
Deleting model
... is not an error, but the expected output of rjags. But I suspect that you have not copied the real error message, which is probably something along the lines of "invalid parent values for node n.pos[1]". The reason for that is that for seropos_est[] you have relationships of the form:
seropos_est[i] = 1-exp(-lambdaS1*age[i])
Where lambdaS1 is an unconstrained variable. Therefore, the result of exp(-lambdaS1*age[i]) can be above 1, which means that seropos_est[i] can be negative, which is invalid for a probability parameter. In fact, given the normal prior for lambdaS1 the model will initialise that variable with a value of zero, meaning that seropos_est[i] is initialised to zero, which is also invalid if any of your n.pos are greater than zero. You therefore need to re-specify your model to constrain seropos_est to valid parameter space, possibly by changing the prior for lambdaS1 etc (presuming that age is positive).
Also, you have in your code:
lambdaS1 ~ dnorm(0,1) #uninformative prior
But this is certainly not uninformative. In any case, there is really no such thing as an 'uninformative prior' - all priors contain some information, by definition. The best you can do is a 'minimally informative prior' or 'non-informative prior', which is why this terminology is generally recommended rather than the misleading word 'uninformative'.
For future, it would help us to help you if your question contained a minimal reproducible example, so that we can run the model and see exactly what you see. In this case all that is really missing is access to the data.
Hope that helps,
Matt

Stan. Using target += syntax

I'm starting to learn Stan.
Could anyone explain when and how to use syntax such as... ?
target +=
instead of just:
y ~ normal(mu, sigma)
For example in Stan manual you can find the following example.
model {
real ps[K]; // temp for log component densities
sigma ~ cauchy(0, 2.5);
mu ~ normal(0, 10);
for (n in 1:N) {
for (k in 1:K) {
ps[k] = log(theta[k])
+ normal_lpdf(y[n] | mu[k], sigma[k]);
}
target += log_sum_exp(ps);
}
}
I think the target line increases the target value, that I think it's the logarithm of the posterior density.
But the posterior density for what parameter?
When is it updated and initialized?
After Stan finishes (and converges), how do you access its value and how I use it?
Other examples:
data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // s.e. of effect estimates
}
parameters {
real mu;
real<lower=0> tau;
vector[J] eta;
}
transformed parameters {
vector[J] theta;
theta = mu + tau * eta;
}
model {
target += normal_lpdf(eta | 0, 1);
target += normal_lpdf(y | theta, sigma);
}
the example above uses target twice instead of just once.
another example.
data {
int<lower=0> N;
vector[N] y;
}
parameters {
real mu;
real<lower=0> sigma_sq;
vector<lower=-0.5, upper=0.5>[N] y_err;
}
transformed parameters {
real<lower=0> sigma;
vector[N] z;
sigma = sqrt(sigma_sq);
z = y + y_err;
}
model {
target += -2 * log(sigma);
z ~ normal(mu, sigma);
}
This last example even mixes both methods.
To do it even more difficult I've read that
y ~ normal(0,1);
has the same effect than
increment_log_prob(normal_log(y,0,1));
Could anyone explain why, please?
Could anyone provide a simple example written in two different ways, with "target +=" and in the regular simpler "y ~" way, please?
Regards
The syntax
target += u;
adds u to the target log density.
The target density is the density from which the sampler samples and it needs to be equal to the joint density of all the parameters given the data up to a constant (which is usually achieved via Bayes's rule by coding as the joint density of parameters and modeled data up to a constant). You access it as lp__ in the posterior, but be careful, as it also contains the Jacobians arising from the constraints and drops constants in sampling statements---you do not want to use it for model comparison.
From a sampling perspective, writing
target += normal_lpdf(y | mu, sigma);
has the same effect as
y ~ normal(mu, sigma);
The _lpdf signals it's the log probability density function for the normal, which is implicit in the sampling notation. The sampling notation is just shorthand for the target += syntax, and in addition, drops constant terms in the log density.
It's explained in the statements section of the language reference (the second part of the manual) and used in multiple examples through the programmer's guide (the first part of the manual).
I am just starting to learn Stan and Bayesian statistics, and mainly rely on John Kruschke's book "Doing Bayesian Data Analysis". Here, in chapter 14.3.3, he explains:
Thus, the essence of computation in Stan is dealing with the logarithm
of the posterior probability density and its gradient; there is no
direct random sampling of parameters from distributions.
As a result (still rephrasing Kruschke), a
model [...] like y ∼ normal(mu,sigma) [actuallly] means to multiply the current posterior probability by the density of the normal distribution at the datum value y.
Following logarithm calculation rules, this multiplication is equal to add the log probability density of a given data y to the current log-probability. (log(a*b) = log(a) + log(b), hence the equality of multiplication and sum).
I concede that I don't grasp the full implications of that, but I think it points into the right direction into what, mathematically speaking, the targer += does.

WinBugs error Trap -undefined real result

I am writing a WinBugs code for the Bayesian Statistics question :
Consider the following model that takes into account the fact that VIX (first variable) provides information for the variance of SP500 (second variable) and the fact that $Y_t^S$ and $Y_t^V$ may be correlated:
The model is at http://i.stack.imgur.com/qMHdq.png
for $t = 1, \ldots, 200$, where $\rho$ reflects the correlation between the increments of $Y_t^S$ and $Y_t^V$, $\alpha$ is a parameter taking values in the real line and $N_2(M,V)$ denotes a bivariate normal distribution with mean $M$ and covariance matrix $V$.
(The question is:)
Assign suitable priors to the parameters $\mu_s$, $\mu_v$, $\sigma$, $\omega$, $\rho$, $\alpha$ and write a WinBugs script to fit this model to your data. Implement it to sample from the posterior distribution of this model's parameters.
The WinBugs Code is :
model{for(i in 1:200){
y[i+1,1:2] ~ dnorm(mean[i,1:2],tau[i,1:2,1:2])
mean[i,1] <- y[i,1]+mu[1]+alpha*exp(y[i,2])
mean[i,2]<- y[i,2]+mu[2]
tau[i,1,1]<-exp(y[i,2])/prec[1]
tau[i,1,2]<-exp(y[i,2]/2)*rho/sqrt(prec[1]*prec[2])
tau[i,2,1]<-exp(y[i,2]/2)*rho/sqrt(prec[1]*prec[2])
tau[i,2,2]<-(1/(prec[2]))
}
mu[1] ~ dnorm (0, 0.0001)
mu[2] ~ dnorm (0, 0.0001)
prec[1] ~ dgamma (0.001, 0.001)
prec[2] ~ dgamma (0.001, 0.001)
alpha~dnorm(1,10000)
rho~dnorm(0,10)
}
list(y =structure(.Data= c(3.291839303,3.296274588,3.295265738,3.297438773,3.298200053,3.298412011,3.296300932,3.296426043,3.294455203,3.294481658,3.285708048,3.284464574,3.287575569,3.283348727,3.283355512,3.280935583,3.285914948,3.287111684,3.286400327,3.289303491,3.291186746,3.29116009,3.294849647,3.297015994,3.298090756,3.299369994,3.298503754,3.300578094,3.301034339,3.301056053,3.300321518,3.301761166,3.301524809,3.301186314,3.3005194,3.302700982,3.301364274,3.298512491,3.300093081,3.300475917,3.297878641,3.297570124,3.300808449,3.301370783,3.303489809,3.303282476,3.299788312,3.297272339,3.300660688,3.293581304,3.297289862,3.296182373,3.294970773,3.289178542,3.289180774,3.294003026,3.29332277,3.286703413,3.294221453,3.285154331,3.280152517,3.272941046,3.273626206,3.27009395,3.270156904,3.27571666,3.279669225,3.28808818,3.284906505,3.290217199,3.293269718,3.292617095,3.29777145,3.297169381,3.299866701,3.304931922,3.30488027,3.303649561,3.306118232,3.307754826,3.307906605,3.309259582,3.309562037,3.309257451,3.309487508,3.309591846,3.309911091,3.312135025,3.311482607,3.312336061,3.314604473,3.315846543,3.31534678,3.316563686,3.315458122,3.312482018,3.315245917,3.316877848,3.316372983,3.317095535,3.31393257,3.313829271,3.30666945,3.308634834,3.301535654,3.298772321,3.295069851,3.303820042,3.314126455,3.316106697,3.317758387,3.318516185,3.318455693,3.319890621,3.320264714,3.318136407,3.313635254,3.313487574,3.30547605,3.30159638,3.306618004,3.314318146,3.31065296,3.307123626,3.306002323,3.303470376,3.299435382,3.305226653,3.305899267,3.30794935,3.314530804,3.312139259,3.313253293,3.307399755,3.301498781,3.305620033,3.299940723,3.305534079,3.311760217,3.309951512,3.314398169,3.312911143,3.311062677,3.315674421,3.315661824,3.319830321,3.321596359,3.322289603,3.322153111,3.321691617,3.324344199,3.324212469,3.325408924,3.325076221,3.32443474,3.32314893,3.325800858,3.323825279,3.321915182,3.322434321,3.316234618,3.317944305,3.310514886,3.309681258,3.315119807,3.312473558,3.31831173,3.31686738,3.322115879,3.319994568,3.323891208,3.323132421,3.320457869,3.314088528,3.313054794,3.314082206,3.319364268,3.315527433,3.31380186,3.315332072,3.318192769,3.317296379,3.318459865,3.320391417,3.322645108,3.320650938,3.321358125,3.323588265,3.323250037,3.318309644,3.32230201,3.321658486,3.323862366,3.324885109,3.325862386,3.324060105,3.325261087,3.323633617,3.319212277,3.323930349,3.325205636,-1.674871187,-1.837305384,-1.784901741,-1.824437164,-1.877095042,-1.853296595,-1.793076756,-1.802020721,-1.75360385,-1.750339701,-1.541660595,-1.537570704,-1.640896418,-1.545769835,-1.571902641,-1.556650006,-1.604336613,-1.6935902,-1.699715676,-1.778820579,-1.811756808,-1.762148494,-1.818778584,-1.826568672,-1.857709419,-1.859185357,-1.880873164,-1.863628277,-1.868840571,-1.857709419,-1.838025906,-1.843086364,-1.823727823,-1.815963058,-1.796505852,-1.835147398,-1.795132589,-1.739332463,-1.780168274,-1.785580061,-1.751643889,-1.700330607,-1.790343193,-1.795818949,-1.839468745,-1.833711714,-1.727193104,-1.651880385,-1.754258154,-1.611526503,-1.656547093,-1.59284645,-1.575092078,-1.5540471,-1.583117287,-1.674274013,-1.621581021,-1.528943106,-1.641471071,-1.453534332,-1.345690975,-1.216718593,-1.28451135,-1.161741385,-1.197198918,-1.315549541,-1.462376193,-1.587427911,-1.495750895,-1.563454293,-1.585808919,-1.589591272,-1.683878412,-1.639174734,-1.676066767,-1.705884658,-1.663594506,-1.654210604,-1.6972603,-1.728462971,-1.76413233,-1.79444677,-1.777474973,-1.770778032,-1.720871468,-1.751643889,-1.708364571,-1.716473539,-1.710229163,-1.73420046,-1.778820579,-1.79788129,-1.823727823,-1.83658546,-1.750339701,-1.689935542,-1.782193745,-1.808267093,-1.814558711,-1.854765047,-1.694811844,-1.654210604,-1.464249161,-1.394472583,-1.352258787,-1.379888524,-1.255280835,-1.422607479,-1.548864573,-1.565558689,-1.633460313,-1.659476569,-1.685086464,-1.677263996,-1.644350056,-1.596113873,-1.433397543,-1.499648104,-1.401421332,-1.350612172,-1.428435452,-1.538591373,-1.511445758,-1.415487857,-1.373953779,-1.335931446,-1.299891813,-1.357631945,-1.402730434,-1.449377291,-1.570312304,-1.556650006,-1.618216566,-1.527933706,-1.379038217,-1.453534332,-1.356803139,-1.423054399,-1.522402875,-1.47367507,-1.54680019,-1.524410013,-1.463312172,-1.527429445,-1.541148304,-1.628349281,-1.665956408,-1.602685826,-1.622143032,-1.631185029,-1.689327925,-1.67367725,-1.727193104,-1.71772782,-1.71334574,-1.749688341,-1.769444817,-1.716473539,-1.6935902,-1.705265784,-1.636312824,-1.644350056,-1.555087327,-1.545769835,-1.623831253,-1.591760035,-1.613194194,-1.610416485,-1.709607188,-1.703411805,-1.770778032,-1.745142444,-1.731645785,-1.622705408,-1.602685826,-1.643773495,-1.676665175,-1.631185029,-1.641471071,-1.667139772,-1.663005033,-1.660651132,-1.708985657,-1.766120707,-1.800638718,-1.711474452,-1.728462971,-1.782869953,-1.79925891,-1.714595509,-1.752296718,-1.755568243,-1.791708899,-1.807570829,-1.820896234,-1.76413233,-1.812456437,-1.746438846,-1.674274013,-1.792392558,-1.782193745),
.Dim=c(201,2))
)
list( mu=c(0,0), prec=c(1,1),alpha=1,rhi=0.5)
I get an error "multivariate node expected" while compiling the model. What is wrong in the code?
Model
You cannot put multiple means and variances in dnorm, which you are currently doing. The model expects that your likelihood function is multivariate, but you are giving it a univariate likelihood function. That model that you specify is actually multivariate normal, which in JAGS you would specify as dmnorm, which can take a vector of means and then a variance covariance matrix (which you have already specified). Try changing the dnorm to dmnorm at the top of your model and then you should be good to go.

Running a logistic model in JAGS - Can you vectorize instead of looping over individual cases?

I'm fairly new to JAGS, so this may be a dumb question. I'm trying to run a model in JAGS that predicts the probability that a one-dimensional random walk process will cross boundary A before crossing boundary B. This model can be solved analytically via the following logistic model:
Pr(A,B) = 1/(1 + exp(-2 * (d/sigma) * theta))
where "d" is the mean drift rate (positive values indicate drift toward boundary A), "sigma" is the standard deviation of that drift rate and "theta" is the distance between the starting point and the boundary (assumed to be equal for both boundaries).
My dataset consists of 50 participants, who each provide 1800 observations. My model assumes that d is determined by a particular combination of observed environmental variables (which I'll just call 'x'), and a weighting coefficient that relates x to d (which I'll call 'beta'). Thus, there are three parameters: beta, sigma, and theta. I'd like to estimate a single set of parameters for each participant. My intention is to eventually run a hierarchical model, where group level parameters influence individual level parameters. However, for simplicity, here I will just consider a model in which I estimate a single set of parameters for one participant (and thus the model is not hierarchical).
My model in rjags would be as follows:
model{
for ( i in 1:Ntotal ) {
d[i] <- x[i] * beta
probA[i] <- 1/(1+exp(-2 * (d[i]/sigma) * theta ) )
y[i] ~ dbern(probA[i])
}
beta ~ dunif(-10,10)
sigma ~ dunif(0,10)
theta ~ dunif(0,10)
}
This model runs fine, but takes ages to run. I'm not sure how JAGS carries out the code, but if this code were run in R, it would be rather inefficient because it would have to loop over cases, running the model for each case individually. The time required to run the analysis would therefore increase rapidly as the sample size increases. I have a rather large sample, so this is a concern.
Is there a way to vectorise this code so that it can calculate the likelihood for all of the data points at once? For example, if I were to run this as a simple maximum likelihood model. I would vectorize the model and calculate the probability of the data given particular parameter values for all 1800 cases provided by the participant (and thus would not need the for loop). I would then take the log of these likelihoods and add them all together to give a single loglikelihood for the all observations given by the participant. This method has enormous time savings. Is there a way to do this in JAGS?
EDIT
Thanks for the responses, and for pointing out that the parameters in the model I showed might be unidentified. I should've pointed out that model was a simplified version. The full model is below:
model{
for ( i in 1:Ntotal ) {
aExpectancy[i] <- 1/(1+exp(-gamma*(aTimeRemaining[i] - aDiscrepancy[i]*aExpectedLag[i]) ) )
bExpectancy[i] <- 1/(1+exp(-gamma*(bTimeRemaining[i] - bDiscrepancy[i]*bExpectedLag[i]) ) )
aUtility[i] <- aValence[i]*aExpectancy[i]/(1 + discount * (aTimeRemaining[i]))
bUtility[i] <- bValence[i]*bExpectancy[i]/(1 + discount * (bTimeRemaining[i]))
aMotivationalValueMean[i] <- aUtility[i]*aQualityMean[i]
bMotivationalValueMean[i] <- bUtility[i]*bQualityMean[i]
aMotivationalValueVariance[i] <- (aUtility[i]*aQualitySD[i])^2 + (bUtility[i]*bQualitySD[i])^2
bMotivationalValueVariance[i] <- (aUtility[i]*aQualitySD[i])^2 + (bUtility[i]*bQualitySD[i])^2
mvDiffVariance[i] <- aMotivationalValueVariance[i] + bMotivationalValueVariance[i]
meanDrift[i] <- (aMotivationalValueMean[i] - bMotivationalValueMean[i])
probA[i] <- 1/(1+exp(-2*(meanDrift[i]/sqrt(mvDiffVariance[i])) *theta ) )
y[i] ~ dbern(probA[i])
}
In this model, the estimated parameters are theta, discount, and gamma, and these parameters can be recovered. When I run the model on the observations for a single participant (Ntotal = 1800), the model takes about 5 minutes to run, which is totally fine. However, when I run the model on the entire sample (45 participants x 1800 cases each = 78,900 observations), I've had it running for 24 hours and it's less than 50% of the way through. This seems odd, as I would expect it to just take 45 times as long, so 4 or 5 hours at most. Am I missing something?
I hope I am not misreading this situation (and I previously apologize if I am), but your question seems to come from a conceptual misunderstanding of how JAGS works (or WinBUGS or OpenBUGS for that matter).
Your program does not actually run, because what you wrote was not written in a programming language. So vectorizing will not help.
You wrote just a description of your model, because JAGS' language is a descriptive one.
Once JAGS reads your model, it assembles a transition matrix to run a MCMC whose stationary distribution is the posteriori distribution of your parameters given your (observed) data. JAGS does nothing else with your program.
All that time you have been waiting the program to run was actually waiting (and hoping) to reach relaxation time of your MCMC.
So, what is taking your program too long to run is that the resulting transition matrix must have bad relaxing properties or anything like that.
That is why vectorizing a program that is read and run only once will be of very little help.
So, your problem lies somewhere else.
I hope it helps and, if not, sorry.
All the best.
You can't vectorise in the same way that you would in R, but if you can group observations with the same probability expression (i.e. common d[i]) then you can use a Binomial rather than Bernoulli distribution which will help enormously. If each observation has a unique d[i] then you are stuck I'm afraid.
Another alternative is to look at Stan which is generally faster for large data sets like yours.
Matt
thanks for the responses. Yes, you make a good point that the parameters in the model I showed might be unidentified.
I should've pointed out that model was a simplified version. The full model is below:
model{
for ( i in 1:Ntotal ) {
aExpectancy[i] <- 1/(1+exp(-gamma*(aTimeRemaining[i] - aDiscrepancy[i]*aExpectedLag[i]) ) )
bExpectancy[i] <- 1/(1+exp(-gamma*(bTimeRemaining[i] - bDiscrepancy[i]*bExpectedLag[i]) ) )
aUtility[i] <- aValence[i]*aExpectancy[i]/(1 + discount * (aTimeRemaining[i]))
bUtility[i] <- bValence[i]*bExpectancy[i]/(1 + discount * (bTimeRemaining[i]))
aMotivationalValueMean[i] <- aUtility[i]*aQualityMean[i]
bMotivationalValueMean[i] <- bUtility[i]*bQualityMean[i]
aMotivationalValueVariance[i] <- (aUtility[i]*aQualitySD[i])^2 + (bUtility[i]*bQualitySD[i])^2
bMotivationalValueVariance[i] <- (aUtility[i]*aQualitySD[i])^2 + (bUtility[i]*bQualitySD[i])^2
mvDiffVariance[i] <- aMotivationalValueVariance[i] + bMotivationalValueVariance[i]
meanDrift[i] <- (aMotivationalValueMean[i] - bMotivationalValueMean[i])
probA[i] <- 1/(1+exp(-2*(meanDrift[i]/sqrt(mvDiffVariance[i])) *theta ) )
y[i] ~ dbern(probA[i])
}
theta ~ dunif(0,10)
discount ~ dunif(0,10)
gamma ~ dunif(0,1)
}
In this model, the estimated parameters are theta, discount, and gamma, and these parameters can be recovered.
When I run the model on the observations for a single participant (Ntotal = 1800), the model takes about 5 minutes to run, which is totally fine.
However, when I run the model on the entire sample (45 participants X 1800 cases each = 78,900 observations), I've had it running for 24 hours and it's less than 50% of the way through.
This seems odd, as I would expect it to just take 45 times as long, so 4 or 5 hours at most. Am I missing something?

Error:"Multiple definitions of node" in OpenBUGS.

So I thought the following code would work in OpenBUGS, but instead it gives me a "Multiple definitions of node Z" error.
model
{
Z <- round(X)
X ~ dnorm(0,1)T(-2,2)
}
list(Z=0)
Even if I replace Z <- round(X) with Z <- X I continue to get the same error. From this fact we can deduce that the error is resulting from the use of a logical assignment for an observable variable and in particular, the error is not due to the round() operation.
Why does BUGS not allow this? Also, what is a good work-around in this case? Here is a more general version that I want to implement, which is essentially modeling a discrete Gaussian with walls (the truncation):
model
{
for(i in 1:N){
Z[i] <- round(X[i])
X[i] ~ dnorm(mu,1)T(-2,2)
}
mu ~ dunif(-2,2)
}
Essentially, I want Z to be distributed with something like a discrete Gaussian with "walls" (the truncation) and I want to estimate mu from data on Z. I suppose I can try to make Z into a categorical variable and estimate the parameters but this seems theoretically painful. Is there some BUGS trick I can use to get my intended model?
WinBUGS and OpenBUGS don't allow observed data to be a deterministic function of an unobserved variable. As you suggest, you could use dcat() and express the probabilities in terms of the normal distribution.
Though you might prefer to switch to JAGS, which has a distribution dround() that deals with just this situation - data that are rounded to n significant digits, in your case n=0. Though this forum post suggests there's a bug in the current stable release for this case, and you might need to download the development version.