Can we get the residuals from JAGS output? - bayesian

Say we fit a Bayesian linear mixed model using JAGS (or WinBUGS), does the output object include the model residuals? How can we find the residuals?
Thanks!

A JAGS (BUGS) model simply outputs the values of the nodes in the model you have told it to monitor. To get residuals you need to define those in the model and then monitor them. For example
model {
bResponse ~ dnorm(0, 5^-2)
sResponse ~ dunif(0, 5)
for (i in 1:length(Response)) {
eResponse[i] <- bResponse
Response[i] ~ dlnorm(eResponse[i], sResponse^-2)
Residual[i] <- (log(Response[i]) - log(eResponse[i])) / sResponse
}
}
were Residual[i] defines a residual for each of the i values of the Response. Note the above example does not deal with specifying which values to monitor.

Related

Rjags jags model compiling error when using for loop

I am using a Rjags package to run MCMC. I have binomial dataset and I tried to run a "for loop" function in order to generate parameters for multiple datasets from different authors in a combined data.
I specified jags model and uninformative priors for each parameter that I want to get posteriors, but I kept getting an error message like this;
jcode <- "model{
for (i in 1:3){
n.pos[i] ~ dbinom(seropos_est[i],N[i]) #fit to binomial data
seropos_est[i] = 1-exp(-lambdaS1*age[i]) #catalytic model
}
for (i in 4:7) {
n.pos[i] ~ dbinom(seropos_est[i],N[i]) #fit to binomial data
seropos_est[i] = 1-exp(-lambdaS2*age[i]) #catalytic model
}
for (i in 8:11) {
n.pos[i] ~ dbinom(seropos_est[i],N[i]) #fit to binomial data
seropos_est[i] = 1-exp(-lambdaS3*age[i]) #catalytic model
}
#priors
lambdaS1 ~ dnorm(0,1) #uninformative prior
lambdaS2 ~ dnorm(0,1) #uninformative prior
lambdaS3 ~ dnorm(0,1) #uninformative prior
}"
Parameter vector
paramVector <- c("lambdaS1", "lambdaS2", "lambdaS3")
`
mcmc.length=50000
jdat = list(n.pos= df_chik$N.pos,
N=df_chik$N,
age=df_chik$agemid)
jmod = jags.model(textConnection(jcode), data=jdat, n.chains=4, n.adapt=15000)
jpos = coda.samples(jmod, paramVector, n.iter=mcmc.length)
`Error message
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 11
Unobserved stochastic nodes: 3
Total graph size: 74
Initializing model
Deleting model
This is an error message that I am kept getting. I would appreciate if anyone can help me out with this!
The text you show under “Error message” i.e. this text:
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 11
Unobserved stochastic nodes: 3
Total graph size: 74
Initializing model
Deleting model
... is not an error, but the expected output of rjags. But I suspect that you have not copied the real error message, which is probably something along the lines of "invalid parent values for node n.pos[1]". The reason for that is that for seropos_est[] you have relationships of the form:
seropos_est[i] = 1-exp(-lambdaS1*age[i])
Where lambdaS1 is an unconstrained variable. Therefore, the result of exp(-lambdaS1*age[i]) can be above 1, which means that seropos_est[i] can be negative, which is invalid for a probability parameter. In fact, given the normal prior for lambdaS1 the model will initialise that variable with a value of zero, meaning that seropos_est[i] is initialised to zero, which is also invalid if any of your n.pos are greater than zero. You therefore need to re-specify your model to constrain seropos_est to valid parameter space, possibly by changing the prior for lambdaS1 etc (presuming that age is positive).
Also, you have in your code:
lambdaS1 ~ dnorm(0,1) #uninformative prior
But this is certainly not uninformative. In any case, there is really no such thing as an 'uninformative prior' - all priors contain some information, by definition. The best you can do is a 'minimally informative prior' or 'non-informative prior', which is why this terminology is generally recommended rather than the misleading word 'uninformative'.
For future, it would help us to help you if your question contained a minimal reproducible example, so that we can run the model and see exactly what you see. In this case all that is really missing is access to the data.
Hope that helps,
Matt

Augment a mable: residuals and innovations from regression with ARMA errors model are the same

I think there is something odd here. For example the following code gives the same values for residuals and innovations:
fit <- us_change %>%
model(ARIMA(Consumption ~ Income)) %>%
augment()
It seems that the augment() function extracts only the innovation values and uses it for the residuals from the regression too. This is seen when we extract the residuals and innovations using residuals():
bind_rows(
`Regression Errors` = as_tibble(residuals(fit, type = "regression")),
`ARIMA Errors` = as_tibble(residuals(fit, type = "innovation")),
.id = "type"
)
Then the residuals and innovations are different as they should be.
The .resid column provided by augment() contains response residuals, not regression residuals. I have updated the documentation to clarify this: https://github.com/tidyverts/fabletools/commit/c0efd7166bca06450d7b18d3d0530fdeac67cce7
A response residual (.resid) is the error from backtransformed predictions on the original response variable. An innovation residual (.innov) is the error from the model (on potentially a different, transformed response variable). As your model does not transform the data, the response residuals (.resid) and the innovation residuals (.innov) are the same.
There is currently no way to obtain the regression residuals (residuals after performing regression, before applying ARIMA process) using the augment() function. This is something that would be nice to have in the future.

Is it relevant to use both feature normalizer_fn and batch normalization?

Is it relevant to use both feature normalizer_fn and batch normalization like following ?
feature_columns_complex_standardized = [
tf.feature_column.numeric_column("my_feature", normalizer_fn=lambda x: (x - xMean) / xStd)
]
model1 = tf.estimator.DNNClassifier(feature_columns=feature_columns_complex_standardized,
hidden_units=[512,512,512],
optimizer=tf.train.AdamOptimizer(learning_rate=0.001, beta1= 0.9,beta2=0.99, epsilon = 1e-08,use_locking=False),
weight_column=weights,
dropout=0.5,
activation_fn=tf.nn.softmax,
n_classes=10,
label_vocabulary=Action_vocab,
model_dir='./Models9/Action/',
loss_reduction=tf.losses.Reduction.SUM_OVER_BATCH_SIZE,
config=tf.estimator.RunConfig().replace(save_summary_steps=10),
batch_norm=True)
May be you get it wrong, as Normalization is one of the methods used to bring features in a dataset to the same scale, where batch normalization is used for solving the problem of internal covariate shift where each hidden unit’s input distribution changes every time there is a parameter update in the previous layer.
So you can use both at the same time.

WinBugs error Trap -undefined real result

I am writing a WinBugs code for the Bayesian Statistics question :
Consider the following model that takes into account the fact that VIX (first variable) provides information for the variance of SP500 (second variable) and the fact that $Y_t^S$ and $Y_t^V$ may be correlated:
The model is at http://i.stack.imgur.com/qMHdq.png
for $t = 1, \ldots, 200$, where $\rho$ reflects the correlation between the increments of $Y_t^S$ and $Y_t^V$, $\alpha$ is a parameter taking values in the real line and $N_2(M,V)$ denotes a bivariate normal distribution with mean $M$ and covariance matrix $V$.
(The question is:)
Assign suitable priors to the parameters $\mu_s$, $\mu_v$, $\sigma$, $\omega$, $\rho$, $\alpha$ and write a WinBugs script to fit this model to your data. Implement it to sample from the posterior distribution of this model's parameters.
The WinBugs Code is :
model{for(i in 1:200){
y[i+1,1:2] ~ dnorm(mean[i,1:2],tau[i,1:2,1:2])
mean[i,1] <- y[i,1]+mu[1]+alpha*exp(y[i,2])
mean[i,2]<- y[i,2]+mu[2]
tau[i,1,1]<-exp(y[i,2])/prec[1]
tau[i,1,2]<-exp(y[i,2]/2)*rho/sqrt(prec[1]*prec[2])
tau[i,2,1]<-exp(y[i,2]/2)*rho/sqrt(prec[1]*prec[2])
tau[i,2,2]<-(1/(prec[2]))
}
mu[1] ~ dnorm (0, 0.0001)
mu[2] ~ dnorm (0, 0.0001)
prec[1] ~ dgamma (0.001, 0.001)
prec[2] ~ dgamma (0.001, 0.001)
alpha~dnorm(1,10000)
rho~dnorm(0,10)
}
list(y =structure(.Data= c(3.291839303,3.296274588,3.295265738,3.297438773,3.298200053,3.298412011,3.296300932,3.296426043,3.294455203,3.294481658,3.285708048,3.284464574,3.287575569,3.283348727,3.283355512,3.280935583,3.285914948,3.287111684,3.286400327,3.289303491,3.291186746,3.29116009,3.294849647,3.297015994,3.298090756,3.299369994,3.298503754,3.300578094,3.301034339,3.301056053,3.300321518,3.301761166,3.301524809,3.301186314,3.3005194,3.302700982,3.301364274,3.298512491,3.300093081,3.300475917,3.297878641,3.297570124,3.300808449,3.301370783,3.303489809,3.303282476,3.299788312,3.297272339,3.300660688,3.293581304,3.297289862,3.296182373,3.294970773,3.289178542,3.289180774,3.294003026,3.29332277,3.286703413,3.294221453,3.285154331,3.280152517,3.272941046,3.273626206,3.27009395,3.270156904,3.27571666,3.279669225,3.28808818,3.284906505,3.290217199,3.293269718,3.292617095,3.29777145,3.297169381,3.299866701,3.304931922,3.30488027,3.303649561,3.306118232,3.307754826,3.307906605,3.309259582,3.309562037,3.309257451,3.309487508,3.309591846,3.309911091,3.312135025,3.311482607,3.312336061,3.314604473,3.315846543,3.31534678,3.316563686,3.315458122,3.312482018,3.315245917,3.316877848,3.316372983,3.317095535,3.31393257,3.313829271,3.30666945,3.308634834,3.301535654,3.298772321,3.295069851,3.303820042,3.314126455,3.316106697,3.317758387,3.318516185,3.318455693,3.319890621,3.320264714,3.318136407,3.313635254,3.313487574,3.30547605,3.30159638,3.306618004,3.314318146,3.31065296,3.307123626,3.306002323,3.303470376,3.299435382,3.305226653,3.305899267,3.30794935,3.314530804,3.312139259,3.313253293,3.307399755,3.301498781,3.305620033,3.299940723,3.305534079,3.311760217,3.309951512,3.314398169,3.312911143,3.311062677,3.315674421,3.315661824,3.319830321,3.321596359,3.322289603,3.322153111,3.321691617,3.324344199,3.324212469,3.325408924,3.325076221,3.32443474,3.32314893,3.325800858,3.323825279,3.321915182,3.322434321,3.316234618,3.317944305,3.310514886,3.309681258,3.315119807,3.312473558,3.31831173,3.31686738,3.322115879,3.319994568,3.323891208,3.323132421,3.320457869,3.314088528,3.313054794,3.314082206,3.319364268,3.315527433,3.31380186,3.315332072,3.318192769,3.317296379,3.318459865,3.320391417,3.322645108,3.320650938,3.321358125,3.323588265,3.323250037,3.318309644,3.32230201,3.321658486,3.323862366,3.324885109,3.325862386,3.324060105,3.325261087,3.323633617,3.319212277,3.323930349,3.325205636,-1.674871187,-1.837305384,-1.784901741,-1.824437164,-1.877095042,-1.853296595,-1.793076756,-1.802020721,-1.75360385,-1.750339701,-1.541660595,-1.537570704,-1.640896418,-1.545769835,-1.571902641,-1.556650006,-1.604336613,-1.6935902,-1.699715676,-1.778820579,-1.811756808,-1.762148494,-1.818778584,-1.826568672,-1.857709419,-1.859185357,-1.880873164,-1.863628277,-1.868840571,-1.857709419,-1.838025906,-1.843086364,-1.823727823,-1.815963058,-1.796505852,-1.835147398,-1.795132589,-1.739332463,-1.780168274,-1.785580061,-1.751643889,-1.700330607,-1.790343193,-1.795818949,-1.839468745,-1.833711714,-1.727193104,-1.651880385,-1.754258154,-1.611526503,-1.656547093,-1.59284645,-1.575092078,-1.5540471,-1.583117287,-1.674274013,-1.621581021,-1.528943106,-1.641471071,-1.453534332,-1.345690975,-1.216718593,-1.28451135,-1.161741385,-1.197198918,-1.315549541,-1.462376193,-1.587427911,-1.495750895,-1.563454293,-1.585808919,-1.589591272,-1.683878412,-1.639174734,-1.676066767,-1.705884658,-1.663594506,-1.654210604,-1.6972603,-1.728462971,-1.76413233,-1.79444677,-1.777474973,-1.770778032,-1.720871468,-1.751643889,-1.708364571,-1.716473539,-1.710229163,-1.73420046,-1.778820579,-1.79788129,-1.823727823,-1.83658546,-1.750339701,-1.689935542,-1.782193745,-1.808267093,-1.814558711,-1.854765047,-1.694811844,-1.654210604,-1.464249161,-1.394472583,-1.352258787,-1.379888524,-1.255280835,-1.422607479,-1.548864573,-1.565558689,-1.633460313,-1.659476569,-1.685086464,-1.677263996,-1.644350056,-1.596113873,-1.433397543,-1.499648104,-1.401421332,-1.350612172,-1.428435452,-1.538591373,-1.511445758,-1.415487857,-1.373953779,-1.335931446,-1.299891813,-1.357631945,-1.402730434,-1.449377291,-1.570312304,-1.556650006,-1.618216566,-1.527933706,-1.379038217,-1.453534332,-1.356803139,-1.423054399,-1.522402875,-1.47367507,-1.54680019,-1.524410013,-1.463312172,-1.527429445,-1.541148304,-1.628349281,-1.665956408,-1.602685826,-1.622143032,-1.631185029,-1.689327925,-1.67367725,-1.727193104,-1.71772782,-1.71334574,-1.749688341,-1.769444817,-1.716473539,-1.6935902,-1.705265784,-1.636312824,-1.644350056,-1.555087327,-1.545769835,-1.623831253,-1.591760035,-1.613194194,-1.610416485,-1.709607188,-1.703411805,-1.770778032,-1.745142444,-1.731645785,-1.622705408,-1.602685826,-1.643773495,-1.676665175,-1.631185029,-1.641471071,-1.667139772,-1.663005033,-1.660651132,-1.708985657,-1.766120707,-1.800638718,-1.711474452,-1.728462971,-1.782869953,-1.79925891,-1.714595509,-1.752296718,-1.755568243,-1.791708899,-1.807570829,-1.820896234,-1.76413233,-1.812456437,-1.746438846,-1.674274013,-1.792392558,-1.782193745),
.Dim=c(201,2))
)
list( mu=c(0,0), prec=c(1,1),alpha=1,rhi=0.5)
I get an error "multivariate node expected" while compiling the model. What is wrong in the code?
Model
You cannot put multiple means and variances in dnorm, which you are currently doing. The model expects that your likelihood function is multivariate, but you are giving it a univariate likelihood function. That model that you specify is actually multivariate normal, which in JAGS you would specify as dmnorm, which can take a vector of means and then a variance covariance matrix (which you have already specified). Try changing the dnorm to dmnorm at the top of your model and then you should be good to go.

Fit a bayesian linear regression and predict unobservable values

I'd like to use Jags plus R to adjust a linear model with observable quantities, and make inference about unobservable ones. I found lots of example on the internet about how to adjust the model, but nothing on how to extrapolate its coefficients after having fitted the model in the Jags environment. So, I'll appreciate any help on this.
My data looks like the following:
ngroups <- 2
group <- 1:ngroups
nobs <- 100
dta <- data.frame(group=rep(group,each=nobs),y=rnorm(nobs*ngroups),x=runif(nobs*ngroups))
head(dta)
JAGS has powerful ways to make inference about missing data, and once you get the hang of it, it's easy! I strongly recommend that you check out Marc Kéry's excellent book which provides a wonderful introduction to BUGS language programming (JAGS is close enough to BUGS that almost everything transfers).
The easiest way to do this involves, as you say, adjusting the model. Below I provide a complete worked example of how this works. But you seem to be asking for a way to get the prediction interval without re-running the model (is your model very large and computationally expensive?). This can also be done.
How to predict--the hard way (without re-running the model)
For each iteration of the MCMC, simulate the response for the desired x-value based on that iteration's posterior draws for the covariate values. So imagine you want to predict a value for X=10. Then if iteration 1 (post burn-in) has slope=2, intercept=1, and standard deviation=0.5, draw a Y-value from
Y=rnorm(1, 1+2*10, 0.5)
And repeat for iteration 2, 3, 4, 5...
These will be your posterior draws for the response at X=10. Note: if you did not monitor the standard deviation in your JAGS model, you are out of luck and need to fit the model again.
How to predict--the easy way--with worked example
The basic idea is to insert (into your data) the x-values whose responses you want to predict, with the associated y-values NA. For example, if you want a prediction interval for X=10, you just have to include the point (10, NA) in your data, and set a trace monitor for the y-value.
I use JAGS from R with the rjags package. Below is a complete worked example that begins by simulating the data, then adds some extra x-values to the data, specifies and runs the linear model in JAGS via rjags, and summarizes the results. Y[101:105] contains draws from the posterior prediction intervals for X[101:105]. Notice that Y[1:100] just contains the y-values for X[1:100]. These are the observed data that we fed to the model, and they never change as the model updates.
library(rjags)
# Simulate data (100 observations)
my.data <- as.data.frame(matrix(data=NA, nrow=100, ncol=2))
names(my.data) <- c("X", "Y")
# the linear model will predict Y based on the covariate X
my.data$X <- runif(100) # values for the covariate
int <- 2 # specify the true intercept
slope <- 1 # specify the true slope
sigma <- .5 # specify the true residual standard deviation
my.data$Y <- rnorm(100, slope*my.data$X+int, sigma) # Simulate the data
#### Extra data for prediction of unknown Y-values from known X-values
y.predict <- as.data.frame(matrix(data=NA, nrow=5, ncol=2))
names(y.predict) <- c("X", "Y")
y.predict$X <- c(-1, 0, 1.3, 2, 7)
mydata <- rbind(my.data, y.predict)
set.seed(333)
setwd(INSERT YOUR WORKING DIRECTORY HERE)
sink("mymodel.txt")
cat("model{
# Priors
int ~ dnorm(0, .001)
slope ~ dnorm(0, .001)
tau <- 1/(sigma * sigma)
sigma ~ dunif(0,10)
# Model structure
for(i in 1:R){
Y[i] ~ dnorm(m[i],tau)
m[i] <- int + slope * X[i]
}
}", fill=TRUE)
sink()
jags.data <- list(R=dim(mydata)[1], X=mydata$X, Y=mydata$Y)
inits <- function(){list(int=rnorm(1, 0, 5), slope=rnorm(1,0,5),
sigma=runif(1,0,10))}
params <- c("Y", "int", "slope", "sigma")
nc <- 3
n.adapt <-1000
n.burn <- 1000
n.iter <- 10000
thin <- 10
my.model <- jags.model('mymodel.txt', data = jags.data, inits=inits, n.chains=nc, n.adapt=n.adapt)
update(my.model, n.burn)
my.model_samples <- coda.samples(my.model,params,n.iter=n.iter, thin=thin)
summary(my.model_samples)