Gamma distribution in JAGS - Error in node - bayesian

I'm trying to parameterise a gamma distribution in JAGS - with a piecewise linear predictor but my model fails to run with the following error message:
Error: Error in node (ashape/(aexp(mu[59]))) Invalid parent values
The model works when timber.recovery is drawn from a normal distribution, but the lower quantile predictions is less than zero, which is not biologically possible. I've tried a few tricks like adding 0.001 to the "mu" parameter in case it was drawing a zero, setting initial values based on outputs from a glm; but neither resolves the error message. any insights would be greatly appreciated [i'm using R2jags]. My model:
cat (
"model {
# UNINFORMATIVE PRIORS
sd_plot ~ dunif(0, 100)
tau_plot <- 1/(sd_plot * sd_plot)
# precision for plot level variance
alpha ~ dnorm(0, 1e-06)
# normal prior for intercept term
shape ~ dunif(0, 100)
# shape parameter for gamma
log_intensity ~ dnorm(0, 1e-06)
# uninformative prior for logging intensity
beta_1 ~ dnorm (0, 1e-06)
# uninformative prior; change in slope for first segment : <=3.6 years
beta_2 ~ dnorm (0, 1e-06)
# uninformative prior; change in slope for first segment : >3.6 years
InX_1 ~ dnorm (0, 1e-06)
# uniformative prior for interaction between tsl and log_intensity : <=3.6 years
InX_2 ~ dnorm (0, 1e-06)
# uniformative prior for interaction between tsl and log_intensity : >3.6 years
# PLOT LEVEL RANDOM EFFECTS
for (i in 1:nplots) {
plot_Eff[i] ~ dnorm(0,tau_plot)
}
for (i in 1:Nobs) {
# PIECEWISE LINEAR PREDICTOR
mu[i] <-
alpha +
beta_1 * (time.since.logged[i] * tsl.DUM1[i]) +
log_intensity * log.volume [i] +
beta_2 * (time.since.logged[i] * tsl.DUM2[i] - 3.6) +
beta_1 * (time.since.logged[i] * tsl.DUM2[i]) +
plot_Eff[plot.id[i]] +
InX_1 * (time.since.logged[i] * tsl.DUM1[i]) * log.volume [i] +
InX_2 * (time.since.logged[i] * tsl.DUM2[i] - 3.6) * log.volume[i] +
InX_1 * (time.since.logged[i] * tsl.DUM2[i]) * log.volume[i]
timber.recovery[i] ~ dgamma(shape,shape/exp(mu[i]))
# observed recovery
pred_timber_recovery[i] ~ dgamma(shape,shape/exp(mu[i]))
# posterior predictive distribution
pearson.residual[i] <-
(timber.recovery[i] - pred_timber_recovery[i]) / (sqrt(timber.recovery[i]))
}
}",
fill = TRUE,
file = "outputs/piecewise_TIMBER_MODEL_FINAL_GAMMA.txt")

Related

How to avoid overdispersed Poisson regression overfitting?

I have a dataset including three variables including company id (there are 96 companies), expert id (there are 38 experts) and points given by experts to companies. Points are discrete values from 0 to 100. I tried fitting an overdispersed poisson to model points given by the experts. But I don't know why the model overfits although I am using a linear likelihood. Here is my JAGS code:
model_code <- "
model
{
# Likelihood
for (i in 1:N) {
y[i] ~ dpois(exp(mu[i]))
mu[i] ~ dnorm(alpha[company[i]] + beta[expert[i]] , sigma^-2)
}
# Priors
for (j in 1:J){
alpha[j] ~ dnorm (mu.a, sigma.a^-2)
}
for (k in 1:K){
beta[k] ~ dnorm (mu.a, sigma.a^-2)
}
mu.a ~ dunif (0, 100)
sigma.a ~ dunif (0, 100)
sigma ~ dunif(0, 100)
}
"
Anyone knows why this model overfits and how to fix it?

How to plot confidence intervals for glm models (gamma family)?

I would like to plot the line and the 95% confidence interval from a glm model (family gamma). For linear models, I have previously been able to plot the confidence intervals from the predictions as they included the fit, lower and upper level and using polygons but I do not know how to do it here as the predictions does not include upper and lower levels. I have also tried ggplot but there it seems that the smoothing flattens the curve. Thanks in advance for help. See code:
library(ggplot2)
# Data
dat <- data.frame(c(45,75,85,2,14,45,45,45,45,45,55,55,65,85,15,15,315,3,40,85,125,115,13,105,
145,125,145,125,205,125,155,125,19,17,145,14,85,65,135,45,40,15,14,10,15,10,10,45,37,30),
c(1.928607e-01, 3.038813e-01, 8.041174e-02, 0.000000e+00, 1.017541e-02, 1.658876e-01, 2.084661e-01,
1.891305e-01, 2.657766e-01, 1.270864e-01, 1.720141e-01, 1.644947e-01, 7.038978e-02, 3.046604e-01,
3.111646e-02, 9.443539e-04, 3.590906e-02, 0.000000e+00, 2.384494e-01, 5.955332e-02, 7.703567e-02,
5.524471e-02, 9.915716e-04, 1.169936e-01, 1.409448e-01, 1.411809e-01, 1.025096e-01, 2.649503e-01,
6.309465e-02, 3.727837e-02, 8.855679e-02, 1.707864e-01, 1.714002e-02, 1.038789e-03, 1.208065e-01,
3.541327e-04, 7.492268e-02, 9.633591e-02, 7.414359e-02, 2.235050e-01, 1.489010e-01, 2.478929e-03,
2.573364e-03, 5.430035e-04, 1.719905e-02, 1.243006e-02, 6.822957e-03, 1.927544e-01, 1.146918e-01, 9.030385e-03))
colnames(dat) <- c("age", "wood")
# Model
model<- glm(wood+0.001 ~ log(age) + I(log(age)^2), data=dat, family = Gamma)
summary(model)
p<-predict(model, data.frame(age=1:200), interval="confidence", level=.95)
p.tr <- 1/p # inverse link according to ?glm
# Plot
plot(1:200, p.tr, type="n", ylim = c(0,.4), xlab="Forest age",
ylab="Proportion",
main="Wood production", yaxt="n")
axis(2, las=2)
lines(1:200, p.tr, ylim=range(p.tr), lwd=2, col=rgb(0, .4, 1))
# How can I add to this plot the 95% confidence intervals of the model?
# Ggplot
# I use this function because there was a warning of "Ignoring unknown parameters: family" and this solves that
binomial_smooth <- function(...) {
geom_smooth(method = "glm", method.args = list(family = "binomial"),formula=y~log(x)+I(log(x)^2),se=FALSE)
}
ggplot(dat, aes(x=age,
y=wood+0.001)) +
binomial_smooth() +
xlab("Forest age") +
ylab("Proportion") +
ggtitle("Wood production") +
xlim(0, 200) +
ylim(0,0.4) +
theme_bw() +
theme (plot.title = element_text(hjust = 0.5), legend.position = "none")
# Why I get this warning (Warning: In eval(family$initialize) : non-integer #successes in a binomial glm!)?
# Why is the curve more smooth here?
I'm not a mathematician / statistician, but I guess "family = binomial" gives you inappropriate estimates, as it isn't the correct distribution as neither wood nor age are countable number of values.
About the confidence intervals:
I used the stat_smooth(), see below. Should be the same as geom_smooth(), though.
dat <- data.frame(c(45,75,85,2,14,45,45,45,45,45,55,55,65,85,15,15,315,3,40,85,125,115,13,105,
145,125,145,125,205,125,155,125,19,17,145,14,85,65,135,45,40,15,14,10,15,10,10,45,37,30),
c(1.928607e-01, 3.038813e-01, 8.041174e-02, 0.000000e+00, 1.017541e-02, 1.658876e-01, 2.084661e-01,
1.891305e-01, 2.657766e-01, 1.270864e-01, 1.720141e-01, 1.644947e-01, 7.038978e-02, 3.046604e-01,
3.111646e-02, 9.443539e-04, 3.590906e-02, 0.000000e+00, 2.384494e-01, 5.955332e-02, 7.703567e-02,
5.524471e-02, 9.915716e-04, 1.169936e-01, 1.409448e-01, 1.411809e-01, 1.025096e-01, 2.649503e-01,
6.309465e-02, 3.727837e-02, 8.855679e-02, 1.707864e-01, 1.714002e-02, 1.038789e-03, 1.208065e-01,
3.541327e-04, 7.492268e-02, 9.633591e-02, 7.414359e-02, 2.235050e-01, 1.489010e-01, 2.478929e-03,
2.573364e-03, 5.430035e-04, 1.719905e-02, 1.243006e-02, 6.822957e-03, 1.927544e-01, 1.146918e-01,
9.030385e-03))
colnames(dat) <- c("age", "wood")
model<- glm(wood+0.001 ~ log(age) + I(log(age)^2), data=dat, family = Gamma)
#summary(model)
p<-predict(model, data.frame(age=1:200), interval="confidence", level=.95)
p.tr <- 1/p # inverse link according to ?glm
prediction <- data.frame(age = as.numeric(names(p)), wood = 1/p)
ggplot(data = dat, aes(x = age, y = wood)) +
geom_point() +
geom_line(data= prediction) +
stat_smooth(data = dat, method = "glm",
formula = y+0.001 ~ log(x) + I(log(x)^2),
method.args = c(family = Gamma))

Custom loss for coordinate/landmark prediction

I am currently trying to get a landmark predictor running and thought about the loss function.
Currently the last (dense) layer has 32 values with the 16 coordinates encoded as x1,y1,x2,y2,...
Up until now I was just fiddling with Mean Squared Error or Mean Absolute Error losses but thought the distance between the ground truth and the predicted coordinate would be far more expressive of the correctness of the values.
My current implementation looks like:
def dst_objective(y_true, y_pred):
vats = dict()
for i in range(0, 16):
true_px = y_true[:, i * 2:i * 2 + 1]
pred_px = y_pred[:, i * 2:i * 2 + 1]
true_py = y_true[:, i * 2 + 1:i * 2 + 2]
pred_py = y_pred[:, i * 2 + 1:i * 2 + 2]
vats[i] = K.sqrt(K.square(true_px - pred_px) + K.square(true_py - pred_py))
out = K.concatenate([
vats[0], vats[1], vats[2], vats[3], vats[4], vats[5], vats[6], vats[7],
vats[8], vats[9], vats[10], vats[11], vats[12], vats[13], vats[14],
vats[15]
],axis=1)
return K.mean(out,axis=0)
It does seem to work when I evaluate it but it does look "hacky" to me. Any suggestions how I could improve on this?
The same calculation expressed as tensor operations in Keras, without separating the X and Y coordinates, because that's basically unnecessary:
# get all the squared difference in coordinates
sq_distances = K.square( y_true - y_pred )
# then take the sum of each pair
sum_pool = 2 * K.AveragePooling1D( sq_distances,
pool_size = 2,
strides = 2,
padding = "valid" )
# take the square root to get the distance
dists = K.sqrt( sum_pool )
# take the mean of the distances
mean_dist = K.mean( dists )

How does Tensorflow Batch Normalization work?

I'm using tensorflow batch normalization in my deep neural network successfully. I'm doing it the following way:
if apply_bn:
with tf.variable_scope('bn'):
beta = tf.Variable(tf.constant(0.0, shape=[out_size]), name='beta', trainable=True)
gamma = tf.Variable(tf.constant(1.0, shape=[out_size]), name='gamma', trainable=True)
batch_mean, batch_var = tf.nn.moments(z, [0], name='moments')
ema = tf.train.ExponentialMovingAverage(decay=0.5)
def mean_var_with_update():
ema_apply_op = ema.apply([batch_mean, batch_var])
with tf.control_dependencies([ema_apply_op]):
return tf.identity(batch_mean), tf.identity(batch_var)
mean, var = tf.cond(self.phase_train,
mean_var_with_update,
lambda: (ema.average(batch_mean), ema.average(batch_var)))
self.z_prebn.append(z)
z = tf.nn.batch_normalization(z, mean, var, beta, gamma, 1e-3)
self.z.append(z)
self.bn.append((mean, var, beta, gamma))
And it works fine both for training and testing phases.
However I encounter problems when I try to use the computed neural network parameters in my another project, where I need to compute all the matrix multiplications and stuff by myself. The problem is that I can't reproduce the behavior of the tf.nn.batch_normalization function:
feed_dict = {
self.tf_x: np.array([range(self.x_cnt)]) / 100,
self.keep_prob: 1,
self.phase_train: False
}
for i in range(len(self.z)):
# print 0 layer's 1 value of arrays
print(self.sess.run([
self.z_prebn[i][0][1], # before bn
self.bn[i][0][1], # mean
self.bn[i][1][1], # var
self.bn[i][2][1], # offset
self.bn[i][3][1], # scale
self.z[i][0][1], # after bn
], feed_dict=feed_dict))
# prints
# [-0.077417567, -0.089603029, 0.000436493, -0.016652612, 1.0055743, 0.30664611]
According to the formula on the page https://www.tensorflow.org/versions/r1.2/api_docs/python/tf/nn/batch_normalization:
bn = scale * (x - mean) / (sqrt(var) + 1e-3) + offset
But as we can see,
1.0055743 * (-0.077417567 - -0.089603029)/(0.000436493^0.5 + 1e-3) + -0.016652612
= 0.543057
Which differs from the value 0.30664611, computed by Tensorflow itself.
So what am I doing wrong here and why I can't just calculate batch normalized value myself?
Thanks in advance!
The formula used is slightly different from:
bn = scale * (x - mean) / (sqrt(var) + 1e-3) + offset
It should be:
bn = scale * (x - mean) / (sqrt(var + 1e-3)) + offset
The variance_epsilon variable is supposed to scale with the variance, not with sigma, which is the square-root of variance.
After the correction, the formula yields the correct value:
1.0055743 * (-0.077417567 - -0.089603029)/((0.000436493 + 1e-3)**0.5) + -0.016652612
# 0.30664642276945747

WinBUGS Examples Vol 1, Dyes example returns error

Currently going through examples volume 1 and came across an error with the dyes example.
When I try to load inits from the example it returns "this chain contains uninitialized variables. I am not sure which part of it is not right as on the first sight I see theta, tau.btw and tau.with is all specified and nothing is left out.
I am using the code directly from Examples Vol 1 under help tab. The same error happened to all three choices of priors for between-variation.
I would really appreciate any advice on the problem. Thanks in advance.
Below is the code I copied directly from the dyes example.
model
{
for( i in 1 : batches ) {
mu[i] ~ dnorm(theta, tau.btw)
for( j in 1 : samples ) {
y[i , j] ~ dnorm(mu[i], tau.with)
}
}
theta ~ dnorm(0.0, 1.0E-10)
# prior for within-variation
sigma2.with <- 1 / tau.with
tau.with ~ dgamma(0.001, 0.001)
# Choice of priors for between-variation
# Prior 1: uniform on SD
#sigma.btw~ dunif(0,100)
#sigma2.btw<-sigma.btw*sigma.btw
#tau.btw<-1/sigma2.btw
# Prior 2: Uniform on intra-class correlation coefficient,
# ICC=sigma2.btw / (sigma2.btw+sigma2.with)
ICC ~ dunif(0,1)
sigma2.btw <- sigma2.with *ICC/(1-ICC)
tau.btw<-1/sigma2.btw
# Prior 3: gamma(0.001, 0.001) NOT RECOMMENDED
#tau.btw ~ dgamma(0.001, 0.001)
#sigma2.btw <- 1 / tau.btw
}
Data
list(batches = 6, samples = 5,
y = structure(
.Data = c(1545, 1440, 1440, 1520, 1580,
1540, 1555, 1490, 1560, 1495,
1595, 1550, 1605, 1510, 1560,
1445, 1440, 1595, 1465, 1545,
1595, 1630, 1515, 1635, 1625,
1520, 1455, 1450, 1480, 1445), .Dim = c(6, 5)))
Inits1
list(theta=1500, tau.with=1, sigma.btw=1)
Inits2
list(theta=1500, tau.with=1,ICC=0.5)
Inits3
list(theta=1500, tau.with=1, tau.btw=1)
That is not an error per se. Yes you have provided the inits for the parameters of interest.
However there are the six mu[i] variables that are not data, but are variables drawn from mu[i] ~ dnorm(theta, tau.btw).
You could provide initial values for these as well, but it is best imo to just click on gen inits if you are using WinBUGS from the GUI - this will provide initial values for those.