I would like to plot the line and the 95% confidence interval from a glm model (family gamma). For linear models, I have previously been able to plot the confidence intervals from the predictions as they included the fit, lower and upper level and using polygons but I do not know how to do it here as the predictions does not include upper and lower levels. I have also tried ggplot but there it seems that the smoothing flattens the curve. Thanks in advance for help. See code:
library(ggplot2)
# Data
dat <- data.frame(c(45,75,85,2,14,45,45,45,45,45,55,55,65,85,15,15,315,3,40,85,125,115,13,105,
145,125,145,125,205,125,155,125,19,17,145,14,85,65,135,45,40,15,14,10,15,10,10,45,37,30),
c(1.928607e-01, 3.038813e-01, 8.041174e-02, 0.000000e+00, 1.017541e-02, 1.658876e-01, 2.084661e-01,
1.891305e-01, 2.657766e-01, 1.270864e-01, 1.720141e-01, 1.644947e-01, 7.038978e-02, 3.046604e-01,
3.111646e-02, 9.443539e-04, 3.590906e-02, 0.000000e+00, 2.384494e-01, 5.955332e-02, 7.703567e-02,
5.524471e-02, 9.915716e-04, 1.169936e-01, 1.409448e-01, 1.411809e-01, 1.025096e-01, 2.649503e-01,
6.309465e-02, 3.727837e-02, 8.855679e-02, 1.707864e-01, 1.714002e-02, 1.038789e-03, 1.208065e-01,
3.541327e-04, 7.492268e-02, 9.633591e-02, 7.414359e-02, 2.235050e-01, 1.489010e-01, 2.478929e-03,
2.573364e-03, 5.430035e-04, 1.719905e-02, 1.243006e-02, 6.822957e-03, 1.927544e-01, 1.146918e-01, 9.030385e-03))
colnames(dat) <- c("age", "wood")
# Model
model<- glm(wood+0.001 ~ log(age) + I(log(age)^2), data=dat, family = Gamma)
summary(model)
p<-predict(model, data.frame(age=1:200), interval="confidence", level=.95)
p.tr <- 1/p # inverse link according to ?glm
# Plot
plot(1:200, p.tr, type="n", ylim = c(0,.4), xlab="Forest age",
ylab="Proportion",
main="Wood production", yaxt="n")
axis(2, las=2)
lines(1:200, p.tr, ylim=range(p.tr), lwd=2, col=rgb(0, .4, 1))
# How can I add to this plot the 95% confidence intervals of the model?
# Ggplot
# I use this function because there was a warning of "Ignoring unknown parameters: family" and this solves that
binomial_smooth <- function(...) {
geom_smooth(method = "glm", method.args = list(family = "binomial"),formula=y~log(x)+I(log(x)^2),se=FALSE)
}
ggplot(dat, aes(x=age,
y=wood+0.001)) +
binomial_smooth() +
xlab("Forest age") +
ylab("Proportion") +
ggtitle("Wood production") +
xlim(0, 200) +
ylim(0,0.4) +
theme_bw() +
theme (plot.title = element_text(hjust = 0.5), legend.position = "none")
# Why I get this warning (Warning: In eval(family$initialize) : non-integer #successes in a binomial glm!)?
# Why is the curve more smooth here?
I'm not a mathematician / statistician, but I guess "family = binomial" gives you inappropriate estimates, as it isn't the correct distribution as neither wood nor age are countable number of values.
About the confidence intervals:
I used the stat_smooth(), see below. Should be the same as geom_smooth(), though.
dat <- data.frame(c(45,75,85,2,14,45,45,45,45,45,55,55,65,85,15,15,315,3,40,85,125,115,13,105,
145,125,145,125,205,125,155,125,19,17,145,14,85,65,135,45,40,15,14,10,15,10,10,45,37,30),
c(1.928607e-01, 3.038813e-01, 8.041174e-02, 0.000000e+00, 1.017541e-02, 1.658876e-01, 2.084661e-01,
1.891305e-01, 2.657766e-01, 1.270864e-01, 1.720141e-01, 1.644947e-01, 7.038978e-02, 3.046604e-01,
3.111646e-02, 9.443539e-04, 3.590906e-02, 0.000000e+00, 2.384494e-01, 5.955332e-02, 7.703567e-02,
5.524471e-02, 9.915716e-04, 1.169936e-01, 1.409448e-01, 1.411809e-01, 1.025096e-01, 2.649503e-01,
6.309465e-02, 3.727837e-02, 8.855679e-02, 1.707864e-01, 1.714002e-02, 1.038789e-03, 1.208065e-01,
3.541327e-04, 7.492268e-02, 9.633591e-02, 7.414359e-02, 2.235050e-01, 1.489010e-01, 2.478929e-03,
2.573364e-03, 5.430035e-04, 1.719905e-02, 1.243006e-02, 6.822957e-03, 1.927544e-01, 1.146918e-01,
9.030385e-03))
colnames(dat) <- c("age", "wood")
model<- glm(wood+0.001 ~ log(age) + I(log(age)^2), data=dat, family = Gamma)
#summary(model)
p<-predict(model, data.frame(age=1:200), interval="confidence", level=.95)
p.tr <- 1/p # inverse link according to ?glm
prediction <- data.frame(age = as.numeric(names(p)), wood = 1/p)
ggplot(data = dat, aes(x = age, y = wood)) +
geom_point() +
geom_line(data= prediction) +
stat_smooth(data = dat, method = "glm",
formula = y+0.001 ~ log(x) + I(log(x)^2),
method.args = c(family = Gamma))
I am currently trying to get a landmark predictor running and thought about the loss function.
Currently the last (dense) layer has 32 values with the 16 coordinates encoded as x1,y1,x2,y2,...
Up until now I was just fiddling with Mean Squared Error or Mean Absolute Error losses but thought the distance between the ground truth and the predicted coordinate would be far more expressive of the correctness of the values.
My current implementation looks like:
def dst_objective(y_true, y_pred):
vats = dict()
for i in range(0, 16):
true_px = y_true[:, i * 2:i * 2 + 1]
pred_px = y_pred[:, i * 2:i * 2 + 1]
true_py = y_true[:, i * 2 + 1:i * 2 + 2]
pred_py = y_pred[:, i * 2 + 1:i * 2 + 2]
vats[i] = K.sqrt(K.square(true_px - pred_px) + K.square(true_py - pred_py))
out = K.concatenate([
vats[0], vats[1], vats[2], vats[3], vats[4], vats[5], vats[6], vats[7],
vats[8], vats[9], vats[10], vats[11], vats[12], vats[13], vats[14],
vats[15]
],axis=1)
return K.mean(out,axis=0)
It does seem to work when I evaluate it but it does look "hacky" to me. Any suggestions how I could improve on this?
The same calculation expressed as tensor operations in Keras, without separating the X and Y coordinates, because that's basically unnecessary:
# get all the squared difference in coordinates
sq_distances = K.square( y_true - y_pred )
# then take the sum of each pair
sum_pool = 2 * K.AveragePooling1D( sq_distances,
pool_size = 2,
strides = 2,
padding = "valid" )
# take the square root to get the distance
dists = K.sqrt( sum_pool )
# take the mean of the distances
mean_dist = K.mean( dists )
I'm using tensorflow batch normalization in my deep neural network successfully. I'm doing it the following way:
if apply_bn:
with tf.variable_scope('bn'):
beta = tf.Variable(tf.constant(0.0, shape=[out_size]), name='beta', trainable=True)
gamma = tf.Variable(tf.constant(1.0, shape=[out_size]), name='gamma', trainable=True)
batch_mean, batch_var = tf.nn.moments(z, [0], name='moments')
ema = tf.train.ExponentialMovingAverage(decay=0.5)
def mean_var_with_update():
ema_apply_op = ema.apply([batch_mean, batch_var])
with tf.control_dependencies([ema_apply_op]):
return tf.identity(batch_mean), tf.identity(batch_var)
mean, var = tf.cond(self.phase_train,
mean_var_with_update,
lambda: (ema.average(batch_mean), ema.average(batch_var)))
self.z_prebn.append(z)
z = tf.nn.batch_normalization(z, mean, var, beta, gamma, 1e-3)
self.z.append(z)
self.bn.append((mean, var, beta, gamma))
And it works fine both for training and testing phases.
However I encounter problems when I try to use the computed neural network parameters in my another project, where I need to compute all the matrix multiplications and stuff by myself. The problem is that I can't reproduce the behavior of the tf.nn.batch_normalization function:
feed_dict = {
self.tf_x: np.array([range(self.x_cnt)]) / 100,
self.keep_prob: 1,
self.phase_train: False
}
for i in range(len(self.z)):
# print 0 layer's 1 value of arrays
print(self.sess.run([
self.z_prebn[i][0][1], # before bn
self.bn[i][0][1], # mean
self.bn[i][1][1], # var
self.bn[i][2][1], # offset
self.bn[i][3][1], # scale
self.z[i][0][1], # after bn
], feed_dict=feed_dict))
# prints
# [-0.077417567, -0.089603029, 0.000436493, -0.016652612, 1.0055743, 0.30664611]
According to the formula on the page https://www.tensorflow.org/versions/r1.2/api_docs/python/tf/nn/batch_normalization:
bn = scale * (x - mean) / (sqrt(var) + 1e-3) + offset
But as we can see,
1.0055743 * (-0.077417567 - -0.089603029)/(0.000436493^0.5 + 1e-3) + -0.016652612
= 0.543057
Which differs from the value 0.30664611, computed by Tensorflow itself.
So what am I doing wrong here and why I can't just calculate batch normalized value myself?
Thanks in advance!
The formula used is slightly different from:
bn = scale * (x - mean) / (sqrt(var) + 1e-3) + offset
It should be:
bn = scale * (x - mean) / (sqrt(var + 1e-3)) + offset
The variance_epsilon variable is supposed to scale with the variance, not with sigma, which is the square-root of variance.
After the correction, the formula yields the correct value:
1.0055743 * (-0.077417567 - -0.089603029)/((0.000436493 + 1e-3)**0.5) + -0.016652612
# 0.30664642276945747
Currently going through examples volume 1 and came across an error with the dyes example.
When I try to load inits from the example it returns "this chain contains uninitialized variables. I am not sure which part of it is not right as on the first sight I see theta, tau.btw and tau.with is all specified and nothing is left out.
I am using the code directly from Examples Vol 1 under help tab. The same error happened to all three choices of priors for between-variation.
I would really appreciate any advice on the problem. Thanks in advance.
Below is the code I copied directly from the dyes example.
model
{
for( i in 1 : batches ) {
mu[i] ~ dnorm(theta, tau.btw)
for( j in 1 : samples ) {
y[i , j] ~ dnorm(mu[i], tau.with)
}
}
theta ~ dnorm(0.0, 1.0E-10)
# prior for within-variation
sigma2.with <- 1 / tau.with
tau.with ~ dgamma(0.001, 0.001)
# Choice of priors for between-variation
# Prior 1: uniform on SD
#sigma.btw~ dunif(0,100)
#sigma2.btw<-sigma.btw*sigma.btw
#tau.btw<-1/sigma2.btw
# Prior 2: Uniform on intra-class correlation coefficient,
# ICC=sigma2.btw / (sigma2.btw+sigma2.with)
ICC ~ dunif(0,1)
sigma2.btw <- sigma2.with *ICC/(1-ICC)
tau.btw<-1/sigma2.btw
# Prior 3: gamma(0.001, 0.001) NOT RECOMMENDED
#tau.btw ~ dgamma(0.001, 0.001)
#sigma2.btw <- 1 / tau.btw
}
Data
list(batches = 6, samples = 5,
y = structure(
.Data = c(1545, 1440, 1440, 1520, 1580,
1540, 1555, 1490, 1560, 1495,
1595, 1550, 1605, 1510, 1560,
1445, 1440, 1595, 1465, 1545,
1595, 1630, 1515, 1635, 1625,
1520, 1455, 1450, 1480, 1445), .Dim = c(6, 5)))
Inits1
list(theta=1500, tau.with=1, sigma.btw=1)
Inits2
list(theta=1500, tau.with=1,ICC=0.5)
Inits3
list(theta=1500, tau.with=1, tau.btw=1)
That is not an error per se. Yes you have provided the inits for the parameters of interest.
However there are the six mu[i] variables that are not data, but are variables drawn from mu[i] ~ dnorm(theta, tau.btw).
You could provide initial values for these as well, but it is best imo to just click on gen inits if you are using WinBUGS from the GUI - this will provide initial values for those.