How to add an interaction term in JAGS between one categorical and one continuous variable? - bayesian

I have mark recapture model in JAGS and I want to code a interaction between a categorical variable and a continuous variable.
ngr is the number of groups
nind is the number of individuals in my mark recapture model
gr.sp[ind] just searches in my database to which group belong individual ind
ngr is the number of groups
Som priors:
phi.precip ~ dnorm(0,0.01)
for(groups in 1:ngr) {
phi.gr[groups] ~ dnorm(0, 0.01)
}
Here is a small part of the likelihood of my model:
...
for(ind in 1:nind) {
for(yr in 1:nyear) {
logit(phi[ind,yr]) <- e.phi[ind,yr]
e.phi[ind,yr] <-
phi.gr[gr.sp[ind]] + # Categorical variable telling how much belonging to a certain group changes your fitness
phi.precip * sum.rainfall[yr] + # Effect of rain on my individuals
phi.gr.precip * phi.gr[gr.sp[ind]] * sum.rainfall[yr] # This is the interaction between the categorical and the continuous I'm trying to code.
}
...}
First, how do you define the prior for the phi.gr.precip? Should it be something resembling this:
for(groups in 1:ngr) {
phi.gr.precip[groups] ~ dnorm(0, 0.01)
}
But then, I don't know how to implement it in the likelihood.
Second, how is phi.gr.precip supposed to be coded to include the interaction between the group an individual is in (gr.sp[ind]) and the climate (sum.rainfall[yr], which represent the amount of rain in a year)?
Coding an interaction like like seems to require the same number of parameter in phi.gr.precip as there are in the categorical variable. But that would require me to loop inside the likelihood:
...
for(ind in 1:nind) {
for(yr in 1:nyear) {
logit(phi[ind,yr]) <- e.phi[ind,yr]
e.phi[ind,yr] <-
phi.gr[gr.sp[ind]] +
phi.precip * sum.rainfall[yr] +
for(groups in 1:ngr) {
phi.gr.precip[groups] * phi.gr[gr.sp[ind]] * sum.rainfall[yr]
}
}
...}
Which is not working when I run the model.

Your choice of prior looks reasonable.
Your likelihood is almost correct, but JAGS can't add a for-loop to a number. Instead, you need to move the for loop for groups up to wrap around the entire sum.
...
for(ind in 1:nind) {
for(yr in 1:nyear) {
for(groups in 1:ngr){ ### MOVE THE FOR-LOOP HERE
logit(phi[ind,yr]) <- e.phi[ind,yr]
e.phi[ind,yr] <-
phi.gr[gr.sp[ind]] +
phi.precip * sum.rainfall[yr] + phi.gr.precip[groups] * phi.gr[gr.sp[ind]] * sum.rainfall[yr]
}
}
...}

Related

How to modify ggboxplot (ggpubr) to suppress whiskers, but retain access to other boxplot customisations?

Originally I asked this question about suppressing the whiskers on a boxplot made by ggboxplot. (The expected way of setting a geom_boxplot option was not available.) A nice solution appeared which suited the original question. However, the broader question to address is how to suppress whiskers on the boxplot but still retain access to the nice additions in ggpubr, such as being able to automatically compute statistical test results and place these on a boxplot.
I tinkered with the solution from #Julian_Hn to get something like what I want.
There are two issues that someone more knowledgeable might be able to help with, now that I've asked the broader question:
Are there ways to make the solution more efficient?
How can I add in the ability to change the range of x-values? (I tried various methods using ggpar and coord_cartesian, with no effect. I might be lacking knowledge of how to use commands like ggplot_build effectively.)
Here's an example where I suppress whiskers and use stat_kruskal_test to label the boxplot:
ggboxplot_whisker_opt <- function(...)
{
opts <- list(...) # Modification of original question solution to include the original labelled ggboxplot with whiskers and stat info added
# Check if user specified a whiskers arg and set options accordingly
if("whisker" %in% names(opts))
{
whisk <- opts$whisker
opts$whisker <- NULL
} else {
whisk <- TRUE
}
# Additional arguments that might need generalising so that other statistical tests can be used in other applications
if ("kruskal" %in% names(opts))
{ kruskal<-opts$kruskal
opts$kruskal <- NULL
opt.group <- opts$kruskal.options[[1]]
opt.label<- opts$kruskal.options[[2]]
opt.y <- opts$kruskal.options[[3]]
opt.x <- opts$kruskal.options[[4]]
opts$kruskal.options <- NULL
}
pl <- do.call(ggboxplot,opts) # create plot by calling ggboxplot with all user options
if (kruskal){ pl <- pl + stat_kruskal_test(group.by=opt.group,label=opt.label, label.y.npc=opt.y,label.x.npc=opt.x) }
if(!whisk)
{ pl_list <- ggplot_build(pl) # get listed version of ggboxplot object to modify
pl_list$data[[1]]$ymin <- NA # remove the ymin/max that specify the whiskers
pl_list$data[[1]]$ymax <- NA
pl <- ggplot_gtable(pl_list) # convert back to ggplot object
}
# return
pl
}
Here's the application:
set.seed(123)
x <-rnorm(100)
labels <- round(runif(100,1,2))
df <- data.frame(labels=labels, value=x)
# Define the options for the stat_kruskal_test label
KO <- list("group"="labels","label"="as_detailed_italic", "label.y.npc"=0.5,"label.x.npc"=0.5,ylim=c(-1.2, 1.2))
# call the function
output.plot <- ggboxplot_whisker_opt(df, "labels","value", col="labels", legend="none", whisker=FALSE,add=c("mean"), orientation="horizontal" kruskal=TRUE,kruskal.options=KO)
# Plot the result
plot(output.plot)
the issue with modifying was that the returned object was not a ggplot object anymore (wrong comment on my side) but a plot object. I have thought about it and instead of modifying the ggbuilt object, it's also possible to directly pass the coef=0 through to the geom_boxplot layer inside the object returned by ggboxplot:
ggboxplot_whisker_opt <- function(...)
{
opts <- list(...)
# check if user specified a whiskers argument and set options accordingly
if("whisker" %in% names(opts))
{
whisk <- opts$whisker
opts$whisker <- NULL
} else {
whisk <- TRUE
}
pl <- do.call(ggpubr::ggboxplot, opts) # create plot by calling ggboxplot with all user options
if(!whisk)
{
pl$layers[[1]]$stat_params$coef <- 0 # modify coef param of geom_boxplot layer
}
# plot the ggplot and return so other ggplot parts can be added via `+`
pl
}
This function now returns an object compatible with ggpar or adding other ggplot modifiers via +
library(ggplot2)
library(ggpubr)
set.seed(123)
x <- rnorm(100)
labels <- round(runif(100,1,2))
df <- data.frame(labels=labels,
value=x)
testplot <- ggboxplot_whisker_opt(df,"labels","value",whisker=FALSE)
ggpar(testplot,xlim=c(0.5,1.5),
ylim=c(-0.5,0.5))
testplot +
geom_line(data=data.frame(x=c(1,2),y=c(0,0)),aes(x=x,y=y),color="red",lwd=2)

How to avoid overdispersed Poisson regression overfitting?

I have a dataset including three variables including company id (there are 96 companies), expert id (there are 38 experts) and points given by experts to companies. Points are discrete values from 0 to 100. I tried fitting an overdispersed poisson to model points given by the experts. But I don't know why the model overfits although I am using a linear likelihood. Here is my JAGS code:
model_code <- "
model
{
# Likelihood
for (i in 1:N) {
y[i] ~ dpois(exp(mu[i]))
mu[i] ~ dnorm(alpha[company[i]] + beta[expert[i]] , sigma^-2)
}
# Priors
for (j in 1:J){
alpha[j] ~ dnorm (mu.a, sigma.a^-2)
}
for (k in 1:K){
beta[k] ~ dnorm (mu.a, sigma.a^-2)
}
mu.a ~ dunif (0, 100)
sigma.a ~ dunif (0, 100)
sigma ~ dunif(0, 100)
}
"
Anyone knows why this model overfits and how to fix it?

Dynamic Optmization

I'm trying to set a problem that maximizes the consumption(u) over the time (10 years) subject to the variation of wealth (xt+1 - xt= -ct), where x0=1000. I've tried to solve it, but can't evaluate function at initial parameters. Also, the final conditions are: x10 >= 0 and u10 >= 0. Here's the code I'm using:
library('optimx')
# Parameters
t <- seq(0,10) # time
desc <- 0.1 # tax
beta <- 1/(1+desc) # parameter of function
alpha <- 0.8 # function
xo = 1000 # wealth in t=0
# Variables
x <- rep(xo, length(t)) # wealth in time
# Function
utilidade_total <- function(u, npar=TRUE, print=TRUE){
for (i in lenght(t)){
obj[i]=(beta^i)*(u[i]^alpha)
x[i]=x[i-1]-u[i-1]
}
result <- sum(obj)
return(result)
}
maxuti <- optimx(par=u,fn= utilidade_total,
control = list(maximize=TRUE))
umax <- coef(maxuti)
for (i in 2:lenght(t)){
print(obj[i]=(beta^i)*(u^alpha))
print(stock[i]=x[i-1]-umax[i-1])
}
# Result
for (i in 2:lenght(t)){
obj[i]=(beta^i)*(u^alpha)
x[i]=x[i-1]-u[i-1]
}
plot(t, x, type='b')
plot(t, u, type='b')

Gamma distribution in JAGS - Error in node

I'm trying to parameterise a gamma distribution in JAGS - with a piecewise linear predictor but my model fails to run with the following error message:
Error: Error in node (ashape/(aexp(mu[59]))) Invalid parent values
The model works when timber.recovery is drawn from a normal distribution, but the lower quantile predictions is less than zero, which is not biologically possible. I've tried a few tricks like adding 0.001 to the "mu" parameter in case it was drawing a zero, setting initial values based on outputs from a glm; but neither resolves the error message. any insights would be greatly appreciated [i'm using R2jags]. My model:
cat (
"model {
# UNINFORMATIVE PRIORS
sd_plot ~ dunif(0, 100)
tau_plot <- 1/(sd_plot * sd_plot)
# precision for plot level variance
alpha ~ dnorm(0, 1e-06)
# normal prior for intercept term
shape ~ dunif(0, 100)
# shape parameter for gamma
log_intensity ~ dnorm(0, 1e-06)
# uninformative prior for logging intensity
beta_1 ~ dnorm (0, 1e-06)
# uninformative prior; change in slope for first segment : <=3.6 years
beta_2 ~ dnorm (0, 1e-06)
# uninformative prior; change in slope for first segment : >3.6 years
InX_1 ~ dnorm (0, 1e-06)
# uniformative prior for interaction between tsl and log_intensity : <=3.6 years
InX_2 ~ dnorm (0, 1e-06)
# uniformative prior for interaction between tsl and log_intensity : >3.6 years
# PLOT LEVEL RANDOM EFFECTS
for (i in 1:nplots) {
plot_Eff[i] ~ dnorm(0,tau_plot)
}
for (i in 1:Nobs) {
# PIECEWISE LINEAR PREDICTOR
mu[i] <-
alpha +
beta_1 * (time.since.logged[i] * tsl.DUM1[i]) +
log_intensity * log.volume [i] +
beta_2 * (time.since.logged[i] * tsl.DUM2[i] - 3.6) +
beta_1 * (time.since.logged[i] * tsl.DUM2[i]) +
plot_Eff[plot.id[i]] +
InX_1 * (time.since.logged[i] * tsl.DUM1[i]) * log.volume [i] +
InX_2 * (time.since.logged[i] * tsl.DUM2[i] - 3.6) * log.volume[i] +
InX_1 * (time.since.logged[i] * tsl.DUM2[i]) * log.volume[i]
timber.recovery[i] ~ dgamma(shape,shape/exp(mu[i]))
# observed recovery
pred_timber_recovery[i] ~ dgamma(shape,shape/exp(mu[i]))
# posterior predictive distribution
pearson.residual[i] <-
(timber.recovery[i] - pred_timber_recovery[i]) / (sqrt(timber.recovery[i]))
}
}",
fill = TRUE,
file = "outputs/piecewise_TIMBER_MODEL_FINAL_GAMMA.txt")

OpenBUGS: Initializing the Model

I am having a problem in initializing the following model in OpenBUGS
model
{
#likelihood
for (t in 1:n) { yisigma2[t] <- 1/exp(theta[t]);
y[t] ~ dnorm(0,yisigma2[t]);
}
#Priors
mu ~ dnorm(0,0.1);
phistar ~ dbeta(20,1.5);
itau2 ~ dgamma(2.5,0.025);
beta <- exp(mu/2);
phi <- 2*phistar-1;
tau <- sqrt(1/itau2);
theta0~dnorm(mu, itau2)
thmean[1] <- mu + phi*(theta0-mu);
theta[1] ~ dnorm(thmean[1],itau2);
for (t in 2:n) { thmean[t] <- mu + phi*(theta[t-1]-mu);
theta[t] ~ dnorm(thmean[t],itau2);
}
}
This is my data
list(y=c(-0.0383 , 0.0019 ,......-0.0094),n=945)
And this is the list of my initials
list(phistar= 0.98, mu=0, itau2=50)
The checking of model, loading of data and compilation steps are ok. When loading initials, OpenBUGS says initial values are loaded but chain contains uninitialized variables. I then tried to initialize theta0 also but the problem persists. Could someone please help me regarding this?
Thanks
Khalid
I am newbie at OpenBugs but shouldn't you be specifying a distribution for inits rather than a single point value? something like?
inits <- function(){ list(alpha=rnorm(1), beta=rnorm(1), sigma = rlnorm(1))}