rjags error Invalid vector argument to ilogit - bayesian

I'd like to compare a betareg regression vs. the same regression using rjags
library(betareg)
d = data.frame(p= sample(c(.1,.2,.3,.4),100, replace= TRUE),
id = seq(1,100,1))
# I am looking to reproduce this regression with jags
b=betareg(p ~ id, data= d,
link = c("logit"), link.phi = NULL, type = c("ML"))
summary(b)
Below I am trying to do the same regression with rjags
#install.packages("rjags")
library(rjags)
jags_str = "
model {
#model
y ~ dbeta(alpha, beta)
alpha <- mu * phi
beta <- (1-mu) * phi
logit(mu) <- a + b*id
#priors
a ~ dnorm(0, .5)
b ~ dnorm(0, .5)
t0 ~ dnorm(0, .5)
phi <- exp(t0)
}"
id = d$id
y = d$p
model <- jags.model(textConnection(jags_str),
data = list(y=y,id=id)
)
update(model, 10000, progress.bar="none"); # Burnin for 10000 samples
samp <- coda.samples(model,
variable.names=c("mu"),
n.iter=20000, progress.bar="none")
summary(samp)
plot(samp)
I get an error on this line
model <- jags.model(textConnection(jags_str),
data = list(y=y,id=id)
)
Error in jags.model(textConnection(jags_str), data = list(y = y, id = id)) :
RUNTIME ERROR:
Invalid vector argument to ilogit
Can you advise
(1) how to fix the error
(2) how to set priors for the beta regression
Thank you.

This error occurs because you have supplied the id vector to the scalar function logit. In Jags inverse link functions cannot be vectorized. To address this, you need to use a for loop to go through each element of id. To do this I would probably add an additional element to your data list that denotes how long id is.
d = data.frame(p= sample(c(.1,.2,.3,.4),100, replace= TRUE),
id = seq(1,100,1), len_id = length(seq(1,100,1)))
From there you just need to make a small edit to your jags code.
for(i in 1:(len_id)){
y[i] ~ dbeta(alpha[i], beta[i])
alpha[i] <- mu[i] * phi
beta[i] <- (1-mu[i]) * phi
logit(mu[i]) <- a + b*id[i]
}
However, if you track mu it is going to be a matrix that is 20000 (# of iterations) by 100 (length of id). You are likely more interested in the actual parameters (a, b, and phi).

Related

Portfolio Optimization Using Quadprog Gives the Same Result for Every time even after changing variables

I have a task to construct the efficient frontier using 25 portfolios (monthly data). I tired writing a quadprog code for calculating minimum variance portfolio weights for a given expected rate of return. However, regardless of the expected return, the solver values give me the same set weights and variance, which the global minimum variance portfolio. I found the answer using an analytical solution. Attached are the codes:
basedf <- read.csv("test.csv", header = TRUE, sep = ",")
data <- basedf[,2:26]
ret <- as.data.frame(colMeans(data))
variance <- diag(var(data))
covmat <-as.matrix(var(data))
###minimum variance portfolio calculation
Q <- 2*cov(data)
A <- rbind(rep(1,25))
a <- 1
result <- solve.QP(Dmat = Q,
dvec = rep(0,25),
Amat = t(A),
bvec = a,
meq = 1)
w <-result$solution
w
var <- result$value
var
sum(w)
this is another set of codes giving the me same value::
mvp <- function(e,ep){
Dmat <- 2*cov(e)
dvec <- rep(0, ncol(e))
Amat <- cbind(rep(1, ncol(e)), colMeans(e))
bvec <- c(1, ep)
result <- solve.QP(Dmat = Dmat, dvec = dvec, Amat = Amat, bvec = bvec, meq=1)
wp <- result$solution
varP <- result$value
ret_values <- list(wp, varP)
names(ret_values) <- c("wp", "VarP")
return(ret_values)
}
z <- mvp(data, -.005)
z$wp
sum(z$wp)
z$VarP
ef <- function(e, min_e, max_e){
list_e <- seq(min_e,max_e, length=50)
loop <- sapply(list_e, function(x) mvp(e, x)$VarP)
effF <- as.data.frame(cbind(list_e,loop))
minvar <- min(effF$loop)
L <- effF$loop==minvar
minret <- effF[L,]$list_e
minpoint <- as.data.frame(cbind(minret,minvar))
minvarwp <- mvp(e, min_e)$wp
rlist <- list(effF, minpoint, minvarwp)
names(rlist) <- c( "eFF", "minPoint", "wp")
return(rlist)
}
in the efficient frontier, all the 50 portfolios have same level of variance. can anyone tell me whats wrong with solver equation??? thanks.
I tried quadprog but couldnt solve it.

How can I use fmincon() for different input parameters without using for loop?

I want to run the optimization function fmincon() over thousands of different input parameters. Briefly, the aim of the optimization is to find the optimal consumption and investment strategy that give the highest utility for a given wealth. The basic set up and functions are given as follows:
library(pracma)
library(NlcOptim)
# individual preference parameters
gamma <- 5
beta <- 0.02
Y <- 1
# financial market parameters
r <- 0.02
mu <- 0.06
sigma <- 0.2
lambda <- (mu-r)/sigma
# Merton fraction
w_star <- lambda / (gamma*sigma)
# fix random seed
set.seed(85)
scenarios <- 1000
Z_omega <- array(rnorm(scenarios,0,1), dim=c(scenarios,1)) # Brownian motion vector for E[J(W)]
# J multiple
multiple <- 1000000000
fineness <- 0.01
# define utility function
u <- function(C) {
C^(1-gamma)/(1-gamma)
}
# wealth scenario at t+1 for a given W_t
W.next <- function(W,C,fstar) {
W.tplus1 <- exp(r + fstar*sigma*lambda - 0.5*fstar^2*sigma^2 + fstar*sigma*Z_omega) * (W + Y - C)
return(W.tplus1)
}
J.simulate <- function(W.tplus1) {
floor.number <- floor((round_any(W.tplus1, fineness, f=floor) * 1/fineness)) + 1
ceiling.number <- ceiling((round_any(W.tplus1, fineness, f=ceiling) * 1/fineness)) + 1
x1 <- G_T[floor.number]
x2 <- G_T[ceiling.number]
y1 <- J_WT[floor.number]
y2 <- J_WT[ceiling.number]
# linear interpolation for J
J.tplus1.simulate <- y1 + ((W.tplus1-x1)/(x2-x1) * (y2-y1))
return(J.tplus1.simulate)
}
# define h(C,f|W)
h_t <- function(Cfstar) {
C <- Cfstar[1]
fstar <- Cfstar[2]
# wealth scenario at t+1 for a given W_t
W.tplus1 <- W.next(W,C,fstar)
# compute indirect utility for simulated W_t+1 using already compute J_WT
J.tplus1.simulate <- J.simulate(W.tplus1) # ignore wealth less than 0.001 (it can never be optimal)
# expectation of all J(W_t+1)
J_t_plus_1 <- mean(J.tplus1.simulate, na.rm=TRUE) # ignore NAs
# function h_t
indirect_utility <- log(-(u(C) + exp(-beta) * J_t_plus_1)*multiple)
return(indirect_utility)
}
For the sake of simplicity, I generated 10 wealth levels, W, to be optimized:
# wealth grid at T
G_T <- c(0.001, seq(0.01, 3, by=0.01))
J_1T <- -291331.95
J_WT <- G_T^(1-gamma) * J_1T
# wealth to be optimized
W_optim <- seq(0.01, 0.1, by=0.01)
What I did using the for loop is as follows:
# number of loop
wealth.loop <- length(W_optim)
# result vectors
C_star <- numeric(wealth.loop)
f_star <- numeric(wealth.loop)
J <- numeric(wealth.loop)
# lowerbound is fixed
lowerbound <- c(0.01,0.0001)
# optimize!
for (g in 1:wealth.loop) {
W <- W_optim[g]
x0 <- c((W+Y)*0.05,w_star) # initial input vector
upperbound <- c(W+Y-0.01,1) # upperbound depending on W
optimization <- fmincon(x0=x0, fn=h_t, lb=lowerbound, ub=upperbound, tol=1e-10)
C_star[g] <- optimization$par[1]
f_star[g] <- optimization$par[2]
J[g] <- optimization$value
print(c(g,optimization$par[1],optimization$par[2]))
}
This works well, but it takes hours to optimize over more than hundred of thousands set of different parameters. Hence, I was looking for some smarter ways of doing this, like using apply-related functions. For instance, I tried:
W <- W_optim
# input matrix
x0 <- matrix(0, nrow=length(W), ncol=2)
x0[,1] <- (W+Y)*0.05
x0[,2] <- w_star
# lowerbound the same
lowerbound <- c(0.01,0.0001)
# upperbound matrix
upperbound <- matrix(0, nrow=length(W), ncol=2)
upperbound[,1] <- W+Y-0.01
upperbound[,2] <- 1
# optimize using mapply
mapply(fmincon, x0=x0, fn=h_t, lb=lowerbound, up=upperbound)
But obviously it doesn't work. I'm not sure whether the problem is using matrix as input parameters, not vector, or I'm just using a wrong function. Is there any way to solve this problem with an efficient & smart coding?
I tried to optimize over the different parameters at once using mapply, but apparently it didn't work. Maybe I should have used another apply-related function or I should make a different structure for the input matrix?

Solve MLE for Vasicek Interest model but constantly run into an error "Error in if (!all(lower[isfixed] <= fixed[isfixed] & fixed[isfixed]..."

I tried to obtain MLEs of the Vasicek function using the following function.
I am running into into the following error constantly and I have no way to solve it. Please help me. Thanks!
Error in if (!all(lower[isfixed] <= fixed[isfixed] & fixed[isfixed] <= :
missing value where TRUE/FALSE needed
Here is the background:
Likelihood function
likehood.Vasicek<-function (theta, kappa, sigma, rt){
n <- NROW(rt)
y <- rt[2:n,] # Take rates other than r0
dt <- 1/12 # Simulated data is monthly
mu <- rt[1:(n-1),]* exp(-kappa*dt) + theta* (1- exp(-kappa*dt)) #Take prior rates for mu calculation
sd <- sqrt((sigma^2)*(1-exp(-2*kappa*dt))/(2*kappa))
pdf_yt <- dnorm(y, mu, sd, log = FALSE)
- sum(log(pdf_yt))
}
Simulating scenarios
IRModeling.Vasicek = function(r0, theta, kappa, sigma, T, N){
M <- T*12 # monthly time step
t <- 1/12 # time interval is monthly
rt = matrix(0, M+1, N) # N sets of scenarios with M months of time steps
rt[1,] <- r0 # set the initial value for each of the N scenarios
for (i in 1:N){
for (j in 1:M){
rt[j+1,i] = rt[j,i] + kappa*(theta - rt[j,i])*t + sigma*rnorm(1,mean=0,sd=1)*sqrt(t)
}
}
rt # Return the values
}
MLE
r0 = 0.03
theta = 0.03
kappa = 0.3
sigma = 0.03
T = 5 # years
N = 500
rt = IRModeling.Vasicek (r0, theta, kappa, sigma, T, N)
theta.est <- 0.04
kappa.est <- 0.5
sigma.est <- 0.02
parameters.est <- c(theta.est, kappa.est, sigma.est)
library(stats4)
bound.lower <- parameters.est*0.1
bound.upper <- parameters.est*2
est.mle<-mle(likelihood.Vasicek, start= list(theta = theta.est, kappa = kappa.est, sigma = sigma.est),
method="L-BFGS-B", lower=bound.lower, upper= bound.upper, fixed = list(rt = rt))
summary(est.mle)
Error
Error in if (!all(lower[isfixed] <= fixed[isfixed] & fixed[isfixed] <= :
missing value where TRUE/FALSE needed

How to sample from a sum of two distributions: binomial and poisson

Is there a way to predict a value from a sum of two distributions? I am getting a syntax error on rstan when I try to estimate y here: y ~ binomial(,) + poisson()
library(rstan)
BH_model_block <- "
data{
int y;
int a;
}
parameters{
real <lower = 0, upper = 1> c;
real <lower = 0, upper = 1> b;
}
model{
y ~ binomial(a,b)+ poisson(c);
}
"
BH_model <- stan_model(model_code = BH_model_block)
BH_fit <- sampling(BH_model,
data = list(y = 5,
a = 2),
iter= 1000)
Produces this error:
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
error in 'model2c6022623d56_457bd7ab767c318c1db686d1edf0b8f6' at line 13, column 20
-------------------------------------------------
11:
12: model{
13: y ~ binomial(a,b)+ poisson(c);
^
14: }
-------------------------------------------------
PARSER EXPECTED: ";"
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model '457bd7ab767c318c1db686d1edf0b8f6' due to the above error.
Stan doesn't support integer parameters, so you can't technically do that. For two real variables, it'd look like this:
parameters {
real x;
real y;
}
transformed parameters {
real z = x + y;
}
model {
x ~ normal(0, 1);
y ~ gamma(0.1, 2);
}
Then you get the sum distribution for z. If the variables are discrete, it won't compile.
If you don't need z in the model, then you can do this in the generated quantities block,
generated quantities {
int x = binomial_rng(a, b);
int y = poisson_rng(c);
int z = x + y;
}
The drawback of doing this is that none of the variables are available in the model block. If you need discrete parameters, they need to be marginalized as described in the user's guide chapter on latent discrete parameters (also in the chapter on mixtures and HMMs). This is not so easy with a Poisson, because support isn't bounded. If the expectations of the two discrete distributions is small, then you can do it approximately with a loop over plausible values.
It looked from the example in the original post that z is data. That's a slightly different marginalization over x and y, but you only sum over the x and y such that x + y = z, so the combinatorics are greatly reduced.
An alternative is to substitute the Binomial with a Poisson, and use Poisson additivity:
BH_model_block <- "
data{
int y;
int a;
}
parameters{
real <lower = 0, upper = 1> c;
real <lower = 0, upper = 1> b;
}
model{
y ~ poisson(a * b + c);
}
"
This differs in that if b is not small, the Binomial has a lower variance than the Poisson, but maybe there is overdispersion anyhow?

Why the jags result and depmixS4 are sometimes different?

I have a data set like the following simulated data:
Pi = matrix(c(0.9,0.1,0.3,0.7),2,2,byrow=TRUE)
delta = c(.5,.5)
z = sample(c(1,2),1,prob=delta)
T = 365
for( t in 2:T){
z[t] = sample(x=c(1,2),1,prob=Pi[z[t-1],])
}
x <- sample(x=seq(-1, 1.5, length.out=T),T,replace=TRUE)
alpha = c(-1, -3.2)
Beta = c(-4,3)
y<-NA
for(i in 1:T){
y[i] = rbinom(1,size=10,prob=1/(1+exp(-Beta[z[i]]*x[i]-alpha[z[i]])))
}
SimulatedBinomData <- data.frame('y' = y, 'x' = x , size=rep(10,T), 'z' = z)
yy<-NA
xx<-NA
for(i in 1:dim(SimulatedBinomData)[1]){
yy<-c(yy,c(rep(1,SimulatedBinomData$y[i]),rep(0,(SimulatedBinomData$size[i]-SimulatedBinomData$y[i]))))
xx<-c(xx,rep(SimulatedBinomData$x[i],SimulatedBinomData$size[i]))
}
yy<-yy[-1]
xx<-xx[-1]
SimulatedBernolliData<-data.frame(y=yy,x=xx, tt=rep(c(1:T),rep(10,T)))
This is a HMM problem with two states meaning that the Hidden Markov chain z_t belongs to {1,2}. To estimate alpha and Beta in two different states I can use the package 'depmixS4' and find the Maximum Likelihood estimates or I can use MCMC in 'rjags' package.
I expect that these two estimations be almost the same while when I run the following program for different simulated data, in several times, the answers are not the same and very different!!
library("rjags")
library("depmixS4")
mod <- depmix(cbind(y,(size-y))~x, data=SimulatedBinomData, nstates=2, family=binomial(logit))
fm <- fit(mod)
getpars(fm)
n<-length(SimulatedBernolliData$y)
T<-max(SimulatedBernolliData$tt)
cat("model {
# Transition Probability
Ptrans[1,1:2] ~ ddirch(a)
Ptrans[2,1:2] ~ ddirch(a)
# States
Pinit[1] <- 0.5 #failor
Pinit[2] <- 0.5 #success
state[1] ~ dbern(Pinit[2])
for (t in 2:T) {
state[t] ~ dbern(Ptrans[(state[t-1]+1),2])
}
# Parameters
alpha[1] ~ dunif(-1.e10, 1.e10)
alpha[2] ~ dunif(-1.e10, 1.e10)
Beta[1] ~ dunif(-1.e10, 1.e10)
Beta[2] ~ dunif(-1.e10, 1.e10)
# Observations
for (i in 1:n){
z[i] <- state[tt[i]]
y[i] ~ dbern(1/(1+exp(-(alpha[(z[i]+1)]+Beta[(z[i]+1)]*x[i]))))
}
}",
file="LeftBehindHiddenMarkov.bug")
jags <- jags.model('LeftBehindHiddenMarkov.bug', data = list('x' = SimulatedBernolliData$x, 'y' = SimulatedBernolliData$y, 'tt' = SimulatedBernolliData$tt, T=T, n = n, a = c(1,1) ))
res <- coda.samples(jags,c('alpha', 'Beta', 'Ptrans','state'),1000)
res.median = apply(res[[1]],2,median)
res.median[1:8]
res.mean = apply(res[[1]],2,mean)
res.mean[1:8]
res.sd = apply(res[[1]],2,sd)
res.sd[1:8]
res.mode = apply(res[[1]],2,function(x){as.numeric(names(table(x))
[which.max(table(x))]) })
res.mode[1:8]
You are having a problem of label switching in your JAGS code, that is, states z[i]=1 is not bounded to the lower posterior value for Beta and z[i]=2 to the higher Beta. Therefore, for each iteration of the MCMC they can switch. There are several ways to solve this problem. One of them is the partial reordering, that is, for every MCMC iteration, draw two independent values for Beta and order them so that Beta[1] < Beta[2].
You can do that by substituting
Beta[1] ~ dunif(-1.e10, 1.e10)
Beta[2] ~ dunif(-1.e10, 1.e10)
for
Beta[1:2] <- sort(Betaaux)
Betaaux[1] ~ dunif(-1.e10, 1.e10)
Betaaux[2] ~ dunif(-1.e10, 1.e10)
Of course, the ordering could also be done on the alpha parameters instead. The election of which parameter to use for the partial reordering depends on the problem.