How to find probability of posterior parameter with Winbugs - bayesian

My winbugs code is as follows:
model
{
for ( i in 1:N){ logit(p[i])<- alpha+ beta*x[i]
y[i]~ dbin(p[i], n[i])
}
alpha~ dnorm(0,0.000001)
beta~ dnorm(0,0.000001)
pbeta<-step(beta-0)
}
list(N=20,
n=c(6, 7, 6, 8, 8, 5, 6, 6, 5, 8, 6, 5, 7, 6, 6, 7,6 , 6, 7, 3),
y=c(0,2,6,2,2,1,3,6,2,3,4,3,7,0,1,0,0,1,1,2),
x=c(25.7, 32.3, 49.6, 35.2, 35.9, 33.2, 39.8, 51.3, 32.9, 40.9,
43.6, 42.5, 50.4, 36.5, 34.1, 31.3, 28.3, 36.5, 37.4, 40.6))
list(alpha=0.1, beta=0.2)
After running this code, I have posterior distribution of alpha and beta. Now I want to see P(beta>0). They said that I can use pbeta<- step(beta) (pbeta is treated like a dummy variable: 0 if beta=0 and 1 if beta>0). But when I put it in the model it gave me an error notification.

I can't see anything wrong with your code, and it runs for me. What was the error message?
On an unrelated matter, your MCMC chains should converge more efficiently if you center the covariate values around their mean.

Related

Numpy array value change via two index sets

I am trying to achieve the following:
# Before
raw = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
# Set values to 10
indice_set1 = np.array([0, 2, 4])
indice_set2 = np.array([0, 1])
raw[indice_set1][indice_set2] = 10
# Result
print(raw)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
But the raw values remain exactly the same.
Expecting this:
# After
raw = np.array([10, 1, 10, 3, 4, 5, 6, 7, 8, 9])
After doing raw[indice_set1] you get a new array, which is the one you modify with the second slicing, not raw.
Instead, slice the slicer:
raw[indice_set1[indice_set2]] = 10
Modified raw:
array([10, 1, 10, 3, 4, 5, 6, 7, 8, 9])

numpy invert stride selection

Consider the following code:
aa = np.arange(16)
step = 4
bb = aa[::4]
This selects every 4th element. Is there a quick and easy numpy function to select the complement of bb? I'm looking for the following output
array([1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14, 15])
Yes, I could generate indices and then do np.setdiff1d, but I'm looking for something more elegant than that.
If you're looking for a simple single-liner:
np.delete(aa,slice(None,None,4))
Another solution (I don't know about elegant), but you could define a selection index of ones, and then set every fourth element to False to then index the original array:
o = np.ones_like(s,dtype=bool)
o[::step] = False
aa[o]
A flexible way to select based on an arbitrary repeated position could be to use a modulo:
bb = aa[np.arange(len(aa))%step != step-1]
Output:
array([ 0, 1, 2, 4, 5, 6, 8, 9, 10, 12, 13, 14])

How to correctly format a pd-multiindex for sktime?

I have a pd.multiindex which looks like this:
However, when I use the run check_raise(df_train, mtype="pd-multiindex)"
I get the following error:
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/sktime/datatypes/_check.py:252, in check_raise(obj, mtype, scitype, var_name)
250 return True
251 else:
--> 252 raise TypeError(msg)
TypeError: input.loc[i] must be Series of mtype pd.DataFrame, not at i=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]
I believe this means I am meant to convert each row into a pandas series, but I am unsure if this is correct?
Any help would be appreciated.
I have similar issue, try to check if your index have duplicate keys, in your case:
df_train.reset_index(['sbj', 'system_time_stamp'])[['sbj', 'system_time_stamp']].duplicated(keep=False)
Remove duplicated index works for me.

Specifying integer latent variable in stan

I'm learning Bayesian data analysis. I try to replicate the tutorials by Trond Reitan by stan, which are originally created by WinBugs.
Specifically, I have following data and model
weta.windata<-list(numdet=c(0, 0, 1, 0, 0, 0, 0, 0, 0, 2, 1, 1, 2, 0, 3, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 1, 0, 3, 1, 1, 3, 1, 1, 2, 0, 2, 1, 1, 1, 1,0, 0, 0, 2, 0, 2, 4, 3, 1, 0, 0, 2, 0, 2, 2, 1, 0, 0, 1),
numvisit=c(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 3, 3, 4, 4, 4, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4,4, 4, 4, 4, 4, 4, 4 ,4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3),
nsites=72)
model_string1="
data{
int nsites;
real<lower=0> numdet[nsites];
real<lower=0> numvisit[nsites];
}
parameters{
real<lower=0> p;
real<lower=0> psi;
int<lower=0> z[nsites];
}
model{
p~uniform(0,1);
psi~uniform(0,1);
for(i in 1:nsites){
z[i]~ bernoulli(psi);
p.site[i]~z[i]*p;
numdet[i]~binomial(numvisit[i],p.site[i]);
}
}
"
mcmc_samples <- stan(model_code=model_string1,
data=weta.windata,
pars=c("p","psi","z"),
chains=3, iter=30000, warmup=10000)
The context is about detecting wetas in fields. There are 72 sites. for each site, researchers visited several times (i.e., numvisit) and recorded the number of times weta found (i.e., numdet).
There is a latent variable z, describing whether one site has weta or not. psi is the probability that one site has weta. p is the detection rate.
The problem I have is I can not declare z to be integers
parameters or transformed parameters cannot be integer or integer array; found declared type int, parameter name=z
Problem with declaration.
However, if I set z to be real, that is,
real<lower=0> z[nsites];
the problem becomes I cannot set the variable from bernoulli as integer...
No matches for:
real ~ bernoulli(real)
I'm very new to stan. Forgive me if this question is very silly.
Stan doesn't support integer parameters or hacks to let you pretend real variables are integers. What it does support is marginalizing the integer variables out of the density. You can then reconstruct them with much more efficiency and much higher tail resolution.
The chapter in the manual on latent discrete parameters is the place to start. It includes an implementation of the CJS population models, which may be familiar. I implemented the Dorazio and Royle occupance models as a case study and Hiroki Ito translated the entire Kery and Schaub book to Stan. They're all linked under users >> documentation on the web site.
I ran into this mysterious error with ulam while answering practice problems in Statistical Rethinking. When you're constructing a list to pass to the data argument to ulam be sure to use = rather than <- for assignment. If you don't the list you construct won't have named components, and a missing name produces this error.

Chart Axes in VB.NET

My requirement is to graph (scatter graph) data from 2 arrays. I can now connect the data from the array and use it on the chart. My question is, how do I set the graph's X- and Y- axes to show consistency in their intervals?
For example, I have points from X = {1, 3, 4, 6, 8, 9} and Y = {7, 10, 11, 15, 18, 19}. What I would like to see is that these points are graphed in a scatter manner, but, the intervals for x-axis should be (intervals of) 2 up to 10 (such that it will show 0, 2, 4, 6, 8, 10 on x-axis) and intervals of 5 for the y-axis (such that it will show 5, 10, 15, 20 on y-axis). What code/property should I use/manipulate?
ADDED PART:
I currently have this data:
x_column = {12, 24, 1, 7, 29, 28, 25, 24, 15, 19}
y_column = {3, 5, 8, 3, 3, 3, 3, 3, 19, 15}
each y_column element is a pair of each respective x_column element
Now, I want MyChart to display a scatter graph of the x_column and y_column data in such a way that the x-axis will show 5, 10, 15, 20, 25, 30 and the y-axis will show 2, 4, 6, 8, 10, 12, 14, 16, 18, 20.
My current code is:
' add points
MyChart.Series("Scatter Plot").Points.DataBindXY(x_Column, y_Column)
The code above only adds points.
Try:
Chart1.ChartAreas("Default").AxisX.Interval = 2
Chart1.ChartAreas("Default").AxisY.Interval = 5