Related
I am working with a map with strings as keys and arrays as values. I would like to adjust the map to be the original strings and change the arrays to the average values.
The original map is:
val appRatings = mapOf(
"Calendar Pro" to arrayOf(1, 5, 5, 4, 2, 1, 5, 4),
"The Messenger" to arrayOf(5, 4, 2, 5, 4, 1, 1, 2),
"Socialise" to arrayOf(2, 1, 2, 2, 1, 2, 4, 2)
)
What I have tried to do is:
val averageRatings = appRatings.forEach{ (k,v) -> v.reduce { acc, i -> acc + 1 }/v.size}
However this returns a Unit instead of a map in Kotlin. What am I doing wrong? I am working through a lambda assignment and they want us to use foreach and reduce to get the answer.
You can use forEach and reduce, but it's overkill, because you can just use mapValues and take the average:
val appRatings = mapOf(
"Calendar Pro" to arrayOf(1, 5, 5, 4, 2, 1, 5, 4),
"The Messenger" to arrayOf(5, 4, 2, 5, 4, 1, 1, 2),
"Socialise" to arrayOf(2, 1, 2, 2, 1, 2, 4, 2)
)
val averages = appRatings.mapValues { (_, v) -> v.average() }
println(averages)
Output:
{Calendar Pro=3.375, The Messenger=3.0, Socialise=2.0}
You can do this with mapValues function:
val appRatings = mapOf(
"Calendar Pro" to arrayOf(1, 5, 5, 4, 2, 1, 5, 4),
"The Messenger" to arrayOf(5, 4, 2, 5, 4, 1, 1, 2),
"Socialise" to arrayOf(2, 1, 2, 2, 1, 2, 4, 2)
)
val ratingsAverage = appRatings.mapValues { it.value.average() }
You already got some answers (including literally from JetBrains?? nice) but just to clear up the forEach thing:
forEach is a "do something with each item" function that returns nothing (well, Unit) - it's terminal, the last thing you can do in a chain, because it doesn't return a value to do anything else with. It's basically a for loop, and it's about side effects, not transforming the collection that was passed in and producing different data.
onEach is similar, except it returns the original item - so you call onEach on a collection, you get the same collection as a result. So this one isn't terminal, and you can pop it in a function chain to do something with the current set of values, without altering them.
map is your standard "transform items into other items" function - if you want to put a collection in and get a different collection out (like transforming arrays of Ints into single Int averages) then you want map. (The name comes from mapping values onto other values, translating them - which is why you always get the same number of items out as you put in)
Suppose I have the array [1,2,3,4,5].
I want to "add" the array [2,4,6,8] to it so I get
[[3,5,7,9],
[4,6,8,10],
[5,7,9,11],
[6,8,10,12],
[7,9,11,13]]
(or its transpose).
There is probably a function for this but I can't seem to find it because I'm not sure what to search for.
As suggested by #Divakar, the best way is to use add.outer:
a1 = np.array([1,2,3,4,5])
a2 = np.array([2,4,6,8])
np.add.outer(a1,a2)
But you can also explicitely broadcast your a1 array to the proper shape, then add to a2:
a1[:,None]+a2
# essentially equivalent to:
# a1.reshape(-1,1)+a2
Both produce:
array([[ 3, 5, 7, 9],
[ 4, 6, 8, 10],
[ 5, 7, 9, 11],
[ 6, 8, 10, 12],
[ 7, 9, 11, 13]])
I'm learning Bayesian data analysis. I try to replicate the tutorials by Trond Reitan by stan, which are originally created by WinBugs.
Specifically, I have following data and model
weta.windata<-list(numdet=c(0, 0, 1, 0, 0, 0, 0, 0, 0, 2, 1, 1, 2, 0, 3, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 1, 0, 3, 1, 1, 3, 1, 1, 2, 0, 2, 1, 1, 1, 1,0, 0, 0, 2, 0, 2, 4, 3, 1, 0, 0, 2, 0, 2, 2, 1, 0, 0, 1),
numvisit=c(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 3, 3, 4, 4, 4, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4,4, 4, 4, 4, 4, 4, 4 ,4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3),
nsites=72)
model_string1="
data{
int nsites;
real<lower=0> numdet[nsites];
real<lower=0> numvisit[nsites];
}
parameters{
real<lower=0> p;
real<lower=0> psi;
int<lower=0> z[nsites];
}
model{
p~uniform(0,1);
psi~uniform(0,1);
for(i in 1:nsites){
z[i]~ bernoulli(psi);
p.site[i]~z[i]*p;
numdet[i]~binomial(numvisit[i],p.site[i]);
}
}
"
mcmc_samples <- stan(model_code=model_string1,
data=weta.windata,
pars=c("p","psi","z"),
chains=3, iter=30000, warmup=10000)
The context is about detecting wetas in fields. There are 72 sites. for each site, researchers visited several times (i.e., numvisit) and recorded the number of times weta found (i.e., numdet).
There is a latent variable z, describing whether one site has weta or not. psi is the probability that one site has weta. p is the detection rate.
The problem I have is I can not declare z to be integers
parameters or transformed parameters cannot be integer or integer array; found declared type int, parameter name=z
Problem with declaration.
However, if I set z to be real, that is,
real<lower=0> z[nsites];
the problem becomes I cannot set the variable from bernoulli as integer...
No matches for:
real ~ bernoulli(real)
I'm very new to stan. Forgive me if this question is very silly.
Stan doesn't support integer parameters or hacks to let you pretend real variables are integers. What it does support is marginalizing the integer variables out of the density. You can then reconstruct them with much more efficiency and much higher tail resolution.
The chapter in the manual on latent discrete parameters is the place to start. It includes an implementation of the CJS population models, which may be familiar. I implemented the Dorazio and Royle occupance models as a case study and Hiroki Ito translated the entire Kery and Schaub book to Stan. They're all linked under users >> documentation on the web site.
I ran into this mysterious error with ulam while answering practice problems in Statistical Rethinking. When you're constructing a list to pass to the data argument to ulam be sure to use = rather than <- for assignment. If you don't the list you construct won't have named components, and a missing name produces this error.
Googled around a bit and couldn't seem to find anything on this.
Is there an option to access data in a pandas data frame using "not index"?
So something like
df_index = asdf = pandas.MultiIndex(levels=[
['2014-10-19', '2014-10-20', '2014-10-21', '2014-10-22', '2014-10-30'],
[u'after_work', u'all_day', u'breakfast', u'lunch', u'mid_evening']],
labels=[[0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 4, 4, 4, 4],
[4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 2, 0, 1, 3, 4]],
names=[u'start_date', u'time_group'])
And then I would like to be able to call the following to get everything not in df_index
df.ix[~df_index]
I know you can do it for logical indexing within pandas. Just curious if I could do it using an Index Object
you can use df.drop(df_index, errors="ignore").
My winbugs code is as follows:
model
{
for ( i in 1:N){ logit(p[i])<- alpha+ beta*x[i]
y[i]~ dbin(p[i], n[i])
}
alpha~ dnorm(0,0.000001)
beta~ dnorm(0,0.000001)
pbeta<-step(beta-0)
}
list(N=20,
n=c(6, 7, 6, 8, 8, 5, 6, 6, 5, 8, 6, 5, 7, 6, 6, 7,6 , 6, 7, 3),
y=c(0,2,6,2,2,1,3,6,2,3,4,3,7,0,1,0,0,1,1,2),
x=c(25.7, 32.3, 49.6, 35.2, 35.9, 33.2, 39.8, 51.3, 32.9, 40.9,
43.6, 42.5, 50.4, 36.5, 34.1, 31.3, 28.3, 36.5, 37.4, 40.6))
list(alpha=0.1, beta=0.2)
After running this code, I have posterior distribution of alpha and beta. Now I want to see P(beta>0). They said that I can use pbeta<- step(beta) (pbeta is treated like a dummy variable: 0 if beta=0 and 1 if beta>0). But when I put it in the model it gave me an error notification.
I can't see anything wrong with your code, and it runs for me. What was the error message?
On an unrelated matter, your MCMC chains should converge more efficiently if you center the covariate values around their mean.