Kotlin: Applicatives with Extension Functions - kotlin

From what I understood, Applicatives are classes which implement the method apply, but I've seen a version with functions too. This is what it should look like:
fun <T, R> List<T>.ap(fab: List<(T) -> R>): List<R> = fab.flatMap { f -> this.map(f) }
And, when I am testing it with:
fun main(){
val numbers = listOf(75, 454, 7, 45, 45, 56, 75)
val functions = listOf<(Int) -> Int>({ i -> i * 2 }, { i -> i + 3 })
val result = numbers.ap(functions).joinToString()
println(result)
}
The output is:
150, 908, 14, 90, 90, 112, 150, 78, 457, 10, 48, 48, 59, 78
But the expected output is:
153, 911, 17, 93, 93, 115, 153, 81, 460, 13, 51, 51, 62, 81
Basically, I am applying a list of functions to a normal list, that's what it should do. From what I observed, my applicative did his job only for the first function, but for the other, it didn't... How can I get the expected result, using applicatives? I would like to keep my list of functions the way it is already, or at least to keep it similar at most.

Related

How can I convert Bytes Array to String on Kotlin other than String()?

[51, -42, 119, -85, -64, 126, 22, 127, -72, 72, 48, -66, -18, 45, 99, -119]
This is the BytesArray that I want to print in String.
When I searched on the internet, I found that
String(Bytes, Charsets.UTF_8)
would convert it to String.
However, I get �؉���Q�t, and doesn't seem to be converted in right way.
Why is it?
I want it to be String in Alphabet characters and numbers
Firstly, you are specifying an array of signed bytes (indicated by negative numbers):
51, -42, 119, -85, -64, 126, 22, 127, -72, 72, 48, -66, -18, 45, 99, -119
Let's take a look at what this would hypothetically look like if it were unsigned (I used this tool for the conversion):
51, 214, 119, 171, 192, 126, 22, 127, 184, 72, 48, 190, 238, 45, 99, 137
Assuming by "Alphabet characters and numbers", you mean the English alphabet, then asciitable will help you identify each character's decimal value, but as a rough guide:
"0"-"9" = 48-57
"A"-"Z" = 65-90
"a"-"z" = 97-122
Consider the following code sample:
/**
* You can edit, run, and share this code.
* play.kotlinlang.org
*/
fun main() {
val bytes = byteArrayOf(51, -42, 119, -85, -64, 126, 22, 127, -72, 72, 48, -66, -18, 45, 99, -119)
val string = bytes.toString(Charsets.US_ASCII)
println(string)
}
As you can see, some of the values in the unsigned array fall outside of the range for English alphabetic characters and numbers, which is why you end up with a string, something like this "3�w��~�H0��-c�" depending on the charset you choose.
For reference:
Charset
Result
Charsets.US_ASCII
3�w��~�H0��-c�
Charsets.UTF_8
3�w��~�H0��-c�
Charsets.UTF_16
㏖瞫쁾ᙿ롈ゾ掉
Charsets.UTF_32
����
Charsets.ISO_8859_1
3Öw«À~¸H0¾î-c
So, it really depends on exactly which encoding the array is using, and exactly what it is you're expecting the resulting string to be.
You can play with the code above, here.

The function must accept the result of another function as an argument Kotlin

The result of the gen function must be an argument of the res function.
The result of the res function is even numbers that came out of the first function.
fun gen():List<Int>{
val numbers=List(10){Random.nextInt(1,100)}
return numbers.filter{it>0}
}
fun res(){...}
From your question, seems like you're trying to create list of random numbers and then filtering out the even numbers from the generated list.
Most probably this should be the implementation:
fun gen(): List<Int> = List(10) { Random.nextInt(1, 100) }
fun res(list: List<Int>) = list.filter { it % 2 == 0 }
// somewhere else
val generated = gen()
println(generated)
println(res(generated))
Sample output:
[44, 57, 64, 96, 30, 93, 92 23, 58, 26]
[44, 64, 96, 30, 92, 58, 26]

Kotlin add carriage return into multiline string

In Kotlin, when I build a multiline string like this:
value expected = """
|digraph Test {
|${'\t'}Empty1;
|${'\t'}Empty2;
|}
|""".trimMargin()
I see that the string lacks carriage return characters (ASCII code 13) when I output it via:
println("Expected bytes")
println(expected.toByteArray().contentToString())
Output:
Expected bytes
[100, 105, 103, 114, 97, 112, 104, 32, 84, 101, 115, 116, 32, 123, 10, 9, 69, 109, 112, 116, 121, 49, 59, 10, 9, 69, 109, 112, 116, 121, 50, 59, 10, 125, 10]
When some code I'm trying to unit test builds the same String via a PrintWriter it delineates lines via the lineSeparator property:
/*
* Line separator string. This is the value of the line.separator
* property at the moment that the stream was created.
*/
So I end up with a string which looks the same in output, but is composed of different bytes and thus is not equal:
Actual bytes
[100, 105, 103, 114, 97, 112, 104, 32, 84, 101, 115, 116, 32, 123, 13, 10, 9, 69, 109, 112, 116, 121, 49, 59, 13, 10, 9, 69, 109, 112, 116, 121, 50, 59, 13, 10, 125, 13, 10]
Is there a better way to address this during string declaration than splitting my multiline string into concatenated stringlets which can each be suffixed with char(13)?
Alternately, I'd like to do something like:
value expected = """
|digraph Test {
|${'\t'}Empty1;
|${'\t'}Empty2;
|}
|""".trimMargin().useLineSeparator(System.getProperty("line.separator"))
or .replaceAll() or such.
Does any standard method exist, or should I add my own extension function to String?
This did the trick.
System.lineSeparator()
Kotlin multiline strings are always compiled into string literals which use \n as the line separator. If you need to have the platform-dependent line separator, you can do replace("\n", System.getProperty("line.separator")).
As of Kotlin 1.2, there is no standard library method for this, so you should define your own extension function if you're using this frequently.

Performance decrease with function call

For the following function:
func CycleClock(c *ballclock.Clock) int {
for i := 0; i < fiveMinutesPerDay; i++ {
c.TickFive()
}
return 1 + CalculateBallCycle(append([]int{}, c.BallQueue...))
}
where c.BallQueue is defined as []int and CalculateBallCycle is defined as func CalculateBallCycle(s []int) int. I am having a huge performance decrease between the for loop and the return statement.
I wrote the following benchmarks to test. The first benchmarks the entire function, the second benchmarks the for loop, while the third benchmarks the CalculateBallCycle function:
func BenchmarkCycleClock(b *testing.B) {
for i := ballclock.MinBalls; i <= ballclock.MaxBalls; i++ {
j := i
b.Run("BallCount="+strconv.Itoa(i), func(b *testing.B) {
for n := 0; n < b.N; n++ {
c, _ := ballclock.NewClock(j)
CycleClock(c)
}
})
}
}
func BenchmarkCycle24(b *testing.B) {
for i := ballclock.MinBalls; i <= ballclock.MaxBalls; i++ {
j := i
b.Run("BallCount="+strconv.Itoa(i), func(b *testing.B) {
for n := 0; n < b.N; n++ {
c, _ := ballclock.NewClock(j)
for k := 0; k < fiveMinutesPerDay; k++ {
c.TickFive()
}
}
})
}
}
func BenchmarkCalculateBallCycle123(b *testing.B) {
m := []int{8, 62, 42, 87, 108, 35, 17, 6, 22, 75, 116, 112, 39, 119, 52, 60, 30, 88, 56, 36, 38, 26, 51, 31, 55, 120, 33, 99, 111, 24, 45, 21, 23, 34, 43, 41, 67, 65, 66, 85, 82, 89, 9, 25, 109, 47, 40, 0, 83, 46, 73, 13, 12, 63, 15, 90, 121, 2, 69, 53, 28, 72, 97, 3, 4, 94, 106, 61, 96, 18, 80, 74, 44, 84, 107, 98, 93, 103, 5, 91, 32, 76, 20, 68, 81, 95, 29, 27, 86, 104, 7, 64, 113, 78, 105, 58, 118, 117, 50, 70, 10, 101, 110, 19, 1, 115, 102, 71, 79, 57, 77, 122, 48, 114, 54, 37, 59, 49, 100, 11, 14, 92, 16}
for n := 0; n < b.N; n++ {
CalculateBallCycle(m)
}
}
Using 123 balls, this gives the following result:
BenchmarkCycleClock/BallCount=123-8 200 9254136 ns/op
BenchmarkCycle24/BallCount=123-8 200000 7610 ns/op
BenchmarkCalculateBallCycle123-8 3000000 456 ns/op
Looking at this, there is a huge disparity between benchmarks. I would expect that the first benchmark would take roughly ~8000 ns/op since that would be the sum of the parts.
Here is the github repository.
EDIT:
I discovered that the result from the benchmark and the result from the running program are widely different. I took what #yazgazan found and modified the benchmark function in main.go mimic somewhat the BenchmarkCalculateBallCycle123 from main_test.go:
func Benchmark() {
for i := ballclock.MinBalls; i <= ballclock.MaxBalls; i++ {
if i != 123 {
continue
}
start := time.Now()
t := CalculateBallCycle([]int{8, 62, 42, 87, 108, 35, 17, 6, 22, 75, 116, 112, 39, 119, 52, 60, 30, 88, 56, 36, 38, 26, 51, 31, 55, 120, 33, 99, 111, 24, 45, 21, 23, 34, 43, 41, 67, 65, 66, 85, 82, 89, 9, 25, 109, 47, 40, 0, 83, 46, 73, 13, 12, 63, 15, 90, 121, 2, 69, 53, 28, 72, 97, 3, 4, 94, 106, 61, 96, 18, 80, 74, 44, 84, 107, 98, 93, 103, 5, 91, 32, 76, 20, 68, 81, 95, 29, 27, 86, 104, 7, 64, 113, 78, 105, 58, 118, 117, 50, 70, 10, 101, 110, 19, 1, 115, 102, 71, 79, 57, 77, 122, 48, 114, 54, 37, 59, 49, 100, 11, 14, 92, 16})
duration := time.Since(start)
fmt.Printf("Ballclock with %v balls took %s;\n", i, duration)
}
}
This gave the output of:
Ballclock with 123 balls took 11.86748ms;
As you can see, the total time was 11.86 ms, all of which was spent in the CalculateBallCycle function. What would cause the benchmark to run in 456 ns/op while the running program runs in around 11867480 ms/op?
You wrote that CalcualteBallCycle() modifies the slice by design.
I can't speak to correctness of that approach, but it is why benchmark time of BenchmarkCalculateBallCycle123 is so different.
On first run it does the expected thing but on subsequent runs it does something completely different, because you're passing different data as input.
Benchmark this modified code:
func BenchmarkCalculateBallCycle123v2(b *testing.B) {
m := []int{8, 62, 42, 87, 108, 35, 17, 6, 22, 75, 116, 112, 39, 119, 52, 60, 30, 88, 56, 36, 38, 26, 51, 31, 55, 120, 33, 99, 111, 24, 45, 21, 23, 34, 43, 41, 67, 65, 66, 85, 82, 89, 9, 25, 109, 47, 40, 0, 83, 46, 73, 13, 12, 63, 15, 90, 121, 2, 69, 53, 28, 72, 97, 3, 4, 94, 106, 61, 96, 18, 80, 74, 44, 84, 107, 98, 93, 103, 5, 91, 32, 76, 20, 68, 81, 95, 29, 27, 86, 104, 7, 64, 113, 78, 105, 58, 118, 117, 50, 70, 10, 101, 110, 19, 1, 115, 102, 71, 79, 57, 77, 122, 48, 114, 54, 37, 59, 49, 100, 11, 14, 92, 16}
for n := 0; n < b.N; n++ {
tmp := append([]int{}, m...)
CalculateBallCycle(tmp)
}
}
This works-around this behavior by making a copy of m, so that CalculateBallCycle modifies a local copy.
The running time becomes more like the others:
BenchmarkCalculateBallCycle123-8 3000000 500 ns/op
BenchmarkCalculateBallCycle123v2-8 100 10483347 ns/op
In your CycleClock function, you are copying the c.BallQueue slice. You can improve performance significantly by using CalculateBallCycle(c.BallQueue) instead (assuming CalculateBallCycle doesn't modify the slice)
For example:
func Sum(values []int) int {
sum := 0
for _, v := range values {
sum += v
}
return sum
}
func BenchmarkNoCopy(b *testing.B) {
for n := 0; n < b.N; n++ {
Sum(m)
}
}
func BenchmarkWithCopy(b *testing.B) {
for n := 0; n < b.N; n++ {
Sum(append([]int{}, m...))
}
}
// BenchmarkNoCopy-4 20000000 73.5 ns/op
// BenchmarkWithCopy-4 5000000 306 ns/op
// PASS
There is a subtle bug in your tests.
Both methods BenchmarkCycleClock and BenchmarkCycle24 run the benchmark in a for loop, passing a closure to b.Run. Inside of those closures you initialize the clocks using the loop variable i like this:ballclock.NewClock(i).
The problem is, that all instances of your anonymous function share the same variable. And, by the time the function is run by the test runner, the loop will be finished, and all of the clocks will be initialized using the same value: ballclock.MaxBalls.
You can fix this using a local variable:
for i := ballclock.MinBalls; i <= ballclock.MaxBalls; i++ {
i := i
b.Run("BallCount="+strconv.Itoa(i), func(b *testing.B) {
for n := 0; n < b.N; n++ {
c, _ := ballclock.NewClock(i)
CycleClock(c)
}
})
}
The line i := i stores a copy of the current value of i (different for each instance of your anonymous function).

Pymc size / indexing issue

I am trying to model Kruschke's "filtration-condensation experiment" with pymc 2.3.5. (numpy 1.10.1)
Basicaly there are:
4 groups
each group has 40 individuals
each individual has 64 Bernoulli trials (correct/incorrect)
What I am modeling:
each individual's results are Binomial distribution (e.g. 45 correct out of 64).
my belief about each individual's performance is Beta distribution.
this Beta distribution is influenced by group to which individual belongs (through parameters A=mu*kappa and B=(1-mu)*kappa)
my belief about how strong each group's influence is Gamma distribution (kappa variable)
my belief about each group's average is Beta distribution (mu variable)
The problem:
when I do modeling with "size=" parameters, pymc get's lost
when I seperate each distribution manually (no size=) the pymc does good job
I include the code below:
Data
import numpy as np
import seaborn as sns
import pymc as pm
from pymc.Matplot import plot as mcplot
%matplotlib inline
# Data
ncond = 4
nSubj = 40
trials = 64
N = np.repeat([trials], (ncond * nSubj))
z = np.array([45, 63, 58, 64, 58, 63, 51, 60, 59, 47, 63, 61, 60, 51, 59, 45,
61, 59, 60, 58, 63, 56, 63, 64, 64, 60, 64, 62, 49, 64, 64, 58, 64, 52, 64, 64,
64, 62, 64, 61, 59, 59, 55, 62, 51, 58, 55, 54, 59, 57, 58, 60, 54, 42, 59, 57,
59, 53, 53, 42, 59, 57, 29, 36, 51, 64, 60, 54, 54, 38, 61, 60, 61, 60, 62, 55,
38, 43, 58, 60, 44, 44, 32, 56, 43, 36, 38, 48, 32, 40, 40, 34, 45, 42, 41, 32,
48, 36, 29, 37, 53, 55, 50, 47, 46, 44, 50, 56, 58, 42, 58, 54, 57, 54, 51, 49,
52, 51, 49, 51, 46, 46, 42, 49, 46, 56, 42, 53, 55, 51, 55, 49, 53, 55, 40, 46,
56, 47, 54, 54, 42, 34, 35, 41, 48, 46, 39, 55, 30, 49, 27, 51, 41, 36, 45, 41,
53, 32, 43, 33])
condition = np.repeat([0,1,2,3], nSubj)
Does not work
# modeling
mu = pm.Beta('mu', 1, 1, size=ncond)
kappa = pm.Gamma('gamma', 1, 0.1, size=ncond)
# Prior
theta = pm.Beta('theta', mu[condition] * kappa[condition], (1 - mu[condition]) * kappa[condition], size=len(z))
# likelihood
y = pm.Binomial('y', p=theta, n=N, value=z, observed=True)
# model
model = pm.Model([mu, kappa, theta, y])
mcmc = pm.MCMC(model)
#mcmc.use_step_method(pm.Metropolis, mu)
#mcmc.use_step_method(pm.Metropolis, theta)
#mcmc.assign_step_methods()
mcmc.sample(100000, burn=20000, thin=3)
# outputs never converge and does vary in new simulations
mcplot(mcmc.trace('mu'), common_scale=False)
Works
z1 = z[:40]
z2 = z[40:80]
z3 = z[80:120]
z4 = z[120:]
Nv = N[:40]
mu1 = pm.Beta('mu1', 1, 1)
mu2 = pm.Beta('mu2', 1, 1)
mu3 = pm.Beta('mu3', 1, 1)
mu4 = pm.Beta('mu4', 1, 1)
kappa1 = pm.Gamma('gamma1', 1, 0.1)
kappa2 = pm.Gamma('gamma2', 1, 0.1)
kappa3 = pm.Gamma('gamma3', 1, 0.1)
kappa4 = pm.Gamma('gamma4', 1, 0.1)
# Prior
theta1 = pm.Beta('theta1', mu1 * kappa1, (1 - mu1) * kappa1, size=len(Nv))
theta2 = pm.Beta('theta2', mu2 * kappa2, (1 - mu2) * kappa2, size=len(Nv))
theta3 = pm.Beta('theta3', mu3 * kappa3, (1 - mu3) * kappa3, size=len(Nv))
theta4 = pm.Beta('theta4', mu4 * kappa4, (1 - mu4) * kappa4, size=len(Nv))
# likelihood
y1 = pm.Binomial('y1', p=theta1, n=Nv, value=z1, observed=True)
y2 = pm.Binomial('y2', p=theta2, n=Nv, value=z2, observed=True)
y3 = pm.Binomial('y3', p=theta3, n=Nv, value=z3, observed=True)
y4 = pm.Binomial('y4', p=theta4, n=Nv, value=z4, observed=True)
# model
model = pm.Model([mu1, kappa1, theta1, y1, mu2, kappa2, theta2, y2,
mu3, kappa3, theta3, y3, mu4, kappa4, theta4, y4])
mcmc = pm.MCMC(model)
#mcmc.use_step_method(pm.Metropolis, mu)
#mcmc.use_step_method(pm.Metropolis, theta)
#mcmc.assign_step_methods()
mcmc.sample(100000, burn=20000, thin=3)
# outputs converge and are not too much different in every simulation
mcplot(mcmc.trace('mu1'), common_scale=False)
mcplot(mcmc.trace('mu2'), common_scale=False)
mcplot(mcmc.trace('mu3'), common_scale=False)
mcplot(mcmc.trace('mu4'), common_scale=False)
mcmc.summary()
Can someone please explain it to me why mu[condition] and gamma[condition] does not work? :)
I guess that not splitting thetas into different variables is the problem but cannot understand why and maybe there is a way to pass a shape parameter to size= on theta?
First of all, I can confirm that the first version doesn't lead to stable results. What I can't confirm is that the second one is much better; I have seen very different results also with the second code, with values for the first mu parameter varying between 0.17 and 0.9 for different runs.
The convergence problems can be cured by using good starting values for the Markov chain. This can be done by first doing a maximum a posteriori (MAP) estimate, and then starting the Markov chain from there. The MAP step is computationally inexpensive and leads to a converging Markov chain with reproducible results for both variants of your code. For reference and comparison: The values I see for the four mu parameters are around 0.94 / 0.86 / 0.72 and 0.71.
You can do the MAP estimation by inserting the following two lines of code right after the line in which you define your model with "model=pm.Model(...":
map_ = pm.MAP(model)
map_.fit()
This technique is covered in more detail in Cameron Davidson-Pilon's Bayesian Methods for Hackers, together with other helpful topics around PyMC.