I've got a collection of "stuff", and I'd like to sum it into smaller buckets. (In my particular case, I'm downsampling a luma channel of an image by 8x.)
I'd like it to be as fast as possible on your average multi-core android device, which I think means coroutine-per-bucket. (because there isn't any reason to play with IntAdders if I don't have to)
The naive linear solution:
val SCALE = 8
image.planes[0].buffer.toByteArray().forEachIndexed { index, byte ->
val x1 = index % image.width
val y1 = index / image.width
val x2 = x1 / SCALE
val y2 = y1 / SCALE
val quadIdx = y2 * (image.width / SCALE) + x2
summedQuadLum[quadIdx] += (byte.toInt() and 0xFF)
}
That isn't great - needs to pre-declare the summedQuadLum collection, and doesn't have any chance of parallel work.
I'd love to use groupBy, or groupingBy? or aggregate?) but those all seem to use the values to determine the new keys, and I need to use the key to determine the new keys. I think the least overhead is withIndex which could be done as
val thumbSums = bufferArray.withIndex().groupingBy { (idx, _) ->
val x1 = idx % previewImageDimension.width
val y1 = idx / previewImageDimension.width
val x2 = x1 / SCALE
val y2 = y1 / SCALE
y2 * (previewImageDimension.width / SCALE) + x2
}.aggregate { _, acc: Int?, (_, lum), _ ->
(acc ?: 0) + (lum.toInt() and 0xFF)
}.values.toIntArray()
Much better, it is really close - if I could figure out how to sum each bucket in a coroutine, I think it would be as good as can be expected.
So after groupingBy we have a Grouping object, which we can use to aggregate values. It's important to notice the grouping itself has not been done yet, we basically have a description how to group the values and an iterator of the original array. From here we have a few options:
Create a Channel from the iterator and launch a few worker coroutines to consume it in parallel. Channels support fan-out, so every item in the source is processed by one worker only. The problem here is all the workers need to update different items in the resulting array, so synchronization is required and that's where it gets tricky and likely inefficient.
To avoid multiple workers to write to the same item, we need to tell each of them what items to process. That mean either each of the worker should process all the items, picking only suitable or we should precalculate the groups in advance and feed the workers with the groups. Both approaches have pretty much the same performance as the serial algorithm, so do not make any sense.
So to parallelize it efficiently we want to avoid having a shared mutable state, because it requires synchronization. Obviously we don't want to precalculate the groups also.
My suggestion here is to come from another side - instead of mapping original array to sampled one, let's map sampled array to the original. So we say
This approaches makes each value to be calculated independently by one worker, so no synchronization needed. Now we can implement it like this:
suspend fun sample() {
val asyncFactor = 8
val src = Image(bufferArray, width)
val dst = Image(src.width / SCALE, src.height / SCALE)
val chunkSize = dst.sizeBytes / asyncFactor
val jobs = Array(asyncFactor) { idx ->
async(Dispatchers.Default) {
val chunkStartIdx = chunkSize * idx
val chunkEndIdxExclusive = min(chunkStartIdx + chunkSize, dst.sizeBytes)
calculateSampledImageForIndexes(src, dst, chunkStartIdx, chunkEndIdxExclusive, SCALE)
}
}
awaitAll(*jobs)
}
private fun calculateSampledImageForIndexes(src: Image, dst: Image, startIdx: Int, exclusiveEndIdx: Int, scaleFactor: Int) {
for (i in startIdx until exclusiveEndIdx) {
val destX = i % dst.width
val destY = i / dst.width
val srcX = destX * scaleFactor
val srcY = destY * scaleFactor
var sum = 0
for (xi in 0 until scaleFactor) {
for (yi in 0 until scaleFactor) {
sum += src[srcX + xi, srcY + yi]
}
}
dst[destX, destY] = sum / (scaleFactor * scaleFactor)
}
}
Where Image is a convenient wrapper around the image data buffer:
class Image(val buffer: ByteArray, val width: Int) {
val height = buffer.size / width
val sizeBytes get() = buffer.size
constructor(w: Int, h: Int) : this(ByteArray(w * h), w)
operator fun get(x: Int, y: Int): Byte = buffer[clampX(x) * width + clampY(y)]
operator fun set(x: Int, y: Int, value: Int) {
buffer[x * width + y] = (value and 0xFF).toByte()
}
private fun clampX(x: Int) = max(min(x, width), 0)
private fun clampY(y: Int) = max(min(y, height), 0)
}
Also, with this approach you can easily implement many image processing functions, which based on convolution operation, like blur and edge detection.
Related
For educational purposes I want to implement the 1-dimensional Perlin Noise algorithm in Kotlin. I familiarized myself with the algorithm here and here.
I think I understood the basic concept, however my implementation can return values greater than 1. I expect the result of the call perlin(x) to be in the range 0 to 1. I can't figure out where I'm mistaken, so maybe someone can point me in the right direction. For simplicity I use simple linear interpolation instead of smoothstep or other advanced techniques for now.
class PerlinNoiseGenerator(seed: Int, private val boundary: Int = 10) {
private var random = Random(seed)
private val noise = DoubleArray(boundary) {
random.nextDouble()
}
fun perlin(x: Double, persistence: Double = 0.5, numberOfOctaves: Int = 8): Double {
var total = 0.0
for (i in 0 until numberOfOctaves) {
val amplitude = persistence.pow(i) // height of the crests
val frequency = 2.0.pow(i) // number of crests per unit distance
val octave = amplitude * noise(x * frequency)
total += octave
}
return total
}
private fun noise(t: Double): Double {
val x = t.toInt()
val x0 = x % boundary
val x1 = if (x0 == boundary - 1) 0 else x0 + 1
val between = t - x
val y0 = noise[x0]
val y1 = noise[x1]
return lerp(y0, y1, between)
}
private fun lerp(a: Double, b: Double, alpha: Double): Double {
return a + alpha * (b - a)
}
}
For example if you would use these random generated noises
private val noise = doubleArrayOf(0.77, 0.02, 0.63, 0.74, 0.49, 0.22, 0.19, 0.76, 0.16, 0.08)
You would end up with an image like this:
where the green line is the calculated Perlin Noise of 8 octaves with a persistence of 0.5. As you can see the sum of all octaves at x=0 for example is greater than 1. (The blue line being the first octave noise(x) and the orange one being the second octave 0.5 * noise(2x)).
What am I doing wrong?
Thanks in advance.
Note: I'm aware that the Simplex Noise algorithm is the successor of Perlin Noise, however for educational purposes I want to implement Perlin Noise first. I'm also aware that my boundary should be set to something in the magnitude of 256 but for simplicity I just used 10 for now.
I've been digging around and found this article which introduces a value to normalize the results returned by Perlin(x). Essentially the amplitudes are summed up and the total is divided by this value. This seems to make sense since we could have "bad luck" and have a y-value of 1.0 in the first octave, followed by a 0.5 in the next, etc. So dividing by the sum of the amplitudes (1.5 in this case with 2 octaves) seems reasonable to keep the values in the range 0 - 1.
However, I'm unsure if this is the preferred way since none of the other resource uses this technique.
The modified code would look like this:
fun perlin(x: Double, persistence: Double = 0.5, numberOfOctaves: Int = 8): Double {
var total = 0.0
var amplitudeSum = 0.0 //used for normalizing results to 0.0 - 1.0
for (i in 0 until numberOfOctaves) {
val amplitude = persistence.pow(i) // height of the crests
val frequency = 2.0.pow(i) // frequency (number of crests per unit distance) doubles per octave
val octave = amplitude * noise(x * frequency)
total += octave
amplitudeSum += amplitude
}
return total / amplitudeSum
}
I am learning Kotlin and I am facing a challenge here:
How can I round and resize two rectangles and four circles creating a square shape in Kotlin using canvas, until it gets a ball or a perfect square?
We have this code already:
import pt.isel.canvas.*
private fun Canvas.drawSquare(r: RoundSquare) {
erase()
val f = (r.side/2 * r.round/100f).toInt()
val pos = Position(r.center.x,r.center.y)
val square =
drawRect(pos.x-150, pos.y-100,r.side+100,r.side, r.color)
drawRect(pos.x-100, pos.y-150, r.side, r.side+100, r.color)
drawCircle(pos.x-100, pos.y-100, f, r.color)
drawCircle(pos.x+100, pos.y-100, f, r.color)
drawCircle(pos.x-100, pos.y+100, f, r.color)
drawCircle(pos.x+100, pos.y+100, f, r.color)
return square
}
fun main () {
onStart {
val cv = Canvas(600, 400, WHITE)
var roundSquare = RoundSquare(Position(300, 200), 200, 50, GREEN)
cv.drawSquare(roundSquare)
cv.drawText(10,400,"center=(${roundSquare.center.x},${roundSquare.center.y}) side=${roundSquare.side} round=${roundSquare.round}% color=0x${roundSquare.color.toString(16).padStart(6, '0').toUpperCase()}",BLACK,15)
cv.onMouseDown {
roundSquare = roundSquare.copy(center = Position(it.x, it.y))
cv.drawSquare(roundSquare)
return#onMouseDown cv.drawText(10,390,"center=(${roundSquare.center.x},${roundSquare.center.y}) side=${roundSquare.side} round=${roundSquare.round}% color=0x${roundSquare.color.toString(16).padStart(6, '0').toUpperCase()}",BLACK,15)
}
cv.onKeyPressed {
roundSquare = roundSquare.processKey(it.char)
cv.drawSquare(roundSquare)
return#onKeyPressed cv.drawText(10,400,"center=(${roundSquare.center.x},${roundSquare.center.y}) side=${roundSquare.side} round=${roundSquare.round}% color=0x${roundSquare.color.toString(16).padStart(6, '0').toUpperCase()}",BLACK,15)
}
onFinish { println("Bye") }
}
}
import pt.isel.canvas.BLACK
import pt.isel.canvas.WHITE
data class Position (val x:Int, val y:Int)
data class RoundSquare (val center:Position, val side:Int, val round:Int, val color:Int)
val RANGE_SIZE = 10..400
val ROUND = 0..100
val RANDOM_COLOR = BLACK..WHITE
fun RoundSquare.processKey(key: Char) = when {
key=='r' && round > ROUND.first -> copy(round = round - 1, side = side -1)
key=='R' && round < ROUND.last -> copy(round = round + 1, side = side + 1)
key=='s' && side > RANGE_SIZE.first -> copy(side = side - 1, round = round - 1)
key=='S' && side < RANGE_SIZE.last -> copy(side = side + 1, round = round + 1)
key == 'c' -> copy(color = RANDOM_COLOR.random())
else -> this
}
But it doesn't give me the output I need. This is the output:
Which can be resized till it shows a perfect ball or perfect square, by resizing sides and rounding circles.
If anyone could help me, I would really appreciate it.
Thanks in advance,
Let rounded shape center is (cx, cy), halfsize is hs.
Left x-coordinate is lx = cx - hs
Top y-coordinate is ty = cy - hs
Right x-coordinate is rx = cx + hs
Bottom y-coordinate is by = cy + hs
We want to change parameter t from 0 to 1 (or from 0 to 100%) to make needed roundness.
Circles radius is (round to integer if needed)
R = hs * t
Circle centers coordinates:
lx + R, ty + R
rx - R, ty + R
rx - R, by - R
lx + R, by - R
Two corners of rectangles:
(lx + R, ty) - (rx - R, by)
(lx, ty + R) - (rx, by - R)
I have 2 Functions. One uses BigInteger and BigDecimal. I want to calculate sin(z) using the Taylor series:
Here is my code:
fun sinus(z: BigDecimal, upperBound: Int = 100): BigDecimal = calcSin(z, upperBound)
fun cosinus(z: BigDecimal, upperBound: Int = 100): BigDecimal = calcSin(z, upperBound, false)
fun calcSin(z: BigDecimal, upperBound: Int = 100, isSin: Boolean = true): BigDecimal {
var erg: BigDecimal = BigDecimal.ZERO
for (n in 0..upperBound) {
// val zaehler = (-1.0).pow(n).toBigDecimal() * z.pow(2 * n + (if (isSin) 1 else 0))
// val nenner = fac(2 * n + (if (isSin) 1 else 0)).toBigDecimal()
val zaehler = (-1.0).pow(n).toBigDecimal() * z.pow(2 * n + 1)
val nenner = fac(2 * n + 1).toBigDecimal()
erg += (zaehler / nenner)
}
return erg
}
fun calcSin(z: Double, upperBound: Int = 100): Double {
var res = 0.0
for (n in 0..upperBound) {
val zaehler = (-1.0).pow(n) * z.pow(2 * n + 1)
val nenner = fac(2 * n + 1, true)
res += (zaehler / nenner)
}
return res
}
fun fac(n: Int): BigInteger = if (n == 0 || n == 1) BigInteger.ONE else n.toBigInteger() * fac(n - 1)
fun fac(n: Int, dummy: Boolean): Double = if (n == 0 || n == 1) 1.0 else n.toDouble() * fac(n - 1, dummy)
According to Google, Sin(1) is
0.8414709848
The Output of the following is however:
println("Sinus 1: ${sinus(1.0.toBigDecimal())}")
println("Sinus 1: ${sinus(1.0.toBigDecimal()).toDouble()}")
println("Sinus 1: ${sinus(1.0.toBigDecimal(), 1000)}")
println("Sinus 1: ${calcSin(1.0)}")
Output:
Sinus 1: 0.8414373208078281027995610599000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Sinus 1: 0.8414373208078281
Sinus 1: 0.8414373208078281027995610599000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Sinus 1: 0.8414709848078965
Wha am I missing? Why does the Double-Variant gives the correct value, while The BigDecimal doesn't? Even with 1000 Iterations.
The commented out code was meant for calculation Cos as well, but wanted to figure out that Problem first, so i made both Functions look the same
In the BigDecimal variant, try replacing erg += (zaehler / nenner) with erg += (zaehler.divide(nenner, 20, RoundingMode.HALF_EVEN))
I suspect that the defaults for scaling the division results (as described here https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/math/BigDecimal.html) are not what you want.
BTW - I assume that performance is not part of the exercise, otherwise your implementation of factorial is a low hanging fruit.
In Kotlin you can use if statements kind of like ternary operators.
We have the option to do something like this:
val x = if (isOdd) 1 else 2
but if we have multiple variables that need to be set based on some condition is it more correct to do it the old fashioned way like so:
val x: Int
val y: Int
val z: Int
if (isOdd) {
x = 1
y = 3
z = 5
} else {
x = 2
y = 4
z = 6
}
or like this :
val x = if (isOdd) 1 else 2
val y = if (isOdd) 3 else 4
val z = if (isOdd) 5 else 6
the second way looks much cleaner to me, but I'd like to know if the first method would be computed faster since it only needs to calculate the condition once whereas the second way needs to check the condition 3 times.
Is the second way actually slower or will it be optimized by the compiler?
I'd prefer something like this, looks way more Kotlinesque:
data class Point3D(val x: Int, val y: Int, val z: Int)
fun foo(isOdd: Boolean): Point3D = if (isOdd) Point3D(1, 3, 5) else Point3D(2, 4, 6)
//or using destructureing see https://kotlinlang.org/docs/reference/multi-declarations.html)
val (x,y,z) = if (isOdd) Triple(1, 3, 5) else Triple(2, 4, 6)
Also it combines the best of both, using if as expression and only one if is needed. (At the cost of an additional object allocation).
But to answer your question. Do what you like and think is most readable. Performance wise I doubt you will make a difference.
if is an expression in Kotlin, not a statement: it returns a value, whereas it doesn't in Java's case.
I don't think here is such an optimization issue you should ever think about, honestly. Premature optimization is a common source of problems. If this boolean variable is thread-confined, then I think the compiler will perform all the optimizations that are possible in this context, so it will be almost no overhead at all (if not completely).
Wise choice in OO languages is to prefer clearness and flexibility over low-level optimization issues (especially when compilers are able to resolve them).
Okay, so just saw this question again and got curious... So I did some tests.
Turns out there is actually a HUGE difference, heres the results:
Code
fun main() {
for (i in 0 until 3) {
val t1_s = System.currentTimeMillis()
for (j in 0 until 100000) {
when (i){
0 -> a(j % 2 == 0)
1 -> b(j % 2 == 0)
2 -> c(j % 2 == 0)
}
}
val t1_e = System.currentTimeMillis()
println("Test $i - time ${t1_e - t1_s}")
}
}
fun a(isOdd: Boolean): Int {
val x: Int
val y: Int
val z: Int
if (isOdd) {
x = 1
y = 3
z = 5
} else {
x = 2
y = 4
z = 6
}
return x + y + z
}
fun b(isOdd: Boolean): Int {
val x = if (isOdd) 1 else 2
val y = if (isOdd) 3 else 4
val z = if (isOdd) 5 else 6
return x + y + z
}
fun c(isOdd: Boolean): Int {
val (x,y,z) = if (isOdd) Triple(1, 3, 5) else Triple(2, 4, 6)
return x + y + z
}
Output
Test 0 - time 3
Test 1 - time 1
Test 2 - time 8
It seems my second solution is the fastest, my first suggestion next, and the top answer as MUCH slower.
Does any one know why this might be? Obviously these are milliseconds so it almost always wouldn't matter, but it is neat to think that one method is 5-10 times faster
EDIT:
So tried bumptin the iterations up to 100000000 and the results were:
Test 0 - time 6
Test 1 - time 41
Test 2 - time 941
I Guess the first 2 options are getting optimized out but the third option is always creating a new object making it much slow
Try it online!
I am trying to populate a circumference with points located at equal intervals. Here is the code (it uses some Processing, but it is not crucial for understanding):
class Circle (x: Float, y: Float, subdivisions: Int, radius: Float) extends WorldObject(x, y) {
def subs = subdivisions
def r = radius
val d = r + r
def makePoints() : List[Glyph] = {
val step = PConstants.TWO_PI / subdivisions
val points = List.make(subdivisions, new Glyph())
for(i <- 0 to subdivisions - 1) {
points(i) position (PApplet.cos(step * i) * r + xPos, PApplet.sin(step * i) * r + yPos)
}
points
}
val points: List[Glyph] = makePoints()
override def draw() {
applet fill 0
applet stroke 255
applet ellipse(x, y, d, d)
applet fill 255
points map(_.update())
}
}
class Glyph(x: Float, y: Float) extends WorldObject(x, y){
def this() = this(0, 0)
override def draw() {
applet ellipse(xPos, yPos, 10, 10)
}
}
object WorldObject {
}
abstract class WorldObject(var xPos: Float, var yPos: Float) {
def this() = this(0, 0)
def x = xPos
def y = yPos
def update() {
draw()
}
def draw()
def position(x: Float, y: Float) {
xPos = x
yPos = y
}
def move(dx: Float, dy: Float) {
xPos += dx
yPos += dy
}
}
The strange result that I get is that all the points are located at a single place. I have experimented with println checks... the checks in the makePoints() method shows everything ok, but checks in the Circle.draw() or even right after the makePoints() show the result as I see it on the screen - all points are located in a single place, right where the last of them is generated, namely x=430.9017 y=204.89435 for a circle positioned at x=400 y=300 and subdivided to 5 points. So somehow they all get collected into the place where the last of them sits.
Why is there such a behavior? What am I doing wrong?
UPD: We have been able to locate the reason, see below:
Answering the question, user unknown changed the code to use the fill method instead of make. The main relevant difference between them is that make pre-computes it's arguments and fill does not. Thus make fills the list with totally identical items. However, fill repeats the computation on each addition. Here are the source codes of these methods from Scala sources:
/** Create a list containing several copies of an element.
*
* #param n the length of the resulting list
* #param elem the element composing the resulting list
* #return a list composed of n elements all equal to elem
*/
#deprecated("use `fill' instead", "2.8.0")
def make[A](n: Int, elem: A): List[A] = {
val b = new ListBuffer[A]
var i = 0
while (i < n) {
b += elem
i += 1
}
b.toList
}
And the fill method:
/** Produces a $coll containing the results of some element computation a number of times.
* #param n the number of elements contained in the $coll.
* #param elem the element computation
* #return A $coll that contains the results of `n` evaluations of `elem`.
*/
def fill[A](n: Int)(elem: => A): CC[A] = {
val b = newBuilder[A]
b.sizeHint(n)
var i = 0
while (i < n) {
b += elem
i += 1
}
b.result
}
I changed a lot of variables forth and back (def x = ... => def x () = , x/ this.x and x/xPos and so on) added println statements and removed (P)applet-stuff, which made the compiler complain.
Providing a compilable, runnable, standalone demo would be beneficial. Here it is:
class Circle (x: Float, y: Float, subdivisions: Int, radius: Float)
extends WorldObject (x, y) {
def subs = subdivisions
def r = radius
val d = r + r
def makePoints() : List[Glyph] = {
// val step = PConstants.TWO_PI / subdivisions
val step = 6.283F / subdivisions
val points = List.fill (subdivisions) (new Glyph ())
for (i <- 0 to subdivisions - 1) {
// points (i) position (PApplet.cos (step * i) * r + xPos,
// PApplet.sin (step * i) * r + yPos)
val xx = (math.cos (step * i) * r).toFloat + xPos
val yy = (math.sin (step * i) * r).toFloat + yPos
println (xx + ": " + yy)
points (i) position (xx, yy)
}
points
}
val points: List [Glyph] = makePoints ()
override def draw () {
/*
applet fill 0
applet stroke 255
applet ellipse(x, y, d, d)
applet fill 255
*/
// println ("Circle:draw () upd-> " + super.x () + "\t" + y () + "\t" + d);
points map (_.update ())
println ("Circle:draw () <-upd " + x + "\t" + y + "\t" + d);
}
}
class Glyph (x: Float, y: Float) extends WorldObject (x, y) {
def this () = this (0, 0)
override def draw() {
// applet ellipse (xPos, yPos, 10, 10)
println ("Glyph:draw (): " + xPos + "\t" + yPos + "\t" + 10);
}
}
object Circle {
def main (as: Array [String]) : Unit = {
val c = new Circle (400, 300, 5, 100)
c.draw ()
}
}
object WorldObject {
}
abstract class WorldObject (var xPos: Float, var yPos: Float) {
def this () = this (0, 0)
def x = xPos
def y = yPos
def update () {
draw ()
}
def draw ()
def position (x: Float, y: Float) {
xPos = x
yPos = y
// println (x + " ?= " + xPos + " ?= " + (this.x ()))
}
def move (dx: Float, dy: Float) {
xPos += dx
yPos += dy
}
}
My result is:
500.0: 300.0
430.9052: 395.1045
319.10266: 358.78452
319.09177: 241.23045
430.8876: 204.88977
Glyph:draw (): 500.0 300.0 10
Glyph:draw (): 430.9052 395.1045 10
Glyph:draw (): 319.10266 358.78452 10
Glyph:draw (): 319.09177 241.23045 10
Glyph:draw (): 430.8876 204.88977 10
Circle:draw () <-upd 400.0 300.0 200.0
Can you spot the difference?
You should create a copy of your code, and stepwise remove code, which isn't necessary to reproduce the error, checking, whether the error is still present. Then you should reach a much smaller problem, or find the error yourself.