Recursive definition of infinite sequence in Kotlin - kotlin

I'm experimenting with the Kotlin sequences and particular the more complicated ones that are not simple calculations on the previous value.
One example I'd like to define is a sequence of all prime numbers.
An easy way to define the next prime is the next integer that is not divisible by any of the previous primes in the sequence.
In Scala this can be translated to:
def primeStream(s: Stream[Int]): Stream[Int] = s.head #:: primeStream(s.tail filter(_ % s.head != 0))
val primes = primeStream(Stream.from(2))
// first 20 primes
primes.take(20).toList
I'm having trouble translating this to Kotlin. In scala it works because you can pass function that returns a sequence that will be lazily evaluated but I can't do the same in Kotlin.
In Kotlin I tried
fun primes(seq: Sequence<Int>):Sequence<Int> = sequenceOf(seq.first()) + primes(seq.drop(1).filter {it % seq.first() != 0})
val primes = primes(sequence(2) {it + 1})
primes.take(20).toList()
But that obviously doesn't work because the function is evaluated straight away and leads to an infinite recursion.

The key point here is to implement a Sequence transformation so that its first item remains and the tail is lazily transformed from the original Sequence tail to something else. That is, the transformation is done only when the item is requested.
First, let's implement lazy sequence concatenation, which behaves like simple concatenation but the right operand is evaluated lazily:
public infix fun <T> Sequence<T>.lazyPlus(otherGenerator: () -> Sequence<T>) =
object : Sequence<T> {
private val thisIterator: Iterator<T> by lazy { this#lazyPlus.iterator() }
private val otherIterator: Iterator<T> by lazy { otherGenerator().iterator() }
override fun iterator() = object : Iterator<T> {
override fun next(): T =
if (thisIterator.hasNext())
thisIterator.next()
else
otherIterator.next()
override fun hasNext(): Boolean =
thisIterator.hasNext() || otherIterator.hasNext()
}
}
Laziness of otherIterator does all the trick: otherGenerator will be called only when otherIterator is accessed, that is, when the first sequence finishes.
Now, let's write a recursive variant of the sieve of Eratosthenes:
fun primesFilter(from: Sequence<Int>): Sequence<Int> = from.iterator().let {
val current = it.next()
sequenceOf(current) lazyPlus {
primesFilter(it.asSequence().filter { it % current != 0 })
}
}
Note that lazyPlus allowed us to lazily make another recursive call of primesFilter in the tail of the sequence.
After that, the whole sequence of primes can be expressed as
fun primes(): Sequence<Int> {
fun primesFilter(from: Sequence<Int>): Sequence<Int> = from.iterator().let {
val current = it.next()
sequenceOf(current) lazyPlus {
primesFilter(it.asSequence().filter { it % current != 0 })
}
}
return primesFilter((2..Int.MAX_VALUE).asSequence())
}
Though this approach isn't very fast. Evaluation of 10,000 primes takes a few seconds, however, the 1000th prime is emitted in about 0.1 second.

You can place the Sequence<Int> concatenation inside of a Sequence<Sequence<Int>> generator and then flatten it to a Sequence<Int> again:
fun primes(seq: Sequence<Int>): Sequence<Int> = sequence {
seq.take(1) + primes(seq.drop(1).filter { it % seq.first() != 0 })
}.flatMap { it }
val primes = primes(sequence(2) { it + 1 })
Output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
It seems a bit slow though. What you probably want is to cache each result in a list and build off of it instead of recalculating the primes recursively. e.g.:
fun primes() = with(arrayListOf(2, 3)) {
asSequence() + sequence(last() + 2) { it + 2 }
.filter { all { prime -> it % prime != 0 } }
.map { it.apply { add(it) } }
}

My current answer is not to use a recursive function. I can get still get an infinite sequence of primes by modelling the sequence as a pair of values with the first the prime number and the second the current filtered sequence. I then apply the map to only select the first element.
val primes = sequence(2 to sequence(3) {it + 2}) {
val currSeq = it.second
val nextPrime = currSeq.first()
nextPrime to currSeq.filter { it % nextPrime != 0}
}.map {it.first}

Related

Taking sequence elements fulfilling a predicate then continuing from there in Kotlin

In Kotlin sequences have a takeWhile function that will let you take items as long as they adhere to a given predicate. What I'd like to do is take items according to that predicate, use them in some way, then alter the predicate and take the next "batch". So far I haven't really found a way of doing this purely with what sequences and iterators offer.
Following snippet of code illustrates the problem. The primeGenerator() function returns a Sequence of prime (Long) numbers. Suppose that I want to make lists with each list having prime numbers with the same number of digits. On creating each list I'd use it for some purpose. If the list conforms to what I was searching the iteration can end, otherwise move onto the next list.
val primeIt = primeGenerator().iterator()
var digits = 1
var next: Long? = null
val currentList = ArrayList<Long>()
while (digits < 4) {
next?.also { currentList.add(it) }
next = primeIt.next()
if (next.toString().length > digits) {
println("Primes with $digits: $currentList")
currentList.clear()
digits++
}
}
In this case it ends once the number of digits exceeds 3. This works fine, but I was wondering if there is some way to achieve the same with operations chained purely on the sequence or an iterator of it. Basically chunking the sequence but based on a predicate rather than a set size. The prime number example above is just for illustration, I'm after the general principle, not something that'd only work for this case.
There are no such functions in standard library for large (or infinite) sequences, but you may write such function by yourself (although it requires some extra code):
class BufferedIterator<T>(private val iterator: Iterator<T>) : Iterator<T> {
var current: T? = null
private set
var reachedEnd: Boolean = false
private set
override fun hasNext(): Boolean = iterator.hasNext().also { reachedEnd = !it }
override fun next(): T = iterator.next().also { current = it }
}
fun <T> Iterator<T>.buffered() = BufferedIterator(this)
fun <T> BufferedIterator<T>.takeWhile(predicate: (T) -> Boolean): List<T> {
val list = ArrayList<T>()
if (reachedEnd) return list
current?.let {
if (predicate(it)) list += it
}
while (hasNext()) {
val next = next()
if (predicate(next)) list += next
else break
}
return list
}
fun main() {
val sequence = sequence {
var next = 0
while (true) {
yield(next++)
}
}
val iter = sequence.iterator().buffered()
for (i in 0..3) {
println(iter.takeWhile { it.toString().length <= i })
}
}
With this approach you can easily work even with infinite sequences.
I believe there is a way to accomplish what you want using the standard library. Limit the sequence first and then groupBy the number of digits.
val Int.numberOfDigits
get() = this.toString().length
sequenceOf(1,22,333).takeWhile{ it.numberOfDigits < 3 }.groupBy{ it.numberOfDigits }.values
If you want to avoid the eager evaluation of groupBy you could use groupingBy instead and then reduce potentially leaving the accumulator blank.
ardenit's answer seems like the best reusable approach. Since taking "chunks" of a sequence requires some state it doesn't seem likely something easily done in a purely functional manner. Delegating the state to a separate class enveloping the sequence makes sense.
Here's a small snippet showing what I ended up using. This assumes the sequence will not be empty and is (technically) infinite or further results aren't requested at some point.
class ChunkedIterator<T>(seq: Sequence<T>) {
private val it = seq.iterator()
var next: T = it.next()
fun next(predicate: (T) -> Boolean): List<T> {
val result = ArrayList<T>();
while (predicate.invoke(next)) {
result.add(next)
next = it.next();
}
return result
}
}
one way you could achieve this is by getting an iterator from your your original sequence and then building a new sequence out of it for each "take" -
val itr = seq.iterator()
val batch1 = itr.asSequence().takeWhile { predicate1(it) }.toList()
val batch2 = itr.asSequence().takeWhile { predicate2(it) }.toList()

How can I take varying chunks out of a Kotlin Sequence?

If I have a Kotlin sequence, every invocation of take(n) restarts the sequence.
val items = generateSequence(0) {
if (it > 9) null else it + 1
}
#Test fun `take doesn't remember position`() {
assertEquals(listOf(0, 1), items.take(2).toList())
assertEquals(listOf(0, 1, 2), items.take(3).toList())
}
Is there an easy way of write say, another(n) such that
#Test fun `another does remember position`() {
assertEquals(listOf(0, 1), items.another(2).toList())
assertEquals(listOf(2, 3, 4), items.another(3).toList())
}
I suppose that I have to have something that isn't the Sequence to keep the state, so maybe what I'm actually asking for is a nice definition of fun Iterator<T>.another(count: Int): List<T>
Sequence does not remember its position, but its iterator does remember:
val iterator : Iterator<Int> = items.iterator()
Now all you need is something like take(n) but for Iterator<T>:
public fun <T> Iterator<T>.another(n: Int): List<T> {
require(n >= 0) { "Requested element count $n is less than zero." }
if (n == 0) return emptyList()
var count = 0
val list = ArrayList<T>(n)
for (item in this) {
list.add(item)
if (++count == n)
break
}
return list
}
What about this:
#Test
fun `another does remember position`() {
val items: Sequence<Int> = generateSequence(0) {
if (it > 9) null else it + 1
}
val (first, rest) = items.another(2)
assertEquals(listOf(0, 1), first.toList())
assertEquals(listOf(2, 3, 4), rest.another(3).first.toList())
}
fun <T> Sequence<T>.another(n: Int): Pair<Sequence<T>, Sequence<T>> {
return this.take(n) to this.drop(n)
}
To answer the last part of your question:
I suppose that I have to have something that isn't the Sequence to keep the state, so maybe what I'm actually asking for is a nice definition of fun Iterator.another(count: Int): List
One such implementation would be:
fun <T> Iterator<T>.another(count: Int): List<T> {
val collectingList = mutableListOf<T>()
while (hasNext() && collectingList.size < count) {
collectingList.add(next())
}
return collectingList.toList()
}
This passes your test if you use the iterator produced by the sequence:
#Test
fun `another does remember position`() {
val items = generateSequence(0) {
if (it > 9) null else it + 1
}.iterator() //Use the iterator of this sequence.
assertEquals(listOf(0, 1), items.another(2))
assertEquals(listOf(2, 3, 4), items.another(3))
}
To me what you've described is an iterator, since it's something that allows you to go over a collection or sequence etc. but also remember its last position.
NB the implementation above wasn't written to take into consideration what should happen for non-positive counts passed in, and if the count is larger than what's left to iterate over you'll be returned a list which has smaller size than n. I suppose you could consider this an exercise for yourself :-)
Sequence does not remember its position, but its iterator does remember:
val iterator : Iterator<Int> = items.iterator()
Unfortunately there is no take(n) for an iterator, so to use the one from stdlib you need to wrap iter into an Iterable:
val iterable : Iterable<Int> = items.iterator().asIterable()
fun <T> Iterator<T>.asIterable() : Iterable<T> = object : Iterable<T> {
private val iter = this#asIterable
override fun iterator() = iter
}
That makes itareble.take(n) remember its position, but unfortunately there is a of-by-one error because the standard .take(n) asks for one element too many:
public fun <T> Iterable<T>.take(n: Int): List<T> {
require(n >= 0) { "Requested element count $n is less than zero." }
if (n == 0) return emptyList()
if (this is Collection<T>) {
if (n >= size) return toList()
if (n == 1) return listOf(first())
}
var count = 0
val list = ArrayList<T>(n)
for (item in this) {
if (count++ == n)
break
list.add(item)
}
return list.optimizeReadOnlyList()
}
That can be fixed with a little tweak:
public fun <T> Iterable<T>.take2(n: Int): List<T> {
require(n >= 0) { "Requested element count $n is less than zero." }
if (n == 0) return emptyList()
if (this is Collection<T>) {
if (n >= size) return toList()
if (n == 1) return listOf(first())
}
var count = 0
val list = ArrayList<T>(n)
for (item in this) {
list.add(item)
//count++
if (++count == n)
break
}
return list
}
Now both of you tests pass:
#Test fun `take does not remember position`() {
assertEquals(listOf(0, 1), items.take2(2).toList())
assertEquals(listOf(0, 1, 2), items.take2(3).toList())
}
#Test fun `another does remember position`() {
assertEquals(listOf(0, 1), iter.take2(2).toList())
assertEquals(listOf(2, 3, 4), iter.take2(3).toList())
}
You could create a function generateStatefulSequence which creates a sequence which keeps its state by using a second sequence's iterator to provide the values.
The iterator is captured in the closure of that function.
On each iteration the seed lambda ({ i.nextOrNull() }) of the returned sequence starts off with the next value provided by the iterator.
// helper
fun <T> Iterator<T>.nextOrNull() = if(hasNext()) { next() } else null
fun <T : Any> generateStatefulSequence(seed: T?, nextFunction: (T) -> T?): Sequence<T> {
val i = generateSequence(seed) {
nextFunction(it)
}.iterator()
return generateSequence(
seedFunction = { i.nextOrNull() },
nextFunction = { i.nextOrNull() }
)
}
Usage:
val s = generateStatefulSequence(0) { if (it > 9) null else it + 1 }
println(s.take(2).toList()) // [0, 1]
println(s.take(3).toList()) // [2, 3, 4]
println(s.take(10).toList()) // [5, 6, 7, 8, 9, 10]
Try it out
Here is a nice definition of fun Iterator<T>.another(count: Int): List<T> as requested:
fun <T> Iterator<T>.another(count: Int): List<T> =
if (count > 0 && hasNext()) listOf(next()) + this.another(count - 1)
else emptyList()
As another workaround (similar to the suggestion by Willi Mentzel above) would be to create a asStateful() extension method that converts any sequence into a one that will remember the position, by wrapping it into an Iterable that always yields the same iterator.
class StatefulIterable<out T>(wrapped: Sequence<T>): Iterable<T> {
private val iterator = wrapped.iterator()
override fun iterator() = iterator
}
fun <T> Sequence<T>.asStateful(): Sequence<T> = StatefulIterable(this).asSequence()
Then you can do:
val items = generateSequence(0) {
if (it > 9) null else it + 1
}.asStateful()
#Test fun `stateful sequence does remember position`() {
assertEquals(listOf(0, 1), items.take(2).toList())
assertEquals(listOf(2, 3, 4), items.take(3).toList())
}
Try it here: https://pl.kotl.in/Yine8p6wn

Required and one-of-more idioms

Kotlin DSL support is great, but I ran into two scenarios I can only add workaround. Both workaround has its major drawback as they enforce constraints only at execution time.
First constraint: required parameter
I would like to write something like this:
start {
position {
random {
rect(49, 46, 49, 47)
rect(50, 47, 51, 48)
point(51, 49)
}
}
}
where position is a required parameter. My approach is to set the position to null at startup and checking it when building the start object.
Second constraint: one of many
I would like to allow exactly one of several possible sub objects:
start {
position {
random {
[parameters of random assign]
}
}
}
or
start {
position {
user {
[parameters of user assign]
}
}
}
I have a feeling that I reached the edge of possibilities of the Kotlin DSL toolkit, because this requirements are also only compile time validated in the core language as well.
Any idea?
You can take inspiration from Kotlin own HTML DSL. For mandatory arguments use simple functions with arguments, not function literal with a receiver.
Your DSL will look something like this:
start(
position {// This is mandatory
random {// This is not
}
}
)
And your start builder:
fun start(position: Position): Start {
val start = Start(position)
...
return start
}
Use same approach for position().
After some thought of the problem, I realized, that these two requirements can't be solved in Kotlin itself, therefore no pure syntactical solution is possible in the current form introduced above. However, there are a few options which may produce close enough syntax and addresses one or both problems at the same time.
Option 1: Parameters
This solution is quite simple and ugly, adding the awful "where-is-the-closing-parenthesis" anomaly. It simply moves the position property into constructor:
start(random {
rect(49, 46, 49, 47)
rect(50, 47, 51, 48)
point(51, 49)
}) {
windDirection to NORTH
boat turn (BEAM_REACH at STARBOARD)
}
This is simple in code:
fun start(pos : StartPosition, op: StartConfigBuilder.() -> Unit) : StartConfigBuilder
= StartConfigBuilder(pos).apply(op)
and creates top level builder functions for the position implementations:
fun random( op : RandomStartPositionBuilder.() -> Unit) = RandomStartPositionBuilder().apply(op).build()
class RandomStartPositionBuilder {
private val startZoneAreas = mutableListOf<Area>()
fun rect(startRow: Int, startColumn: Int, endRow: Int = startRow, endColumn: Int) =
startZoneAreas.add(Area(startRow, startColumn, endRow, endColumn))
fun point(row: Int, column: Int) = startZoneAreas.add(Area(row, column))
fun build() = RandomStartPosition(if (startZoneAreas.isEmpty()) null else Zone(startZoneAreas))
}
fun user( op : UserStartPositionBuilder.() -> Unit) = UserStartPositionBuilder().apply(op).build()
class UserStartPositionBuilder {
fun build() = UserStartPosition()
}
Although this solves both required and only one problems on edit time, makes the DSL much harder to read and we loose the elegance of the DSL tools. It will become even more messy if more than one properties have to be moved into the constructor or as the internal object (position) becomes more complicated.
Option 2: Infix function
This solution moves the required complex field outside the block (this is the "nasty" part) and uses it as an infix function:
start {
windDirection to NORTH
boat turn (BEAM_REACH at STARBOARD)
} position random {
rect(49, 46, 49, 47)
rect(50, 47, 51, 48)
point(51, 49)
}
or
start {
windDirection to NORTH
boat turn (BEAM_REACH at STARBOARD)
} position user {
}
This solution solves the "only one" problem, but not the "exactly one".
To achieve this, I modified the builders:
//Note, that the return value is the builder: at the end, we should call build() later progmatically
fun start(op: StartConfigBuilder.() -> Unit) : StartConfigBuilder = StartConfigBuilder().apply(op)
class StartConfigBuilder {
private var position: StartPosition = DEFAULT_START_POSITION
private var windDirectionVal: InitialWindDirection = RandomInitialWindDirection()
val windDirection = InitialWindDirectionBuilder()
val boat = InitialHeadingBuilder()
infix fun position(pos : StartPosition) : StartConfigBuilder {
position = pos
return this
}
fun build() = StartConfig(position, windDirection.value, boat.get())
}
// I have to move the factory function top level
fun random( op : RandomStartPositionBuilder.() -> Unit) = RandomStartPositionBuilder().apply(op).build()
class RandomStartPositionBuilder {
private val startZoneAreas = mutableListOf<Area>()
fun rect(startRow: Int, startColumn: Int, endRow: Int = startRow, endColumn: Int) =
startZoneAreas.add(Area(startRow, startColumn, endRow, endColumn))
fun point(row: Int, column: Int) = startZoneAreas.add(Area(row, column))
fun build() = RandomStartPosition(if (startZoneAreas.isEmpty()) null else Zone(startZoneAreas))
}
// Another implementation
fun user( op : UserStartPositionBuilder.() -> Unit) = UserStartPositionBuilder().apply(op).build()
class UserStartPositionBuilder {
fun build() = UserStartPosition()
}
This solves the problem of "only-one" implementation in an almost elegant way, but gives no answer to the "required property" option. So it is good when default value could be applied, but still gives only parse time exception when the position is missing.
Options 3: Chain of infix functions
This solution is a variant of the previous. To address the required issue of the previous, we use a variable and an intermediate class:
var start : StartWithPos? = null
class StartWithoutPos {
val windDirection = InitialWindDirectionBuilder()
val boat = InitialHeadingBuilder()
}
class StartWithPos(val startWithoutPos: StartWithoutPos, pos: StartPosition) {
}
fun start( op: StartWithoutPos.() -> Unit): StartWithoutPos {
val res = StartWithoutPos().apply(op)
return res
}
infix fun StartWithoutPos.position( pos: StartPosition): StartWithPos {
return StartWithPos(this, pos)
}
Then we could write the following statement in DSL:
start = start {
windDirection to NORTH
boat heading NORTH
} position random {
}
This would solve both problems, but with the cost of an additional variable assignment.
All three solutions work, adds some dirt to DSL, but one might chose which fits better.

Find and return first match in nested lists in Kotlin?

Consider the following two classes:
class ObjectA(val objectBs: List<ObjectB>,
val otherFields: Any)
class ObjectB(val key: String,
val otherFields: Any)
The task is to find and return the first ObjectB with a certain key in a List of ObjectA.
Just achieving the goal is simple enough, but doing it nicely and efficiently seems rather tricky. I can't find anything like a "firstIn" or "findIn" function that would allow me to return another type than ObjectA when iterating on a list of ObjectA.
I have a few approaches, one of which looks pretty nice, but is very inefficient:
listOfA.mapNotNull {
it.objectBs.firstOrNull {
item -> item.key == wantedKey
}
}.firstOrNull()
The obvious inefficiency of this code is that it will not stop iterating through listOfA when it has found a match (and there can only be one match, just to be clear).
Approaches using filter or find have similar problems, requiring redundant iterations through at least one list of ObjectB.
Is there something in kotlins standard library that would cover such a use case?
If you want an elegant solution you can just do a flatMap like this:
val result: ObjectB? = listOfA.flatMap { it.objectBs }.firstOrNull { it.key == "myKey" }
If you want the efficiency you can do something like this:
val result: ObjectB? = objectAs.firstOrNull {
it.objectBs.map(ObjectB::key).contains("myKey")
}?.objectBs?.firstOrNull { it.key == "myKey" }
You can also wrap these in an Optional and put it in a function so the users of this operation can have a clean API:
fun List<ObjectA>.findFirstObjectB(key: String): Optional<ObjectB> {
return Optional.ofNullable(firstOrNull {
it.objectBs.map(ObjectB::key).contains(key)
}?.objectBs?.firstOrNull { it.key == key })
}
By converting all the nested elements to a flattened Sequence, they can be iterated lazily, and the overhead of unnecessary iteration is eliminated. This trick is done by combining asSequence and flatMap:
listOfA.asSequence().flatMap { it.objectBs.asSequence() }.find { it.key == wantedKey }
I wrote and ran the following code to ensure that it works as expected:
class PrintSequenceDelegate<out T>(private val wrappedSequence: Sequence<T>) : Sequence<T> by wrappedSequence {
override fun iterator(): Iterator<T> {
val wrappedIterator = wrappedSequence.iterator()
return object : Iterator<T> by wrappedIterator {
override fun next(): T =
wrappedIterator.next().also { println("Retrieving: $it") }
}
}
}
fun <T> Sequence<T>.toPrintDelegate() = PrintSequenceDelegate(this)
fun main() {
val listOfLists = List(3) { i -> List(3) { j -> "$i$j" } }
println("List of lists: $listOfLists")
val found = listOfLists.asSequence().toPrintDelegate().flatMap { it.asSequence().toPrintDelegate() }.find { it == "11" }
println(if (found != null) "Found: $found" else "Not found")
}
Output:
List of lists: [[00, 01, 02], [10, 11, 12], [20, 21, 22]]
Retrieving: [00, 01, 02]
Retrieving: 00
Retrieving: 01
Retrieving: 02
Retrieving: [10, 11, 12]
Retrieving: 10
Retrieving: 11
Found: 11
Thus we see that the elements (12) after the element found in the containing nested list are not iterated, neither are the following nested lists ([20, 21, 22]).
Nothing fancy, but it does the job efficiently:
fun findBWithKey(listOfA: List<ObjectA>, wantedKey: String): ObjectB? {
listOfA.forEach {
it.objectBs.forEach { item ->
if(item.key == wantedKey){
return item
}
}
}
return null
}
I also like to use map and first, but doing the given task efficiently gets unecessary hard using those extension functions.
A simple flatMap does the trick:
listOfA.flatMap { it.objectBs }.first { it.key == wantedKey }
This will basically give you an intermediate List with all of them combined so that you can easily query the first matching one.
I would look in to coroutines or sequences if performance is critical.
You can optimize your code slightly by using firstOrNull on listOfA as well:
listOfA.filterNotNull().firstOrNull { item ->
item.objectBs.firstOrNull { it.key == wantedKey } != null
}
I would do some performance testing to see if this code is causing any issues before making it overly complex.

Kotlin: Convert large List to sublist of set partition size

I'm looking for a function equivalent to Groovy's collate which would partition a large List into batches for processing. I did see subList which could be adapted into a similar function but wanted to check and make sure I wasn't missing an in-built or crazy simple alternative to rolling my own.
With Kotlin 1.3, according to your needs, you may choose one of the following ways to solve your problem.
#1. Using chunked
fun main() {
val list = listOf(2, 4, 3, 10, 8, 7, 9)
val newList = list.chunked(2)
//val newList = list.chunked(size = 2) // also works
print(newList)
}
/*
prints:
[[2, 4], [3, 10], [8, 7], [9]]
*/
#2. Using windowed
fun main() {
val list = listOf(2, 4, 3, 10, 8, 7, 9)
val newList = list.windowed(2, 2, true)
//val newList = list.windowed(size = 2, step = 2, partialWindows = true) // also works
println(newList)
}
/*
prints:
[[2, 4], [3, 10], [8, 7], [9]]
*/
NOTE: For Kotlin 1.2 and newer, please see the chunked and windowed functions that are now in the standard library. There is no need for a custom solution.
Here is an implementation of a lazy batching extension function which will take a collection, or anything that can become a Sequence and return a Sequence of List each of that size, with the last one being that size or smaller.
Example usage to iterate a list as batches:
myList.asSequence().batch(5).forEach { group ->
// receive a Sequence of size 5 (or less for final)
}
Example to convert batches of List to Set:
myList.asSequence().batch(5).map { it.toSet() }
See the first test case below for showing the output given specific input.
Code for the function Sequence<T>.batch(groupSize):
public fun <T> Sequence<T>.batch(n: Int): Sequence<List<T>> {
return BatchingSequence(this, n)
}
private class BatchingSequence<T>(val source: Sequence<T>, val batchSize: Int) : Sequence<List<T>> {
override fun iterator(): Iterator<List<T>> = object : AbstractIterator<List<T>>() {
val iterate = if (batchSize > 0) source.iterator() else emptyList<T>().iterator()
override fun computeNext() {
if (iterate.hasNext()) setNext(iterate.asSequence().take(batchSize).toList())
else done()
}
}
}
Unit tests proving it works:
class TestGroupingStream {
#Test fun testConvertToListOfGroupsWithoutConsumingGroup() {
val listOfGroups = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).asSequence().batch(2).toList()
assertEquals(5, listOfGroups.size)
assertEquals(listOf(1,2), listOfGroups[0].toList())
assertEquals(listOf(3,4), listOfGroups[1].toList())
assertEquals(listOf(5,6), listOfGroups[2].toList())
assertEquals(listOf(7,8), listOfGroups[3].toList())
assertEquals(listOf(9,10), listOfGroups[4].toList())
}
#Test fun testSpecificCase() {
val originalStream = listOf(1,2,3,4,5,6,7,8,9,10)
val results = originalStream.asSequence().batch(3).map { group ->
group.toList()
}.toList()
assertEquals(listOf(1,2,3), results[0])
assertEquals(listOf(4,5,6), results[1])
assertEquals(listOf(7,8,9), results[2])
assertEquals(listOf(10), results[3])
}
fun testStream(testList: List<Int>, batchSize: Int, expectedGroups: Int) {
var groupSeenCount = 0
var itemsSeen = ArrayList<Int>()
testList.asSequence().batch(batchSize).forEach { groupStream ->
groupSeenCount++
groupStream.forEach { item ->
itemsSeen.add(item)
}
}
assertEquals(testList, itemsSeen)
assertEquals(groupSeenCount, expectedGroups)
}
#Test fun groupsOfExactSize() {
testStream(listOf(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15), 5, 3)
}
#Test fun groupsOfOddSize() {
testStream(listOf(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18), 5, 4)
testStream(listOf(1,2,3,4), 3, 2)
}
#Test fun groupsOfLessThanBatchSize() {
testStream(listOf(1,2,3), 5, 1)
testStream(listOf(1), 5, 1)
}
#Test fun groupsOfSize1() {
testStream(listOf(1,2,3), 1, 3)
}
#Test fun groupsOfSize0() {
val testList = listOf(1,2,3)
val groupCountZero = testList.asSequence().batch(0).toList().size
assertEquals(0, groupCountZero)
val groupCountNeg = testList.asSequence().batch(-1).toList().size
assertEquals(0, groupCountNeg)
}
#Test fun emptySource() {
listOf<Int>().asSequence().batch(1).forEach { groupStream ->
fail()
}
}
}
A more simplistic/functional-style solution would be
val items = (1..100).map { "foo_${it}" }
fun <T> Iterable<T>.batch(chunkSize: Int) =
withIndex(). // create index value pairs
groupBy { it.index / chunkSize }. // create grouping index
map { it.value.map { it.value } } // split into different partitions
items.batch(3)
Note 1: Personally I'd prefer partition as a method name here, but it's already present in Kotlin's stdlib to separate a lists into 2 parts given a predicate.
Note 2: The the iterator solution from Jayson may scale better than this solution for large collections.
In Kotlin 1.2 M2 and later you can use chunked and windowed (see Kotlin 1.2 M2 is out | Kotlin Blog). Note that there are Sequence variances too (see kotlin.sequences - Kotlin Programming Language).
For versions of Kotlin prior to 1.2 M2 I recommend using Lists.partition(List, int) from google-guava (it uses java.util.List.subList(int, int)):
If you are unfamiliar with Guava see CollectionUtilitiesExplained ยท google/guava Wiki for more details.
You can create your own Kotlin extension function for it if you want:
fun <T> List<T>.collate(size: Int): List<List<T>> = Lists.partition(this, size)
If you want an extension function for mutable lists then in a separate Kotlin file (to avoid platform declaration clashes):
fun <T> MutableList<T>.collate(size: Int): List<MutableList<T>> = Lists.partition(this, size)
If you want something lazy loaded like in Jayson Minard's answer you can use Iterables.partition(Iterable, int). You might also be interested in Iterables.paddedPartition(Iterable, int) if you want to pad the last sublist if it is smaller than the specified size. These return Iterable<List<T>> (I don't see much point in making it Iterable<Iterable<T>> as subList returns an efficient view).
If for some reason you don't want to depend on Guava you can roll your own pretty easily using the subList function you mentioned:
fun <T> List<T>.collate(size: Int): List<List<T>> {
require(size > 0)
return if (isEmpty()) {
emptyList()
} else {
(0..lastIndex / size).map {
val fromIndex = it * size
val toIndex = Math.min(fromIndex + size, this.size)
subList(fromIndex, toIndex)
}
}
}
or
fun <T> List<T>.collate(size: Int): Sequence<List<T>> {
require(size > 0)
return if (isEmpty()) {
emptySequence()
} else {
(0..lastIndex / size).asSequence().map {
val fromIndex = it * size
val toIndex = Math.min(fromIndex + size, this.size)
subList(fromIndex, toIndex)
}
}
}
Dummy Array
for (i in 0..49){
var data="java"
}
array.add(data)
Used:
var data=array?.chunked(15)
kotlin's method
There is unfortunately no built-in function for that yet and while functional and Sequence-based implementations from other answers look nice, if you just need is List of Lists, I'd suggest writing a little bit of ugly, imperative, but performant code.
This is my final result:
fun <T> List<T>.batch(chunkSize: Int): List<List<T>> {
if (chunkSize <= 0) {
throw IllegalArgumentException("chunkSize must be greater than 0")
}
val capacity = (this.size + chunkSize - 1) / chunkSize
val list = ArrayList<ArrayList<T>>(capacity)
for (i in 0 until this.size) {
if (i % chunkSize == 0) {
list.add(ArrayList(chunkSize))
}
list.last().add(this.get(i))
}
return list
}