How to choose between asIterable() vs asSequence()? [duplicate] - kotlin

Both of these interfaces define only one method
public operator fun iterator(): Iterator<T>
Documentation says Sequence is meant to be lazy. But isn't Iterable lazy too (unless backed by a Collection)?

The key difference lies in the semantics and the implementation of the stdlib extension functions for Iterable<T> and Sequence<T>.
For Sequence<T>, the extension functions perform lazily where possible, similarly to Java Streams intermediate operations. For example, Sequence<T>.map { ... } returns another Sequence<R> and does not actually process the items until a terminal operation like toList or fold is called.
Consider this code:
val seq = sequenceOf(1, 2)
val seqMapped: Sequence<Int> = seq.map { print("$it "); it * it } // intermediate
print("before sum ")
val sum = seqMapped.sum() // terminal
It prints:
before sum 1 2
Sequence<T> is intended for lazy usage and efficient pipelining when you want to reduce the work done in terminal operations as much as possible, same to Java Streams. However, laziness introduces some overhead, which is undesirable for common simple transformations of smaller collections and makes them less performant.
In general, there is no good way to determine when it is needed, so in Kotlin stdlib laziness is made explicit and extracted to the Sequence<T> interface to avoid using it on all the Iterables by default.
For Iterable<T>, on contrary, the extension functions with intermediate operation semantics work eagerly, process the items right away and return another Iterable. For example, Iterable<T>.map { ... } returns a List<R> with the mapping results in it.
The equivalent code for Iterable:
val lst = listOf(1, 2)
val lstMapped: List<Int> = lst.map { print("$it "); it * it }
print("before sum ")
val sum = lstMapped.sum()
This prints out:
1 2 before sum
As said above, Iterable<T> is non-lazy by default, and this solution shows itself well: in most cases it has good locality of reference thus taking advantage of CPU cache, prediction, prefetching etc. so that even multiple copying of a collection still works good enough and performs better in simple cases with small collections.
If you need more control over the evaluation pipeline, there is an explicit conversion to a lazy sequence with Iterable<T>.asSequence() function.

Completing hotkey's answer:
It is important to notice how Sequence and Iterable iterates throughout your elements:
Sequence example:
list.asSequence().filter { field ->
Log.d("Filter", "filter")
field.value > 0
}.map {
Log.d("Map", "Map")
}.forEach {
Log.d("Each", "Each")
}
Log result:
filter - Map - Each; filter - Map - Each
Iterable example:
list.filter { field ->
Log.d("Filter", "filter")
field.value > 0
}.map {
Log.d("Map", "Map")
}.forEach {
Log.d("Each", "Each")
}
filter - filter - Map - Map - Each - Each

Iterable is mapped to the java.lang.Iterable interface on the
JVM, and is implemented by commonly used collections, like List or
Set. The collection extension functions on these are evaluated
eagerly, which means they all immediately process all elements in
their input and return a new collection containing the result.
Here’s a simple example of using the collection functions to get the
names of the first five people in a list whose age is at least 21:
val people: List<Person> = getPeople()
val allowedEntrance = people
.filter { it.age >= 21 }
.map { it.name }
.take(5)
Target platform: JVMRunning on kotlin v. 1.3.61 First, the age check
is done for every single Person in the list, with the result put in a
brand new list. Then, the mapping to their names is done for every
Person who remained after the filter operator, ending up in yet
another new list (this is now a List<String>). Finally, there’s one
last new list created to contain the first five elements of the
previous list.
In contrast, Sequence is a new concept in Kotlin to represent a lazily
evaluated collection of values. The same collection extensions are
available for the Sequence interface, but these immediately return
Sequence instances that represent a processed state of the date, but
without actually processing any elements. To start processing, the
Sequence has to be terminated with a terminal operator, these are
basically a request to the Sequence to materialize the data it
represents in some concrete form. Examples include toList, toSet,
and sum, to mention just a few. When these are called, only the
minimum required number of elements will be processed to produce the
demanded result.
Transforming an existing collection to a Sequence is pretty
straightfoward, you just need to use the asSequence extension. As
mentioned above, you also need to add a terminal operator, otherwise
the Sequence will never do any processing (again, lazy!).
val people: List<Person> = getPeople()
val allowedEntrance = people.asSequence()
.filter { it.age >= 21 }
.map { it.name }
.take(5)
.toList()
Target platform: JVMRunning on kotlin v. 1.3.61 In this case, the
Person instances in the Sequence are each checked for their age, if
they pass, they have their name extracted, and then added to the
result list. This is repeated for each person in the original list
until there are five people found. At this point, the toList function
returns a list, and the rest of the people in the Sequence are not
processed.
There’s also something extra a Sequence is capable of: it can contain
an infinite number of items. With this in perspective, it makes sense
that operators work the way they do - an operator on an infinite
sequence could never return if it did its work eagerly.
As an example, here’s a sequence that will generate as many powers of
2 as required by its terminal operator (ignoring the fact that this
would quickly overflow):
generateSequence(1) { n -> n * 2 }
.take(20)
.forEach(::println)
You can find more here.

Iterable is good enough for most use cases, the way iteration is performed on them it works very well with caches because of the spatial locality. But the issue with them is that whole collection must pass through first intermediate operation before it moves to second and so on.
In sequence each item passes through the full pipeline before the next is handled.
This property can be determental to the performance of your code especially when iterating over large data set. so, if your terminal operation is very likely to terminate early then sequence should be preferred choice because you save by not performing unnecessary operations. for example
sequence.filter { getFilterPredicate() }
.map { getTransformation() }
.first { getSelector() }
In above case if first item satisfies the filter predicate and after map transformation meets the selection criteria then filter, map and first are invoked only once.
In case of iterable whole collection must first be filtered then mapped and then first selection starts

Related

How to get last Key or Value in Kotlin Map collection

How can I get the last key or value in a Kotlin Map collection? It seems like it cannot be done by using an index value.
There's a couple ways it can be done. While you can't elegantly print a map directly, you may print it's entry set.
The first way, and the way that I DO NOT recommend, is by calling the .last() function on the entry set. This can be accomplished with testMap.entries.last(). The reason I don't recommend this method is because in real data this method is non-deterministic -- meaning there's no way to guarantee the characteristics of the value returned.
While I don't recommend this method, I don't know your application and this may be sufficient.
I DO recommend using the .sortedBy() function on your entry set, and then calling .last() on it. This allows you to make some sort of assumption about the results returned, something that is typically necessary, otherwise why do you want the last?
See this example comparing the two methods and then comparing the method against the order you would get if you iterate with the .forEach function:
fun main(args: Array<String>) {
val testMap = mutableMapOf<Long, String>()
testMap[1] = "Hello"
testMap[5] = "World"
testMap[3] = "Foobar"
println(testMap.entries.last())
println(testMap.entries.sortedBy { it.key }.last())
println("\norder via loop:")
testMap
.entries
.forEach {
println("\t$it")
}
}
Take a look at the output:
3=Foobar
5=World
order via loop:
1=Hello
5=World
3=Foobar
Here we see that the value returned from .last(), is the last value that was inserted into the map - the same happens with .forEach. This is okay, but usually we want our map to have some sort of order. In this example, i've called for the entry set to be sorted by the key value, so that our call to .last() on the entry set returns the key/value pair with the largest key.

Kotlin. What is the best way to replace element in immutable list?

What is the best way to update specific item in immutable list. For example I have list of Item. And I have several ways to update list:
1.
fun List<Item>.getList(newItem: Item): List<Item> {
val items = this.toMutableList()
val index = items.indexOf(newItem)
if (index != -1) {
items[index ] = newItem
}
return items
}
fun List<Item>.getList(newItem: Card): List<Item> {
return this.map { item ->
if (item.id == newItem.id) newItem else item
}
}
The second option looks more concise and I like it more. However, in the second option, we will go through each element in the list, which is bad for me, because the list can contain many elements.
Please, is there a better way to fulfill my requirement?
You have a few options - you're already doing the "make a mutable copy and update it" approach, and the "make a copy by mapping each item and changing what you need" one.
Another typical approach is to kinda go half-and-half, copying the parts you need, and inserting the bits you want to change. You could do this by, for example, slicing the list around the element you want to change, and building your final list from those parts:
fun List<Item>.update(item: Item): List<Item> {
val itemIndex = indexOf(item)
return if (itemIndex == -1) this.toList()
else slice(0 until itemIndex) + item + slice(itemIndex+1 until size)
}
This way you get to take advantage of any efficiency from the underlying list copy methods, versus map which has to "transform" each item even if it ends up passing through the original.
But as always, it's best to benchmark to see how well these approaches actually perform! Here's a playground example - definitely not the best place to do benchmarking, but it can be instructive as a general ballpark if you run things a few times:
Mapping all elements: 2500 ms
Slicing: 1491 ms
Copy and update index: 611 ms
Broadly speaking, mapping takes 60-100% more time than the slice-and-combine approach. And slicing takes 2-3x longer than just a straight mutable copy and update.
Considering what you actually need to do here (get a copy of the list and change (up to) one thing) the last approach seems like the best fit! The others have their benefits depending on how you want to manipulate the list to produce the end result, but since you're barely doing anything here, they just add unnecessary overhead. And of course it depends on your use-case - the slicing approach for example uses more intermediate lists than the mapping approach, and that might be a concern in addition to raw speed.
If the verbosity in your first example bothers you, you could always write it like:
fun List<Item>.getList(newItem: Item): List<Item> =
this.toMutableList().apply {
val index = indexOf(newItem)
if (index != -1) set(index, newItem)
}
The second one looks ever so slightly better for performance, but they are both O(n), so it's not a big difference, and hardly worth worrying about. I would go for the second one because it's easier to read.
The first one iterates the list up to 2 times, but the second iteration breaks early once it finds the item. (The first iteration is to copy the list, but it is possibly optimized by the JVM to do a fast array copy under the hood.)
The second one iterates the list a single time, but it does have to do the ID comparison for each item in the list.
Side note: "immutable" is not really the right term for a List. They are called "read-only" Lists because the interface does not guarantee immutability. For example:
private val mutableList = mutableListOf<Int>()
val readOnlyList: List<Int> get() = mutableList
To an outside class, this List is read-only, but not immutable. Its contents might be getting changed internally in the class that owns the list. That would be kind of a fragile design, but it's possible. There are situations where you might want to use a MutableList for performance reasons and pass it to other functions that only expect a read-only List. As long as you don't mutate it while it is in use by that other class, it would be OK.
Another thing you could try is, as apparently each item has an id field that you are using to identify the item, to create a map from it, perform all your replacements on that map, and convert it back into a list. This is only useful if you can batch all the replacements you need to do, though. It will probably also change the order of the items in the list.
fun List<Item>.getList(newItem: Item) =
associateBy(Item::id)
.also { map ->
map[newItem.id] = newItem
}
.values
And then there’s also the possibility to convert your list into a Sequence: this way it will be lazily evaluated; every replacement you add with .map will create a new Sequence that refers to the old one plus your new mapping, and none of it will be evaluated until you run an operation that actually has to read the whole thing, like toList().
Another solution: if the list is truly immutable and not only read-only; or if its contents could change and you would like to see these changes in the resulting list, then you can also wrap the original list into another one. This is fairly easy to do in Kotlin:
fun main() {
val list = listOf(
Item(1, "1-orig"),
Item(2, "2-orig"),
Item(3, "3-orig"),
)
val list2 = list.getList(Item(2, "2-new"))
println(list2)
}
fun List<Item>.getList(newItem: Item): List<Item> {
val found = indexOfFirst { it.id == newItem.id }
if (found == -1) return this
return object : AbstractList<Item>() {
override val size = this#getList.size
override fun get(index: Int) = if (index == found) newItem else this#getList[index]
}
}
data class Item(val id: Int, val name: String)
This is very good for the performance if you don't plan to repeatedly modify resulting lists with further changes. It is O(1) to replace an item and it almost doesn't use any additional memory. However, if you plan to invoke getList() repeatedly on a resulting list, each time creating a new one, that would create a chain of lists, slowing down access to the data and preventing garbage collector to clean up replaced items (if you don't use the original list anymore). You can partially optimize this by detecting you invoke getItem() on your specific implementation, but even better, you can use already existing libraries that does this.
This pattern is called a persistent data structure and it is provided by the library kotlinx.collections.immutable. You can use it like this:
fun main() {
val list = persistentListOf(
Item(1, "1-orig"),
Item(2, "2-orig"),
Item(3, "3-orig"),
)
val list2 = list.set(1, Item(2, "2-new"))
println(list2)
}
By the way, it seems strange to keep a list of items where we identify them by their ids. Did you consider using a map instead?

Which one is more efficient using kotlin?

I have a collection with objects that contain a value field and I need reduce information objective which one is more efficent or better and why?.
settings.filter {it.value != null }.forEach{
doSomething ....
}
settings.forEach{
if(it.value != null){
doSomething ...
}
filter allocates a list, so the second one will be faster. But if your list isn’t many hundreds of items long, the difference is negligible and you should choose what you think is more readable code. In this case I think the second one is easier to read anyway.
Here is the internal implementation of filter function used in Kotlin Collection.
public inline fun <T> Iterable<T>.filter(predicate: (T) -> Boolean): List<T> {
return filterTo(ArrayList<T>(), predicate) // New Array List Object Creation
}
public inline fun <T, C : MutableCollection<in T>> Iterable<T>.filterTo(destination: C, predicate: (T) -> Boolean): C {
for (element in this) if (predicate(element)) destination.add(element)
return destination
}
Here you can see, it creates new list. It creates an empty arraylist and add filtered elements to new list.
Adding to Tenfour04's answer, for small list you can use filter as its more idiomatic. If you need to go with optimal way, you can use non null check.
Also you do this more idiomatically like this,
settings.filterNotNull().forEach {} //It also create extra memory.
Or you can use create your own idiomatic foreach extension function filtering null values, without creating extra space
fun <T> Iterable<T?>.forEachNonNull(a: (T) -> Unit) {
for (i in this) {
if (i != null){
a.invoke(i)
}
}
}
You can use like this.
settings.forEachNonNull {
}
As other answers mention, the first example will create a temporary list in memory. In practice, this isn't usually worth worrying about — but as you say, if the list could be very big (say, tens of thousands of items or more) then it could become significant.
However, there's a ‘best of both worlds’ option, which is to use a sequence:
settings.asSequence()
.filterNotNull()
.forEach {
// doSomething ....
}
This looks like the first example (apart from the added asSequence() and line breaks), but performs about as well as the second. That's because sequences are evaluated lazily: in this case filterNotNull() doesn't create a new list, but adds an action that will be executed as part of the forEach. You can add futher processing steps in between, too, and nothing will actually get evaluated until it's needed.
There's a bit of overhead in setting it all up (which is why sequences aren't the default), but that overhead doesn't depend on the size of the list — so if you have big lists and/or lots of processing steps, it can save a lot of memory. (It can also save a lot of processing in cases where you're not using all the results, such as when the last operation is a find().)

Kotlin: How convert from Set to Map?

I have Set<FlagFilter> and I need to convert it to Map<Class<out FlagFilter>, FlagFilter>.
I tried doing it like this:
val result: Map<Class<out FlagFilter>, FlagFilter> =
target
.takeIf { !it.isEmpty() }
?.map { mapOf(it.javaClass to it) }
?: emptyMap<>()
but instead of a Map it turns out to be a List and I get a compilation error:
Type mismatch.
Required: Map<Class<out FlagFilter>, FlagFilter>
Found: List<Map<Class<FlagFilter>, FlagFilter>>
What am I doing wrong? As if there is not enough operation, but I do not understand yet which one
map isn't anything to do with the Map type - it's a functional programming term (coming from a broader mathematical concept) that basically means a function that maps each input value to an output value.
So it's a transformation or conversion that takes a collection of items, transforms each one, and results in another collection with the same number of items. In Kotlin, you get a List of items (unless you're working with a Sequence, in which case you get another Sequence that yields the same number of items).
It's worth getting familiar with the kotlin.collections package - there's lots of useful stuff in there! But each function has a specific purpose, in terms of how they process the collection and what they return:
map - returns a new value for each item
onEach - returns the original items (allows you to do something with each, then continue processing the collection)
forEach - returns nothing (allows you to do something with each, but as a final operation - you can't chain another operation, it's terminal)
filter - returns a subset of the original items, matching a predicate
first - returns a single item, matching a predicate
reduce - returns a single item, transforming the values to produce a single result
count - returns a single item, based on an attribute of the collection (not the values themselves)
There's more but you get the idea - some things transform, some things pass-through, some things give you an identically sized collection, or a potentially smaller one, or a single value. map is just the one that takes a collection, and gives you a collection of the same size where every item has been (potentially) altered.
Like Vadik says, use one of the associate* functions to turn values into a Map object, effectively transforming each item in the collection into a Map.Entry. Which is technically mapping it to a mapping, so I wasn't totally accurate earlier when I said it's nothing to do with Maps, but I figured I'd save this thought til the end ;)
Just use associateBy extension function:
val result: Map<Class<out FlagFilter>, FlagFilter> =
target.associateBy { it.javaClass }
Or if you want to fix your code, remove the excessive call of mapOf(). Just convert your Set to List of Pairs, and then call toMap() to create a map:
val result: Map<Class<out FlagFilter>, FlagFilter> =
target.map { it.javaClass to it }.toMap()

Kotlin: stream vs sequence - why multiple ways to do the same thing?

Is there a reason why there are multiple ways to do the same thing in Kotlin
val viaSequence = items.asSequence()
.filter { it%2 == 0 }
.map { it*2 }
.toList()
println(viaSequence)
val viaIterable = items.asIterable()
.filter { it%2 == 0 }
.map { it*2 }
.toList()
println(viaIterable)
val viaStream = items.stream()
.filter { it%2 == 0 }
.map { it*2 }
.toList()
println(viaStream)
I know that the following code creates a list on every step, which adds load to the GC, and as such should be avoided:
items.filter { it%2 == 0 }.map { it*2 }
Streams come from Java, where there are no inline functions, so Streams are the only way to use these functional chains on a collection in Java. Kotlin can do them directly on Iterables, which is better for performance in many cases because intermediate Stream objects don't need to be created.
Kotlin has Sequences as an alternative to Streams with these advantages:
They use null to represent missing items instead of Optional. Nullable values are easier to work with in Kotlin because of its null safety features. It's also better for performance to avoid wrapping all the items in the collection.
Some of the operators and aggregation functions are much more concise and avoid having to juggle generic types (compare Sequence.groupBy to Stream.collect).
There are more operators provided for Sequences, which result in performance advantages and simpler code by cutting out intermediate steps.
Many terminal operators are inline functions, so they omit the last wrapper that a Stream would need.
The sequence builder lets you create a complicated lazy sequence of items with simple sequential syntax in a coroutine. Very powerful.
They work back to Java 1.6. Streams require Java 8 or higher. This one is irrelevant for Kotlin 1.5 and higher since Kotlin now requires JDK 8 or higher.
The other answer mentions the advantages that Streams have.
Nice article comparing them here.
One of your three variants is the same as using the List itself:
items.asIterable()
.filter { it%2 == 0 }
Here, you're calling the exact same filter function as if you just called items.filter. The List itself is an Iterable so it's the Iterable filter that is called. This filter looks at all available elements and returns a complete list.
So the question is why we have both streams and sequences.
Streams are part of Java. Many of the terminal operations produce Optional. Other operations, such as toList(), will produce non-null-safe platform types such as List<Integer!>. On the other hand, sequences are native to Kotlin, and they can be used with Kotlin's own compile-time null-safety features. Also, sequences are available in non-JVM variants of Kotlin.
The Kotlin designers probably had to create a new class, because if they had just added new operations to Stream, e.g. as extension functions, they would clash with existing names (e.g. max() returns Optional in Java, and that's not ideal for Kotlin, but choosing a name other than the natural name max wouldn't be ideal either.)
So in most cases you should prefer sequences as they are more Kotlin-idiomatic. However, there are some things Java Streams can do that aren't yet available to sequences (for example, SummaryStatistics, or parallel operations). When you need an operation that is only available in Streams, but you have a Sequence, you can convert the Sequence to a Stream using asStream() (as well as vice versa).
Another advantage of Streams is that you can use primitive streams such as IntStream, to avoid unnecessary boxing/unboxing.