Which one is more efficient using kotlin? - kotlin

I have a collection with objects that contain a value field and I need reduce information objective which one is more efficent or better and why?.
settings.filter {it.value != null }.forEach{
doSomething ....
}
settings.forEach{
if(it.value != null){
doSomething ...
}

filter allocates a list, so the second one will be faster. But if your list isn’t many hundreds of items long, the difference is negligible and you should choose what you think is more readable code. In this case I think the second one is easier to read anyway.

Here is the internal implementation of filter function used in Kotlin Collection.
public inline fun <T> Iterable<T>.filter(predicate: (T) -> Boolean): List<T> {
return filterTo(ArrayList<T>(), predicate) // New Array List Object Creation
}
public inline fun <T, C : MutableCollection<in T>> Iterable<T>.filterTo(destination: C, predicate: (T) -> Boolean): C {
for (element in this) if (predicate(element)) destination.add(element)
return destination
}
Here you can see, it creates new list. It creates an empty arraylist and add filtered elements to new list.
Adding to Tenfour04's answer, for small list you can use filter as its more idiomatic. If you need to go with optimal way, you can use non null check.
Also you do this more idiomatically like this,
settings.filterNotNull().forEach {} //It also create extra memory.
Or you can use create your own idiomatic foreach extension function filtering null values, without creating extra space
fun <T> Iterable<T?>.forEachNonNull(a: (T) -> Unit) {
for (i in this) {
if (i != null){
a.invoke(i)
}
}
}
You can use like this.
settings.forEachNonNull {
}

As other answers mention, the first example will create a temporary list in memory. In practice, this isn't usually worth worrying about — but as you say, if the list could be very big (say, tens of thousands of items or more) then it could become significant.
However, there's a ‘best of both worlds’ option, which is to use a sequence:
settings.asSequence()
.filterNotNull()
.forEach {
// doSomething ....
}
This looks like the first example (apart from the added asSequence() and line breaks), but performs about as well as the second. That's because sequences are evaluated lazily: in this case filterNotNull() doesn't create a new list, but adds an action that will be executed as part of the forEach. You can add futher processing steps in between, too, and nothing will actually get evaluated until it's needed.
There's a bit of overhead in setting it all up (which is why sequences aren't the default), but that overhead doesn't depend on the size of the list — so if you have big lists and/or lots of processing steps, it can save a lot of memory. (It can also save a lot of processing in cases where you're not using all the results, such as when the last operation is a find().)

Related

Kotlin. What is the best way to replace element in immutable list?

What is the best way to update specific item in immutable list. For example I have list of Item. And I have several ways to update list:
1.
fun List<Item>.getList(newItem: Item): List<Item> {
val items = this.toMutableList()
val index = items.indexOf(newItem)
if (index != -1) {
items[index ] = newItem
}
return items
}
fun List<Item>.getList(newItem: Card): List<Item> {
return this.map { item ->
if (item.id == newItem.id) newItem else item
}
}
The second option looks more concise and I like it more. However, in the second option, we will go through each element in the list, which is bad for me, because the list can contain many elements.
Please, is there a better way to fulfill my requirement?
You have a few options - you're already doing the "make a mutable copy and update it" approach, and the "make a copy by mapping each item and changing what you need" one.
Another typical approach is to kinda go half-and-half, copying the parts you need, and inserting the bits you want to change. You could do this by, for example, slicing the list around the element you want to change, and building your final list from those parts:
fun List<Item>.update(item: Item): List<Item> {
val itemIndex = indexOf(item)
return if (itemIndex == -1) this.toList()
else slice(0 until itemIndex) + item + slice(itemIndex+1 until size)
}
This way you get to take advantage of any efficiency from the underlying list copy methods, versus map which has to "transform" each item even if it ends up passing through the original.
But as always, it's best to benchmark to see how well these approaches actually perform! Here's a playground example - definitely not the best place to do benchmarking, but it can be instructive as a general ballpark if you run things a few times:
Mapping all elements: 2500 ms
Slicing: 1491 ms
Copy and update index: 611 ms
Broadly speaking, mapping takes 60-100% more time than the slice-and-combine approach. And slicing takes 2-3x longer than just a straight mutable copy and update.
Considering what you actually need to do here (get a copy of the list and change (up to) one thing) the last approach seems like the best fit! The others have their benefits depending on how you want to manipulate the list to produce the end result, but since you're barely doing anything here, they just add unnecessary overhead. And of course it depends on your use-case - the slicing approach for example uses more intermediate lists than the mapping approach, and that might be a concern in addition to raw speed.
If the verbosity in your first example bothers you, you could always write it like:
fun List<Item>.getList(newItem: Item): List<Item> =
this.toMutableList().apply {
val index = indexOf(newItem)
if (index != -1) set(index, newItem)
}
The second one looks ever so slightly better for performance, but they are both O(n), so it's not a big difference, and hardly worth worrying about. I would go for the second one because it's easier to read.
The first one iterates the list up to 2 times, but the second iteration breaks early once it finds the item. (The first iteration is to copy the list, but it is possibly optimized by the JVM to do a fast array copy under the hood.)
The second one iterates the list a single time, but it does have to do the ID comparison for each item in the list.
Side note: "immutable" is not really the right term for a List. They are called "read-only" Lists because the interface does not guarantee immutability. For example:
private val mutableList = mutableListOf<Int>()
val readOnlyList: List<Int> get() = mutableList
To an outside class, this List is read-only, but not immutable. Its contents might be getting changed internally in the class that owns the list. That would be kind of a fragile design, but it's possible. There are situations where you might want to use a MutableList for performance reasons and pass it to other functions that only expect a read-only List. As long as you don't mutate it while it is in use by that other class, it would be OK.
Another thing you could try is, as apparently each item has an id field that you are using to identify the item, to create a map from it, perform all your replacements on that map, and convert it back into a list. This is only useful if you can batch all the replacements you need to do, though. It will probably also change the order of the items in the list.
fun List<Item>.getList(newItem: Item) =
associateBy(Item::id)
.also { map ->
map[newItem.id] = newItem
}
.values
And then there’s also the possibility to convert your list into a Sequence: this way it will be lazily evaluated; every replacement you add with .map will create a new Sequence that refers to the old one plus your new mapping, and none of it will be evaluated until you run an operation that actually has to read the whole thing, like toList().
Another solution: if the list is truly immutable and not only read-only; or if its contents could change and you would like to see these changes in the resulting list, then you can also wrap the original list into another one. This is fairly easy to do in Kotlin:
fun main() {
val list = listOf(
Item(1, "1-orig"),
Item(2, "2-orig"),
Item(3, "3-orig"),
)
val list2 = list.getList(Item(2, "2-new"))
println(list2)
}
fun List<Item>.getList(newItem: Item): List<Item> {
val found = indexOfFirst { it.id == newItem.id }
if (found == -1) return this
return object : AbstractList<Item>() {
override val size = this#getList.size
override fun get(index: Int) = if (index == found) newItem else this#getList[index]
}
}
data class Item(val id: Int, val name: String)
This is very good for the performance if you don't plan to repeatedly modify resulting lists with further changes. It is O(1) to replace an item and it almost doesn't use any additional memory. However, if you plan to invoke getList() repeatedly on a resulting list, each time creating a new one, that would create a chain of lists, slowing down access to the data and preventing garbage collector to clean up replaced items (if you don't use the original list anymore). You can partially optimize this by detecting you invoke getItem() on your specific implementation, but even better, you can use already existing libraries that does this.
This pattern is called a persistent data structure and it is provided by the library kotlinx.collections.immutable. You can use it like this:
fun main() {
val list = persistentListOf(
Item(1, "1-orig"),
Item(2, "2-orig"),
Item(3, "3-orig"),
)
val list2 = list.set(1, Item(2, "2-new"))
println(list2)
}
By the way, it seems strange to keep a list of items where we identify them by their ids. Did you consider using a map instead?

How do I map non-null elements without an assertion in Kotlin?

I am trying to find better way to chain the filter and map operators in Kotlin. What I want to do is to filter the null items before going to the map operator.
I was able to chain them, but the compiler complained about the nullability of my list items.
class Person(val age : String?)
fun foo(age :String){
// require non-null age
}
The sample usage was:
val list = mutableListOf(Person("3"), Person("2"))
list.filter{ it.age != null }.map{ foo(it.age) }
// The IDE wants me to add !!
So why can't Kotlin infer the nullability? The filtered (all non-null) items passed down to map should had been filtered to ensure that they are non-null.
You can replace filter and map with one method mapNotNull:
val list2 = list.mapNotNull { it.age }
This case may seem easy for a human, but technically speaking it would be really hard for the compiler to understand that after filtering it is a list of people objects, but with different type of the age property than original.
If you don't use a whole people instance at map() stage then I think the easiest would be to do:
list
.mapNotNull { it.age }
.map(::foo)
Or, if your foo() can't return nulls:
list.mapNotNull { it.age?.let(::foo) }
But I think this is less readable. Or you can just use !! - it's not that bad if we know what we're doing.
You can use the Iterable<T>.filterNotNull() extension function here which will return a list of the non-nullable type.
In your case, the compiler just isn't advanced enough to smart-cast the filtered list, it would be quite a lot to ask. So if you need to use filter specifically you would have to add an assertion.

Kotlin Collection: indexOfFirst vs find

It might be a stupid question, but I am not sure whether to use indexOfFirst() or find() as both "Returns the first element matching the given [predicate]". The only difference is that one returns -1 and other null. When should I use indexOfFirst() or find(). Is there any advantage of one over other. Consider the following code snippet.
private val mPersonList = mutableListOf<Person>()
private fun findPerson(person: Person) {
val position = mPersonList.indexOfFirst { it.name == person.name }
if (position != -1) {
doSomethingWithPerson(mPersonList[position])
}
}
private fun findPersonWithFind(person: Person) {
val foundPerson = mPersonList.find { it.name == person.name }
foundPerson?.let { doSomethingWithPerson(it) }
}
private fun doSomethingWithPerson(foundPerson: Person) {
//Do something
}
Both functions do nearly the same thing: they both locate the first matching item in a list or array (i.e. the first one for which the given predicate returns true).
The differences between them are subtle:
Most obviously, indexOfFirst() gives the index of the matching item, while find() gives the item itself.
Obviously, if you have the index, you can easily get the matching item.  (And, if the list is random-access, such as an ArrayList, then that's very efficient — much less so if it's not, such as a LinkedList.)  Whereas if you only have the item, then you can't find its index without calling find, indexOf, or indexOfFirst again!
So if you need to know the index, then only indexOfFirst() will do; but if you don't, then find() may be marginally simpler.
The code in the question falls into the latter category: findPerson() gets the position but uses it only to index into the list.  So that's a little more long-winded, and (if the list isn't random-access) potentially a lot slower, than findPersonWithFind().
Second, as you say, if no matching item is found, indexOfFirst() returns -1, while find() returns null.
Kotlin provides many ways to use nulls safely (such as the safe-call ?. operator, the elvis ?: operator, smart-casting, extension functions on nullable receivers, and many helpful functions in the standard library).  But there are no equivalents for dealing with -1, so using find() is likely to make it easier to safely handle the not-found case.
By the way, the nullability is made clear in the alternative name for find(), which is firstOrNull() — though that also has overloads which take no predicate and simply return the very first item in the list if it's not empty.  (The standard library is moving toward …OrNull() function names, probably because it makes the nullability very clear, especially when it's a common naming convention.)
So, which one you use depends on your needs.
It's also worth being aware of some related functions.  All of them have equivalents which find the last matching item: findLast()/lastOrNull(), and indexOfLast().
There's also the older indexOf() function, which checks for (equality with) a given object, instead of using a predicate.  (That, too, returns -1 if not found, which is probably why indexOfFirst() and indexOfLast() do the same.)  Though if the list is sorted, a binarySearch() or binarySearchBy is likely to be a lot faster than a full scan.

Speed up filter of large list in Kotlin

I've got an ArrayList with about 4000 Pair<Int,Int>s (points on a grid).
At one point I need to get the points in a certain range of x and y coordinates.
My code so far is:
val points: ArrayList = // ...
val xRange: IntRange = x: Int - spacingX: Int .. x: Int + spacingX: Int
val yRange: IntRange = y: Int - spacingY: Int .. y: Int + spacingY: Int
val nearPoints: ArrayList<Point<Int, Int>> = points.filter { xRange.contains(it.first) && yRange.contains(it.second) }
It is considerably faster than iterating over the entire list, but I hoped to further speed up the process.
Is it possible to get the nearPoints: ArrayList faster, through another construct? I've read about Sequence, but it seems to be better for multiple operations, rather than pure filtering.
Using an ArrayList guarantees constant time complexity O(1) (per accessed element) when iterating and since contains of IntRange already makes a range check
override fun contains(value: Int): Boolean = first <= value && value <= last
and does not search for a particular element, I don't think you can further speed it up.
Note: It would be more idiomatic to use in instead of contains.
The sequence in Kotlin makes the process work lazily. At the JVM level, you will have a class like Iterable that would use an Iterator from the ArrayList to apply the filter.
The best is to profile the code on real data (but 4000 items is probably not many at all) and see where are the bottlenecks.
You need to place the points into ArrayList. It means you do not need the laziness at all. I would vote to use the .filter { .. } inline function on the ArrayList. The lambda is inlined into the code in that case, there is no method call per element. Check the bytecode. Probably, you may even replace Ranges with comparisons too.
Should you need more speed - you may try to replace ArrayList> with primitive types, e.g. use IntArray or LongArray (you may encode two Int's as one Long. But please, profile existing code before

How to choose between asIterable() vs asSequence()? [duplicate]

Both of these interfaces define only one method
public operator fun iterator(): Iterator<T>
Documentation says Sequence is meant to be lazy. But isn't Iterable lazy too (unless backed by a Collection)?
The key difference lies in the semantics and the implementation of the stdlib extension functions for Iterable<T> and Sequence<T>.
For Sequence<T>, the extension functions perform lazily where possible, similarly to Java Streams intermediate operations. For example, Sequence<T>.map { ... } returns another Sequence<R> and does not actually process the items until a terminal operation like toList or fold is called.
Consider this code:
val seq = sequenceOf(1, 2)
val seqMapped: Sequence<Int> = seq.map { print("$it "); it * it } // intermediate
print("before sum ")
val sum = seqMapped.sum() // terminal
It prints:
before sum 1 2
Sequence<T> is intended for lazy usage and efficient pipelining when you want to reduce the work done in terminal operations as much as possible, same to Java Streams. However, laziness introduces some overhead, which is undesirable for common simple transformations of smaller collections and makes them less performant.
In general, there is no good way to determine when it is needed, so in Kotlin stdlib laziness is made explicit and extracted to the Sequence<T> interface to avoid using it on all the Iterables by default.
For Iterable<T>, on contrary, the extension functions with intermediate operation semantics work eagerly, process the items right away and return another Iterable. For example, Iterable<T>.map { ... } returns a List<R> with the mapping results in it.
The equivalent code for Iterable:
val lst = listOf(1, 2)
val lstMapped: List<Int> = lst.map { print("$it "); it * it }
print("before sum ")
val sum = lstMapped.sum()
This prints out:
1 2 before sum
As said above, Iterable<T> is non-lazy by default, and this solution shows itself well: in most cases it has good locality of reference thus taking advantage of CPU cache, prediction, prefetching etc. so that even multiple copying of a collection still works good enough and performs better in simple cases with small collections.
If you need more control over the evaluation pipeline, there is an explicit conversion to a lazy sequence with Iterable<T>.asSequence() function.
Completing hotkey's answer:
It is important to notice how Sequence and Iterable iterates throughout your elements:
Sequence example:
list.asSequence().filter { field ->
Log.d("Filter", "filter")
field.value > 0
}.map {
Log.d("Map", "Map")
}.forEach {
Log.d("Each", "Each")
}
Log result:
filter - Map - Each; filter - Map - Each
Iterable example:
list.filter { field ->
Log.d("Filter", "filter")
field.value > 0
}.map {
Log.d("Map", "Map")
}.forEach {
Log.d("Each", "Each")
}
filter - filter - Map - Map - Each - Each
Iterable is mapped to the java.lang.Iterable interface on the
JVM, and is implemented by commonly used collections, like List or
Set. The collection extension functions on these are evaluated
eagerly, which means they all immediately process all elements in
their input and return a new collection containing the result.
Here’s a simple example of using the collection functions to get the
names of the first five people in a list whose age is at least 21:
val people: List<Person> = getPeople()
val allowedEntrance = people
.filter { it.age >= 21 }
.map { it.name }
.take(5)
Target platform: JVMRunning on kotlin v. 1.3.61 First, the age check
is done for every single Person in the list, with the result put in a
brand new list. Then, the mapping to their names is done for every
Person who remained after the filter operator, ending up in yet
another new list (this is now a List<String>). Finally, there’s one
last new list created to contain the first five elements of the
previous list.
In contrast, Sequence is a new concept in Kotlin to represent a lazily
evaluated collection of values. The same collection extensions are
available for the Sequence interface, but these immediately return
Sequence instances that represent a processed state of the date, but
without actually processing any elements. To start processing, the
Sequence has to be terminated with a terminal operator, these are
basically a request to the Sequence to materialize the data it
represents in some concrete form. Examples include toList, toSet,
and sum, to mention just a few. When these are called, only the
minimum required number of elements will be processed to produce the
demanded result.
Transforming an existing collection to a Sequence is pretty
straightfoward, you just need to use the asSequence extension. As
mentioned above, you also need to add a terminal operator, otherwise
the Sequence will never do any processing (again, lazy!).
val people: List<Person> = getPeople()
val allowedEntrance = people.asSequence()
.filter { it.age >= 21 }
.map { it.name }
.take(5)
.toList()
Target platform: JVMRunning on kotlin v. 1.3.61 In this case, the
Person instances in the Sequence are each checked for their age, if
they pass, they have their name extracted, and then added to the
result list. This is repeated for each person in the original list
until there are five people found. At this point, the toList function
returns a list, and the rest of the people in the Sequence are not
processed.
There’s also something extra a Sequence is capable of: it can contain
an infinite number of items. With this in perspective, it makes sense
that operators work the way they do - an operator on an infinite
sequence could never return if it did its work eagerly.
As an example, here’s a sequence that will generate as many powers of
2 as required by its terminal operator (ignoring the fact that this
would quickly overflow):
generateSequence(1) { n -> n * 2 }
.take(20)
.forEach(::println)
You can find more here.
Iterable is good enough for most use cases, the way iteration is performed on them it works very well with caches because of the spatial locality. But the issue with them is that whole collection must pass through first intermediate operation before it moves to second and so on.
In sequence each item passes through the full pipeline before the next is handled.
This property can be determental to the performance of your code especially when iterating over large data set. so, if your terminal operation is very likely to terminate early then sequence should be preferred choice because you save by not performing unnecessary operations. for example
sequence.filter { getFilterPredicate() }
.map { getTransformation() }
.first { getSelector() }
In above case if first item satisfies the filter predicate and after map transformation meets the selection criteria then filter, map and first are invoked only once.
In case of iterable whole collection must first be filtered then mapped and then first selection starts