Functional way of checking if a list of list has duplicate elements - kotlin

I have a list of list of elements, and would like to check if there are any duplicates. I would also like to break early - I don't care what the duplicates are, nor if there are many of them, I just want to know if there is at least one.
An imperative way which fits the bill would be:
fun main() {
println(hasDuplicates(listOf(
listOf("1", "2", "3"),
listOf("4", "5"),
listOf("1", "2")
)))
}
fun hasDuplicates(input: List<List<String>>): Boolean {
val seen = mutableSetOf<String>()
input.forEach { inner ->
inner.forEach { element ->
if (!seen.add(element)) {
return true
}
}
}
return false
}
Another way, without explicit iteration, would be:
fun hasDuplicates(input: List<List<String>>): Boolean {
val flat = input.flatten()
return flat.size != flat.toSet().size
}
but this iterates the whole list, and even creates a flattened intermediary in the first step.
I have an idea, but don't know how to implement it: suppose I could map each (flattened) list element to the number of times it has already be seen. I have this so far:
fun hasDuplicates(input: List<List<String>>): Boolean {
return input.asSequence().flatten()
// .onEach {
// println("getting $it")
// }
.groupingBy { it }
.eachCount()
.any { (_, count) -> count > 1 }
}
It does what it should but it first iterates the whole list (uncomment the onEach intermediary to see) to collect the groups. The idea would incrementally emit the element and its count, like (for input list ["1", "2", "1"]:
// (element, seenCount)
("1", 0)
("2", 0)
("1", 1)
at which point I could simply check for seenCount > 0 and return early.
Any help? Any other ideas are also welcome.
UPDATE: Got this, not really the initial idea, but seems to work:
fun hasDuplicates(input: List<List<String>>): Boolean {
input.asSequence().flatten()
.onEach {
println("getting $it")
}
.fold(mutableSetOf<String>()) { seen, element ->
if (!seen.add(element)) {
return true
}
seen
}
return false
}
The above code performs slightly worse than the very first version with loops in the worst cade (no duplicates), pretty much the same in the best case (second element is the duplicate) and in the 'medium' case (middle element of the flattened list is a duplicate).

The idea described in the question can be implemented the following way:
fun hasDuplicates(input: List<List<String>>): Boolean {
input.asSequence().flatten()
// .onEach {
// println("getting $it")
// }
.groupingBy { it }
.aggregate { _, _: Int?, _, first ->
if (first) {
1
} else {
return true
}
}
return false
}
The type of the accumulator (Int? above) doesn't matter as it is unused.
But, the following solution is even better for me, as it allows me to return the set in the case that all is unique, which I need later:
fun uniqueOrNull(input: List<List<String>>): Set<String>? {
return input.asSequence().flatten()
.fold(mutableSetOf()) { seen, element ->
if (!seen.add(element)) {
return null
}
seen
}
}
Using aggregate would also work, but performs negligibly worse and is more complicated to the reader:
fun uniqueOrNull(input: List<List<String>>): Set<String>? {
return input.asSequence().flatten()
.groupingBy { it }
.aggregate { _, _: Int?, _, first ->
if (first) {
1
} else {
return null
}
}.keys
}

Related

In Kotlin, how can I test and use a value without computing it twice?

Every so often, I find myself wanting to compute a value for some sort of filter operation, but then wanting to use that value when it's already disappeared into the condition-checking thing.
For instance:
val found = list.firstOrNull { slowConversion(it).isWanted() }
if (found != null) {
something(found, slowConversion(found))
}
or
when {
other_conditions -> other_actions
list.any { it.contains(regex1) } -> something(list.firstOrNull { it.contains(regex1) } ?: "!!??")
}
For the slowConversion() I can work with a sequence mapped to pairs, although the terms first and second kinda confuse things a bit...
val pair = list.asSequence().map { it to slowConversion(it) }.firstOrNull { it.second.isWanted() }
if ( pair != null ) {
something(pair.first, pair.second)
}
or if I only want the conversion,
val converted = list.firstNotNullOfOrNull { slowConversion(it).takeIf { it.isWanted() } }
but the best I can come up with to avoid the when duplication involves moving the action part into the condition part!
fun case(s: List<String>, r: Regex) {
val match = s.firstOrNull { it.contains(r) }?.also { something(it) }
return match != null
}
when {
other_conditions -> other_actions
case(list, regex1) -> true
}
At this point, it seems I should just have a stack of function calls linked together with ||
other_things || case(list, regex1) || case(list, regex2) || catchAll(list)
Is there something better or more concise for either of these?
You can write your first example like this:
for(element in list) {
val result = slowConversion(element)
if(result.isWanted()) {
something(element, result)
break
}
}
This might not look very Kotlin-ish, but I think it's pretty straightforward & easy to understand.
For your second example, you can use the find function:
when {
other_conditions -> other_actions
else -> list.find { it.contains(regex1) }?.let(::something)
}
If you have multiple regexes, just iterate over them,
val regexes = listOf(regex1, regex2, ...)
for(regex in regexes) {
val element = list.find { it.contains(regex1) } ?: continue
something(element)
break
}

Find-first-and-transform for Sequence in Kotlin

I often stumble upon this problem but don't see a common implementation: how do I idiomatically (functionally) find an element, stop search after the match, and also return a different type (i.e. map whatever matched to another type)?
I've been able to do a workaround with
fun <F,T> Sequence<F>.mapFirst(block: (F) -> T?): T? =
fold(AtomicReference<T>()) { ref, from ->
if (ref.get() != null) return#fold ref
ref.set(block(from))
ref
}.get()
fun main() {
Files.list(someDir).asSequence().map { it.toFile() }.mapFirst { file ->
file.useLines { lines ->
lines.mapFirst { line ->
if (line == "123") line.toInt() else null
}
}
}?.let { num ->
println("num is $num") // will print 123 as an Int
} ?: println("not a single file had a line eq to '123'")
}
But that doesn't stop on the match (when block() returns non-null) and goes to consume all files and all their lines.
A simple for loop is enough to implement mapFirst:
fun <F,T> Sequence<F>.mapFirst(block: (F) -> T?): T? {
for (e in this) {
block(e)?.let { return it }
}
return null
}
If you need a solution without introducing your own extensions (though there's nothing wrong with it), you can use mapNotNull + firstOrNull combination:
files.asSequence()
.mapNotNull { /* read the first line and return not null if it's ok */ }
.firstOrNull()
I would not map the values you discard then, instead do it like this:
sequenceOf(1, 2, 3)
.firstOrNull() { it == 2 }
?.let { it * 2 } ?: 6
First you find the value that matches your condition, then you transform it too whatever you want. In case you don't find a matching element, you assign a default value (in this case 6).

Find and return first match in nested lists in Kotlin?

Consider the following two classes:
class ObjectA(val objectBs: List<ObjectB>,
val otherFields: Any)
class ObjectB(val key: String,
val otherFields: Any)
The task is to find and return the first ObjectB with a certain key in a List of ObjectA.
Just achieving the goal is simple enough, but doing it nicely and efficiently seems rather tricky. I can't find anything like a "firstIn" or "findIn" function that would allow me to return another type than ObjectA when iterating on a list of ObjectA.
I have a few approaches, one of which looks pretty nice, but is very inefficient:
listOfA.mapNotNull {
it.objectBs.firstOrNull {
item -> item.key == wantedKey
}
}.firstOrNull()
The obvious inefficiency of this code is that it will not stop iterating through listOfA when it has found a match (and there can only be one match, just to be clear).
Approaches using filter or find have similar problems, requiring redundant iterations through at least one list of ObjectB.
Is there something in kotlins standard library that would cover such a use case?
If you want an elegant solution you can just do a flatMap like this:
val result: ObjectB? = listOfA.flatMap { it.objectBs }.firstOrNull { it.key == "myKey" }
If you want the efficiency you can do something like this:
val result: ObjectB? = objectAs.firstOrNull {
it.objectBs.map(ObjectB::key).contains("myKey")
}?.objectBs?.firstOrNull { it.key == "myKey" }
You can also wrap these in an Optional and put it in a function so the users of this operation can have a clean API:
fun List<ObjectA>.findFirstObjectB(key: String): Optional<ObjectB> {
return Optional.ofNullable(firstOrNull {
it.objectBs.map(ObjectB::key).contains(key)
}?.objectBs?.firstOrNull { it.key == key })
}
By converting all the nested elements to a flattened Sequence, they can be iterated lazily, and the overhead of unnecessary iteration is eliminated. This trick is done by combining asSequence and flatMap:
listOfA.asSequence().flatMap { it.objectBs.asSequence() }.find { it.key == wantedKey }
I wrote and ran the following code to ensure that it works as expected:
class PrintSequenceDelegate<out T>(private val wrappedSequence: Sequence<T>) : Sequence<T> by wrappedSequence {
override fun iterator(): Iterator<T> {
val wrappedIterator = wrappedSequence.iterator()
return object : Iterator<T> by wrappedIterator {
override fun next(): T =
wrappedIterator.next().also { println("Retrieving: $it") }
}
}
}
fun <T> Sequence<T>.toPrintDelegate() = PrintSequenceDelegate(this)
fun main() {
val listOfLists = List(3) { i -> List(3) { j -> "$i$j" } }
println("List of lists: $listOfLists")
val found = listOfLists.asSequence().toPrintDelegate().flatMap { it.asSequence().toPrintDelegate() }.find { it == "11" }
println(if (found != null) "Found: $found" else "Not found")
}
Output:
List of lists: [[00, 01, 02], [10, 11, 12], [20, 21, 22]]
Retrieving: [00, 01, 02]
Retrieving: 00
Retrieving: 01
Retrieving: 02
Retrieving: [10, 11, 12]
Retrieving: 10
Retrieving: 11
Found: 11
Thus we see that the elements (12) after the element found in the containing nested list are not iterated, neither are the following nested lists ([20, 21, 22]).
Nothing fancy, but it does the job efficiently:
fun findBWithKey(listOfA: List<ObjectA>, wantedKey: String): ObjectB? {
listOfA.forEach {
it.objectBs.forEach { item ->
if(item.key == wantedKey){
return item
}
}
}
return null
}
I also like to use map and first, but doing the given task efficiently gets unecessary hard using those extension functions.
A simple flatMap does the trick:
listOfA.flatMap { it.objectBs }.first { it.key == wantedKey }
This will basically give you an intermediate List with all of them combined so that you can easily query the first matching one.
I would look in to coroutines or sequences if performance is critical.
You can optimize your code slightly by using firstOrNull on listOfA as well:
listOfA.filterNotNull().firstOrNull { item ->
item.objectBs.firstOrNull { it.key == wantedKey } != null
}
I would do some performance testing to see if this code is causing any issues before making it overly complex.

How can I `return` from inside of a call to `use`?

In Kotlin, this code compiles:
private fun bar(): Boolean = TODO()
fun works(): Int {
while (true) {
if (bar()) {
return 5
}
}
}
(This is a pared down example of my real code to illustrate the issue I'm running into.)
I actually need to use a file during this loop, and close on exit:
fun openFile(): InputStream = TODO()
fun doesnt_work(): Int {
openFile().use { input ->
while (true) {
if (bar()) {
return 5
}
}
}
} // line 42
This doesn't compile. I get the error:
Error:(42, 5) Kotlin: A 'return' expression required in a function with a block body ('{...}')
I've found two ways to work around this, but both are kind of awkward.
One way is to use a variable to hold the result, and break from the loop right when it's set:
fun works_but_awkward(): Int {
openFile().use { input ->
val result: Int
while (true) {
if (bar()) {
result = 5
break
}
}
return result
}
}
This is especially awkward in my real code, as I have a nested loop, and so I need to use a labelled break.
The other way to work around this is to have a named function for the loop:
fun workaround_with_named_function(): Int {
fun loop(input: InputStream): Int {
while (true) {
if (bar()) {
return 5
}
}
}
return openFile().use { loop(it) }
}
This seems a bit better, but I'm still surprised that the use abstraction is so leaky that I can't do an early return from within a loop. Is there a way to use use with an early return in a loop that's less awkward?
Cause Kotlin compiler isn't smart enough to undestand that use with code inside will return something from the function. The reason of such behavior is inability to guarantee compiler that lambda will be called exactly once.
Another way to workaround this is throwing exception in the end of the function:
fun doesnt_work(): Int {
openFile().use { input ->
while (true) {
if (bar()) {
return 5
}
}
}
throw IllegalStateException("Something goes wrong")
}
P.S. I am not sure, but seems it can be compiled without any hacks when contract system will be added to Kotlin. And it is probably going to be in version 1.3
This should work.
fun openFile(): InputStream = TODO()
fun doesnt_work(): Int {
return openFile().use { input ->
while (true) {
if (bar()) {
return#use 5
}
}
-1 // unreachable return value
// just to help Kotlin infer the return type
}
}
Remember, use is a function whose return value is exactly the same with the return value of the lambda. So returning the value (here it's 5) in the lambda and return the return value of use should work.
Also, if I were you, I'll write the function like this:
fun doesnt_work() = openFile().use { input ->
while (true) if (bar()) return#use 5
-1
}

RxJava Filter on Error

This question is loosely related to this question, but there were no answers. The answer from Bob Dalgleish is close, but doesn't support the potential error coming from a Single (which I think that OP actually wanted as well).
I'm basically looking for a way to "filter on error" - but don't think this exists when the lookup is RX based. I am trying to take a list of values, run them through a lookup, and skip any result that returns a lookup failure (throwable). I'm having trouble figuring out how to accomplish this in a reactive fashion.
I've tried various forms of error handling operators combined with mapping. Filter only works for raw values - or at least I couldn't figure out how to use it to support what I'd like to do.
In my use case, I iterate a list of IDs, requesting data for each from a remote service. If the service returns 404, then the item doesn't exist anymore. I should remove non-existing items from the local database and continue processing IDs. The stream should return the list of looked up values.
Here is a loose example. How do I write getStream() so that canFilterOnError passes?
import io.reactivex.Single
import io.reactivex.schedulers.Schedulers
import org.junit.Test
class SkipExceptionTest {
private val data: Map<Int, String> = mapOf(
Pair(1, "one"),
Pair(2, "two"),
Pair(4, "four"),
Pair(5, "five")
)
#Test
fun canFilterOnError() {
getStream(listOf(1, 2, 3, 4, 5))
.subscribeOn(Schedulers.trampoline())
.observeOn(Schedulers.trampoline())
.test()
.assertComplete()
.assertNoErrors()
.assertValueCount(1)
.assertValue {
it == listOf(
"one", "two", "four", "five"
)
}
}
fun getStream(list: List<Int>): Single<List<String>> {
// for each item in the list
// get it's value via getValue()
// if a call to getValue() results in a NotFoundException, skip that value and continue
// mutate the results using mutate()
TODO("not implemented")
}
fun getValue(id: Int): Single<String> {
return Single.fromCallable {
val value: String? = data[id]
if (value != null) {
data[id]
} else {
throw NotFoundException("dat with id $id does not exist")
}
}
}
class NotFoundException(message: String) : Exception(message)
}
First .materialize(), then .filter() on non-error events, then .dematerialize():
getStream(/* ... */)
.materialize()
.filter(notification -> { return !notification.isOnError(); })
.dematerialize()
I ended up mapping getValue() to Optional<String>, then calling onErrorResumeNext() on that and either returning Single.error() or Single.just(Optional.empty()). From there, the main stream could filter out the empty Optional.
private fun getStream(list: List<Int>): Single<List<String>> {
return Observable.fromIterable(list)
.flatMapSingle {
getValue(it)
.map {
Optional.of(it)
}
.onErrorResumeNext {
when (it) {
is NotFoundException -> Single.just(Optional.empty())
else -> Single.error(it)
}
}
}
.filter { it.isPresent }
.map { it.get() }
.toList()
}