How to yield all substrings from string using sequence? - kotlin

I'm trying to learn the Sequence in Kotlin.
Assume I want to get a sequence of all substrings of a string with the yield statement. I understand how to do this with two nested loops with the right and left borders.
It seems to me that there is an efficient way to use a Sequence or a pair of nested Sequences instead of loops. But I can't figure out how to do it.
How to yield all substrings from string using sequence?
Thanks

Frankly, I don't know what is the most efficient method. And I would just use for loops. But here's my solution to this problem, maybe it will help you understand sequences and this style of writing code:
Here it is on the Playground
fun String.substrings() =
indices.asSequence().flatMap { left ->
(left + 1..length).asSequence().map { right -> substring(left, right) }
}

Sequences aren't especially efficient, there's a bunch of overhead involved for each one - their main strength is being able to pass each element through the whole chain of operations one at a time.
This means you don't have to create an entire new collection of elements for each intermediate step (lower memory usage), you can terminate earlier once you find a result you're looking for, and sequences can be infinite. Even then, they might still be slower than the normal list version, depending on exactly what you're working with.
The most efficient sequence is probably what you're doing, using a couple of for loops and yielding items. But if you mean "efficient" like "using the standard library instead of writing out for loops" then #Furetur's answer is a way to do it, or you could use sliding windows like this:
val stuff = "12345"
val substrings = with(stuff) {
indices.asSequence().flatMap { i ->
windowedSequence(length - i)
}
}
print(substrings.toList())
>>>>[12345, 1234, 2345, 123, 234, 345, 12, 23, 34, 45, 1, 2, 3, 4, 5]
basically just using windowed (with the default of partialWindows=false) for every possible substring length, from length to 1, using the sequence versions of everything

Related

How to chain filter expressions together

I have data in the following format
ArrayList<Map.Entry<String,ByteString>>
[
{"a":[a-bytestring]},
{"b":[b-bytestring]},
{"a:model":[amodel-bytestring]},
{"b:model":[bmodel-bytestring]},
]
I am looking for a clean way to transform this data into the format (List<Map.Entry<ByteString,ByteString>>) where the key is the value of a and value is the value of a:model.
Desired output
List<Map.Entry<ByteString,ByteString>>
[
{[a-bytestring]:[amodel-bytestring]},
{[b-bytestring]:[bmodel-bytestring]}
]
I assume this will involve the use of filters or other map operations but am not familiar enough with Kotlin yet to know this
It's not possible to give an exact, tested answer without access to the ByteString class — but I don't think that's needed for an outline, as we don't need to manipulate byte strings, just pass them around. So here I'm going to substitute Int; it should be clear and avoid any dependencies, but still work in the same way.
I'm also going to use a more obvious input structure, which is simply a map:
val input = mapOf("a" to 1,
"b" to 2,
"a:model" to 11,
"b:model" to 12)
As I understand it, what we want is to link each key without :model with the corresponding one with :model, and return a map of their corresponding values.
That can be done like this:
val output = input.filterKeys{ !it.endsWith(":model") }
.map{ it.value to input["${it.key}:model"] }.toMap()
println(output) // Prints {1=11, 2=12}
The first line filters out all the entries whose keys end with :model, leaving only those without. Then the second creates a map from their values to the input values for the corresponding :model keys. (Unfortunately, there's no good general way to create one map directly from another; here map() creates a list of pairs, and then toMap() creates a map from that.)
I think if you replace Int with ByteString (or indeed any other type!), it should do what you ask.
The only thing to be aware of is that the output is a Map<Int, Int?> — i.e. the values are nullable. That's because there's no guarantee that each input key has a corresponding :model key; if it doesn't, the result will have a null value. If you want to omit those, you could call filterValues{ it != null } on the result.
However, if there's an ‘orphan’ :model key in the input, it will be ignored.

Creating 4 digit number with no repeating elements in Kotlin

Thanks to #RedBassett for this Ressource (Kotlin problem solving): https://kotlinlang.org/docs/tutorials/koans.html
I'm aware this question exists here:
Creating a 4 digit Random Number using java with no repetition in digits
but I'm new to Kotlin and would like to explore the direct Kotlin features.
So as the title suggests, I'm trying to find a Kotlin specific way to nicely solve generate a 4 digit number (after that it's easy to make it adaptable for length x) without repeating digits.
This is my current working solution and would like to make it more Kotlin. Would be very grateful for some input.
fun createFourDigitNumber(): Int {
var fourDigitNumber = ""
val rangeList = {(0..9).random()}
while(fourDigitNumber.length < 4)
{
val num = rangeList().toString()
if (!fourDigitNumber.contains(num)) fourDigitNumber +=num
}
return fourDigitNumber.toInt()
}
So the range you define (0..9) is actually already a sequence of numbers. Instead of iterating and repeatedly generating a new random, you can just use a subset of that sequence. In fact, this is the accepted answer's solution to the question you linked. Here are some pointers if you want to implement it yourself to get the practice:
The first for loop in that solution is unnecessary in Kotlin because of the range. 0..9 does the same thing, you're on the right track there.
In Kotlin you can call .shuffled() directly on the range without needing to call Collections.shuffle() with an argument like they do.
You can avoid another loop if you create a string from the whole range and then return a substring.
If you want to look at my solution (with input from others in the comments), it is in a spoiler here:
fun getUniqueNumber(length: Int) = (0..9).shuffled().take(length).joinToString('')
(Note that this doesn't gracefully handle a length above 10, but that's up to you to figure out how to implement. It is up to you to use subList() and then toString(), or toString() and then substring(), the output should be the same.)

Large list literals in Kotlin stalling/crashing compiler

I'm using val globalList = listOf("a1" to "b1", "a2" to "b2") to create a large list of Pairs of strings.
All is fine until you try to put more than 1000 Pairs into a List. The compiler either takes > 5 minutes or just crashes (Both in IntelliJ and Android Studio).
Same happens if you use simple lists of Strings instead of Pairs.
Is there a better way / best practice to include large lists in your source code without resorting to a database?
You can replace a listOf(...) expression with a list created using a constructor or a factory function and adding the items to it:
val globalList: List<Pair<String, String>> = mutableListOf().apply {
add("a1" to "b1")
add("a2" to "b2")
// ...
}
This is definitely a simpler construct for the compiler to analyze.
If you need something quick and dirty instead of data files, one workaround is to use a large string, then split and map it into a list. Here's an example mapping into a list of Ints.
val onCommaWhitespace = "[\\s,]+".toRegex() // in this example split on commas w/ whitespace
val bigListOfNumbers: List<Int> = """
0, 1, 2, 3, 4,
:
:
:
8187, 8188, 8189, 8190, 8191
""".trimIndent()
.split(onCommaWhitespace)
.map { it.toInt() }
Of course for splitting into a list of Strings, you'd have to choose an appropriate delimiter and regex that don't interfere with the actual data set.
There's no good way to do what you want; for something that size, reading the values from a data file (or calculating them, if that were possible) is a far better solution all round — more maintainable, much faster to compile and run, easier to read and edit, less likely to cause trouble with build tools and frameworks…
If you let the compiler finish, its output will tell you the problem.  (‘Always read the error messages’ should be one of the cardinal rules of development!)
I tried hotkey's version using apply(), and it eventually gave this error:
…
Caused by: org.jetbrains.org.objectweb.asm.MethodTooLargeException: Method too large: TestKt.main ()V
…
There's the problem: MethodTooLargeException.  The JVM allows only 65535 bytes of bytecode within a single method; see this answer.  That's the limit you're coming up against here: once you have too many entries, its code would exceed that limit, and so it can't be compiled.
If you were a real masochist, you could probably work around this to an extent by splitting the initialisation across many methods, keeping each one's code just under the limit.  But please don't!  For the sake of your colleagues, for the sake of your compiler, and for the sake of your own mental health…

Is there a way to merge 2 arrays in GREL

In a GREL expression, is there a way to merge 2 arrays?
I tried ["a","b"]+["c","d"] but the result is a java error.
Short answer: Not with Grel.
Here is the complete list of the "arrays" methods in Grel and their respective Java code. It should not be very difficult to add a "merge" or "append" method, but would it be worth it? It is very rare to have more than one array in a cell (I have never encountered this case).
It is precisely to solve this kind of rare but possible case that Open Refine offers two other more powerful scripting languages, Jython and Clojure. In Python/Jython, the operation you want to do is as simple as:
return [1,2,3] +[3,4,5] #result : [ 1, 2, 3, 3, 3, 4, 5 ]
Would it be possible/worth the effort to make it easier with some Grel new function?
There is a way to do it (though it might be a bad idea):
split(join(["a","b"], "|") + "|" + join(["c","d"], "|"), "|")
Join each array with a delimiter character that does not appear in the data. (I've chosen the pipe character.) Add the resulting joined-up arrays together, and add the delimiter between them. Now, they form the string a|b|c|d. This string can be split on the | delimiter into a new array.

Using a hash with object keys in Perl 6

I'm trying to make a Hash with non-string keys, in my case arrays or lists.
> my %sum := :{(1, 3, 5) => 9, (2, 4, 6) => 12}
{(1 3 5) => 9, (2 4 6) => 12}
Now, I don't understand the following.
How to retrieve an existing element?
> %sum{(1, 3, 5)}
((Any) (Any) (Any))
> %sum{1, 3, 5}
((Any) (Any) (Any))
How to add a new element?
> %sum{2, 4} = 6
(6 (Any))
Several things are going on here: first of all, if you use (1,2,3) as a key, Rakudo Perl 6 will consider this to be a slice of 3 keys: 1, 2 and 3. Since neither of these exist in the object hash, you get ((Any) (Any) (Any)).
So you need to indicate that you want the list to be seen as single key of which you want the value. You can do this with $(), so %sum{$(1,3,5)}. This however does not give you the intended result. The reason behind that is the following:
> say (1,2,3).WHICH eq (1,2,3).WHICH
False
Object hashes internally key the object to its .WHICH value. At the moment, Lists are not considered value types, so each List has a different .WHICH. Which makes them unfit to be used as keys in object hashes, or in other cases where they are used by default (e.g. .unique and Sets, Bags and Mixes).
I'm actually working on making this the above eq return True before long: this should make it to the 2018.01 compiler release, on which also a Rakudo Star release will be based.
BTW, any time you're using object hashes and integer values, you will probably be better of using Bags. Alas not yet in this case either for the above reason.
You could actually make this work by using augment class List and adding a .WHICH method on that, but I would recommend against that as it will interfere with any future fixes.
Elizabeth's answer is solid, but until that feature is created, I don't see why you can't create a Key class to use as the hash key, which will have an explicit hash function which is based on its values rather than its location in memory. This hash function, used for both placement in the list and equality testing, is .WHICH. This function must return an ObjAt object, which is basically just a string.
class Key does Positional {
has Int #.list handles <elems AT-POS EXISTS-POS ASSIGN-POS BIND-POS push>;
method new(*#list) { self.bless(:#list); }
method WHICH() { ObjAt.new(#!list.join('|')); }
}
my %hsh{Key};
%hsh{Key.new(1, 3)} = 'result';
say %hsh{Key.new(1, 3)}; # output: result
Note that I only allowed the key to contain Int. This is an easy way of being fairly confident no element's string value contains the '|' character, which could make two keys look the same despite having different elements. However, this is not hardened against naughty users--4 but role :: { method Str() { '|' } } is an Int that stringifies to the illegal value. You can make the code stronger if you use .WHICH recursively, but I'll leave that as an exercise.
This Key class is also a little fancier than you strictly need. It would be enough to have a #.list member and define .WHICH. I defined AT-POS and friends so the Key can be indexed, pushed to, and otherwise treated as an Array.