what is the kotlin idiom for an equivalent to this python iterator - iterator

The question is, how to create a python like iterator in Kotlin.
Consider this python code that parses string into substrings:
def parse(strng, idx=1):
lst = []
for i, c in itermarks(strng, idx):
if c == '}':
lst.append(strng[idx:i-1])
break
elif c == '{':
sublst, idx = parse(strng, i+1)
lst.append(sublst)
else:
lst.append(strng[idx:i-1])
idx = i+1
return lst, i
>>>res,resl = parse('{ a=50 , b=75 , { e=70, f=80 } }')
>>>print(resl)
>>>[' a=50', ' b=75', [' e=7', ' f=80'], '', ' f=80']
This is a play example just to illustrate a python iterator:
def findany(strng, idx, chars):
""" to emulate 'findany' in kotlin """
while idx < len(strng) and strng[idx] not in chars:
idx += 1
return idx
def itermarks(strng, idx=0):
while True:
idx = findany(strng, idx, ',{}"')
if idx >= len(strng):
break
yield idx, strng[idx]
if strng[idx] == '}':
break
idx += 1
Kotlin has iterators and generators, and as I understand it there can only be one per type. My idea is to define a type with a generator and instance that type. So the for loop from 'parse' (above) would look like this :
for((i,c) in IterMarks(strng){
......
}
But how do i define the generator, and what is the best idiom.

Kotlin uses two interfaces: Iterable<T> (from JDK) and Sequence<T>. They are identical, with the exception that the first one is eager, while the second one is lazy by convention.
To work with iterators or sequences all you have to do is implement one of those interfaces. Kotlin stdlib has a bunch of helper functions that may help.
In particular, a couple of functions for creating sequences with yield was added in 1.1. They and some other functions are called generators. Use them if you like them, or implement the interfaces manually.

OK, after some work, here is the Kotlin for the iterator:
import kotlin.coroutines.experimental.*
fun iterMarks(strng: String, idx:Int=0)=buildSequence{
val specials = listOf("\"", "{", "}", ",")
var found:Pair<Int,String>?
var index = idx
while (true){
found = strng.findAnyOf(specials, index)
if (found == null) break
yield (found)
index= found.first + 1
}
}
The main discoveries were that an iterator can be returned by any function, so the there is no need to add the iterator methods to an existing object. The JetBrains doco is solid, but lacks examples so hopefully the above example helps. You can also work from the basics, and again the notes are good but lack examples. I will post more on other approaches if there is interest.
The this code for 'parse' then works:
fun parse(strng:String, idxIn:Int=1): Pair<Any,Int> {
var lst:MutableList<Any> = mutableListOf()
var idx = idxIn
loop# for (mark in iterMarks(strng, idx)){
if(mark==null ||mark.first <= idx){
// nothing needed
}
else
{
when( mark.second ) {
"}" -> {
lst.add(strng.slice(idx..mark.first - 1))
idx = mark.first + 1
break#loop
}
"{" -> {
val res: Pair<Any, Int>
res = parse(strng, mark.first + 1)
lst.add(res.first)
idx = res.second
}
"," -> {
lst.add(strng.slice(idx..mark.first - 1))
idx = mark.first + 1
}
}
}
}
return Pair(lst, idx)
}
Hopefully this example will make it less work for the next person new to Kotlin, by providing an example of implementing an iterator. Specifically if you know how to make an iterator in python then this example should be useful

Related

Slice() nested for loop values i and j Kotlin

I'm wanting to slice a range which I can do in Javascfript but am struggling in kotlin.
my current code is:
internal class blah {
fun longestPalindrome(s: String): String {
var longestP = ""
for (i in 0..s.length) {
for (j in 1..s.length) {
var subS = s.slice(i, j)
if (subS === subS.split("").reversed().joinToString("") && subS.length > longestP.length) {
longestP = subS
}
}
}
return longestP
}
and the error I get is:
Type mismatch.
Required:
IntRange
Found:
Int
Is there a way around this keeping most of the code I have?
As the error message says, slice wants an IntRange, not two Ints. So, pass it a range:
var subS = s.slice(i..j)
By the way, there are some bugs in your code:
You need to iterate up to the length minus 1 since the range starts at 0. But the easier way is to grab the indices range directly: for (i in s.indices)
I assume j should be i or bigger, not 1 or bigger, or you'll be checking some inverted Strings redundantly. It should look like for (j in i until s.length).
You need to use == instead of ===. The second operator is for referential equality, which will always be false for two computed Strings, even if they are identical.
I know this is probably just practice, but even with the above fixes, this code will fail if the String contains any multi-code-unit code points or any grapheme clusters. The proper way to do this would be by turning the String into a list of grapheme clusters and then performing the algorithm, but this is fairly complicated and should probably rely on some String processing code library.
class Solution {
fun longestPalindrome(s: String): String {
var longestPal = ""
for (i in 0 until s.length) {
for (j in i + 1..s.length) {
val substring = s.substring(i, j)
if (substring == substring.reversed() && substring.length > longestPal.length) {
longestPal = substring
}
}
}
return longestPal
}
}
This code is now functioning but unfortunately is not optimized enough to get through all test cases.

How I can return IntArray with reduce or fold?

I have the following data for my task:
Input: nums = [1,2,3,4]
Output: [1,3,6,10]
Explanation: Running sum is obtained as follows: [1, 1+2, 1+2+3, 1+2+3+4].
As u see, I need to return IntArray, the first thing I used was runningReduce() , but this function is used in the version of Kotlin 1.4.30.
fun runningSum(nums: IntArray): IntArray {
return nums.runningReduce { sum, element -> sum + element }.toIntArray()
}
Yes, this solution works, but how can I solve the same problem using reduce() or fold()?
Try the following:
nums.fold(listOf(0)) { acc, i -> acc + (acc.last() + i) }.drop(1).toIntArray()
The solution is sub-optimal though: it copies the list in each iteration. But looks fancy.
To avoid copying, you could write it as follows:
nums.fold(mutableListOf(0)) { acc, i -> acc += (acc.last() + i); acc }.drop(1).toIntArray()
I think I prefer the first version, the second one is not pure from functional perspective.
Mafor is right, fold() is not a good choice when producing a collection because it copies the collection every time so you need to workaround with mutable collections, which defeats the point of the functional style.
If you really want to work with arrays, which are mutable, doing it the old fashioned procedural way may be best:
val array = intArrayOf(1, 2, 3, 4)
for (i in array.indices.drop(1)) {
array[i] += array[i - 1]
}
println(array.joinToString(", "))
Here's a slightly modified version of Mafor's answer that gives you an IntArray and avoids the use of multiple statements - it still uses the mutable list though:
val input = intArrayOf(1, 2, 3, 4)
val output = input.fold(mutableListOf(0)) { acc, cur ->
acc.apply { add(last() + cur) }
}.drop(1).toIntArray()
println(output.joinToString(", "))
Both of these print 1, 3, 6, 10.

Return double index of collection's element while iterating

In Kotlin documentation I found the following example:
for ((index, value) in array.withIndex()) {
println("the element at $index is $value")
}
Is it possible (and how) to do the similar with 2D matrix:
for ((i, j, value) in matrix2D.withIndex()) {
// but iterate iver double index: i - row, j - column
if (otherMatrix2D[i, j] > value) doSomething()
}
How to make support this functionality in Kotlin class?
While the solutions proposed by miensol and hotkey are correct it would be the least efficient way to iterate a matrix. For instance, the solution of hotkey makes M * N allocations of Cell<T> plus M allocations of List<Cell<T>> and IntRange plus one allocation of List<List<Cell<T>>> and IntRange. Moreover lists resize when new cells are added so even more allocations happen. That's too much allocations for just iterating a matrix.
Iteration using an inline function
I would recommend you to implement a very similar and very effective at the same time extension function that will be similar to Array<T>.forEachIndexed. This solution doesn't do any allocations at all and as efficient as writing nested for cycles.
inline fun <T> Matrix<T>.forEachIndexed(callback: (Int, Int, T) -> Unit) {
for (i in 0..cols - 1) {
for (j in 0..rows - 1) {
callback(i, j, this[i, j])
}
}
}
You can call this function in the following way:
matrix.forEachIndexed { i, j, value ->
if (otherMatrix[i, j] > value) doSomething()
}
Iteration using a destructive declaration
If you want to use a traditional for-loop with destructive declaration for some reason there exist a way more efficient but hacky solution. It uses a sequence instead of allocating multiple lists and creates only a single instance of Cell, but the Cell itself is mutable.
data class Cell<T>(var i: Int, var j: Int, var value: T)
fun <T> Matrix<T>.withIndex(): Sequence<Cell<T>> {
val cell = Cell(0, 0, this[0, 0])
return generateSequence(cell) { cell ->
cell.j += 1
if (cell.j >= rows) {
cell.j = 0
cell.i += 1
if (cell.i >= cols) {
return#generateSequence null
}
}
cell.value = this[cell.i, cell.j]
cell
}
}
And you can use this function to iterate a matrix in a for-loop:
for ((i, j, item) in matrix.withIndex()) {
if (otherMatrix[i, j] > value) doSomething()
}
This solution is lightly less efficient than the first one and not so robust because of a mutable Cell, so I would really recommend you to use the first one.
These two language features are used for implementing the behaviour that you want:
For-loops can be used with any class that has a method that provides an iterator.
for (item in myItems) { ... }
This code will compile if myItems has function iterator() returning something with functions hasNext(): Boolean and next().
Usually it is an Iterable<SomeType> implementation (some collection), but you can add iterator() method to an existing class as an extension, and you will be able to use that class in for-loops as well.
For destructuring declaration, the item type should have componentN() functions.
val (x, y, z) = item
Here the compiler expects item to have component1(), component2() and component3() functions. You can also use data classes, they have these functions generated.
Destructuring in for-loop works in a similar way: the type that the iterator's next() returns must have componentN() functions.
Example implementation (not pretending to be best at performance, see below):
Class with destructuring support:
class Cell<T>(val i: Int, val j: Int, val item: T) {
operator fun component1() = i
operator fun component2() = j
operator fun component3() = item
}
Or using data class:
data class Cell<T>(val i: Int, val j: Int, val item: T)
Function that returns List<Cell<T>> (written as an extension, but can also be a member function):
fun <T> Matrix<T>.withIndex() =
(0 .. height - 1).flatMap { i ->
(0 .. width - 1). map { j ->
Cell(i, j, this[i, j])
}
}
The usage:
for ((i, j, item) in matrix2d.withIndex()) { ... }
UPD Solution offered by Michael actually performs better (run this test, the difference is about 2x to 3x), so it's more suitable for performance critical code.
The following method:
data class Matrix2DValue<T>(val x: Int, val y: Int, val value: T)
fun withIndex(): Iterable<Matrix2DValue<T>> {
//build the list of values
}
Would allow you to write for as:
for ((x, y, value) in matrix2d.withIndex()) {
println("value: $value, x: $x, y: $y")
}
Bear in mind though that the order in which you declare data class properties defines the values of (x, y, value) - as opposed to for variable names. You can find more information about destructuring in the Kotlin documentation.

Should there be an indicesWhere method on Scala's List class?

Scala's List classes have indexWhere methods, which return a single index for a List element which matches the supplied predicate (or -1 if none exists).
I recently found myself wanting to gather all indices in a List which matched a given predicate, and found myself writing an expression like:
list.zipWithIndex.filter({case (elem, _) => p(elem)}).map({case (_, index) => index})
where p here is some predicate function for selecting matching elements. This seems a bit of an unwieldy expression for such a simple requirement (but I may be missing a trick or two).
I was half expecting to find an indicesWhere function on List which would allow me to write instead:
list.indicesWhere(p)
Should something like this be part of the Scala's List API, or is there a much simpler expression than what I've shown above for doing the same thing?
Well, here's a shorter expression that removes some of the syntactic noise you have in yours (modified to use Travis's suggestion):
list.zipWithIndex.collect { case (x, i) if p(x) => i }
Or alternatively:
for ((x,i) <- list.zipWithIndex if p(x)) yield i
But if you use this frequently, you should just add it as an implicit method:
class EnrichedWithIndicesWhere[T, CC[X] <: Seq[X]](xs: CC[T]) {
def indicesWhere(p: T => Boolean)(implicit bf: CanBuildFrom[CC[T], Int, CC[Int]]): CC[Int] = {
val b = bf()
for ((x, i) <- xs.zipWithIndex if p(x)) b += i
b.result
}
}
implicit def enrichWithIndicesWhere[T, CC[X] <: Seq[X]](xs: CC[T]) = new EnrichedWithIndicesWhere(xs)
val list = List(1, 2, 3, 4, 5)
def p(i: Int) = i % 2 == 1
list.indicesWhere(p) // List(0, 2, 4)
You could use unzip to replace the map:
list.zipWithIndex.filter({case (elem, _) => p(elem)}).unzip._2

How do I write a generic memoize function?

I'm writing a function to find triangle numbers and the natural way to write it is recursively:
function triangle (x)
if x == 0 then return 0 end
return x+triangle(x-1)
end
But attempting to calculate the first 100,000 triangle numbers fails with a stack overflow after a while. This is an ideal function to memoize, but I want a solution that will memoize any function I pass to it.
Mathematica has a particularly slick way to do memoization, relying on the fact that hashes and function calls use the same syntax:
triangle[0] = 0;
triangle[x_] := triangle[x] = x + triangle[x-1]
That's it. It works because the rules for pattern-matching function calls are such that it always uses a more specific definition before a more general definition.
Of course, as has been pointed out, this example has a closed-form solution: triangle[x_] := x*(x+1)/2. Fibonacci numbers are the classic example of how adding memoization gives a drastic speedup:
fib[0] = 1;
fib[1] = 1;
fib[n_] := fib[n] = fib[n-1] + fib[n-2]
Although that too has a closed-form equivalent, albeit messier: http://mathworld.wolfram.com/FibonacciNumber.html
I disagree with the person who suggested this was inappropriate for memoization because you could "just use a loop". The point of memoization is that any repeat function calls are O(1) time. That's a lot better than O(n). In fact, you could even concoct a scenario where the memoized implementation has better performance than the closed-form implementation!
You're also asking the wrong question for your original problem ;)
This is a better way for that case:
triangle(n) = n * (n - 1) / 2
Furthermore, supposing the formula didn't have such a neat solution, memoisation would still be a poor approach here. You'd be better off just writing a simple loop in this case. See this answer for a fuller discussion.
I bet something like this should work with variable argument lists in Lua:
local function varg_tostring(...)
local s = select(1, ...)
for n = 2, select('#', ...) do
s = s..","..select(n,...)
end
return s
end
local function memoize(f)
local cache = {}
return function (...)
local al = varg_tostring(...)
if cache[al] then
return cache[al]
else
local y = f(...)
cache[al] = y
return y
end
end
end
You could probably also do something clever with a metatables with __tostring so that the argument list could just be converted with a tostring(). Oh the possibilities.
In C# 3.0 - for recursive functions, you can do something like:
public static class Helpers
{
public static Func<A, R> Memoize<A, R>(this Func<A, Func<A,R>, R> f)
{
var map = new Dictionary<A, R>();
Func<A, R> self = null;
self = (a) =>
{
R value;
if (map.TryGetValue(a, out value))
return value;
value = f(a, self);
map.Add(a, value);
return value;
};
return self;
}
}
Then you can create a memoized Fibonacci function like this:
var memoized_fib = Helpers.Memoize<int, int>((n,fib) => n > 1 ? fib(n - 1) + fib(n - 2) : n);
Console.WriteLine(memoized_fib(40));
In Scala (untested):
def memoize[A, B](f: (A)=>B) = {
var cache = Map[A, B]()
{ x: A =>
if (cache contains x) cache(x) else {
val back = f(x)
cache += (x -> back)
back
}
}
}
Note that this only works for functions of arity 1, but with currying you could make it work. The more subtle problem is that memoize(f) != memoize(f) for any function f. One very sneaky way to fix this would be something like the following:
val correctMem = memoize(memoize _)
I don't think that this will compile, but it does illustrate the idea.
Update: Commenters have pointed out that memoization is a good way to optimize recursion. Admittedly, I hadn't considered this before, since I generally work in a language (C#) where generalized memoization isn't so trivial to build. Take the post below with that grain of salt in mind.
I think Luke likely has the most appropriate solution to this problem, but memoization is not generally the solution to any issue of stack overflow.
Stack overflow usually is caused by recursion going deeper than the platform can handle. Languages sometimes support "tail recursion", which re-uses the context of the current call, rather than creating a new context for the recursive call. But a lot of mainstream languages/platforms don't support this. C# has no inherent support for tail-recursion, for example. The 64-bit version of the .NET JITter can apply it as an optimization at the IL level, which is all but useless if you need to support 32-bit platforms.
If your language doesn't support tail recursion, your best option for avoiding stack overflows is either to convert to an explicit loop (much less elegant, but sometimes necessary), or find a non-iterative algorithm such as Luke provided for this problem.
function memoize (f)
local cache = {}
return function (x)
if cache[x] then
return cache[x]
else
local y = f(x)
cache[x] = y
return y
end
end
end
triangle = memoize(triangle);
Note that to avoid a stack overflow, triangle would still need to be seeded.
Here's something that works without converting the arguments to strings.
The only caveat is that it can't handle a nil argument. But the accepted solution can't distinguish the value nil from the string "nil", so that's probably OK.
local function m(f)
local t = { }
local function mf(x, ...) -- memoized f
assert(x ~= nil, 'nil passed to memoized function')
if select('#', ...) > 0 then
t[x] = t[x] or m(function(...) return f(x, ...) end)
return t[x](...)
else
t[x] = t[x] or f(x)
assert(t[x] ~= nil, 'memoized function returns nil')
return t[x]
end
end
return mf
end
I've been inspired by this question to implement (yet another) flexible memoize function in Lua.
https://github.com/kikito/memoize.lua
Main advantages:
Accepts a variable number of arguments
Doesn't use tostring; instead, it organizes the cache in a tree structure, using the parameters to traverse it.
Works just fine with functions that return multiple values.
Pasting the code here as reference:
local globalCache = {}
local function getFromCache(cache, args)
local node = cache
for i=1, #args do
if not node.children then return {} end
node = node.children[args[i]]
if not node then return {} end
end
return node.results
end
local function insertInCache(cache, args, results)
local arg
local node = cache
for i=1, #args do
arg = args[i]
node.children = node.children or {}
node.children[arg] = node.children[arg] or {}
node = node.children[arg]
end
node.results = results
end
-- public function
local function memoize(f)
globalCache[f] = { results = {} }
return function (...)
local results = getFromCache( globalCache[f], {...} )
if #results == 0 then
results = { f(...) }
insertInCache(globalCache[f], {...}, results)
end
return unpack(results)
end
end
return memoize
Here is a generic C# 3.0 implementation, if it could help :
public static class Memoization
{
public static Func<T, TResult> Memoize<T, TResult>(this Func<T, TResult> function)
{
var cache = new Dictionary<T, TResult>();
var nullCache = default(TResult);
var isNullCacheSet = false;
return parameter =>
{
TResult value;
if (parameter == null && isNullCacheSet)
{
return nullCache;
}
if (parameter == null)
{
nullCache = function(parameter);
isNullCacheSet = true;
return nullCache;
}
if (cache.TryGetValue(parameter, out value))
{
return value;
}
value = function(parameter);
cache.Add(parameter, value);
return value;
};
}
}
(Quoted from a french blog article)
In the vein of posting memoization in different languages, i'd like to respond to #onebyone.livejournal.com with a non-language-changing C++ example.
First, a memoizer for single arg functions:
template <class Result, class Arg, class ResultStore = std::map<Arg, Result> >
class memoizer1{
public:
template <class F>
const Result& operator()(F f, const Arg& a){
typename ResultStore::const_iterator it = memo_.find(a);
if(it == memo_.end()) {
it = memo_.insert(make_pair(a, f(a))).first;
}
return it->second;
}
private:
ResultStore memo_;
};
Just create an instance of the memoizer, feed it your function and argument. Just make sure not to share the same memo between two different functions (but you can share it between different implementations of the same function).
Next, a driver functon, and an implementation. only the driver function need be public
int fib(int); // driver
int fib_(int); // implementation
Implemented:
int fib_(int n){
++total_ops;
if(n == 0 || n == 1)
return 1;
else
return fib(n-1) + fib(n-2);
}
And the driver, to memoize
int fib(int n) {
static memoizer1<int,int> memo;
return memo(fib_, n);
}
Permalink showing output on codepad.org. Number of calls is measured to verify correctness. (insert unit test here...)
This only memoizes one input functions. Generalizing for multiple args or varying arguments left as an exercise for the reader.
In Perl generic memoization is easy to get. The Memoize module is part of the perl core and is highly reliable, flexible, and easy-to-use.
The example from it's manpage:
# This is the documentation for Memoize 1.01
use Memoize;
memoize('slow_function');
slow_function(arguments); # Is faster than it was before
You can add, remove, and customize memoization of functions at run time! You can provide callbacks for custom memento computation.
Memoize.pm even has facilities for making the memento cache persistent, so it does not need to be re-filled on each invocation of your program!
Here's the documentation: http://perldoc.perl.org/5.8.8/Memoize.html
Extending the idea, it's also possible to memoize functions with two input parameters:
function memoize2 (f)
local cache = {}
return function (x, y)
if cache[x..','..y] then
return cache[x..','..y]
else
local z = f(x,y)
cache[x..','..y] = z
return z
end
end
end
Notice that parameter order matters in the caching algorithm, so if parameter order doesn't matter in the functions to be memoized the odds of getting a cache hit would be increased by sorting the parameters before checking the cache.
But it's important to note that some functions can't be profitably memoized. I wrote memoize2 to see if the recursive Euclidean algorithm for finding the greatest common divisor could be sped up.
function gcd (a, b)
if b == 0 then return a end
return gcd(b, a%b)
end
As it turns out, gcd doesn't respond well to memoization. The calculation it does is far less expensive than the caching algorithm. Ever for large numbers, it terminates fairly quickly. After a while, the cache grows very large. This algorithm is probably as fast as it can be.
Recursion isn't necessary. The nth triangle number is n(n-1)/2, so...
public int triangle(final int n){
return n * (n - 1) / 2;
}
Please don't recurse this. Either use the x*(x+1)/2 formula or simply iterate the values and memoize as you go.
int[] memo = new int[n+1];
int sum = 0;
for(int i = 0; i <= n; ++i)
{
sum+=i;
memo[i] = sum;
}
return memo[n];