Lazy list in kotlin? - kotlin

How can I achieve in a simple way a Lazy List in Kotlin? (For example, integers lazy list).
I've been seeking official documentation, I've been googling for that without consistent results. Maybe the best tutorial I've found is here, but I wonder if there is a more Kotlin idiomatic way for doing that.
I've found the following on Kotlin's official blog, though I was unable to get an item, with integers[3] for example
var i = 0
integers = iterate{i++}
integers[3] // does not work
integers drop 3 // works

As you correctly observed, sequenceOf (streamOf() in older versions) is the way to get a lazy stream of numbers. Unlike Haskell, there's no such thing as a lazy list in Kotlin's standard library, and for a good reason: the primary meaning of "list" in Haskell world and Java world is different. In Haskell, a list is primarily a linked list, a pair of head and tail, and the main operation is taking a head of such a list, which is straightforward to efficiently implement lazily. In Kotlin/Java, list is a data structure with random access to its elements, and the main operation is get(int), which can be implemented lazily, of course, but its performance will often be surprising for the user.
So, Kotlin uses streams for laziness, because they are good when it comes to the main use cases of lazy collections: iteration, filtering, mapping, and random access is unlikely to be encountered very often.
As you, again, correctly observe, drop lets you access elements by index, which makes the performance implications more explicit in the code.
BTW, what is your use case for lazy lists?

Related

Synchronized collection that blocks on every method

I have a collection that is commonly used between different threads. In one thread I need to add items, remove items, retrieve items and iterate over the list of items. What I am looking for is a collection that blocks access to any of its read/write/remove methods whenever any of these methods are already being called. So if one thread retrieves an item, another thread has to wait until the reading has completed before it can remove an item from the collection.
Kotlin doesn't appear to provide this. However, I could create a wrapper class that provides the synchronization I'm looking for. Java does appear to offer the synchronizedList class but from what I read, this is really for blocking calls on a single method, meaning that no two threads can remove an item at the same time but one can remove while the other reads an item (which is what I am trying to avoid).
Are there any other solutions?
A wrapper such as the one returned by synchronizedList
synchronizes calls to every method, using the wrapper itself as the lock. So one thread would be blocked from calling get(), say, while another thread is currently calling put(). (This is what the question seems to ask for.)
However, as the docs to that method point out, this does nothing to protect sequences of calls, such as you might use when iterating through a collection. If another thread changes the collection in between your calls to next(), then anything could happen. (This is what I think the question is really about!)
To handle that safely, your options include:
Manual synchronization. Surround each sequence of calls to the collection in a synchronized block that synchronises on the collection, e.g.:
val list = Collections.synchronizedList(mutableListOf<String>())
// …
synchronized (list) {
for (i in list) {
// …
}
}
This is straightforward, and relatively easy to do if the collection is under your control. But if you miss any sequences, then you could get unexpected behaviour. Also, you'll need to keep your sequences short, to avoid holding the lock for an extended time and affecting performance.
Use a concurrent collection implementation which provides primitives letting you do all the processing you need in a single call, avoiding iteration and other sequences.
For maps, Java provides very good support with its ConcurrentMap interface, and high-performance implementations such as ConcurrentHashMap. These have methods allowing you to iterate, update single or multiple mappings, search, reduce, and many other whole-map operations in a single call, avoiding any concurrency problems.
For sets (as per this question) you can use a ConcurrentSkipListSet, or you can create one from a ConcurrentHashMap with newKeySet().
For lists (as per this question), there are fewer options. (I think concurrent lists are much less commonly needed.) If you don't need random access, ConcurrentLinkedQueue may suffice. Or if modification is much less common than iteration, CopyOnWriteArrayList could work.
There are many other concurrent classes in the java.util.concurrent package, so it's well worth looking through to see if any of those is a better match for your particular case.
If you have specialised requirements, you could write your own collection implementation which supports them. Obviously this is more work, and only worthwhile if none of the above approaches does what you want.
In general, I think it's well worth stepping back and seeing whether iteration is really needed. Historically, in imperative languages all the way from FORTRAN through BASIC and C up to Java, the for loop has traditionally been the tool of choice (sometimes the only structure) for operating on collections of data — and for those of us who grew up on those languages, it's what we reach for instinctively. But the functional programming paradigm provides alternative tools, and so in languages like Kotlin which provide some of them, it's good to stop and ask ourselves “What am I ultimately trying to achieve here?” (Often what we want is actually to update all entries, or map to a new structure, or search for an element, or find the maximum — all of which have better approaches in Kotlin than low-level iteration.)
After all, if you can tell the compiler what you want to do, instead of how to do it, then your program is likely to be shorter and easier to read and maintain, freeing you to think about more important things!

Kotlin Lists and Arrays [duplicate]

This question already has answers here:
Difference between List and Array types in Kotlin
(3 answers)
Closed 4 years ago.
Just started working with Kotlin and I love it but...
I can not make any sense of Lists and Arrys in this language.
I'm not new to programming and do not need an explanation on what arrays are. What I do not understand is.
What is the difference between a List and an Array? They seem very much the same you access both using a[index] and use them in much the same way. If a list is immutable they are even more the same, so... What is the difference? Assuming the list is not a linked list they both work in O(1) access time.
If I'm using a list; What is the difference between mutable and immutable? When can I edit the content? When can I change the length?
There seem to be many overlapping and confusing names for the same thing. List, ListOf, ArrayList, IntArray, intArray....
Could someone make an exhaustive list of all of them and give some kind of rule of thumb when you would use every one. Specifically, I find the concept of an immutable empty list very perplexing. What on earth would that be used for?
How do you initialize these things?
Sorry for the long question,
Thanks.
First difference is that List is interface describing some common list operations, while Array is a class. From memory perspective, Array is continuous region in memory which size doesn't change, that is why you can't change the size of Array after it is created, but you can change its elements, on other hand List can be implemented in different ways, meaning that memory structure can be different, most common implementations are ArrayList where array is used to store elements, and once array is filled, its changed with bigger array with contents of old one being added to new one, another implementation is LinkedList, where you have nodes pointing to next element on list. From performance perspective Array is always faster than any implementation of List but it is also much more limited.
Difference between List and MutableList is that when you use MutableList you can change elements of that list(add or remove elements from it), while when using immutable List you can't add or remove elements from it. Both lists allow you to change properties of those elements.
Will divide this answer into three answers:
List is the interface which extends Collection interface, provides basic common list operations, MutableList extends List interface as well as MutableCollection interface adding methods needed to change elements of that list, listOf is function which creates List and fills it with given arguments, by using listOf we don't need to specify which implementation of List will be used, for example on JVM List is backed by java.util.Arrays.ArrayList(not same as java.util.ArrayList), while on JavaScript side it is probably backed up by Array(take this statement with grain of salt, as I have never worked with Kotlin for JS)
ArrayList is typealias to java.util.ArrayList, there is nothing special about it, it is implemenentation of Java's List interface, MutableList is backed by this implementation on JVM.
Array is equivalent to Java's array, nothing special for it either, IntArray and other primitive array company is used to make up for the lack of primitive types in kotlin, Array<Int> is same as Integer[] in Java, while IntArray is same as int[]. Same logic is applied to all other variants. Using primitive types you get better performance, but difference can be neglected in most cases on modern computers, still if you have really a lot of data you should go for primitive types where possible.
You can see yourself all collections hierarchy on kotlin repository
Use built-in Kotlin functions like listOf, arrayOf, mutableListOf, this isn't a must, but its always good to follow best practices.
Coming from C/C++ the multitude of different names is very confusing.
Then maybe this can give C++ analogy specifically:
Array is like std::array (though length doesn't need to be known at compile time), or like C arrays, except it stores the length and all accesses are bounds-checked.
ArrayList is like std::vector (again, all accesses are bounds-checked).
MutableList is the interface to ArrayList (like SequenceContainer).
List is the read-only part of MutableList.
Generics work very differently from C++ templates, in particular there's no specialization: in C++, there is separate code generated for std::vector<int> and std::vector<std::string>, in Java and Kotlin there isn't. (Actually, Kotlin has a form of it with reified type parameters, but it doesn't apply here.) So e.g. Array<Int> and List<Int> have to work with boxed java.lang.Integers instead of primitive types. But Java does have arrays of primitives, and that's what Kotlin calls IntArray.

What is the exhaustive list of guidelines/practices/rules to fully conform with functional paradigm?

I've started playing around with Kotlin, but I sense my own limitation in the way I program. My problem is that I still think Java therefore the style is still imperative, my question is to all functional programming zealots , which I believe would be very useful to all people who at the very beginning stage and also need to 'brake' their brain to start building it again; to leave comfort zone and start thinking pseudo and not in "whatever is your first language". I believe it is possible for highly experienced polyglot developers to chew the concepts down to plain advices of what makes your program being written in entirely functional way and what violates the paradigm. I don't know all the quirks but please don't hesitate to include universally accepted terms which might be unknown to me(I can always lookup). At this point I need this set of rules to make myself suffer at first and not break them but then I know I will feel it, analyze guidelines and understand how they are worse/better which of course is my own homework.
So example of these guidelines, would be something like:
Never change state, this can be avoided by using x, y, z
Operate using higher order functions only (I maybe wrong, just example)
I hope the answer will give me long term reference to put myself in extreme conditions where I stop escaping to OOP whenever I feel uncomfortable. And now when I look at Kotlin I understand how I've should've been thinking about problems, it is about intention not about the structure imposed by one language or another. Intention can always be converted to a language of your choice and backed up by design patterns applicable to the language, but to find that middle ground I need to jail myself first from the comfort zone.
Avoid mutable state like the plague.
One of the main points of using functional programming, possibly the main one, is to avoid all the little pitfalls, bugs, issues one needs to deal with when using mutable state. You should do everything you can in order to avoid mutating state. For instance, instead of using C-style for-loops where you need to keep a counter variable updated, use map and other higher-order functions in order to abstract away your iteration patterns. This also means that you should never change the value of a variable if you can avoid that. Instead, you should be defining almost all of your variables, preferrably all of them, as constants, and using functions to compute new values from them instead of mutating them.
Avoid side-effects like the plague.
Mutable state's ugly cousin, side-effects. Side effects mean anything other than taking a value and returning a value in a function. If that function prints data, mutates global variables, sends messages to threads, or anything, anything other than simply taking its parameters, computing a value from them, and returning a value, that function has side-effects. Side-effects are important (see next bullet point), but if you use them a lot, they get impossible to track. Just think of how everyone tells you to avoid global variables in imperative programming. Functional programming goes a step further and tries to avoid all side-effects. The bulk of your program should be made of pure functions. (See ahead)
When you need to use side-effects, keep them contained.
Yes, I just told you to run away from side-effects. However, no program is useful without side-effects of some kind. Graphical User Interface? Side-effect. Audio output? Side-effect. Printing to a shell? Side-effect. So you can't really get rid of side-effects if you want to build useful stuff.
What you should do instead is write your code so that all your side-effecting code lives in a thin layer which mostly calls pure functions and then does the required side-effects using the result of these pure function calls.
Use pure functions for everything you can.
This is sort of the flipside of the previous point. A pure function is a function which has no side-effects and does not mutate anything. It can only take in parameters and return a value. You should use these a lot. For instance, instead of doing your logging within functions which are computing stuff, you should be constructing your log strings using pure functions, and then letting your side-effects layer call these pure functions, call more pure functions in order to format the log strings into a full log, and then output the log itself from your side-effects layer.
Use higher-order functions to structure your code.
Higher-order functions are, in a way, the glue that makes functional programming work. A higher-order function is a function which takes one or more functions as parameters and/or returns a function. The power of higher-order functions is that they can encapsulate many of the patterns which you would use in an imperative-style program in a declarative manner. For instance, let's take a look at the three most common higher-order functions:
map is a function which takes a function and a list of values, applies its function argument to each of those values, and returns a new list with the results. map encapsulates the whole pattern of iterating over a list doing an operation on each value in a declarative manner.
filter is a function which takes a function which returns a boolean and a list of values, applies its function argument to each of those values and returns a list containing only those values for which its function argument returns true. It encapsulates the whole pattern of selecting results from a list in a declarative manner.
reduce, also known as fold, takes an initial value, a binary function and a list of values. It uses its function argument to combine the initial value with the first value of the list, then combines the result with the next value of the list and keeps on doing this until it has reduced the list to just one single value. It encapsulates the entire pattern of obtaining an aggregate value from a list of values.
This is in no way an exhaustive list of higher-order functions, but these three are the most common ones. I hope this has been enough to show how you can structure code which would require a lot of tracking variables using only functions in a declarative manner. If you use these higher-order functions well, it's likely you won't ever need a for or while loop again.
This is definitely not an exhaustive list of functional programming practices, but I think most functional programmers would agree these five guidelines form the core of what functional programming is about. If you want to really learn how to apply these, my advice would be to learn a pure functional programming language such as Haskell, so you are forced to abandon the imperative paradigm and to learn how to structure things functionally instead. I would recommend the fantastic Haskell Programming from First Principles as a starting resource if you choose to go this way. In case you don't want to/can't put down the cash, Brent Yorgey's Haskell course at UPenn is also a great free resource.

Optimization of consecutive map/filter/fold calls

Let's say I have a big list on which I'd like to execute multiple map, filter and fold/reduce calls. For clarity and expressiveness this should be done with small lambda functions passed to map/filter/fold. However, as far as I know, these are actually traversing the list every time, calling the lambda on it (might be inline though) and generating a new list. If this is the case, I could just code a for-each loop and merge all the lambdas into its body.
I measured execution time of a simple map/filter/reduce algorithm and the corresponding imperative for-each loop in Python and the latter was more than two times faster, just as I expected, but I know Python is not the best language in this regard.
My questions are: Is it possible for a compiler to figure out these and somehow merge them into a single loop? Are there any compilers that do this? I'm interested in mainly functional languages (Haskell, Erlang/Elixir, Scala), but would be good to hear about others as well (Rust's implementation, LINQ).
Yes, such optimizations have been considered many times.
One term or method used is "fusion" (also known as stream or map fusion), which has the goal of intelligently inlining iterated trasformations, in patterns like map f . map g = map (f . g). This mostly has to be done with the help of a compiler, but can work on "normal" implementations of these functions (if they are done somewhat intelligently).
Another approach is to perform this kind of inlining manually by accumulating all intermediate closures, and only apply the combinded transformation when the values are actually needed (this is closely related to lazy evaluation, a thing which will in some languages, like Haskell, be done automatically). Such things can be found in Scala's views and Streams, or Clojure's transducers (which work in a more complicated way, though). The problem with these lazy things is that they tend to run into space problems more easily (I've heard).
Iterators in Python (and C#'s IEnumerable/LINQ stuff, and Java's new Streams) principle work via the latter principle, involving a language-provided iteration support (involving some internal state). Which is why xs = map(print, range(10)) will not print anything immediately, and can only be traversed once; at every step of the iteration, the nested iterators will ask each other for the next value, transform it, and update their state. (And probably your measured difference is due more to this involved machinery than to repeated iteration.)

Scala immutable vs mutable. What is the way one should go?

I'm just learning to program in scala.
I have some experience in functional programming, as I have in object oriented programming.
My question is kind of simple, yet tricky:
Which structures should be used in Scala? Should we only stick to immutables, eg. modifing lists by iterating through it and stick a new one together, or go for mutables? What is your opinion on that, what are the performance aspects, memory related aspects, ...
I'm likely to program in a functional style, but it often expands to an insane amount of effort to do things which are easily done by using mutables. Is it situation dependent, what to use?
Prefer immutable to mutable state. Use mutable state only where it is absolutely necessary. Some notable reasons include:
Performance. The standard libraries make wide use of vars and while loops, even though this is not idiomatic Scala. This should not be emulated, however, except for cases where you have profiled to determine that modifying the code to be more imperative will bring a significant performance gain.
I/O. I/O, or interacting with the outside world is inherently state dependent, and thus must be dealt with in a mutable manner.
This is no different than the recommended coding style found in all major languages, imperative or functional. For example, in Java it is preferable to use data objects with only private final fields. Code written in an immutable (and functional) way is inherently easier to understand because when one sees a val, they know it will never change, reducing the possible number of states any particular object or function can be in.
In many cases, it also allows automatic parallel execution, for example, collection classes in Scala all have a par function, which will return a parallel collection that automatically run the calls to functions like map or reduce in parallel.
(I thought this must be a duplicate but couldn't easily find an earlier similar one, so I venture to answer...)
There is no general answer to this question. The rule of thumb suggested by the creators of Scala is to start with immutable vals and structures and stick to them as long as it makes sense. You can almost always create a workable solution to your problem this way. But if not, of course be pragmatic and use mutability.
Once you have a solution, you can tweak it, test it, measure its performance etc. If you find that e.g. it is too slow or overly complex, identify the critical part of it, understand what makes it problematic and - if needed - reimplement it using mutable variables, ideally keeping it isolated from the rest of the program. Note though that in many cases, a better solution can be found from within the immutable realm as well, so try looking there first. Especially for a beginner like myself, it still happens regularly that the best solution I could come up with looked contorted and complex with no apparent way to improve it - until seeing a simple and elegant solution to the same problem in a few lines of code, created by an experienced Scala developer who controls more of the power of the language and its libraries.
I usually obey the following rules:
Never use static mutable vars
Keep all user defined data types (typically case classes) immutable unless they are very expensive to copy. This will simplify a lot of the application logic.
If a data structure/collection is inherently mutable (i.e. it's designed to change over time), using a mutable data structure/collection might be appropriate. An example might be a large game world that is updated when players move. Remember to (almost) never share these data structures between threads though.
It's fine to use mutable local vars in methods
Use immutable collections for function results. These can be strictly or lazily evaluated depending on what gives best performance in the used context. Be careful if you use a lazily evaluated result which depends on a mutable collection though.