Kotlin Lists and Arrays [duplicate] - kotlin

This question already has answers here:
Difference between List and Array types in Kotlin
(3 answers)
Closed 4 years ago.
Just started working with Kotlin and I love it but...
I can not make any sense of Lists and Arrys in this language.
I'm not new to programming and do not need an explanation on what arrays are. What I do not understand is.
What is the difference between a List and an Array? They seem very much the same you access both using a[index] and use them in much the same way. If a list is immutable they are even more the same, so... What is the difference? Assuming the list is not a linked list they both work in O(1) access time.
If I'm using a list; What is the difference between mutable and immutable? When can I edit the content? When can I change the length?
There seem to be many overlapping and confusing names for the same thing. List, ListOf, ArrayList, IntArray, intArray....
Could someone make an exhaustive list of all of them and give some kind of rule of thumb when you would use every one. Specifically, I find the concept of an immutable empty list very perplexing. What on earth would that be used for?
How do you initialize these things?
Sorry for the long question,
Thanks.

First difference is that List is interface describing some common list operations, while Array is a class. From memory perspective, Array is continuous region in memory which size doesn't change, that is why you can't change the size of Array after it is created, but you can change its elements, on other hand List can be implemented in different ways, meaning that memory structure can be different, most common implementations are ArrayList where array is used to store elements, and once array is filled, its changed with bigger array with contents of old one being added to new one, another implementation is LinkedList, where you have nodes pointing to next element on list. From performance perspective Array is always faster than any implementation of List but it is also much more limited.
Difference between List and MutableList is that when you use MutableList you can change elements of that list(add or remove elements from it), while when using immutable List you can't add or remove elements from it. Both lists allow you to change properties of those elements.
Will divide this answer into three answers:
List is the interface which extends Collection interface, provides basic common list operations, MutableList extends List interface as well as MutableCollection interface adding methods needed to change elements of that list, listOf is function which creates List and fills it with given arguments, by using listOf we don't need to specify which implementation of List will be used, for example on JVM List is backed by java.util.Arrays.ArrayList(not same as java.util.ArrayList), while on JavaScript side it is probably backed up by Array(take this statement with grain of salt, as I have never worked with Kotlin for JS)
ArrayList is typealias to java.util.ArrayList, there is nothing special about it, it is implemenentation of Java's List interface, MutableList is backed by this implementation on JVM.
Array is equivalent to Java's array, nothing special for it either, IntArray and other primitive array company is used to make up for the lack of primitive types in kotlin, Array<Int> is same as Integer[] in Java, while IntArray is same as int[]. Same logic is applied to all other variants. Using primitive types you get better performance, but difference can be neglected in most cases on modern computers, still if you have really a lot of data you should go for primitive types where possible.
You can see yourself all collections hierarchy on kotlin repository
Use built-in Kotlin functions like listOf, arrayOf, mutableListOf, this isn't a must, but its always good to follow best practices.

Coming from C/C++ the multitude of different names is very confusing.
Then maybe this can give C++ analogy specifically:
Array is like std::array (though length doesn't need to be known at compile time), or like C arrays, except it stores the length and all accesses are bounds-checked.
ArrayList is like std::vector (again, all accesses are bounds-checked).
MutableList is the interface to ArrayList (like SequenceContainer).
List is the read-only part of MutableList.
Generics work very differently from C++ templates, in particular there's no specialization: in C++, there is separate code generated for std::vector<int> and std::vector<std::string>, in Java and Kotlin there isn't. (Actually, Kotlin has a form of it with reified type parameters, but it doesn't apply here.) So e.g. Array<Int> and List<Int> have to work with boxed java.lang.Integers instead of primitive types. But Java does have arrays of primitives, and that's what Kotlin calls IntArray.

Related

What are the in and out positions in Kotlin Generics?

I want to start with what I know, or at least I think I know, so what I'm asking would be more clear.
First of all, I know that you can declare a variable of a supertype and assign an object of a subtype to take advantage of polymorphism with Inheritence and Interfaces.
I know that generics provide type safety because the type parameters are invariant by definition, so where A is a subtype of B, Foo<A> is not necessarily a subtype of Foo<B>, and may not be used in place depending on mutability of the object. With this, possible exceptions that could arise at runtime due to dynamic dispatching can be caught in compile time.
They also help to define a generic logic for different types: Like in Lists where you have collections of type A objects, but it doesn't change the implementation for type B objects.
Also, I understood why MutableList<String> doesn't count as the subtype of MutableList<Any> because that could result in cases where you create a variable with type MutableList<Any> that holds a reference to a MutableList<String> object, and add an Int element to a List of Strings, which is obviously a problem.
I also understood why List version of the previous example works because Lists are immutable so you can't make any modification to the object that could result in type mismatches.
Lastly, I know that type parameters with in can only be used as function parameters, being consumed, and the ones with out can be used as the function return types, being produced.
Now to the part what I don't understand:
I didn't quite understand what the words consumer and producer actually means in terms of in and out. What does it mean for a type to be in consumed or produced position? Does that mean the object with that type can only be read or write only? Does that have anything to do with the object at all?
What would be the behaviour of the object if, let's say, we don't define it using in or out, or, opposite, we define it using in or out, not talking about the subtype-supertype relationship that I explained above.
I spend the last few days looking at different explanations of this, but I found the lack of examples a big problem, especially because that's how I usually learn.
I can use these concepts in code, but the lack of underlying knowledge or the logic greatly disturbs me, so please, if you decide to take the time to write an explanation, provide it with examples and counter examples of why or how a certain idea works.
Just one correction to your first bullet points: List is not immutable; it is read-only. A List could be an up-cast mutable implementation and some other object that references it could be mutating it.
Producer means the generic type appears as a return type in any functions or properties of the object. You can get T’s out of a List, for instance.
Consumer means the generic type appears as a parameter of any functions or as the type of any var properties of the object. You can put T’s into a MutableList, for example.
Since List produces but doesn’t consume (it doesn’t have any functions with T as a parameter), its type is marked as producing-only, aka covariant, aka out right at the declaration site so its type can always be assumed to be out wherever it’s used even if the out keyword is not used.
Since the List type is always covariant out, any List can be safely upcast to a List where the type is a supertype of the originating type. A List<String> can be cast to List<CharSequence> because any item you get out of it (anything it produces) is going to be a String, and therefore also qualifies as the supertype CharSequence.
The reverse logic would apply for something that is purely a consumer with the type marked in, but it’s harder to come up with a simple example where you would actually have a useful object like this.
A MutableList both produces and consumes, so it is invariant by default, but since it is also a List, a MutableList<String> could be safely cast to a List<CharSequence>. If you have a reference to the List<CharSequence>, you can get CharSequences out of it. The underlying object might continue to have new Strings put into it from the original reference.

Hacklang : why were container classes replaced with built-in types?

Just a quote from hack documentation :
Legacy Vector, Map, and Set
These container types should be avoided in new code; use dict,
keyset, and vec instead.
Early in Hack's life, the library provided mutable and immutable
generic class types called: Vector, ImmVector, Map, ImmMap, Set, and
ImmSet. However, these have been replaced by vec, dict, and keyset,
whose use is recommended in all new code. Each generic type had a
corresponding literal form. For example, a variable of type
Vector might be initialized using Vector {22, 33, $v}, where $v
is a variable of type int.
I wonder why this change was made.
I mean, one of PHP weaknesses is that it has bad oop standard library.
Ex : str_replace and array_values methods are outside of the string/array type itself. The PHP standard library is not consistent, sometimes we must pass the array as the first parameter, other times as the second...
I was glad to see that Hack introduced true OOP encapsulation for collections.
Do you know why they stepped back and wrote utility classes such as C\, Dict\, Keyset\ and Vec\ ?
Will there be in the future an addition to add methods to built-in types (ex : Str\starts_with => "toto"->startsWith("t")) ?
Based on Dwayne Reeves' blog post introducing HSL, it seems that the main advantage is the fact that arrays are native values, not objects. This has two important consequences:
For users, the semantics are different when the values cross through arguments. Objects are passed as references, and mutations affect the original object. On the other hand, values are copied on write after passing through arguments, so without references (which are finally to be completely banned in Hack) the callee can't mutate the value of the caller, with the exception of the much stricter inout parameters.
The article cites the invariance of the mutable containers (Vector, Set, etc.) and generally how shared mutable state couples functions closer together. The soundness issues as discussed in the article are somewhat moot because there were also immutable object containers (ImmVector, ImmSet, etc.), although since these interfaces were written in userland, variance boxed the function type signature into tight constraints. There are tangible differences from this: ImmMap<Tk, +Tv> is invariant in Tk solely because of the (function(Tk): Tv) getter. Meanwhile, dict<+Tk, +Tv> is covariant in both type parameters thanks to the inherent mutation protection from copy-on-write.
For the compiler, static values can be allocated quickly and persist over the lifetime of the server. Objects on the other hand have arbitrarily complicated construction routines in general, and the collection objects weren't going to be special-cased it seems.
I will also mention that for most use cases, there is minimal difference even in code style: e.g. the -> reference chains can be directly replaced with the |> pipe operator. There is also no longer a boundary between the privileged "standard functions" and custom user functions on collection types. Finally, the collection types were final of course, so their objective nature didn't offer any actual hierarchical or polymorphic advantages to the end user anyways.

Why is List wrapped in IntoIter?

In the book Learning Rust With Entirely Too Many Linked Lists, in the implementation of IntoIter, why is List wrapped in a tuple struct? Instead, Iterator could have been implemented for a List.
Yes, technically Iterator could be implemented for List in this case. This isn't generally true, since iterators might need different state that's not in the base container (e.g. a Vec iterator might need to store an index to the next item to iterate efficiently).
One reason is that if the implementation changes in future and the List iterator would be better with extra state, it's possible to change the iterator struct without changing any callers.
Another reason is that in Rust it's common to use types to narrow interfaces to reduce the chance of errors. If you implement Iterator directly (and presumably IntoIterator to return self), then that leaves the possibility for the user to call other List methods during iteration, which is probably wrong. Instead the iterator is a separate type, meaning that there's no possibility of someone pushing items on during iteration. (Note that in a for loop it'd be hard to do this anyway due to the borrowing/move rules, but the general point is still there).

Lazy list in kotlin?

How can I achieve in a simple way a Lazy List in Kotlin? (For example, integers lazy list).
I've been seeking official documentation, I've been googling for that without consistent results. Maybe the best tutorial I've found is here, but I wonder if there is a more Kotlin idiomatic way for doing that.
I've found the following on Kotlin's official blog, though I was unable to get an item, with integers[3] for example
var i = 0
integers = iterate{i++}
integers[3] // does not work
integers drop 3 // works
As you correctly observed, sequenceOf (streamOf() in older versions) is the way to get a lazy stream of numbers. Unlike Haskell, there's no such thing as a lazy list in Kotlin's standard library, and for a good reason: the primary meaning of "list" in Haskell world and Java world is different. In Haskell, a list is primarily a linked list, a pair of head and tail, and the main operation is taking a head of such a list, which is straightforward to efficiently implement lazily. In Kotlin/Java, list is a data structure with random access to its elements, and the main operation is get(int), which can be implemented lazily, of course, but its performance will often be surprising for the user.
So, Kotlin uses streams for laziness, because they are good when it comes to the main use cases of lazy collections: iteration, filtering, mapping, and random access is unlikely to be encountered very often.
As you, again, correctly observe, drop lets you access elements by index, which makes the performance implications more explicit in the code.
BTW, what is your use case for lazy lists?

Mutable vs immutable objects

I'm trying to get my head around mutable vs immutable objects. Using mutable objects gets a lot of bad press (e.g. returning an array of strings from a method) but I'm having trouble understanding what the negative impacts are of this. What are the best practices around using mutable objects? Should you avoid them whenever possible?
Well, there are a few aspects to this.
Mutable objects without reference-identity can cause bugs at odd times. For example, consider a Person bean with a value-based equals method:
Map<Person, String> map = ...
Person p = new Person();
map.put(p, "Hey, there!");
p.setName("Daniel");
map.get(p); // => null
The Person instance gets "lost" in the map when used as a key because its hashCode and equality were based upon mutable values. Those values changed outside the map and all of the hashing became obsolete. Theorists like to harp on this point, but in practice I haven't found it to be too much of an issue.
Another aspect is the logical "reasonability" of your code. This is a hard term to define, encompassing everything from readability to flow. Generically, you should be able to look at a piece of code and easily understand what it does. But more important than that, you should be able to convince yourself that it does what it does correctly. When objects can change independently across different code "domains", it sometimes becomes difficult to keep track of what is where and why ("spooky action at a distance"). This is a more difficult concept to exemplify, but it's something that is often faced in larger, more complex architectures.
Finally, mutable objects are killer in concurrent situations. Whenever you access a mutable object from separate threads, you have to deal with locking. This reduces throughput and makes your code dramatically more difficult to maintain. A sufficiently complicated system blows this problem so far out of proportion that it becomes nearly impossible to maintain (even for concurrency experts).
Immutable objects (and more particularly, immutable collections) avoid all of these problems. Once you get your mind around how they work, your code will develop into something which is easier to read, easier to maintain and less likely to fail in odd and unpredictable ways. Immutable objects are even easier to test, due not only to their easy mockability, but also the code patterns they tend to enforce. In short, they're good practice all around!
With that said, I'm hardly a zealot in this matter. Some problems just don't model nicely when everything is immutable. But I do think that you should try to push as much of your code in that direction as possible, assuming of course that you're using a language which makes this a tenable opinion (C/C++ makes this very difficult, as does Java). In short: the advantages depend somewhat on your problem, but I would tend to prefer immutability.
Immutable Objects vs. Immutable Collections
One of the finer points in the debate over mutable vs. immutable objects is the possibility of extending the concept of immutability to collections. An immutable object is an object that often represents a single logical structure of data (for example an immutable string). When you have a reference to an immutable object, the contents of the object will not change.
An immutable collection is a collection that never changes.
When I perform an operation on a mutable collection, then I change the collection in place, and all entities that have references to the collection will see the change.
When I perform an operation on an immutable collection, a reference is returned to a new collection reflecting the change. All entities that have references to previous versions of the collection will not see the change.
Clever implementations do not necessarily need to copy (clone) the entire collection in order to provide that immutability. The simplest example is the stack implemented as a singly linked list and the push/pop operations. You can reuse all of the nodes from the previous collection in the new collection, adding only a single node for the push, and cloning no nodes for the pop. The push_tail operation on a singly linked list, on the other hand, is not so simple or efficient.
Immutable vs. Mutable variables/references
Some functional languages take the concept of immutability to object references themselves, allowing only a single reference assignment.
In Erlang this is true for all "variables". I can only assign objects to a reference once. If I were to operate on a collection, I would not be able to reassign the new collection to the old reference (variable name).
Scala also builds this into the language with all references being declared with var or val, vals only being single assignment and promoting a functional style, but vars allowing a more C-like or Java-like program structure.
The var/val declaration is required, while many traditional languages use optional modifiers such as final in java and const in C.
Ease of Development vs. Performance
Almost always the reason to use an immutable object is to promote side effect free programming and simple reasoning about the code (especially in a highly concurrent/parallel environment). You don't have to worry about the underlying data being changed by another entity if the object is immutable.
The main drawback is performance. Here is a write-up on a simple test I did in Java comparing some immutable vs. mutable objects in a toy problem.
The performance issues are moot in many applications, but not all, which is why many large numerical packages, such as the Numpy Array class in Python, allow for In-Place updates of large arrays. This would be important for application areas that make use of large matrix and vector operations. This large data-parallel and computationally intensive problems achieve a great speed-up by operating in place.
Immutable objects are a very powerful concept. They take away a lot of the burden of trying to keep objects/variables consistent for all clients.
You can use them for low level, non-polymorphic objects - like a CPoint class - that are used mostly with value semantics.
Or you can use them for high level, polymorphic interfaces - like an IFunction representing a mathematical function - that is used exclusively with object semantics.
Greatest advantage: immutability + object semantics + smart pointers make object ownership a non-issue, all clients of the object have their own private copy by default. Implicitly this also means deterministic behavior in the presence of concurrency.
Disadvantage: when used with objects containing lots of data, memory consumption can become an issue. A solution to this could be to keep operations on an object symbolic and do a lazy evaluation. However, this can then lead to chains of symbolic calculations, that may negatively influence performance if the interface is not designed to accommodate symbolic operations. Something to definitely avoid in this case is returning huge chunks of memory from a method. In combination with chained symbolic operations, this could lead to massive memory consumption and performance degradation.
So immutable objects are definitely my primary way of thinking about object-oriented design, but they are not a dogma.
They solve a lot of problems for clients of objects, but also create many, especially for the implementers.
Check this blog post: http://www.yegor256.com/2014/06/09/objects-should-be-immutable.html. It explains why immutable objects are better than mutable. In short:
immutable objects are simpler to construct, test, and use
truly immutable objects are always thread-safe
they help to avoid temporal coupling
their usage is side-effect free (no defensive copies)
identity mutability problem is avoided
they always have failure atomicity
they are much easier to cache
You should specify what language you're talking about. For low-level languages like C or C++, I prefer to use mutable objects to conserve space and reduce memory churn. In higher-level languages, immutable objects make it easier to reason about the behavior of the code (especially multi-threaded code) because there's no "spooky action at a distance".
A mutable object is simply an object that can be modified after it's created/instantiated, vs an immutable object that cannot be modified (see the Wikipedia page on the subject). An example of this in a programming language is Pythons lists and tuples. Lists can be modified (e.g., new items can be added after it's created) whereas tuples cannot.
I don't really think there's a clearcut answer as to which one is better for all situations. They both have their places.
Shortly:
Mutable instance is passed by reference.
Immutable instance is passed by value.
Abstract example. Lets suppose that there exists a file named txtfile on my HDD. Now, when you are asking me to give you the txtfile file, I can do it in the following two modes:
I can create a shortcut to the txtfile and pass shortcut to you, or
I can do a full copy of the txtfile file and pass copied file to you.
In the first mode, the returned file represents a mutable file, because any change into the shortcut file will be reflected into the original one as well, and vice versa.
In the second mode, the returned file represents an immutable file, because any change into the copied file will not be reflected into the original one, and vice versa.
If a class type is mutable, a variable of that class type can have a number of different meanings. For example, suppose an object foo has a field int[] arr, and it holds a reference to a int[3] holding the numbers {5, 7, 9}. Even though the type of the field is known, there are at least four different things it can represent:
A potentially-shared reference, all of whose holders care only that it encapsulates the values 5, 7, and 9. If foo wants arr to encapsulate different values, it must replace it with a different array that contains the desired values. If one wants to make a copy of foo, one may give the copy either a reference to arr or a new array holding the values {1,2,3}, whichever is more convenient.
The only reference, anywhere in the universe, to an array which encapsulates the values 5, 7, and 9. set of three storage locations which at the moment hold the values 5, 7, and 9; if foo wants it to encapsulate the values 5, 8, and 9, it may either change the second item in that array or create a new array holding the values 5, 8, and 9 and abandon the old one. Note that if one wanted to make a copy of foo, one must in the copy replace arr with a reference to a new array in order for foo.arr to remain as the only reference to that array anywhere in the universe.
A reference to an array which is owned by some other object that has exposed it to foo for some reason (e.g. perhaps it wants foo to store some data there). In this scenario, arr doesn't encapsulate the contents of the array, but rather its identity. Because replacing arr with a reference to a new array would totally change its meaning, a copy of foo should hold a reference to the same array.
A reference to an array of which foo is the sole owner, but to which references are held by other object for some reason (e.g. it wants to have the other object to store data there--the flipside of the previous case). In this scenario, arr encapsulates both the identity of the array and its contents. Replacing arr with a reference to a new array would totally change its meaning, but having a clone's arr refer to foo.arr would violate the assumption that foo is the sole owner. There is thus no way to copy foo.
In theory, int[] should be a nice simple well-defined type, but it has four very different meanings. By contrast, a reference to an immutable object (e.g. String) generally only has one meaning. Much of the "power" of immutable objects stems from that fact.
Mutable collections are in general faster than their immutable counterparts when used for in-place
operations.
However, mutability comes at a cost: you need to be much more careful sharing them between
different parts of your program.
It is easy to create bugs where a shared mutable collection is updated
unexpectedly, forcing you to hunt down which line in a large codebase is performing the unwanted update.
A common approach is to use mutable collections locally within a function or private to a class where there
is a performance bottleneck, but to use immutable collections elsewhere where speed is less of a concern.
That gives you the high performance of mutable collections where it matters most, while not sacrificing
the safety that immutable collections give you throughout the bulk of your application logic.
If you return references of an array or string, then outside world can modify the content in that object, and hence make it as mutable (modifiable) object.
Immutable means can't be changed, and mutable means you can change.
Objects are different than primitives in Java. Primitives are built in types (boolean, int, etc) and objects (classes) are user created types.
Primitives and objects can be mutable or immutable when defined as member variables within the implementation of a class.
A lot of people people think primitives and object variables having a final modifier infront of them are immutable, however, this isn't exactly true. So final almost doesn't mean immutable for variables. See example here
http://www.siteconsortium.com/h/D0000F.php.
General Mutable vs Immutable
Unmodifiable - is a wrapper around modifiable. It guarantees that it can not be changed directly(but it is possibly using backing object)
Immutable - state of which can not be changed after creation. Object is immutable when all its fields are immutable. It is a next step of Unmodifiable object
Thread safe
The main advantage of Immutable object is that it is a naturally for concurrent environment. The biggest problem in concurrency is shared resource which can be changed any of thread. But if an object is immutable it is read-only which is thread safe operation. Any modification of an original immutable object return a copy
source of truth, side-effects free
As a developer you are completely sure that immutable object's state can not be changed from any place(on purpose or not). For example if a consumer uses immutable object he is able to use an original immutable object
compile optimisation
Improve performance
Disadvantage:
Copying of object is more heavy operation than changing a mutable object, that is why it has some performance footprint
To create an immutable object you should use:
1. Language level
Each language contains tools to help you with it. For example:
Java has final and primitives
Swift has let and struct[About].
Language defines a type of variable. For example:
Java has primitive and reference type,
Swift has value and reference type[About].
For immutable object more convenient is primitives and value type which make a copy by default. As for reference type it is more difficult(because you are able to change object's state out of it) but possible. For example you can use clone pattern on a developer level to make a deep(instead of shallow) copy.
2. Developer level
As a developer you should not provide an interface for changing state
[Swift] and [Java] immutable collection