Is there any difference between `runningFold` and `scan` methods in the Kotlin collections library? - kotlin

Comparing the pages for both methods scan and runningFold (from kotlin.collections), the two appear to be identical save for the name.
https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/scan.html
https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/running-fold.html

Apparently there is no difference, check this out:
runningFold() and runningReduce() are introduced as synonyms for
scan() and scanReduce(). Such names are more consistent with the
related functions fold() and reduce(). In the future, scan() will be
available along with runningFold() since it’s a commonly known name
for this operation. The experimental scanReduce(), however, will be
deprecated and removed soon.
source: https://blog.jetbrains.com/kotlin/2020/05/1-4-m2-standard-library/

Related

Difference between stream.max(Comparator) and stream.collect(Collectors.maxBy(Comparator) in Java

In Java Streams - what is the difference between stream.max(Comparator) and stream.collect(Collectors.maxBy(Comparator)) in terms of preformance. Both will fetch the max based on the comparator being passed. If this is the case why do we need the additional step of collecting using the collect method? When should we choose former vs latter? What are the use case scenarios suited for using both?
They do the same thing, and share the same code.
why do we need the additional step of collecting using the collect method?
You don't. Use max() if that's what you want to do. But there are cases where a Collector can be handy. For example:
Optional<Foo> result = stream.collect(createCollector());
where createCollector() would return a collector based on some condition, which could be maxBy, minBy, or something else.
In general, you shouldn't care too much about the small performance differences that might exist between two methods that do the same thing, and have a huge chance of being implemented the same way. Instead, you should make your code as clear and readable as possible.
There is a relevant quote in Effective Java 3rd Edition, page 214:
The collectors returned by the counting method are intended only for use as downstream collectors. The same functionality is available directly on Stream, via the count method, so there is never a reason to say collect(counting()). There are fifteen more Collectors with this property.
Given that maxBy is duplicated by Stream.max, it is presumably one of these sixteen methods.
Shortly after, same page, it goes on to justify the dual existence:
From a design perspective, these collectors represent an attempt to partially duplicate the functionality of streams in collectors so that downstream collectors can act as "ministreams".
Personally, I find this edict and explanation a bit unsatisfying: it says that it wasn't the intent for these 16 collectors to be used like this, but not why they shouldn't.
I suppose that the methods directly on stream are able to be implemented in specialized ways which could be more efficient than the general collectors.
According to java Documentation ,
the below are definition for maxBy , minBy From Collectors class ,
static <T> Collector<T,?,Optional<T>> maxBy(Comparator<? super T> comparator)
Returns a Collector that produces the maximal element according to a given Comparator, described as an Optional<T>.
static <T> Collector<T,?,Optional<T>> minBy(Comparator<? super T> comparator)
Returns a Collector that produces the minimal element according to a given Comparator, described as an Optional<T>.
where as max() and min() in Stream return the Optional<T>
every stream pipeline operation can be divided into terminal and non terminal operation .
so by definition from java doc , it is one thing clear that Stream provided max() ,min() are terminal operation and return Optional<T> .
but the maxBy() and minBy() are Collector producing operation , so they can be used for chaining computation .
They both use BinaryOperator.maxBy(comparator) and do a reducing operation to the elements (even though the implementation of how it is reduced is slightly different). Hence there are no changes in the output.
If you need to find the max among all the stream elements, I suggest using Stream.max because the code would look neat and also you do not really need to create a collector in this case.
But there are scenarios where Collectors.maxBy need to be used. Assume that you need to group your elements and need to find the max in each group. In such scenarios you cannot use Stream.max. Here you need to use Collectors.groupingBy(mapper, Collectors.maxBy(...)). Similarly you could use it for partitionBy and other similar methods where you need a collector.

Source of randomness in kotlin-stdlib-common

In kotlin-stdlib-common is there any source of randomness available out of the box? Whether it's some implementation of standard java.util.Random, kotlin.math.random* or basic current time millis that I can use to create my own random number generator? I can't find any.
If it's not there, how would you get the source of randomness without setting your own platform dependent implementations? This is the only method I need:
expect class Rng {
fun nextInt(): Int
}
I'm trying to make it platform agnostic.
The answer would be: wait for Kotlin 1.3 to get released where the common library will be enriched with classes and methods that can provide the source for random values.
https://kotlinlang.org/docs/reference/whatsnew13.html#multiplatform-random
This maybe a post with many links, which may cause the problem of Your answer is in another castle: when is an answer not an answer?, so I try my best to write the link description. And my understanding of Kotlin Multiplatform is Kotlin-Multiplatform = Kotlin-JVM + Kotlin-JS.
I believe that the random number of Kotlin-JVM is provided by java.util.Random, and Math.Random() if it is Kotlin-JS, with these following reasons:
How can I get a random number in Kotlin?, and there is an answer in that question said that Kotlin-JS can use Math.Random() to get random number.
Can't get any result of any random number related method of Kotlin-JVM, but there is a random() in Kotlin-JS.
The source code of Kotlin-JVM related file, while using Random(), there is a import java.util.*, or some file directly use java.util.Random for example kotlin/libraries/stdlib/jvm/src/kotlin/collections/MutableCollectionsJVM.kt#L78.
And, java.util.Random is designed as result platform-independent, and also implementation platform-independent, with these reason:
Is Java's RNG (using seeds) platform-independent?, though this question may be out-dated.
We can't find keyword "native" in the source of JDK8/java.util.Random or the source of JDK10/java.util.Random, and the RNG logic is clear in these source code, where seed is decided by nanoTime() if not provided, and RNG is the implementation of Volume 2, TAOCP.
So, I think,
How would you get the source of randomness without setting your own platform dependent implementations?
Maybe a random-enough seed and a random-enough (P)RNG.

ABAP type pool: program with type code TYPP but with name longer than five characters

We are writing a tool in Java that parses and transforms ABAP code. We therefore have no intention to write new ABAP code but our tool has to handle all of ABAP, even obsolete statements. Furthermore, I'm not an ABAP expert.
ABAP programs can use type groups, introduced by key word TYPE-POOL. Names of type groups have a maximal length of five (internally eight, if you count the prefix "% C"), their type code is TYPP. In the past, relying on these assumptions worked well for us.
Recently, we see ABAP programs with type code TYPP but with name longer than 5, e.g., 'OIA===========================P'. Furthermore, for each of those, there is another, empty object with same name but type code INCL. These new objects are referenced only if a regular type group is, too.
These new objects may be internal ones and irrelevant for us - I haven't seen any reference to them in the ABAP Keyword Documentation. On the other hand, they are confusing us because we see them.
Can someone explain to me the meaning of these objects and point me to some documentation?
Edit: Here examples from an EHP7 for SAP ERP 6.0 system
An example object. Entries in D010INC look fine:
The same object now using type pool mrm. Where do the additional includes come from?
These objects are introduced through inclusions, extensions and switched objects. To read along:
Check type pool MRM, type mrm_idoc_data_ers - that type contains a statement to include rmrm_idoc_data_ers_sbo. A similar include statement pulls rmrm_upd_arseg_nfm into mrm_upd_arseg. That explains the last two lines. Your parser should have caught that.
RMRM_IDOC_DATA_ERS_SBO contains an enhancement point named RMRM_IDOC_DATA_ERS_SBO_02 that belongs to an enhancement spot ES_RMRM_IDOC_DATA_ERS_SBO. Similarly, RMRM_UPD_ARSEG_NFM contains an enhancement point RMRM_UPD_ARSEG_NFM_01 that belongs to the enhancement spot ES_RMRM_UPD_ARSEG_NFM.
For ES_RMRM_IDOC_DATA_ERS_SBO, an enhancement implementation named ISAUTO_MRM_RMRM_IDOC_DATA_ERS exists. For ES_RMRM_UPD_ARSEG_NFM, an implementation named /NFM/MM_RMRM_UPD_ARSEG_NFM exists. That explains the references ending with =E
The implementation ISAUTO_MRM_RMRM_IDOC_DATA_ERS is located in the package ISAUTO_MRM. The implementation /NFM/MM_RMRM_UPD_ARSEG_NFM is located in the package /NFM/MM. That explains the references ending with =P. Obviously, these references are not generated for every package:
The package ISAUTO_MRM is controlled by the switch AM_ERS, the package /NFM/MM is controlled by the switch /NFM/MM. That explains the references ending in =S.
Ultimately, these references can be used to determine which programs need to be re-generated when the state of a switch is changed.

Ocaml naming convention

I am wondering if there exists already some naming conventions for Ocaml, especially for names of constructors, names of variables, names of functions, and names for labels of record.
For instance, if I want to define a type condition, do you suggest to annote its constructors explicitly (for example Condition_None) so as to know directly it is a constructor of condition?
Also how would you name a variable of this type? c or a_condition? I always hesitate to use a, an or the.
To declare a function, is it necessary to give it a name which allows to infer the types of arguments from its name, for example remove_condition_from_list: condition -> condition list -> condition list?
In addition, I use record a lot in my programs. How do you name a record so that it looks different from a normal variable?
There are really thousands of ways to name something, I would like to find a conventional one with a good taste, stick to it, so that I do not need to think before naming. This is an open discussion, any suggestion will be welcome. Thank you!
You may be interested in the Caml programming guidelines. They cover variable naming, but do not answer your precise questions.
Regarding constructor namespacing : in theory, you should be able to use modules as namespaces rather than adding prefixes to your constructor names. You could have, say, a Constructor module and use Constructor.None to avoid confusion with the standard None constructor of the option type. You could then use open or the local open syntax of ocaml 3.12, or use module aliasing module C = Constructor then C.None when useful, to avoid long names.
In practice, people still tend to use a short prefix, such as the first letter of the type name capitalized, CNone, to avoid any confusion when you manipulate two modules with the same constructor names; this often happen, for example, when you are writing a compiler and have several passes manipulating different AST types with similar types: after-parsing Let form, after-typing Let form, etc.
Regarding your second question, I would favor concision. Inference mean the type information can most of the time stay implicit, you don't need to enforce explicit annotation in your naming conventions. It will often be obvious from the context -- or unimportant -- what types are manipulated, eg. remove cond (l1 # l2). It's even less useful if your remove value is defined inside a Condition submodule.
Edit: record labels have the same scoping behavior than sum type constructors. If you have defined a {x: int; y : int} record in a Coord submodule, you access fields with foo.Coord.x outside the module, or with an alias foo.C.x, or Coord.(foo.x) using the "local open" feature of 3.12. That's basically the same thing as sum constructors.
Before 3.12, you had to write that module on each field of a record, eg. {Coord.x = 2; Coord.y = 3}. Since 3.12 you can just qualify the first field: {Coord.x = 2; y = 3}. This also works in pattern position.
If you want naming convention suggestions, look at the standard library. Beyond that you'll find many people with their own naming conventions, and it's up to you to decide who to trust (just be consistent, i.e. pick one, not many). The standard library is the only thing that's shared by all Ocaml programmers.
Often you would define a single type, or a single bunch of closely related types, in a module. So rather than having a type called condition, you'd have a module called Condition with a type t. (You should give your module some other name though, because there is already a module called Condition in the standard library!). A function to remove a condition from a list would be Condition.remove_from_list or ConditionList.remove. See for example the modules List, Array, Hashtbl,Map.Make`, etc. in the standard library.
For an example of a module that defines many types, look at Unix. This is a bit of a special case because the names are mostly taken from the preexisting C API. Many constructors have a short prefix, e.g. O_ for open_flag, SEEK_ for seek_command, etc.; this is a reasonable convention.
There's no reason to encode the type of a variable in its name. The compiler won't use the name to deduce the type. If the type of a variable isn't clear to a casual reader from the context, put a type annotation when you define it; that way the information provided to the reader is validated by the compiler.

Separate Namespaces for Functions and Variables in Common Lisp versus Scheme

Scheme uses a single namespace for all variables, regardless of whether they are bound to functions or other types of values. Common Lisp separates the two, such that the identifier "hello" may refer to a function in one context, and a string in another.
(Note 1: This question needs an example of the above; feel free to edit it and add one, or e-mail the original author with it and I will do so.)
However, in some contexts, such as passing functions as parameters to other functions, the programmer must explicitly distinguish that he's specifying a function variable, rather than a non-function variable, by using #', as in:
(sort (list '(9 A) '(3 B) '(4 C)) #'< :key #'first)
I have always considered this to be a bit of a wart, but I've recently run across an argument that this is actually a feature:
...the
important distinction actually lies in the syntax of forms, not in the
type of objects. Without knowing anything about the runtime values
involved, it is quite clear that the first element of a function form
must be a function. CL takes this fact and makes it a part of the
language, along with macro and special forms which also can (and must)
be determined statically. So my question is: why would you want the
names of functions and the names of variables to be in the same
namespace, when the primary use of function names is to appear where a
variable name would rarely want to appear?
Consider the case of class names: why should a class named FOO prevent
the use of variables named FOO? The only time I would be referring the
class by the name FOO is in contexts which expect a class name. If, on
the rare occasion I need to get the class object which is bound to the
class name FOO, there is FIND-CLASS.
This argument does make some sense to me from experience; there is a similar case in Haskell with field names, which are also functions used to access the fields. This is a bit awkward:
data Point = Point { x, y :: Double {- lots of other fields as well --} }
isOrigin p = (x p == 0) && (y p == 0)
This is solved by a bit of extra syntax, made especially nice by the NamedFieldPuns extension:
isOrigin2 Point{x,y} = (x == 0) && (y == 0)
So, to the question, beyond consistency, what are the advantages and disadvantages, both for Common Lisp vs. Scheme and in general, of a single namespace for all values versus separate ones for functions and non-function values?
The two different approaches have names: Lisp-1 and Lisp-2. A Lisp-1 has a single namespace for both variables and functions (as in Scheme) while a Lisp-2 has separate namespaces for variables and functions (as in Common Lisp). I mention this because you may not be aware of the terminology since you didn't refer to it in your question.
Wikipedia refers to this debate:
Whether a separate namespace for functions is an advantage is a source of contention in the Lisp community. It is usually referred to as the Lisp-1 vs. Lisp-2 debate. Lisp-1 refers to Scheme's model and Lisp-2 refers to Common Lisp's model. These names were coined in a 1988 paper by Richard P. Gabriel and Kent Pitman, which extensively compares the two approaches.
Gabriel and Pitman's paper titled Technical Issues of Separation in Function Cells and Value Cells addresses this very issue.
Actually, as outlined in the paper by Richard Gabriel and Kent Pitman, the debate is about Lisp-5 against Lisp-6, since there are several other namespaces already there, in the paper are mentioned type names, tag names, block names, and declaration names. edit: this seems to be incorrect, as Rainer points out in the comment: Scheme actually seems to be a Lisp-1. The following is largely unaffected by this error, though.
Whether a symbol denotes something to be executed or something to be referred to is always clear from the context. Throwing functions and variables into the same namespace is primarily a restriction: the programmer cannot use the same name for a thing and an action. What a Lisp-5 gets out of this is just that some syntactic overhead for referencing something from a different namespace than what the current context implies is avoided. edit: this is not the whole picture, just the surface.
I know that Lisp-5 proponents like the fact that functions are data, and that this is expressed in the language core. I like the fact that I can call a list "list" and a car "car" without confusing my compiler, and functions are a fundamentally special kind of data anyway. edit: this is my main point: separate namespaces are not a wart at all.
I also liked what Pascal Constanza had to say about this.
I've met a similar distinction in Python (unified namespace) vs Ruby (distinct namespaces for methods vs non-methods). In that context, I prefer Python's approach -- for example, with that approach, if I want to make a list of things, some of which are functions while others aren't, I don't have to do anything different with their names, depending on their "function-ness", for example. Similar considerations apply to all cases in which function objects are to be bandied around rather than called (arguments to, and return values from, higher-order functions, etc, etc).
Non-functions can be called, too (if their classes define __call__, in the case of Python -- a special case of "operator overloading") so the "contextual distinction" isn't necessarily clear, either.
However, my "lisp-oid" experience is/was mostly with Scheme rather than Common Lisp, so I may be subconsciously biased by the familiarity with the uniform namespace that in the end comes from that experience.
The name of a function in Scheme is just a variable with the function as its value. Whether I do (define x (y) (z y)) or (let ((x (lambda (y) (z y)))), I'm defining a function that I can call. So the idea that "a variable name would rarely want to appear there" is kind of specious as far as Scheme is concerned.
Scheme is a characteristically functional language, so treating functions as data is one of its tenets. Having functions be a type of their own that's stored like all other data is a way of carrying on the idea.
The biggest downside I see, at least for Common Lisp, is understandability. We can all agree that it uses different namespaces for variables and functions, but how many does it have? In PAIP, Norvig showed that it has "at least seven" namespaces.
When one of the language's classic books, written by a highly respected programmer, can't even say for certain in a published book, I think there's a problem. I don't have a problem with multiple namespaces, but I wish the language was, at the least, simple enough that somebody could understand this aspect of it entirely.
I'm comfortable using the same symbol for a variable and for a function, but in the more obscure areas I resort to using different names out of fear (colliding namespaces can be really hard to debug!), and that really should never be the case.
There's good things to both approaches. However, I find that when it matters, I prefer having both a function LIST and a a variable LIST than having to spell one of them incorrectly.