What is the name of a [foo, bar] = ["foo", "bar"] feature? - language-features

I need to know a correct name for this cool feature that some languages provide.
FYI: In some languages it is possible to do a multiple assignments by assigning a structure of values to a structure of "variables". In the example in the question title it assigns "foo" to foo and "bar" to bar.

It's generally called destructuring bind in functional languages (which don't have assignments) and destructuring assignment in imperative languages.
Some languages provide subsets of that feature and then call it something different. For example, in Python it works with Tuples, Lists or Sequences and is called Tuple unpacking, List unpacking or Sequence unpacking, in Ruby, it works with Arrays (or objects that are convertible to an array) and is called parallel assignment.
Destructuring bind can get arbitrarily complex. E.g. this (imaginary) bind
[Integer(a), b, 2, c] = some_array
would assign the first element of some_array to a, the second element to b and the fourth element to c, but only if the first element is an Integer, the third element is equal to 2 and the length is 4. So, this even incorporates some conditional logic.
Destructuring bind is a subset of more general pattern matching, which is a standard feature of functional languages like Haskell, ML, OCaml, F#, Erlang and Scala. The difference is that destructuring bind only lets you take apart a structure and bind its components to variables, whereas pattern matching also matches on values inside those structures and lets you make decisions and in particular lets you run arbitrary code in the context of the bindings. (You can see the above imaginary bind as half-way in between destructuring bind and pattern matching.)
Here's the classical example of a reverse function in an imaginary language, written using pattern matching:
def reverse(l: List): List {
match l {
when [] { return [] }
when [first :: rest] { return (reverse(rest) :: first) }
}
}

In Python it is known as list or sequence unpacking: http://docs.python.org/tutorial/datastructures.html#tuples-and-sequences
my_list = ["foo", "bar"]
foo, bar = my_list

It's called parallel assignment in Ruby and other languages.

Perl and PHP call it list assignment
Perl:
my ($foo, $bar, $baz) = (1, 2, 3);
PHP:
list($foo, $bar, $baz) = array(1, 2, 3);

If you view the right hand side as a tuple, one could view the assignment as a kind of Tuple Unpacking.

In Erlang it's ... well, it's not assignment, it's pattern matching (seeing as there is no assignment, as such, in Erlang).
$ erl
Erlang R14B (erts-5.8.1) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:true]
Eshell V5.8.1 (abort with ^G)
1> [H1, H2, H3| Rest] = [1,2,3,4,5].
[1,2,3,4,5]
2> H1.
1
3> H2.
2
4> H3.
3
5> Rest.
[4,5]
Why is it called "pattern matching"? Because it actually is matching patterns. Look:
6> [1,2,3,4,A] = [1,2,3,4,5].
[1,2,3,4,5]
7> A.
5
8> [1,2,3,4,A] = [1,2,3,4,6].
** exception error: no match of right hand side value [1,2,3,4,6]
In the first one we did what effectively amounts to an assertion that the list would start with [1,2,3,4] and that the fifth value could be anything at all, but please bind it into the unbound variable A. In the second one we did the same thing except that A is now bound so we're looking explicitly for the list [1,2,3,4,5] (because A is now 5).

Mozilla calls it destructuring assignment. In Python, it's sequence unpacking; tuple unpacking is a common special case.

In Clojure it would be called destructuring. Simple example:
(let [[foo bar] ["foo" "bar"]]
(println "I haz" foo "and" bar))
It's also often used in function definitions, e.g. the following destructures a single point argument into x and y components:
(defn distance-from-origin [[x y]]
(sqrt (+ (* x x) (* y y))))
You can also use the same technique to destructure nested data structures or key/value associative maps.

Related

Raku: return Type

I want to write a function returning an array whose all subarrays must have a length of two.
For example return will be [[1, 2], [3, 4]].
I define:
(1) subset TArray of Array where { .all ~~ subset :: where [Int, Int] };
and
sub fcn(Int $n) of TArray is export(:fcn) {
[[1, 2], [3, 4]];
}
I find (1) over-complicated. Is there something simpler?
Stepping back first
subset TArray of Array where { .all ~~ subset :: where [Int, Int] };
Is there something simpler?
Before we go there, let's step back. Even ignoring your code's "overly-complicated" nature based on just looking at it, it's also potentially problematic and complicated for various reasons that may not be so obvious. I'll highlight three:
This subset will accept an Array containing Arrays, with each of those arrays containing two Ints. But it doesn't mandate an Array[Array[Int]]. The outer Array's type may be just a generic Array, rather than being an Array[Array] let lone an Array[Array[Int]]. Indeed it will be unless you deliberately introduce strongly typed values. I will cover strong typing in the last section of this answer.
What about an empty Array? Your subset will accept that. Is that your intent? If not, what about requiring at least one pair of Ints?
The outer where clause uses a common Raku idiom of the form .all ~~ ..., with a junction on the left hand side of the ~~ smart match operator. Astonishingly, per an issue I just filed, this may be a problem. What alternatives are there?
Starting simple
Raku does a decent job of keeping simple things simple. If we put aside any artificial desire for strong typing, and focus on simple tools for tightening code up, a simple subset I would have suggested in the past would be:
subset TArray where .all == 2; # BAD despite being idiomatic???
This has all of the problems your original code has, plus in addition it accepts data that has non-integers where integers belong.
But it does have the redeeming qualities that it does a useful check (that the inner arrays each have two elements) and it's significantly simpler than your code.
Now I've reminded myself that I need to view .all on the left hand side of ~~ as possibly a problem, I'll instead write it as:
subset TArray where 2 == .all; # Potentially the new idiomatic.
This version reads more poorly, but, while readability is important, basic correctness is more important.
Still fairly simple, and less problems
Here are two variants I came up with:
subset TArray where all .map: * ~~ (Int,Int);
subset TArray where .elems == .grep: (Int,Int);
These both avoid the junction/smartmatch problem. (The first where expression does have a junction to the left of a smart match, but it's not an example of the problem.)
The second version isn't so obviously correct (think of it as checking that the count of subarrays is the same as the count of subarrays that match (Int,Int)) but it nicely lends itself to fixing the problem of matching if there are zero subarrays, if that were to need fixing:
subset TArray where 0 < .elems == .grep: (Int,Int);
Strong typing solutions
The solutions thus far don't deal with strong typing. Perhaps that's desirable. Perhaps not.
To understand what I mean by this, let's first look at literals:
say WHAT 1; # (Int)
say WHAT [1,2]; # (Array)
say WHAT [[1,2],[3,4]]; # (Array)
These values have types determined by their literal constructors.
The last two are just Arrays, generic over their elements.
(The second is not an Array[Int], which might be expected. Similarly the last one is not an Array[Array[Int]].)
Current built in Raku literal forms for composite types (arrays and hashes) all construct generic Arrays which do not constrain the types of their elements.
See the PR Introduce [1,2,3]:Int syntax #4406 for a proposal/PR regarding element typed composite literals and a related issue I just posted in response to your Q here about an alternative and/or complementary approach to that PR. (There have been discussions over the years about this aspect of the type system but it seems like it's time for Rakoons to look at addressing it.)
What if you wanted to build a strongly typed data structure as the value to return from your routine, and to have the return type check that?
Here's one way one might build such a strongly typed value:
my Array[Array[Int]] $result .= new: Array[Int].new(1,2), Array[Int].new(3,4);
Super verbose! But now you could write the following for your sub's return type check and it'll work:
subset TArray of Array[Array[Int]] where 0 < .elems == .grep: (Int,Int);
sub fcn(Int $n) of TArray is export(:fcn) {
my Array[Array[Int]] $result .= new: Array[Int].new(1,2), Array[Int].new(3,4);
}
Another way to build a strongly typed value is to specify not only the strong typing in a variable's type constraint, but also coercion typing to bridge from a loosely typed value to a strongly typed target.
We keep the exact same subset (that establishes the strongly typed target data structure and adds "refinement typing" checks):
subset TArray of Array[Array[Int]] where 0 < .elems == .grep: (Int,Int);
But instead of using a verbose correct-by-construction initialization value, using full type names and news, we introduce additional coercion typing and then just use ordinary literal syntax:
constant TArrayInitialization = TArray(Array[Array[Int]()]());
sub fcn(Int $n) of TArray is export(:fcn) {
my TArrayInitialization $result = [[1,2],[3,4]];
}
(I could have written the TArrayInitialization declaration as another subset, but it would be a slight overkill to have done so. A constant does the job with less fuss.)
I gather that the aim is to restrict the type of the inner Array to [Int,Int] ... the closest I can get to this is to declare two subsets, one based on the other...
subset IArray where * ~~ [Int, Int];
subset TArray where .all ~~ IArray;
Otherwise, the anonymous subset form you use seems to be the briefest, although as #raiph points out you can drop the 'of Array' piece.
If you wanted to impose this sort of constraint on a function's parameter (rather than its return type) you could do so with something like:
sub fcn(#a where {all .map: * ~~ [Int, Int]}) {...}
As the other answers have mentioned, there currently isn't great syntax for similarly constraining the return type, but there's a proposal to add support for similar syntax for return types. In fact, as mentioned in that issue, someone has volunteered to work on an implementation but hasn't yet made any progress as far as I know. (And I guess I should know, since I was that volunteer… oops)
So, for now, a subset is the best option – but hopefully the future will have even better ways to write that.

Why does this predicate leave behind a choicepoint?

I've written the following predicate:
list_withoutlast([_Last], []). % forget the last element
list_withoutlast([First, Second|List], [First|WithoutLast]) :-
list_withoutlast([Second|List], WithoutLast).
Queries like list_withoutlast(X, [1, 2]). succeed deterministically, but queries like list_withoutlast([1, 2, 3], X) leave behind a choicepoint, even though there's only one answer.
When I trace it seems to be that SWI attempts to match list_withoutlast([3], Var) against both clauses, even though definitely only the first one will ever match!
Is there something else I can do to tell SWI that I want a list with more than one element? Or, if I want to take advantage of first-argument indexing, are my only options "zero-length list" and "non-zero length list"?
Do other Prologs handle this situation any differently?
You can rewrite your predicate to avoid the spurious choice point:
list_withoutlast([Head| Tail], List) :-
list_withoutlast(Tail, Head, List).
list_withoutlast([], _, []).
list_withoutlast([Head| Tail], Previous, [Previous| List]) :-
list_withoutlast(Tail, Head, List).
This definition takes advantage of first-argument indexing, which will distinguish in the list_withoutlast /3 predicate the first clause, which have an atom (the empty list) in the first argument, from the second clause, which have a (non-empty) list in the first argument.
Passing the head and tail of an input list argument as separate arguments to an auxiliary predicate is a common Prolog programming idiom to take advantage of first-argument indexing and avoid spurious choice-points.
Note that most Prolog systems don't apply deep term indexing. In particular, for compound terms, indexing usually only takes into account the name and arity and doesn't take into account the compound term arguments (a list with one element and a list with two or more elements share the same functor).

What does the operator := mean? [duplicate]

I've seen := used in several code samples, but never with an accompanying explanation. It's not exactly possible to google its use without knowing the proper name for it.
What does it do?
http://en.wikipedia.org/wiki/Equals_sign#In_computer_programming
In computer programming languages, the equals sign typically denotes either a boolean operator to test equality of values (e.g. as in Pascal or Eiffel), which is consistent with the symbol's usage in mathematics, or an assignment operator (e.g. as in C-like languages). Languages making the former choice often use a colon-equals (:=) or ≔ to denote their assignment operator. Languages making the latter choice often use a double equals sign (==) to denote their boolean equality operator.
Note: I found this by searching for colon equals operator
It's the assignment operator in Pascal and is often used in proofs and pseudo-code. It's the same thing as = in C-dialect languages.
Historically, computer science papers used = for equality comparisons and ← for assignments. Pascal used := to stand in for the hard-to-type left arrow. C went a different direction and instead decided on the = and == operators.
In the statically typed language Go := is initialization and assignment in one step. It is done to allow for interpreted-like creation of variables in a compiled language.
// Creates and assigns
answer := 42
// Creates and assigns
var answer = 42
Another interpretation from outside the world of programming languages comes from Wolfram Mathworld, et al:
If A and B are equal by definition (i.e., A is defined as B), then this is written symbolically as A=B, A:=B, or sometimes A≜B.
■ http://mathworld.wolfram.com/Defined.html
■ https://math.stackexchange.com/questions/182101/appropriate-notation-equiv-versus
Some language uses := to act as the assignment operator.
In a lot of CS books, it's used as the assignment operator, to differentiate from the equality operator =. In a lot of high level languages, though, assignment is = and equality is ==.
This is old (pascal) syntax for the assignment operator. It would be used like so:
a := 45;
It may be in other languages as well, probably in a similar use.
A number of programming languages, most notably Pascal and Ada, use a colon immediately followed by an equals sign (:=) as the assignment operator, to distinguish it from a single equals which is an equality test (C instead used a single equals as assignment, and a double equals as the equality test).
Reference: Colon (punctuation).
In Python:
Named Expressions (NAME := expr) was introduced in Python 3.8. It allows for the assignment of variables within an expression that is currently being evaluated. The colon equals operator := is sometimes called the walrus operator because, well, it looks like a walrus emoticon.
For example:
if any((comment := line).startswith('#') for line in lines):
print(f"First comment: {comment}")
else:
print("There are no comments")
This would be invalid if you swapped the := for =. Note the additional parentheses surrounding the named expression. Another example:
# Compute partial sums in a list comprehension
total = 0
values = [1, 2, 3, 4, 5]
partial_sums = [total := total + v for v in values]
# [1, 3, 6, 10, 15]
print(f"Total: {total}") # Total: 15
Note that the variable total is not local to the comprehension (so too is comment from the first example). The NAME in a named expression cannot be a local variable within an expression, so, for example, [i := 0 for i, j in stuff] would be invalid, because i is local to the list comprehension.
I've taken examples from the PEP 572 document - it's a good read! I for one am looking forward to using Named Expressions, once my company upgrades from Python 3.6. Hope this was helpful!
Sources: Towards Data Science Article and PEP 572.
It's like an arrow without using a less-than symbol <= so like everybody already said "assignment" operator. Bringing clarity to what is being set to where as opposed to the logical operator of equivalence.
In Mathematics it is like equals but A := B means A is defined as B, a triple bar equals can be used to say it's similar and equal by definition but not always the same thing.
Anyway I point to these other references that were probably in the minds of those that invented it, but it's really just that plane equals and less that equals were taken (or potentially easily confused with =<) and something new to define assignment was needed and that made the most sense.
Historical References: I first saw this in SmallTalk the original Object Language, of which SJ of Apple only copied the Windows part of and BG of Microsoft watered down from them further (single threaded). Eventually SJ in NeXT took the second more important lesson from Xerox PARC in, which became Objective C.
Well anyway they just took colon-equals assiment operator from ALGOL 1958 which was later popularized by Pascal
https://en.wikipedia.org/wiki/PARC_(company)
https://en.wikipedia.org/wiki/Assignment_(computer_science)
Assignments typically allow a variable to hold different values at
different times during its life-span and scope. However, some
languages (primarily strictly functional) do not allow that kind of
"destructive" reassignment, as it might imply changes of non-local
state.
The purpose is to enforce referential transparency, i.e. functions
that do not depend on the state of some variable(s), but produce the
same results for a given set of parametric inputs at any point in
time.
https://en.wikipedia.org/wiki/Referential_transparency
For VB.net,
a constructor (for this case, Me = this in Java):
Public ABC(int A, int B, int C){
Me.A = A;
Me.B = B;
Me.C = C;
}
when you create that object:
new ABC(C:=1, A:=2, B:=3)
Then, regardless of the order of the parameters, that ABC object has A=2, B=3, C=1
So, ya, very good practice for others to read your code effectively
Colon-equals was used in Algol and its descendants such as Pascal and Ada because it is as close as ASCII gets to a left-arrow symbol.
The strange convention of using equals for assignment and double-equals for comparison was started with the C language.
In Prolog, there is no distinction between assignment and the equality test.

Accessing the last element in Perl6

Could someone explain why this accesses the last element in Perl 6
#array[*-1]
and why we need the asterisk *?
Isn't it more logical to do something like this:
#array[-1]
The user documentation explains that *-1 is just a code object, which could also be written as
-> $n { $n - 1 }
When passed to [ ], it will be invoked with the array size as argument to compute the index.
So instead of just being able to start counting backwards from the end of the array, you could use it to eg count forwards from its center via
#array[* div 2] #=> middlemost element
#array[* div 2 + 1] #=> next element after the middlemost one
According to the design documents, the reason for outlawing negative indices (which could have been accepted even with the above generalization in place) is this:
The Perl 6 semantics avoids indexing discontinuities (a source of subtle runtime errors), and provides ordinal access in both directions at both ends of the array.
If you don't like the whatever-star, you can also do:
my $last-elem = #array.tail;
or even
my ($second-last, $last) = #array.tail(2);
Edit: Of course, there's also a head method:
my ($first, $second) = #array.head(2);
The other two answers are excellent. My only reason for answering was to add a little more explanation about the Whatever Star * array indexing syntax.
The equivalent of Perl 6's #array[*-1] syntax in Perl 5 would be $array[ scalar #array - 1]. In Perl 5, in scalar context an array returns the number of items it contains, so scalar #array gives you the length of the array. Subtracting one from this gives you the last index of the array.
Since in Perl 6 indices can be restricted to never be negative, if they are negative then they are definitely out of range. But in Perl 5, a negative index may or may not be "out of range". If it is out of range, then it only gives you an undefined value which isn't easy to distinguish from simply having an undefined value in an element.
For example, the Perl 5 code:
use v5.10;
use strict;
use warnings;
my #array = ('a', undef, 'c');
say $array[-1]; # 'c'
say $array[-2]; # undefined
say $array[-3]; # 'a'
say $array[-4]; # out of range
say "======= FINISHED =======";
results in two nearly identical warnings, but still finishes running:
c
Use of uninitialized value $array[-2] in say at array.pl line 7.
a
Use of uninitialized value in say at array.pl line 9.
======= FINISHED =======
But the Perl 6 code
use v6;
my #array = 'a', Any, 'c';
put #array[*-1]; # evaluated as #array[2] or 'c'
put #array[*-2]; # evaluated as #array[1] or Any (i.e. undefined)
put #array[*-3]; # evaluated as #array[0] or 'a'
put #array[*-4]; # evaluated as #array[-1], which is a syntax error
put "======= FINISHED =======";
will likewise warn about the undefined value being used, but it fails upon the use of an index that comes out less than 0:
c
Use of uninitialized value #array of type Any in string context.
Methods .^name, .perl, .gist, or .say can be used to stringify it to something meaningful.
in block <unit> at array.p6 line 5
a
Effective index out of range. Is: -1, should be in 0..Inf
in block <unit> at array.p6 line 7
Actually thrown at:
in block <unit> at array.p6 line 7
Thus your Perl 6 code can more robust by not allowing negative indices, but you can still index from the end using the Whatever Star syntax.
last word of advice
If you just need the last few elements of an array, I'd recommend using the tail method mentioned in mscha's answer. #array.tail(3) is much more self-explanatory than #array[*-3 .. *-1].

What's the purpose of #[datatype] constructor in Rebol

I just came across this syntax in Rebol to construct some values:
>> #[email! "me#host.com"]
== me#host.com
This seems to be equivalent to
>> to email! "me#host.com"
== me#host.com
and this
>> #[string! "hello"]
== "hello"
While these error out:
>> #[integer! 1]
** Syntax Error: Invalid construct -- #[
** Near: (line 1) #[integer! 1]
>> #[decimal! 1]
** Syntax Error: Invalid construct -- #[
** Near: (line 1) #[decimal! 1]
>> #[string! 1]
== [string! 1]
I wonder what's this for? what benefits does it bring?
This syntactic structure is called "construction syntax" (also see CureCode issue #1955). Its raison d'être is to allow literal forms for values which could not otherwise be directly represented.
There are two main classes of situations that require construction syntax.
Values which (except for construction syntax) have no separate lexical form, but are only constructed through evaluation.
Values with a regular literal form, but where a particular value is "special" in as far as the value no longer can be written directly.
(1) Construction Syntax for Direct Value Representation
One prominent example for this class is object!. Rebol objects are typically created with code such as make object! [a: 42]. This is not a direct literal representation of the resulting object value, but rather code which, when evaluated, (in the DO dialect) creates the expected object value. Construction syntax allows for a direct representation of the value: #[object! [a: 42]].
Other typical examples are #[none!] and the somewhat irregular (construction-syntax-wise) #[true] and #[false] (note the missing !). Wrapping one's head around the difference between e.g. #[none!] and none will lead to a deeper understanding of Rebol semantics (and is thus left as an exercise for the reader).
(2) Construction Syntax as "Escape Mechanism"
A typical example for this case is a reversed URL, such as the value you get from reverse http://stackoverflow.com. If the resulting value would be serialised as a plain URL!, the serialised form will no longer be syntactically valid:
>> /moc.wolfrevokcats//:ptth
** Script error: // does not allow unset! for its value2 argument
Construction syntax provides an escape mechanism for this situation:
>> #[url! "/moc.wolfrevokcats//:ptth"] = reverse http://stackoverflow.com/
== true
This particular example also points to a problematic situation:
>> reverse http://stackoverflow.com/
== /moc.wolfrevokcats//:ptth
Arguably, this output (produced by the mold native) is wrong: the result as displayed is not a valid lexical representation of the corresponding value.
There's a few bugs tracking particular instances of this class of problems (such as CureCode issue #2010 for URL!). One proposal on the table towards a more general solution is to have a round-trip check built into some variant of MOLD: whenever LOAD-ing the result of a non-construction-syntax MOLD results in value different from the original value, this MOLD variant would fall back to serialising to construction syntax.