Specifying lifetimes in nested iterators for flattening - iterator

I'm writing a function that should take a couple of vectors and produce their Cartesian product (all of their pair combinations) as a vector in row-major order. In other words, if I have
let x_coords = vec![1, 2, 3];
let y_coords = vec![4, 5, 6];
I want to produce
vec![ [1,4], [1,5], [1,6], [2,4], [2,5], [2,6], [3,4], [3,5], [3,6] ]
This seemed like a perfect job for .flat_map():
fn main() {
let x_coords = vec![1, 2, 3];
let y_coords = vec![4, 5, 6];
let coord_vec: Vec<[isize; 2]> =
x_coords.iter().map(|&x_coord| {
y_coords.iter().map(|&y_coord| {
[x_coord, y_coord]
})
}).flat_map(|s| s).collect();
// Expecting to get: vec![ [1,4], [1,5], [1,6], [2,4], [2,5], [2,6], [3,4], [3,5], [3,6] ]
println!("{:?}", &coord_vec);
}
But this doesn't work, because &x_coord doesn't live long enough. According to the compiler, it ends up inside the y_coords map and then never makes it back out.
I tried using .clone() and move in the closures, but got a strangely long and unclear lecture by the compiler in the form of multiple Note: lines.
Am I just completely off base with flat_map, or can this be saved?

Solution
You were really close! This works:
let coord_vec: Vec<_> = x_coords.iter()
.flat_map(|&x_coord| {
y_coords.iter().map(move |&y_coord| {
// ^^^^
[x_coord, y_coord]
})
})
.collect();
The only thing I added is the move keyword in front of the inner closure. Why? Let's try to understand what the compiler is thinking below!
A few notes, though:
I renamed map to flat_map and removed the second flat_map call... You made your life more complicated than it has to be ;-)
I omitted part of the type annotation of coord_vec, because it's not necessary
Explanation
The type x_coord is i32 (or any other integer). Not a reference or anything but the value directly. This means that x_coord is owned by the enclosing function, which happens to be a closure, specifically the "outer" closure, that you pass to flat_map. Thus x_coord lives just inside the closure, not longer. That's what the compiler is telling you. So far so good.
When you define the second closure (the "inner" one), you access the environment, specifically x_coord. The important question now is: how does a closure access its environment? It can do so with an immutable reference, with a mutable reference and by value. The Rust compiler determines what kind of access to the environment is needed and chooses the "least intrusive" option that works. Let's look at your code: the compiler figures out that the closure only needs to borrow the environment immutably (because i32 is Copy and thus the closure can transform its &i32 into a i32 easily).
But in this case, the compiler's reasoning is wrong! The closure borrowing its environment leads to a limited lifetime. And in this case we need the closure to live for longer than it can, hence the error.
By adding move, we force the compiler to pass the environment to the closure by value (transferring ownership). This way the closure borrows nothing and can live forever (satisfies 'static).

Related

Unsinking sunk calls via CALL-ME

While trying to come up with an example of maps in sunk context, I bumped into this code:
my $a = -> { 42 };
my $b = -> { "foo" };
$a;
$a();
($a,$b).map: { $_ };
The first call to $a by itself returns:
WARNINGS for /home/jmerelo/Code/raku/my-raku-examples/sunk-map.p6:
Useless use of $a in sink context (line 6)
However, putting .() or () behind, or using them in (what I though it was) sink context in a map didn't result in any warning. Probably it's not sink context, but I would like to know why.
$a;
$a();
($a,$b).foo ...
The first call to $a by itself
That's not a "call". That's just a mention of it aka "use". (Note: this, er, use of "use" has nothing to do with the keyword use in a use statement.)
WARNINGS for /home/jmerelo/Code/raku/my-raku-examples/sunk-map.p6:
Useless use of $a in sink context (line 6)
Right. Just mentioning it like that is useless. You don't do anything with the value of $a (which has been assigned a function). You just drop it on the floor.
However, putting .() or () behind, or using them in (what I though it was) sink context in a map didn't result in any warning.
Right. Those cause .CALL-ME to be invoked on their left hand side. Those are understood to be potentially useful regardless of whether any value returned by the call is used by the code. So there's no warning despite being in sink context (though perhaps sink context is itself context dependent and at the compiler level they aren't even considered to be in sink context; I don't know).
cf my comment about a similar situation in perl.

Is the term immutable variable just a convention?

In Rust variables are immutable by default, i.e., they don't vary but are not constants (as noted here).
Do they retain the name "variable" just by convention, or is there another reason why the term "variable" is maintained?
It should be noted that the term mut in Rust was hotly debated before stabilization with some arguing that it should be called excl or uniq. The matter is that the mut in in let mut x and &mut x are two completely different things.
let mut x declares that x is mutable, in the sense that it can be re-assigned, but also that one can take a &mut reference of it; which is best called an exclusive or unique reference. It is quite possible in Rust in some cases to mutate through a shared reference in the case of std::cell::Cell, for instance, and not all operations that require an exclusive reference involve mutation. An operation that requires an exclusive reference is simply one that would be unsafe with a shared one; Cell is designed in such a way that it is not, by strictly controlling under what conditions mutation can occur.
In theory, the two functions of let mut x could have different keywords, but they are compressed into one for simplicity. Rust could in theory be designed with mut and excl being different keywords, and allowing for let excl x, which would be a variable wherefrom one could take an exclusive reference, but not mutate.
One can also have variables that are not declared with mut, in particular in function calls. In a signature like fn func ( x : u32 ), x is not mutable, but it is variable, because it a different x can be passed every single time.
The let mut x type of "mutable" is purely a lint and, in theory, unnecessary for Rust to work — any currently working Rust program will continue to work if all non-mutable variables be made mutable. It's simply considered bad practice to do so and the compiler will warn the programmer whenever he make a variable mutable that isn't necessary to be mutable; this helps catching unintended bugs. This is absolutely not the case with exclusive and shared references, which are necessary to be distinguished and more than just a lint.
Here "variable" means "factor involved in computation" not "varying". This is from the mathematical principle where expressions like f(x) include x, a variable, as a part of the equation.
In Rust, like with other languages, you'll need variables (e.g. input) that affects how the program runs, otherwise your program would only ever behave in a singular, specific way, producing the same output each time.
You'll need to think of what variables change during processing and which do not. Those that do not need to change do not need to be declared mutable.
Regardless of if or when they change, they're still considered variables.
In C++ you'll have things like const int x which is a constant (read-only) variable, so the term can take on all sorts of specific meanings.
Is the term immutable variable just a convention?
By definition every... definition of a word is a convention, language, meaning of the word, change by time, is unique for every people that live, you can take 100 peoples and end with 100 difference definition of 1 word. That why we often start scientific paper by defining word that could be miss understand in the paper. Trying to clarify as much as possible. Rust does not differs that why we have The Reference
We have a specific section for variable
A variable is a component of a stack frame, either a named function
parameter, an anonymous temporary, or a named local variable.
A local variable (or stack-local allocation) holds a value directly,
allocated within the stack's memory. The value is a part of the stack
frame.
Local variables are immutable unless declared otherwise. For example:
let mut x = ....
Function parameters are immutable unless declared with mut. The mut
keyword applies only to the following parameter. For example: |mut x,
y| and fn f(mut x: Box, y: Box) declare one mutable variable
x and one immutable variable y.
Local variables are not initialized when allocated. Instead, the
entire frame worth of local variables are allocated, on frame-entry,
in an uninitialized state. Subsequent statements within a function may
or may not initialize the local variables. Local variables can be used
only after they have been initialized through all reachable control
flow paths.
So there is not much to add, variable in rust is clearly defined, it doesn't matter if your definition doesn't match or you find a definition of variable that doesn't match Rust one. In the context of Rust, variable is that. If you want to ask about opinion about this choice then it's off topic as opinion oriented. But, wiki definition make Rust definition quite standard both from mathematics view than computer science:
Variable (computer science), a symbolic name associated with a value and whose associated value may be changed
Variable (mathematics), a symbol that represents a quantity in a mathematical expression, as used in many sciences

Mixing-in roles in traits apparently not working

This example is taken from roast, although it's been there for 8 years:
role doc { has $.doc is rw }
multi trait_mod:<is>(Variable $a, :$docced!) {
$a does doc.new(doc => $docced);
}
my $dog is docced('barks');
say $dog.VAR;
This returns Any, without any kind of role mixed in. There's apparently no way to get to the "doc" part, although the trait does not error. Any idea?
(This answer builds on #guifa's answer and JJ's comment.)
The idiom to use in variable traits is essentially $var.var.VAR.
While that sounds fun when said aloud it also seems crazy. It isn't, but it demands explanation at the very least and perhaps some sort of cognitive/syntactic relief.
Here's the brief version of how to make some sense of it:
$var makes sense as the name of the trait parameter because it's bound to a Variable, a compiler's-eye view of a variable.
.var is needed to access the user's-eye view of a variable given the compiler's-eye view.
If the variable is a Scalar then a .VAR is needed as well to get the variable rather than the value it contains. (It does no harm if it isn't a Scalar.)
Some relief?
I'll explain the above in more detail in a mo, but first, what about some relief?
Perhaps we could introduce a new Variable method that does .var.VAR. But imo this would be a mistake unless the name for the method is so good it essentially eliminates the need for the $var.var.VAR incantation explanation that follows in the next section of this answer.
But I doubt such a name exists. Every name I've come up with makes matters worse in some way. And even if we came up with the perfect name, it would still barely be worth it at best.
I was struck by the complexity of your original example. There's an is trait that calls a does trait. So perhaps there's call for a routine that abstracts both that complexity and the $var.var.VAR. But there are existing ways to reduce that double trait complexity anyway, eg:
role doc[$doc] { has $.doc is rw = $doc}
my $dog does doc['barks'];
say $dog.doc; # barks
A longer explanation of $var.var.VAR
But $v is already a variable. Why so many var and VARs?
Indeed. $v is bound to an instance of the Variable class. Isn't that enough?
No, because a Variable:
Is for storing metadata about a variable while it's being compiled. (Perhaps it should have been called Metadata-About-A-Variable-Being-Compiled? Just kidding. Variable looks nice in trait signatures and changing its name wouldn't stop us needing to use and explain the $var.var.VAR idiom anyway.)
Is not the droid we are looking for. We want a user's-eye view of the variable. One that's been declared and compiled and is then being used as part of user code. (For example, $dog in the line say $dog.... Even if it were BEGIN say $dog..., so it ran at compile-time, $dog would still refer to a symbol that's bound to a user's-eye view container or value. It would not refer to the Variable instance that's only the compiler's-eye view of data related to the variable.)
Makes life easier for the compiler and those writing traits. But it requires that a trait writer accesses the user's-eye view of the variable to access or alter the user's-eye view. The .var attribute of the Variable stores that user's-eye view. (I note the roast test has a .container attribute that you omitted. That's clearly now been renamed .var. My guess is that that's because a variable may be bound to an immutable value rather than a container so the name .container was considered misleading.)
So, how do we arrive at $var.var.VAR?
Let's start with a variant of your original code and then move forward. I'll switch from $dog to #dog and drop the .VAR from the say line:
multi trait_mod:<is>(Variable $a, :$docced!) {
$a does role { has $.doc = $docced }
}
my #dog is docced('barks');
say #dog.doc; # No such method 'doc' for invocant of type 'Array'
This almost works. One tiny change and it works:
multi trait_mod:<is>(Variable $a, :$docced!) {
$a.var does role { has $.doc = $docced }
}
my #dog is docced('barks');
say #dog.doc; # barks
All I've done is add a .var to the ... does role ... line. In your original, that line is modifying the compiler's-eye view of the variable, i.e. the Variable object bound to $a. It doesn't modify the user's-eye view of the variable, i.e. the Array bound to #dog.
As far as I know everything now works correctly for plural containers like arrays and hashes:
#dog[1] = 42;
say #dog; # [(Any) 42]
say #dog.doc; # barks
But when we try it with a Scalar variable:
my $dog is docced('barks');
we get:
Cannot use 'does' operator on a type object Any.
This is because the .var returns whatever it is that the user's-eye view variable usually returns. With an Array you get the Array. But with a Scalar you get the value the Scalar contains. (This is a fundamental aspect of P6. It works great but you have to know it in these sorts of scenarios.)
So to get this to appear to work again we have to add a couple .VAR's as well. For anything other than a Scalar a .VAR is a "no op" so it does no harm to cases other than a Scalar to add it:
multi trait_mod:<is>(Variable $a, :$docced!) {
$a.var.VAR does role { has $.doc = $docced }
}
And now the Scalar case also appears to work:
my $dog is docced('barks');
say $dog.VAR.doc; # barks
(I've had to reintroduce the .VAR in the say line for the same reason I had to add it to the $a.var.VAR ... line.)
If all were well that would be the end of this answer.
A bug
But something is broken. If we'd attempted to initialize the Scalar variable:
my $dog is docced('barks') = 42;
we'd see:
Cannot assign to an immutable value
As #guifa noted, and I stumbled on a while back:
It seems that a Scalar with a mixin no longer successfully functions as a container and the assignment fails. This currently looks to me like a bug.
Not a satisfactory answer but maybe you can progress from it
role doc {
has $.doc is rw;
}
multi trait_mod:<is>(Variable:D $v, :$docced!) {
$v.var.VAR does doc;
$v.var.VAR.doc = $docced;
}
say $dog; # ↪︎ Scalar+{doc}.new(doc => "barks")
say $dog.doc;  # ↪︎ barks
$dog.doc = 'woofs'; #
say $dog; # ↪︎ Scalar+{doc}.new(doc => "woofs")
Unfortunately, there is something off with this, and applying the trait seems to cause the variable to become immutable.

How to mutate another item in a vector, but not the vector itself, while iterating over the vector?

It is quite clear to me that iterating over a vector shouldn't let the loop body mutate the vector arbitrarily. This prevents iterator invalidation, which is prone to bugs.
However, not all kinds of mutation lead to iterator invalidation. See the following example:
let mut my_vec: Vec<Vec<i32>> = vec![vec![1,2], vec![3,4], vec![5,6]];
for inner in my_vec.iter_mut() { // <- or .iter()
// ...
my_vec[some_index].push(inner[0]); // <-- ERROR
}
Such a mutation does not invalidate the iterator of my_vec, however it is disallowed. It could invalidate any references to the specific elements in my_vec[some_index] but we do not use any such references anyway.
I know that these questions are common, and I'm not asking for an explanation. I am looking for a way to refactor this so that I can get rid of this loop. In my actual code I have a huge loop body and I can't modularize it unless I express this bit nicely.
What I have thought of so far:
Wrapping the vector with Rc<RefCell<...>>. I think this would still fail at runtime, since the RefCell would be borrowed by the iterator and then will fail when the loop body tries to borrow it.
Using a temporary vector to accumulate the future pushes, and push them after the loop ends. This is okay, but needs more allocations than pushing them on the fly.
Unsafe code, and messing with pointers.
Anything listed in the Iterator documentation does not help. I checked out itertools and it looks like it wouldn't help either.
Using a while loop and indexing instead of using an iterator making use of a reference to the outer vector. This is okay, but does not let me use iterators and adapters. I just want to get rid of this outer loop and use my_vec.foreach(...).
Are there any idioms or any libraries which would let me do this nicely Unsafe functions would be okay as long as they don't expose pointers to me.
You can wrap each of the inner vectors in a RefCell.
use std::cell::RefCell;
fn main() {
let my_vec : Vec<RefCell<Vec<i32>>> = vec![
RefCell::new(vec![1,2]),
RefCell::new(vec![3,4]),
RefCell::new(vec![5,6])];
for inner in my_vec.iter() {
// ...
let value = inner.borrow()[0];
my_vec[some_index].borrow_mut().push(value);
}
}
Note that the value binding here is important if you need to be able to push to the vector that inner refers to. value happens to be a type that doesn't contain references (it's i32), so it doesn't keep the first borrow active (it ends by the end of the statement). Then, the next statement may borrow the same vector or another vector mutably and it'll work.
If we wrote my_vec[some_index].borrow_mut().push(inner.borrow()[0]); instead, then both borrows would be active until the end of the statement. If both my_vec[some_index] and inner refer to the same RefCell<Vec<i32>>, this will panic with RefCell<T> already mutably borrowed.
Without changing the type of my_vec, you could simply use access by indexing and split_at_mut:
for index in 0..my_vec.len() {
let (first, second) = my_vec.split_at_mut(index);
first[some_index].push(second[0]);
}
Note: beware, the indices in second are off by index.
This is safe, relatively easy, and very flexible. It does not, however, work with iterator adaptors.

Unpacking stack objects (such as structs) using "if let"

This is rather a Swift compiler optimization question about the Swift optional stack object (such as struct) and "if let".
In Swift "if let" provides you a syntactic sugar to work with optionals.
What about the structs that live on the stack? As a c++ programmer, I would not introduce an unnecessary copy of a stack object, especially, only in order to check it's presence in the container. Is the struct being copied with all it's members recursively every time you use "if let", or the swift compiler is optimized enough to create a local variable by reference or using other tricks?
For example, we have this struct packaged into an optional:
struct MyData{
var a=1
var b=2
//lots more store....
func description()->String{
return "MyData: a="+String(a)+", b="+String(b)
}
}
var optionalData:MyData?=nil
optionalData=MyData()
since the struct is on the stack, to unpack, is there an unnecessary copy from the container optionalData to local var data, or the fact that the data is a constant, the copy is optimized away?
if let data=optionalData{//is data copy or reference?
println(data.description())
}
since the struct is on the stack, to unpack, is there an unnecessary copy from the container optionalData to local var data, or the fact that the data is a constant, the copy is optimized away?
It is unlikely that the compiler is actually emitting code to make a copy. let essentially gives another name to an expression.
With classes, "let x = y" will allow you to write through your copy of x (because you are just copying a reference), i.e.
let x = y
x.foo = bar
y.foo // => bar
but with structs, this is not the case. You aren't allowed to write to a let struct or call any mutable methods on it. This allows the Swift compiler to treat let x = y, where y is a struct, as a no-op.
However, this code probably does make a copy of y:
y.foo = bar
let x = y
y.foo = baz
x.foo // => bar
It has to, because you wrote to the thing you were copying from. This is known as "copy-on-write", and it's an optimization that's made possible by using let semantics.
To answer your final question:
if let data=optionalData{//is data copy or reference?
println(data.description())
}
data is assuredly a reference in this case. Actually it probably does not exist at all; the compiler is going to emit the same code as if you wrote:
if (optionalData != nil)
{
println(optionalData!.description())
}