How to implement an iterator giving struct with lifetimes? - iterator

I want to create a mutable iterator that control modifications, for this I create a struct named FitnessIterMut that impl the Iterator trait.
The next() method gives a struct that can do things on the container itself when a modification is done. (Is it a good way to do this sort of things ?)
pub struct FitnessModifier<'a, T: 'a> {
wheel: &'a mut RouletteWheel<T>,
value: &'a mut (f32, T)
}
impl<'a, T> FitnessModifier<'a, T> {
pub fn read(&'a self) -> &'a (f32, T) {
self.value
}
pub fn set_fitness(&'a self, new: f32) {
let &mut (ref mut fitness, _) = self.value;
self.wheel.proba_sum -= *fitness;
self.wheel.proba_sum += new;
*fitness = new;
}
}
pub struct FitnessIterMut<'a, T: 'a> {
wheel: &'a mut RouletteWheel<T>,
iterator: &'a mut IterMut<'a, (f32, T)>
}
impl<'a, T> Iterator for FitnessIterMut<'a, T> {
type Item = FitnessModifier<'a, T>;
fn next(&mut self) -> Option<Self::Item> {
if let Some(value) = self.iterator.next() {
Some(FitnessModifier { wheel: self.wheel, value: value })
}
else {
None
}
}
}
This gives me this error, I think I have to do a 'b lifetime but I'm a little lost.
error: cannot infer an appropriate lifetime for automatic coercion due to conflicting requirements [E0495]
Some(FitnessModifier { wheel: self.wheel, value: value })
^~~~~~~~~~
help: consider using an explicit lifetime parameter as shown: fn next(&'a mut self) -> Option<Self::Item>
fn next(&mut self) -> Option<Self::Item> {
if let Some(value) = self.iterator.next() {
Some(FitnessModifier { wheel: self.wheel, value: value })
}
else {
None

You won't be able to get this to work with simple mutable references, unless you're OK with not implementing the standard Iterator trait. That's because it's not legal in Rust to have more than one usable mutable alias to a particular value at the same time (because it can lead to memory unsafety). Let's see why your code violates this restriction.
First, I can instantiate a FitnessIterMut object. From this object, I can call next to obtain a FitnessModifier. At this point, both the FitnessIterMut and the FitnessModifier contain a mutable reference to a RouletteWheel object, and both the FitnessIterMut and the FitnessModifier are still usable – that's not legal! I could call next again on the FitnessIterMut to obtain another FitnessModifier, and now I'd have 3 mutable aliases to the RouletteWheel.
Your code fails to compile because you assumed that mutable references can be copied, which is not the case. Immutable references (&'a T) implement Copy, but mutable references (&'a mut T) do not, so they cannot be copied.
What could we do to fix this? Rust lets us temporarily make a mutable reference unusable (i.e. you'll get a compiler error if you try to use it) by reborrowing from it. Normally, for types that don't implement Copy, the compiler will move a value instead of copying it, but for mutable references, the compiler will reborrow instead of moving. Reborrowing can be seen as "flattening" or "collapsing" references to references, while keeping the shortest lifetime.
Let's see how this works in practice. Here's a valid implementation of next. Note that this doesn't conform to the contract of the standard Iterator trait, so I made this an inherent method instead.
impl<'a, T> FitnessIterMut<'a, T> {
fn next<'b>(&'b mut self) -> Option<FitnessModifier<'b, T>> {
if let Some(value) = self.iterator.next() {
Some(FitnessModifier { wheel: self.wheel, value: value })
}
else {
None
}
}
}
Instead of returning an Option<FitnessModifier<'a, T>>, we now return an Option<FitnessModifier<'b, T>>, where 'b is linked to the lifetime of the self argument. When initializing the wheel field of the result FitnessModifier, the compiler will automatically reborrow from self.wheel (we could make this explicit by writing &mut *self.wheel instead of self.wheel).
Since this expression references a &'a mut RouletteWheel<T>, you thought the type of this expression would also be &'a mut RouletteWheel<T>. However, because this expression borrows from self, which is a &'b mut FitnessIterMut<'a, T>, the type of this expression is actually &'b mut RouletteWheel<T>. In your code, you tried to assign a &'b mut RouletteWheel<T> to a field expecting a &'a mut RouletteWheel<T>, but 'a is longer than 'b, which is why you got a compiler error.
If Rust didn't allow reborrowing, then instead of storing a &'b mut RouletteWheel<T> in FitnessModifier, you'd have to store a &'b &'a mut RouletteWheel<T>, where 'a is the lifetime of the RouletteWheel<T> and 'b is the lifetime of the &'a mut RouletteWheel<T> in the FitnessIterMut<'a, T>. However, Rust lets us "collapse" this reference to a reference and we can just store a &'b mut RouletteWheel<T> instead (the lifetime is 'b, not 'a, because 'b is the shorter lifetime).
The net effect of this change is that, after you call next, you can't use the FitnessIterMut at all until the resulting Option<FitnessModifier<'b, T>> goes out of scope. That's because the FitnessModifier is borrowing from self, and since the method passed self by mutable reference, the compiler assumes that the FitnessModifier keeps a mutable reference to the FitnessIterMut or to one of its fields (which is true here, not is not always true in general). Thus, while there's a FitnessModifier in scope, there's only one usable mutable alias to the RouletteWheel, which is the one in the FitnessModifier object. When the FitnessModifier<'b, T> goes out of scope, the FitnessIterMut object will become usable again.
If you absolutely need to conform to the Iterator trait, then I suggest you replace your mutable references with Rc<RefCell<T>> instead. Rc doesn't implement Copy, but it implements Clone (which only clones the pointer, not the underlying data), so you need to call .clone() explicitly to clone an Rc. RefCell does dynamic borrow checking at runtime, which has a bit of runtime overhead, but gives you more freedom in how you pass mutable objects around.

Related

Rust `From` trait, errors, reference vs Box and `?` operator [duplicate]

This question already has answers here:
Is there any way to return a reference to a variable created in a function?
(5 answers)
Closed 3 years ago.
I am pretty confused on the ? operator in functions that return Result<T, E>.
I have the following snippet of code:
use std::error;
use std::fs;
fn foo(s: &str) -> Result<&str, Box<error::Error>> {
let result = fs::read_to_string(s)?;
return Ok(&result);
}
fn bar(s: &str) -> Result<&str, &dyn error::Error> {
// the trait `std::convert::From<std::io::Error>` is not implemented for `&dyn std::error::Error` (1)
let result = fs::read_to_string(s)?;
return Ok(&result);
}
fn main() {
println!("{}", foo("foo.txt").unwrap());
println!("{}", bar("bar.txt").unwrap());
}
As you might see from the above snippet, the ? operator works pretty well with the returned boxed error, but not with dynamic error references (error at (1)).
Is there any specific reason why it does not work? In my limited knowledge of Rust, it is more natural to return an error reference, rather than a boxed object: in the end, after returning rom the foo function, I expect deref coercion to work with it, so why not returning the error reference itself?
Look at this function signature:
fn bar(s: &str) -> Result<&str, &dyn error::Error> {
The error type is a reference, but a reference to what? Who owns the value being referenced? The value cannot be owned by the function itself because it would go out of scope and Rust, quite rightly, won't allow you to return the dangling reference. So the only alternative is that the error is the input string slice s, or some sub-slice of it. This is definitely not what you wanted.
Now, the error:
the trait `std::convert::From<std::io::Error>` is not implemented for `&dyn std::error::Error`
The trait isn't implemented, and it can't be. To see why, try to implement it by hand:
impl<'a> From<io::Error> for &'a dyn error::Error {
fn from(e: io::Error) -> &'a dyn error::Error {
// what can go here?
}
}
This method is impossible to implement, for exactly the same reason.
Why does it work for Box<dyn Error>? A Box allocates its data on the heap, but also owns that data and deallocates it when the box goes out of scope. This is completely different from references, where the owner is separate, and the reference is prevented from outliving the data by lifetime parameters in the types.
See also:
Is there any way to return a reference to a variable created in a function?
Although it is possible to cast the concrete type std::io::Error into dyn Error, it is not possible to return it as a reference because the "owned" value is being dropped/erased/removed at the end of the function, same goes to your String -> &str. The Box<error::Error> example works because an owned Error is created in the heap (Box<std::io::Error>) and the std has an implementation of Error for Box<T> (impl<T: Error> Error for Box<T>).
If you want to erase the concrete type and only work with the available methods of a trait, it is possible to use impl Trait.
use std::{error, fs};
fn foo(s: &str) -> Result<String, Box<dyn error::Error>> {
let result = fs::read_to_string(s)?;
Ok(result)
}
fn bar(s: &str) -> Result<String, impl error::Error> {
let result = match fs::read_to_string(s) {
Ok(x) => x,
Err(x) => return Err(x),
};
Ok(result)
}
fn main() {
println!("{}", foo("foo.txt").unwrap());
println!("{}", bar("bar.txt").unwrap());
}

How can I implement a "default iterator" for a trait?

I am trying to implement a default iterator for structs implementing a trait. My trait is called DataRow, represents a row of table cells, and looks like this:
pub trait DataRow<'a> {
// Gets a cell by index
fn getCell(&self, i: usize) -> &DataCell<'a>;
// Gets the number of cells in the row
fn getNumCells(&self) -> usize;
}
The default iterator I want to provide should use those two methods to iterate over the row and return cell references. In Java this would boil down to an abstract class DataRow that implements Iterable. In Rust I tried first with IntoIterator:
impl<'a, T> IntoIterator for &'a T
where
T: DataRow<'a>,
{
type Item = &'a DataCell<'a>;
type IntoIter = DataRowIterator<'a, T>;
fn into_iter(self) -> DataRowIterator<'a, T> {
return DataRowIterator::new(self);
}
}
This does not work as anyone could implement their own iterator for their own implementation of the DataRow trait.
My second try was adding an iter method to the trait which creates the iterator and returns it:
fn iter(&self) -> DataRowIterator<'a, Self> {
return DataRowIterator::new(self);
}
This does not work either, because the size of Self is not known at compile time. Since DataRow can contain an arbitrary number of cells, I also cannot mark it as Sized to get around that.
My demo code including notes on the occurring errors
How would someone implement such a "default iterator" for a custom trait?
You can implement IntoIterator for a trait object reference.
impl<'a> IntoIterator for &'a DataRow<'a> {
type Item = &'a DataCell<'a>;
type IntoIter = DataRowIterator<'a>;
fn into_iter(self) -> DataRowIterator<'a> {
DataRowIterator::new(self)
}
}
DataRowIterator should be modified to keep the trait object reference &DataRow instead of &T and use the methods available for the DataRow trait.

Constrain parent trait to reference

I want to define a trait Container such that every implementor of this trait also needs to implement IntoIterator, with the caveat that the iteration ALWAYS only borrows the instance. If I understand correctly, I can implement IntoIterator using a pattern like this:
impl<'a> IntoIterator for &'a ContainerImpl
However, how can I specify that this needs to be implemented if a type implements Container, e.g.:
trait Container: &IntoIter ???
You can add a where clause to traits, too (playground):
trait IterBorrow where for<'a> &'a Self: IntoIterator {}
impl IterBorrow for [i32] {} // legal
// impl IterBorrow for i32 {} // Illegal
However, it seems you currently need to reiterate this bound whenever you actually want to iterate, i.e., this function does not compile without the where clause:
fn foo<T: IterBorrow>(x: T) where for<'a> &'a T: IntoIterator {
for _ in &x {}
for _ in &x {}
}

How can I return an iterator over a slice?

fn main() {
let vec: Vec<_> = (0..5).map(|n| n.to_string()).collect();
for item in get_iterator(&vec) {
println!("{}", item);
}
}
fn get_iterator(s: &[String]) -> Box<Iterator<Item=String>> {
Box::new(s.iter())
}
fn get_iterator<'a>(s: &'a [String]) -> Box<Iterator<Item=&'a String> + 'a> {
Box::new(s.iter())
}
The trick here is that we start with a slice of items and that slice has the lifetime 'a. slice::iter returns a slice::Iter with the same lifetime as the slice. The implementation of Iterator likewise returns references with that lifetime. We need to connect all of the lifetimes together.
That explains the 'a in the arguments and in the Item=&'a part. So what's the + 'a mean? There's a complete answer about that, and another with more detail. The short version is that an object with references inside of it may implement a trait, so we need to account for those lifetimes when talking about a trait. By default, that lifetime is 'static as it was determined that was the usual case.
The Box is not strictly required, but is a normal thing you'll see when you don't want to deal with the complicated types that might underlie the implementation (or just don't want to expose the implementation). In this case, the function could be
fn get_iterator<'a>(s: &'a [String]) -> std::slice::Iter<'a, String> {
s.iter()
}
But if you add .skip(1), the type would be:
std::iter::Skip<std::slice::Iter<'a, String>>
And if you involve a closure, then it's currently impossible to specify the type, as closures are unique, anonymous, auto-generated types! A Box is required for those cases.

Rust struct can borrow "&'a mut self" twice, so why can't a trait?

The following Rust code compiles successfully:
struct StructNothing;
impl<'a> StructNothing {
fn nothing(&'a mut self) -> () {}
fn twice_nothing(&'a mut self) -> () {
self.nothing();
self.nothing();
}
}
However, if we try to package it in a trait, it fails:
pub trait TraitNothing<'a> {
fn nothing(&'a mut self) -> () {}
fn twice_nothing(&'a mut self) -> () {
self.nothing();
self.nothing();
}
}
This gives us:
error[E0499]: cannot borrow `*self` as mutable more than once at a time
--> src/lib.rs:6:9
|
1 | pub trait TraitNothing<'a> {
| -- lifetime `'a` defined here
...
5 | self.nothing();
| --------------
| |
| first mutable borrow occurs here
| argument requires that `*self` is borrowed for `'a`
6 | self.nothing();
| ^^^^ second mutable borrow occurs here
Why is the first version allowed, but the second version forbidden?
Is there any way to convince the compiler that the second version is OK?
Background and motivation
Libraries like rust-csv would like to support streaming, zero-copy parsing because it's 25 to 50 times faster than allocating memory (according to benchmarks). But Rust's built-in Iterator trait can't be used for this, because there's no way to implement collect(). The goal is to define a StreamingIterator trait which can be shared by rust-csv and several similar libraries, but every attempt to implement it so far has run into the problem above.
The following is an extension of Francis's answer using implicit lifetimes but it allows for the return value to be lifetime bound:
pub trait TraitNothing<'a> {
fn change_it(&mut self);
fn nothing(&mut self) -> &Self {
self.change_it();
self
}
fn bounded_nothing(&'a mut self) -> &'a Self {
self.nothing()
}
fn twice_nothing(&'a mut self) -> &'a Self {
// uncomment to show old fail
// self.bounded_nothing();
// self.bounded_nothing()
self.nothing();
self.nothing()
}
}
It's less than perfect, but you can call the methods with implicit lifetimes change_it and nothing multiple times within other methods. I don't know if this will solve your real problem because ultimately self has the generic type &mut Self in the trait methods whereas in the struct it has type &mut StructNothing and the compiler can't guarantee that Self doesn't contain a reference. This workaround does solve the code example.
If you put the lifetime parameters on each method rather than on the trait itself, it compiles:
pub trait TraitNothing {
fn nothing<'a>(&'a mut self) -> () {}
fn twice_nothing<'a>(&'a mut self) -> () {
self.nothing();
self.nothing();
}
}
Nobody seemed to answer the "why?" so here I am.
Here's the point: In the trait, we're calling methods from the same trait. However, in the free impl, we're not calling methods from the same impl.
What? Surely we call methods from the same impl?
Let's be more precise: we're calling methods from the same impl, but not with the same generic parameters.
Your free impl is essentially equivalent to the following:
impl StructNothing {
fn nothing<'a>(&'a mut self) {}
fn twice_nothing<'a>(&'a mut self) {
self.nothing();
self.nothing();
}
}
Because the impl's generic lifetime is floating, it can be chosen separately for each method. The compiler does not call <Self<'a>>::nothing(self), but rather it calls <Self<'some_shorter_lifetime>>::nothing(&mut *self).
With the trait, on the other hand, the situation is completely different. The only thing we can know for sure is that Self: Trait<'b>. We cannot call nothing() with a shorter lifetime, because maybe Self doesn't implement Trait with the shorter lifetime. Therefore, we are forced to call <Self as Trait<'a>>::nothing(self), and the result is that we're borrowing for overlapping regions.
From this we can infer that if we tell the compiler that Self implements Trait for any lifetime it will work:
fn twice_nothing(&'a mut self)
where
Self: for<'b> TraitNothing<'b>,
{
(&mut *self).nothing();
(&mut *self).nothing();
}
...except it fails to compile because of issue #84435, so I don't know whether this would have succeeded :(
Is this really surprising?
The assertion you're making is that &mut self lasts for at least the lifetime 'a.
In the former case, &mut self is a pointer to a struct. No pointer aliasing occurs because the borrow is entirely contained in nothing().
In the latter case, the &mut self is a pointer to a pointer to a struct + a vtable for the trait. You're locking the pointed to struct that implements TraitNothing for the duration of 'a; i.e. the whole function each time.
By removing 'a, you're implicitly using 'static, which says the impl lasts forever, so its fine.
If you want to work around it, transmute the &'a TraitNothing to &'static TraitNothing... but I'm pretty sure that's not what you want to do.
This is why we need block scopes ('b: { .... }) in Rust...
Try using dummy lifetimes perhaps?