How can I return an iterator over a slice? - iterator

fn main() {
let vec: Vec<_> = (0..5).map(|n| n.to_string()).collect();
for item in get_iterator(&vec) {
println!("{}", item);
}
}
fn get_iterator(s: &[String]) -> Box<Iterator<Item=String>> {
Box::new(s.iter())
}

fn get_iterator<'a>(s: &'a [String]) -> Box<Iterator<Item=&'a String> + 'a> {
Box::new(s.iter())
}
The trick here is that we start with a slice of items and that slice has the lifetime 'a. slice::iter returns a slice::Iter with the same lifetime as the slice. The implementation of Iterator likewise returns references with that lifetime. We need to connect all of the lifetimes together.
That explains the 'a in the arguments and in the Item=&'a part. So what's the + 'a mean? There's a complete answer about that, and another with more detail. The short version is that an object with references inside of it may implement a trait, so we need to account for those lifetimes when talking about a trait. By default, that lifetime is 'static as it was determined that was the usual case.
The Box is not strictly required, but is a normal thing you'll see when you don't want to deal with the complicated types that might underlie the implementation (or just don't want to expose the implementation). In this case, the function could be
fn get_iterator<'a>(s: &'a [String]) -> std::slice::Iter<'a, String> {
s.iter()
}
But if you add .skip(1), the type would be:
std::iter::Skip<std::slice::Iter<'a, String>>
And if you involve a closure, then it's currently impossible to specify the type, as closures are unique, anonymous, auto-generated types! A Box is required for those cases.

Related

Rust `From` trait, errors, reference vs Box and `?` operator [duplicate]

This question already has answers here:
Is there any way to return a reference to a variable created in a function?
(5 answers)
Closed 3 years ago.
I am pretty confused on the ? operator in functions that return Result<T, E>.
I have the following snippet of code:
use std::error;
use std::fs;
fn foo(s: &str) -> Result<&str, Box<error::Error>> {
let result = fs::read_to_string(s)?;
return Ok(&result);
}
fn bar(s: &str) -> Result<&str, &dyn error::Error> {
// the trait `std::convert::From<std::io::Error>` is not implemented for `&dyn std::error::Error` (1)
let result = fs::read_to_string(s)?;
return Ok(&result);
}
fn main() {
println!("{}", foo("foo.txt").unwrap());
println!("{}", bar("bar.txt").unwrap());
}
As you might see from the above snippet, the ? operator works pretty well with the returned boxed error, but not with dynamic error references (error at (1)).
Is there any specific reason why it does not work? In my limited knowledge of Rust, it is more natural to return an error reference, rather than a boxed object: in the end, after returning rom the foo function, I expect deref coercion to work with it, so why not returning the error reference itself?
Look at this function signature:
fn bar(s: &str) -> Result<&str, &dyn error::Error> {
The error type is a reference, but a reference to what? Who owns the value being referenced? The value cannot be owned by the function itself because it would go out of scope and Rust, quite rightly, won't allow you to return the dangling reference. So the only alternative is that the error is the input string slice s, or some sub-slice of it. This is definitely not what you wanted.
Now, the error:
the trait `std::convert::From<std::io::Error>` is not implemented for `&dyn std::error::Error`
The trait isn't implemented, and it can't be. To see why, try to implement it by hand:
impl<'a> From<io::Error> for &'a dyn error::Error {
fn from(e: io::Error) -> &'a dyn error::Error {
// what can go here?
}
}
This method is impossible to implement, for exactly the same reason.
Why does it work for Box<dyn Error>? A Box allocates its data on the heap, but also owns that data and deallocates it when the box goes out of scope. This is completely different from references, where the owner is separate, and the reference is prevented from outliving the data by lifetime parameters in the types.
See also:
Is there any way to return a reference to a variable created in a function?
Although it is possible to cast the concrete type std::io::Error into dyn Error, it is not possible to return it as a reference because the "owned" value is being dropped/erased/removed at the end of the function, same goes to your String -> &str. The Box<error::Error> example works because an owned Error is created in the heap (Box<std::io::Error>) and the std has an implementation of Error for Box<T> (impl<T: Error> Error for Box<T>).
If you want to erase the concrete type and only work with the available methods of a trait, it is possible to use impl Trait.
use std::{error, fs};
fn foo(s: &str) -> Result<String, Box<dyn error::Error>> {
let result = fs::read_to_string(s)?;
Ok(result)
}
fn bar(s: &str) -> Result<String, impl error::Error> {
let result = match fs::read_to_string(s) {
Ok(x) => x,
Err(x) => return Err(x),
};
Ok(result)
}
fn main() {
println!("{}", foo("foo.txt").unwrap());
println!("{}", bar("bar.txt").unwrap());
}

How to implement an iterator giving struct with lifetimes?

I want to create a mutable iterator that control modifications, for this I create a struct named FitnessIterMut that impl the Iterator trait.
The next() method gives a struct that can do things on the container itself when a modification is done. (Is it a good way to do this sort of things ?)
pub struct FitnessModifier<'a, T: 'a> {
wheel: &'a mut RouletteWheel<T>,
value: &'a mut (f32, T)
}
impl<'a, T> FitnessModifier<'a, T> {
pub fn read(&'a self) -> &'a (f32, T) {
self.value
}
pub fn set_fitness(&'a self, new: f32) {
let &mut (ref mut fitness, _) = self.value;
self.wheel.proba_sum -= *fitness;
self.wheel.proba_sum += new;
*fitness = new;
}
}
pub struct FitnessIterMut<'a, T: 'a> {
wheel: &'a mut RouletteWheel<T>,
iterator: &'a mut IterMut<'a, (f32, T)>
}
impl<'a, T> Iterator for FitnessIterMut<'a, T> {
type Item = FitnessModifier<'a, T>;
fn next(&mut self) -> Option<Self::Item> {
if let Some(value) = self.iterator.next() {
Some(FitnessModifier { wheel: self.wheel, value: value })
}
else {
None
}
}
}
This gives me this error, I think I have to do a 'b lifetime but I'm a little lost.
error: cannot infer an appropriate lifetime for automatic coercion due to conflicting requirements [E0495]
Some(FitnessModifier { wheel: self.wheel, value: value })
^~~~~~~~~~
help: consider using an explicit lifetime parameter as shown: fn next(&'a mut self) -> Option<Self::Item>
fn next(&mut self) -> Option<Self::Item> {
if let Some(value) = self.iterator.next() {
Some(FitnessModifier { wheel: self.wheel, value: value })
}
else {
None
You won't be able to get this to work with simple mutable references, unless you're OK with not implementing the standard Iterator trait. That's because it's not legal in Rust to have more than one usable mutable alias to a particular value at the same time (because it can lead to memory unsafety). Let's see why your code violates this restriction.
First, I can instantiate a FitnessIterMut object. From this object, I can call next to obtain a FitnessModifier. At this point, both the FitnessIterMut and the FitnessModifier contain a mutable reference to a RouletteWheel object, and both the FitnessIterMut and the FitnessModifier are still usable – that's not legal! I could call next again on the FitnessIterMut to obtain another FitnessModifier, and now I'd have 3 mutable aliases to the RouletteWheel.
Your code fails to compile because you assumed that mutable references can be copied, which is not the case. Immutable references (&'a T) implement Copy, but mutable references (&'a mut T) do not, so they cannot be copied.
What could we do to fix this? Rust lets us temporarily make a mutable reference unusable (i.e. you'll get a compiler error if you try to use it) by reborrowing from it. Normally, for types that don't implement Copy, the compiler will move a value instead of copying it, but for mutable references, the compiler will reborrow instead of moving. Reborrowing can be seen as "flattening" or "collapsing" references to references, while keeping the shortest lifetime.
Let's see how this works in practice. Here's a valid implementation of next. Note that this doesn't conform to the contract of the standard Iterator trait, so I made this an inherent method instead.
impl<'a, T> FitnessIterMut<'a, T> {
fn next<'b>(&'b mut self) -> Option<FitnessModifier<'b, T>> {
if let Some(value) = self.iterator.next() {
Some(FitnessModifier { wheel: self.wheel, value: value })
}
else {
None
}
}
}
Instead of returning an Option<FitnessModifier<'a, T>>, we now return an Option<FitnessModifier<'b, T>>, where 'b is linked to the lifetime of the self argument. When initializing the wheel field of the result FitnessModifier, the compiler will automatically reborrow from self.wheel (we could make this explicit by writing &mut *self.wheel instead of self.wheel).
Since this expression references a &'a mut RouletteWheel<T>, you thought the type of this expression would also be &'a mut RouletteWheel<T>. However, because this expression borrows from self, which is a &'b mut FitnessIterMut<'a, T>, the type of this expression is actually &'b mut RouletteWheel<T>. In your code, you tried to assign a &'b mut RouletteWheel<T> to a field expecting a &'a mut RouletteWheel<T>, but 'a is longer than 'b, which is why you got a compiler error.
If Rust didn't allow reborrowing, then instead of storing a &'b mut RouletteWheel<T> in FitnessModifier, you'd have to store a &'b &'a mut RouletteWheel<T>, where 'a is the lifetime of the RouletteWheel<T> and 'b is the lifetime of the &'a mut RouletteWheel<T> in the FitnessIterMut<'a, T>. However, Rust lets us "collapse" this reference to a reference and we can just store a &'b mut RouletteWheel<T> instead (the lifetime is 'b, not 'a, because 'b is the shorter lifetime).
The net effect of this change is that, after you call next, you can't use the FitnessIterMut at all until the resulting Option<FitnessModifier<'b, T>> goes out of scope. That's because the FitnessModifier is borrowing from self, and since the method passed self by mutable reference, the compiler assumes that the FitnessModifier keeps a mutable reference to the FitnessIterMut or to one of its fields (which is true here, not is not always true in general). Thus, while there's a FitnessModifier in scope, there's only one usable mutable alias to the RouletteWheel, which is the one in the FitnessModifier object. When the FitnessModifier<'b, T> goes out of scope, the FitnessIterMut object will become usable again.
If you absolutely need to conform to the Iterator trait, then I suggest you replace your mutable references with Rc<RefCell<T>> instead. Rc doesn't implement Copy, but it implements Clone (which only clones the pointer, not the underlying data), so you need to call .clone() explicitly to clone an Rc. RefCell does dynamic borrow checking at runtime, which has a bit of runtime overhead, but gives you more freedom in how you pass mutable objects around.

Is there an Iterator-like trait which returns references that must fall out of scope before the next access?

This would make it possible to safely iterate over the same element twice, or to hold some state for the global thing being iterated over in the item type.
Something like:
trait IterShort<Iter>
where Self: Borrow<Iter>,
{
type Item;
fn next(self) -> Option<Self::Item>;
}
then an implementation could look like:
impl<'a, MyIter> IterShort<MyIter> for &'a mut MyIter {
type Item = &'a mut MyItem;
fn next(self) -> Option<Self::Item> {
// ...
}
}
I realize I could write my own (I just did), but I'd like one that works with the for-loop notation. Is that possible?
The std::iter::Iterator trait can not do this, but you can write a different trait:
trait StreamingIterator {
type Item;
fn next<'a>(&'a mut self) -> Option<&'a mut Self::Item>;
}
Note that the return value of next borrows the iterator itself, whereas in Vec::iter for example it only borrows the vector.
The downside is that &mut is hard-coded. Making it generic would require higher-kinded types (so that StreamingIterator::Item could itself be generic over a lifetime parameter).
Alexis Beingessner gave a talk about this and more titled Who Owns This Stream of Data? at RustCamp.
As to for loops, they’re really tied to std::iter::IntoIterator which is tied to std::iter::Iterator. You’d just have to implement both.
The standard iterators can't do this as far as I can see. The very definition of an iterator is that the outside has control over the elements while the inside has control over what produces the elements.
From what I understand of what you are trying to do, I'd flip the concept around and instead of returning elements from an iterator to a surrounding environment, pass the environment to the iterator. That is, you create a struct with a constructor function that accepts a closure and implements the iterator trait. On each call to next, the passed-in closure is called with the next element and the return value of that closure or modifications thereof are returned as the current element. That way, next can handle the lifetime of whatever would otherwise be returned to the surrounding environment.

How do I apply an explicit lifetime bound to a returned trait?

Returning an iterator from a function in Rust is an exercise of Sisyphean dimensions, but I am told it's possible to return one as a trait without quite so much pain. Unfortunately, it isn't working: apparently, I need an explicit lifetime bound? Which is apparently not the same thing as adding a lifetime parameter. Which means I have no idea how to do that.
Here's my (tiny, test) code:
fn main() {
let args = get_args();
for arg in args {
println!("{}", arg);
}
}
fn get_args() -> Iterator {
std::env::args().filter_map(|arg| arg.into_string().ok())
}
What is the appropriate way to make this actually work?
Edit: rust version rustc 1.0.0-nightly (00df3251f 2015-02-08 23:24:33 +0000)
You can't return a bare Iterator from a function, because it is a trait, thus not a sized type.
In your situation, you'll need to put the iterator object inside a box, in order to make it into a sized object that can be returned from the function.
To do so, you can change your code like this:
fn get_args() -> Box<Iterator<Item=String> + 'static> {
Box::new(std::env::args().filter_map(|arg| arg.into_string().ok()))
}
Here I've added a lifetime specifier 'static for the trait object, meaning that it is completely self-owned (a function taking no arguments will almost always return something valid for the 'static lifetime in this sense).
You also need the <Item=String> part to explicit the type of data yielded by your iterator. In this case: Strings.
In this specific case you can manage to return a concrete type from your get_args, like so:
fn get_args() -> FilterMap<Args, fn(OsString) -> Option<String>> {
fn arg_into_string(arg: OsString) -> Option<String> { arg.into_string().ok() }
args().filter_map(arg_into_string as fn(OsString) -> Option<String>)
}
basically this applies to all the cases where the closure you use in the iterator adapter (in your case filter_map) is not really a closure, in that it does not capture any environment, and it can be modeled by a plain old function.
In general, if you do need to return a type that does contain a closure, you will indeed need to box it and return a trait object. In your case:
fn get_args() -> Box<Iterator<Item=String> + 'static> {
Box::new(std::env::args().filter_map(|arg| arg.into_string().ok()))
}

Working around the limitations of extension traits

The pattern of having an object-safe trait Foo and a (potentially unsafe) extension trait FooExt implemented for all instances of Foo seems to become standard now.
https://github.com/rust-lang/rfcs/pull/445
This is a problem for me in the case of Iterator<A>, as I have a library that overrides the default method IteratorExt#last() of the old iterator trait (the underlying library has an efficient implementation of last()). This in now impossible, because for any A, there will always be a conflicting trait implementation of IteratorExt, the one that libcore already provides for all Iterator<A>.
iterator.rs:301:1: 306:2 error: conflicting implementations for trait `core::iter::IteratorExt` [E0119]
iterator.rs:301 impl<'a, K: Key> iter::IteratorExt<Vec<u8>> for ValueIterator<'a,K,Vec<u8>> {
iterator.rs:302 fn last(&mut self) -> Option<Vec<u8>> {
iterator.rs:303 self.seek_last();
iterator.rs:304 Some(self.value())
iterator.rs:305 }
iterator.rs:306 }
...
Now, as far as I see, I have two options:
have my own trait and my own last() implementation. That would mean it conflicts if IteratorExt is imported unless carefully used. This also has the danger accidentally using an inefficient version of last() if the version from IteratorExt is used. I'd loose convenient access to IteratorExt.
have my own trait and name the method differently (seek_last()). Disadvantage: I ask the user to learn vocabulary and to always favor my method over that provided by IteratorExt. Same problem: I'd like to avoid accidental usage of last().
Is there any other, better, solution I am missing?
As of rustc 0.13.0-nightly (8bca470c5 2014-12-08 00:12:30 +0000) defining last() as an inherent method on your type should work.
#[deriving(Copy)]
struct Foo<T> {t: T}
impl<T> Iterator<T> for Foo<T> {
fn next(&mut self) -> Option<T> { None }
}
// this does not work
// error: conflicting implementations for trait `core::iter::IteratorExt` [E0119]
// impl<T> IteratorExt<T> for Foo<T> {
// fn last(mut self) -> Option<T> { None }
//}
// but this currently does
impl<T> Foo<T> {
fn last(mut self) -> Option<T> { Some(self.t) }
}
fn main() {
let mut t = Foo{ t: 3u };
println!("{}", t.next())
println!("{}", t.last()) // last has been "shadowed" by our impl
println!("{}", t.nth(3)) // other IteratorExt methods are still available
}
Since you're not supposed to use Extension traits as generic bounds (but just to provide additional methods), this should theoretically work for your scenario, as you can have your own type and its impl in the same crate.
Users of your type will use the inherent last method instead of the one on IteratorExt but still be able to use the other methods on IteratorExt.
last should be moved to Iterator, rather than IteratorExt.
IteratorExt is needed when using Box<Iterator> objects, to allow calling generic methods on them (e.g. map), because you can't put a generic method in a vtable. However, last isn't generic, so it can be put in Iterator.