Writing an iterator [duplicate] - iterator

This question already has answers here:
How to implement Iterator and IntoIterator for a simple struct?
(2 answers)
Closed 4 years ago.
Editor's note: This code example is from a version of Rust prior to 1.0 and is not valid Rust 1.0 code. Updated versions of this code no longer produce an error due to changes how for loops are implemented.
I'm writing a Vector struct in Rust.
pub struct Vector {
pub x: f32,
pub y: f32,
pub z: f32,
curr: uint
}
And I'd like to write a simple iterator for it, so that I can iterate over the elements of the vector. It's occasionally useful, plus I know next to nothing about iterators in Rust.
Here's what I've got at the moment.
impl Iterator<f32> for Vector {
fn next(&mut self) -> Option<f32> {
let new_next : Option<f32> = match self.curr {
0 => Some(self.x),
1 => Some(self.y),
2 => Some(self.z),
_ => None
};
let new_curr = (self.curr + 1) % 4;
mem::replace(&mut self.curr, new_curr);
new_next
}
}
Now ideally I'd like to be able to use this like:
let u = Vector::new(0.0f32, 0.0f32, 0.0f32);
for element in u {
///
}
However, I get the following compiler error:
error: cannot borrow immutable local variable `u` as mutable
So I'm stumped. After a couple hours of Googling, I couldn't come up with anything. I feel like I'm missing something huge.

Are you sure you really want the Vector itself to be an iterator? Usually structures and iterators into them are separate. Consider something like this:
pub struct Vector {
pub x: f32,
pub y: f32,
pub z: f32,
}
pub struct VectorIter<'a> {
vector: &'a Vector,
cur: usize,
}
impl<'a> Iterator for VectorIter<'a> {
type Item = f32;
fn next(&mut self) -> Option<f32> {
let r = match self.cur {
0 => self.vector.x,
1 => self.vector.y,
2 => self.vector.z,
_ => return None,
};
self.cur += 1;
Some(r)
}
}
impl Vector {
fn iter(&self) -> VectorIter {
VectorIter {
vector: self,
cur: 0,
}
}
}
fn main() {
let v = Vector { x: 1.0, y: 2.0, z: 3.0 };
for c in v.iter() {
println!("{}", c);
}
}
Because Vector is very simple, it can derive Copy, and its iterator can take it by value:
#[derive(Copy, Clone)]
pub struct Vector {
pub x: f32,
pub y: f32,
pub z: f32,
}
pub struct VectorIter {
vector: Vector,
cur: usize,
}
impl Iterator for VectorIter {
type Item = f32;
fn next(&mut self) -> Option<f32> {
let r = match self.cur {
0 => self.vector.x,
1 => self.vector.y,
2 => self.vector.z,
_ => return None,
};
self.cur += 1;
Some(r)
}
}
impl Vector {
fn iter(&self) -> VectorIter {
VectorIter {
vector: *self,
cur: 0,
}
}
}
fn main() {
let v = Vector { x: 1.0, y: 2.0, z: 3.0 };
for c in v.iter() {
println!("{}", c);
}
}
This variant is probably better unless your Vector contains something other than coordinates. This variant is more flexible because it does not tie the iterator with the iterable, but on the other hand, exactly because of the same reason it may be undesirable (with Copy you can change the original value, and the iterator won't reflect it; without Copy and with references you won't be able to change the original value at all). The semantics you would want to use heavily depends on your use cases.

Editor's note: This answer is no longer useful as of Rust 1.0 due to changes how the for loop works.
Vladimir Matveev's answer is correct and explains what you should do. I'd like to add on by explaining your error a bit more.
error: cannot borrow immutable local variable u as mutable
When you are iterating over something, you need to mutate something that tracks where you are. In your code, that's done by:
mem::replace(&mut self.curr, new_curr);
And Rust knows which thing you want to mutate because the method signature for next indicates it:
fn next(&mut self) -> Option<f32>
The problem is that your object is not mutable:
let u = Vector::new(0.0f32, 0.0f32, 0.0f32);
If you change your code to
let mut u = Vector::new(0.0f32, 0.0f32, 0.0f32);
I'd expect your code to work as-is. All that being said, a separate iterator is probably the right way to go. However, you'll know more than we do about your code!

Related

I need help refactoring for error handling in Rust

I would like to refactor this Rust code for calculating the largest series product and make it as efficient and elegant as possible. I feel that
lsp(string_digits: &str, span: usize) -> Result<u64, Error>
could be done in a way to make it much more elegant than it is right now. Could lsp be implemented with only one series of chained iterator methods?
#[derive(Debug, PartialEq)]
pub enum Error {
SpanTooLong,
InvalidDigit(char),
}
fn sp(w: &[u8]) -> u64 {
w.iter().fold(1u64, |acc, &d| acc * u64::from(d))
}
pub fn lsp(string_digits: &str, span: usize) -> Result<u64, Error> {
let invalid_chars = string_digits
.chars()
.filter(|ch| !ch.is_numeric())
.collect::<Vec<_>>();
if span > string_digits.len() {
return Err(Error::SpanTooLong);
} else if !invalid_chars.is_empty() {
return Err(Error::InvalidDigit(invalid_chars[0]));
} else if span == 0 || string_digits.is_empty() {
return Ok(1);
}
let vec_of_u8_digits = string_digits
.chars()
.map(|ch| ch.to_digit(10).unwrap() as u8)
.collect::<Vec<_>>();
let lsp = vec_of_u8_digits
.windows(span)
.max_by(|&w1, &w2| sp(w1).cmp(&sp(w2)))
.unwrap();
Ok(sp(lsp))
}
Not sure if this is the most elegant way, but I've given it a try, hope the new version is equivalent to the given program.
Two things will be needed in this case: First, we need a data structure that provides the sliding window "on the fly" and second a function that ends the iteration early if the conversion yields an error.
For the former I've chosen a VecDeque since span is dynamic. For the latter there is a function called process_results in the itertools crate. It converts an iterator over results to an iterator over the unwrapped type and stops iteration if an error is encountered.
I've also slightly changed the signature of sp to accept any iterator over u8.
This is the code:
use std::collections::VecDeque;
use itertools::process_results;
#[derive(Debug, PartialEq)]
pub enum Error {
SpanTooLong,
InvalidDigit(char),
}
fn sp(w: impl Iterator<Item=u8>) -> u64 {
w.fold(1u64, |acc, d| acc * u64::from(d))
}
pub fn lsp(string_digits: &str, span: usize) -> Result<u64, Error> {
if span > string_digits.len() {
return Err(Error::SpanTooLong);
} else if span == 0 || string_digits.is_empty() {
return Ok(1);
}
let mut init_state = VecDeque::new();
init_state.resize(span, 0);
process_results(string_digits.chars()
.map(|ch| ch.to_digit(10)
.map(|d| d as u8)
.ok_or(Error::InvalidDigit(ch))),
|digits|
digits.scan(init_state, |state, digit| {
state.pop_back();
state.push_front(digit);
Some(sp(state.iter().cloned()))
})
.max()
.unwrap()
)
}

How can I automatically implement FromIterator?

I have written a trait that specifies some methods similar to those of Vec:
pub trait Buffer {
type Item;
fn with_capacity(c: usize) -> Self;
fn push(&mut self, item: Self::Item);
}
I would like to implement FromIterator for all types that implement Buffer, as follows:
impl<T> iter::FromIterator<T::Item> for T
where T: Buffer
{
fn from_iter<I>(iter: I) -> Self
where I: IntoIterator<Item = T>
{
let mut iter = iter.into_iter();
let (lower, _) = iter.size_hint();
let ans = Self::with_capacity(lower);
while let Some(x) = iter.next() {
ans.push(x);
}
ans
}
}
The compiler won't let me:
error[E0210]: type parameter `T` must be used as the type parameter
for some local type (e.g. `MyStruct<T>`); only traits defined in the
current crate can be implemented for a type parameter
I think I understand the error message; it is preventing me from writing code that is incompatible with possible future changes to the standard library.
The only way around this error appears to be to implement FromIterator separately for every type for which I implement Buffer. This will involve copying out exactly the same code many times. Is there a a way to share the same implementation between all Buffer types?
You can't implement a trait from another crate for an arbitrary type, only for a type from your crate. However, you can move the implementation to a function and reduce amount of duplicated code:
fn buffer_from_iter<I, B>(iter: I) -> B
where I: IntoIterator<Item = B::Item>,
B: Buffer
{
let mut iter = iter.into_iter();
let (lower, _) = iter.size_hint();
let mut ans = B::with_capacity(lower);
while let Some(x) = iter.next() {
ans.push(x);
}
ans
}
struct S1;
impl Buffer for S1 {
type Item = i32;
fn with_capacity(c: usize) -> Self { unimplemented!() }
fn push(&mut self, item: Self::Item) { unimplemented!() }
}
impl std::iter::FromIterator<<S1 as Buffer>::Item> for S1 {
fn from_iter<I>(iter: I) -> Self
where I: IntoIterator<Item = <S1 as Buffer>::Item>
{
buffer_from_iter(iter)
}
}
This implementation of FromIterator can be wrapped into a macro to further reduce code duplication.

Wrong number of lifetime parameters when using a modified `Chars` iterator

I want to implement the IntoIterator trait for a struct containing a String. The iterator is based on the chars() iterator, is supposed to count the '1' chars and accumulate the result. This is a simplified version of what I got so far:
use std::iter::Map;
use std::str::Chars;
fn main() {
let str_struct = StringStruct { system_string: String::from("1101") };
for a in str_struct {
println!("{}", a);
}
}
struct StringStruct {
system_string: String
}
impl IntoIterator for StringStruct {
type Item = u32;
type IntoIter = Map<Chars, Fn(char) -> u32>;
fn into_iter(self) -> Self::IntoIter {
let count = 0;
return self.system_string.chars().map(|c| match c {
Some('1') => {
count += 1;
return Some(count);
},
Some(chr) => return Some(count),
None => return None
});
}
}
Expected output: 1, 2, 2, 3
This fails with:
error[E0107]: wrong number of lifetime parameters: expected 1, found 0
--> src/main.rs:17:25
|
17 | type IntoIter = Map<Chars, Fn(char) -> u32>;
| ^^^^^ expected 1 lifetime parameter
The chars iterator should have the same lifetime as the StringStruct::system_string, but I have no idea how to express this or if this approach is viable at all.
To answer the question you asked, I'd recommend to impl IntoIterator for &StringStruct (a reference to a StringStruct instead of the struct directly). The code would look like this:
impl<'a> IntoIterator for &'a StringStruct {
type Item = u32;
type IntoIter = Map<Chars<'a>, Fn(char) -> u32>;
// ...
}
However, you will notice many more errors that have a different origin afterwards. The next error that pops up is that Fn(char) -> u32 does not have a constant size at compile time.
The problem is that you try to name the type of your closure by writing Fn(char) -> u32. But this is not the type of your closure, but merely a trait which is implemented by the closure. The type of a closure can't be named (sometimes called "Voldemort type").
This means that, right now, you can't specify the type of a Map<_, _> object. This is a known issue; the recently accepted impl Trait-RFC might offer a workaround for cases like this. But right now, it's not possible, sorry.
So how to solve it then? You need to create your own type that implements Iterator and use it instead of Map<_, _>. Note that you can still use the Chars iterator. Here is the full solution:
struct StringStructIter<'a> {
chars: Chars<'a>,
count: u32,
}
impl<'a> Iterator for StringStructIter<'a> {
type Item = u32;
fn next(&mut self) -> Option<Self::Item> {
self.chars.next().map(|c| {
if c == '1' {
self.count += 1;
}
self.count
})
}
}
impl<'a> IntoIterator for &'a StringStruct {
type Item = u32;
type IntoIter = StringStructIter<'a>;
fn into_iter(self) -> Self::IntoIter {
StringStructIter {
chars: self.system_string.chars(),
count: 0,
}
}
}
fn main() {
let str_struct = StringStruct { system_string: String::from("1101") };
for a in &str_struct {
println!("{}", a);
}
}
And just a small note: an explicit return when not necessary is considered bad style in Rust. Better stick to rule and write idiomatic code by removing return whenever possible ;-)

Rust iterators and looking forward (peek/multipeek)

I am trying to use a pattern with iterators in Rust and falling down somewhere, apparently simple.
I would like to iterate through a container and find an element with a predicate [A] (simple), but then look forward using another predicate and get that value [B] and use [B] to mutate [A] in some way. In this case [A] is mutable and [B] can be immutable; this makes no difference to me, only to the borrow checker (rightly).
It would help to understand this with a simple scenario, so I have added a small snippet to let folk see the issue/attempted goal. I have played with itertools and breaking into for/while loops, although I want to remain as idiomatic as possible.
Silly Example scenario
Lookup an even number, find next number that is divisible by 3 and add to the initial number.
#[allow(unused)]
fn is_div_3(num: &u8) -> bool {
num % 3 == 0
}
fn main() {
let mut data: Vec<u8> = (0..100).collect();
let count = data.iter_mut()
.map(|x| {
if *x % 2 == 0 {
// loop through numbers forward to next is_div_3,
// then x = x + that number
}
true
})
.count();
println!("data {:?}, count was {} ", data, count);
}
playground
Sadly I'm a bit late, but here goes.
It's not totally pretty, but it's not as bad as the other suggestion:
let mut data: Vec<u8> = (1..100).collect();
{
let mut mut_items = data.iter_mut();
while let Some(x) = mut_items.next() {
if *x % 2 == 0 {
let slice = mut_items.into_slice();
*x += *slice.iter().find(|&x| x % 3 == 0).unwrap();
mut_items = slice.iter_mut();
}
}
}
println!("{:?}", data);
gives
[1, 5, 3, 10, 5, 15, 7, 17, 9, 22, ...]
as with Matthieu M.'s solution.
The key is to use mut_items.into_slice() to "reborrow" the iterator, effectively producing a local (and thus safe) clone of the iterator.
Warning: The iterator presented right below is unsafe because it allows one to obtain multiple aliases to a single mutable element; skip to the second part for the corrected version. (It would be alright if the return type contained immutable references).
If you are willing to write your own window iterator, then it becomes quite easy.
First, the iterator in all its gory details:
use std::marker::PhantomData;
struct WindowIterMut<'a, T>
where T: 'a
{
begin: *mut T,
len: usize,
index: usize,
_marker: PhantomData<&'a mut [T]>,
}
impl<'a, T> WindowIterMut<'a, T>
where T: 'a
{
pub fn new(slice: &'a mut [T]) -> WindowIterMut<'a, T> {
WindowIterMut {
begin: slice.as_mut_ptr(),
len: slice.len(),
index: 0,
_marker: PhantomData,
}
}
}
impl<'a, T> Iterator for WindowIterMut<'a, T>
where T: 'a
{
type Item = (&'a mut [T], &'a mut [T]);
fn next(&mut self) -> Option<Self::Item> {
if self.index > self.len { return None; }
let slice: &'a mut [T] = unsafe {
std::slice::from_raw_parts_mut(self.begin, self.len)
};
let result = slice.split_at_mut(self.index);
self.index += 1;
Some(result)
}
}
Invoked on [1, 2, 3] it will return (&[], &[1, 2, 3]) then (&[1], &[2, 3]), ... until (&[1, 2, 3], &[]). In short, it iterates over all the potential partitions of the slice (without shuffling).
Which is safe to use as:
fn main() {
let mut data: Vec<u8> = (1..100).collect();
for (head, tail) in WindowIterMut::new(&mut data) {
if let Some(element) = head.last_mut() {
if *element % 2 == 0 {
if let Some(n3) = tail.iter().filter(|i| *i % 3 == 0).next() {
*element += *n3;
}
}
}
}
println!("{:?}", data);
}
Unfortunately it can also be used as:
fn main() {
let mut data: Vec<u8> = (1..100).collect();
let mut it = WindowIterMut::new(&mut data);
let first_0 = { it.next(); &mut it.next().unwrap().0[0] };
let second_0 = &mut it.next().unwrap().0[0];
println!("{:?} {:?}", first_0 as *const _, second_0 as *const _);
}
which when run print: 0x7f73a8435000 0x7f73a8435000, show-casing that both mutable references alias the same element.
Since we cannot get rid of aliasing, we need to get rid of mutability; or at least defer to interior mutability (Cell here since u8 is Copy).
Fortunately, Cell has no runtime cost, but it does cost a bit in ergonomics (all those .get() and .set()).
I take the opportunity to make the iterator slightly more generic too, and rename it since Window is already a used name for a different concept.
struct FingerIter<'a, T>
where T: 'a
{
begin: *const T,
len: usize,
index: usize,
_marker: PhantomData<&'a [T]>,
}
impl<'a, T> FingerIter<'a, T>
where T: 'a
{
pub fn new(slice: &'a [T]) -> FingerIter<'a, T> {
FingerIter {
begin: slice.as_ptr(),
len: slice.len(),
index: 0,
_marker: PhantomData,
}
}
}
impl<'a, T> Iterator for FingerIter<'a, T>
where T: 'a
{
type Item = (&'a [T], &'a T, &'a [T]);
fn next(&mut self) -> Option<Self::Item> {
if self.index >= self.len { return None; }
let slice: &'a [T] = unsafe {
std::slice::from_raw_parts(self.begin, self.len)
};
self.index += 1;
let result = slice.split_at(self.index);
Some((&result.0[0..self.index-1], result.0.last().unwrap(), result.1))
}
}
We use it as a building brick:
fn main() {
let data: Vec<Cell<u8>> = (1..100).map(|i| Cell::new(i)).collect();
for (_, element, tail) in FingerIter::new(&data) {
if element.get() % 2 == 0 {
if let Some(n3) = tail.iter().filter(|i| i.get() % 3 == 0).next() {
element.set(element.get() + n3.get());
}
}
}
let data: Vec<u8> = data.iter().map(|cell| cell.get()).collect();
println!("{:?}", data);
}
On the playpen this prints: [1, 5, 3, 10, 5, 15, 7, 17, 9, 22, ...], which seems correct.

Difference between iter() and into_iter() on a shared, borrowed Vec?

I am reading the Rust 101 tutorial, where the author talks about shared borrowing with the example of a Vec object passed to a function. Below is a slightly adapted MWE of what the the tutorial is teaching. The interesting part is v.iter() in vec_min. The author writes:
This time, we explicitly request an iterator for the vector v. The method iter borrows the vector it works on, and provides shared borrows of the elements.
But what happens if I use a for ... in ... construction on an object which is shared? According to this blog post, this implicit for loop uses into_iter(), taking ownership of v. But it cannot really take ownership of the v in that function, since it has only borrowed it to begin with, right?
Can somebody explain the difference between into_iter() and iter() applied to a borrowed object to me?
enum NumberOrNothing {
Number(i32),
Nothing,
}
use self::NumberOrNothing::{Number,Nothing};
impl NumberOrNothing {
fn print(self) {
match self {
Nothing => println!("The number is: <nothing>"),
Number(n) => println!("The number is: {}", n),
};
}
}
fn vec_min(v: &Vec<i32>) -> NumberOrNothing {
fn min_i32(a: i32, b: i32) -> i32 {
if a < b {a} else {b}
}
let mut min = Nothing;
for e in v.iter() {
//Alternatively implicitly and with *e replaced by e:
//for e in v {
min = Number(match min {
Nothing => *e,
Number(n) => min_i32(n, *e),
});
}
min
}
pub fn main() {
let vec = vec![18,5,7,2,9,27];
let foo = Nothing;
let min = vec_min(&vec);
let min = vec_min(&vec);
min.print();
}
There isn't a difference.
it cannot really take ownership of the v in that function, since it has only borrowed it to begin with
It absolutely can take ownership of v, because that's a &Vec. Note the precise semantics here - you are taking ownership of the reference, not of the referred-to item.
If you check out the implementors of IntoIterator, you can find:
impl<'a, T> IntoIterator for &'a Vec<T>
And the source for that:
impl<'a, T, A: Allocator> IntoIterator for &'a Vec<T, A> {
type Item = &'a T;
type IntoIter = slice::Iter<'a, T>;
fn into_iter(self) -> slice::Iter<'a, T> {
self.iter()
}
}
Surprise — it calls iter!
The same logic applies for &mut vec and vec.iter_mut()