How can a function conditionally fall back to a trait if another trait is implemented or not? - iterator

I am building up a library for generating the minimum perfect hash from a set of keys. The idea is to index the keys online without storing the full dataset in memory. Based on a user requirement, it is possible that skip_next() is not available and I want to fall back to using next(). Although it might be slower based on the speed of the iterator, it simplifies things for a general user.
My idea is to selectively iterate over all the elements generated by an iterator. This code works fine, but it requires a user to implement the trait FastIteration:
#[derive(Debug)]
struct Pixel {
r: Vec<i8>,
g: Vec<i8>,
b: Vec<i8>,
}
#[derive(Debug)]
struct Node {
r: i8,
g: i8,
b: i8,
}
struct PixelIterator<'a> {
pixel: &'a Pixel,
index: usize,
}
impl<'a> IntoIterator for &'a Pixel {
type Item = Node;
type IntoIter = PixelIterator<'a>;
fn into_iter(self) -> Self::IntoIter {
println!("Into &");
PixelIterator {
pixel: self,
index: 0,
}
}
}
impl<'a> Iterator for PixelIterator<'a> {
type Item = Node;
fn next(&mut self) -> Option<Node> {
println!("next &");
let result = match self.index {
0 | 1 | 2 | 3 => Node {
r: self.pixel.r[self.index],
g: self.pixel.g[self.index],
b: self.pixel.b[self.index],
},
_ => return None,
};
self.index += 1;
Some(result)
}
}
trait FastIteration {
fn skip_next(&mut self);
}
impl<'a> FastIteration for PixelIterator<'a> {
fn skip_next(&mut self) {
self.index += 1;
}
}
fn main() {
let p1 = Pixel {
r: vec![11, 21, 31, 41],
g: vec![12, 22, 32, 42],
b: vec![13, 23, 33, 43],
};
let mut index = 0;
let mut it = p1.into_iter();
loop {
if index == p1.r.len() {
break;
}
if index == 1 {
it.skip_next()
} else {
let val = it.next();
println!("{:?}", val);
}
index += 1;
}
}
How can one make the above program fall back to using the normal next() instead of skip_next() based on if the trait FastIteration is implemented or not?
fn fast_iterate<I>(objects: I)
where I: IntoIter + FastIteration { // should use skip_next() };
fn slow_iterate<I>(objects: I)
where I: IntoIter { // should NOT use skip_next(), use next() };
As above, one can always write two separate impl but is it possible to do this in one?
This question builds on:
Conditionally implement a Rust trait only if a type constraint is satisfied
Implement rayon `as_parallel_slice` using iterators.

You are looking for the unstable feature specialization:
#![feature(specialization)]
#[derive(Debug)]
struct Example(u8);
impl Iterator for Example {
type Item = u8;
fn next(&mut self) -> Option<u8> {
let v = self.0;
if v > 10 {
None
} else {
self.0 += 1;
Some(v)
}
}
}
trait FastIterator: Iterator {
fn skip_next(&mut self);
}
impl<I: Iterator> FastIterator for I {
default fn skip_next(&mut self) {
println!("step");
self.next();
}
}
impl FastIterator for Example {
fn skip_next(&mut self) {
println!("skip");
self.0 += 1;
}
}
fn main() {
let mut ex = Example(0);
ex.skip_next();
let mut r = 0..10;
r.skip_next();
}

Related

How can I return an iterator over a locked struct member in Rust?

Here is as far as I could get, using rental, partly based on How can I store a Chars iterator in the same struct as the String it is iterating on?. The difference here is that the get_iter method of the locked member has to take a mutable self reference.
I'm not tied to using rental: I'd be just as happy with a solution using reffers or owning_ref.
The PhantomData is present here just so that MyIter bears the normal lifetime relationship to MyIterable, the thing being iterated over.
I also tried changing #[rental] to #[rental(deref_mut_suffix)] and changing the return type of MyIterable.get_iter to Box<Iterator<Item=i32> + 'a> but that gave me other lifetime errors originating in the macro that I was unable to decipher.
#[macro_use]
extern crate rental;
use std::marker::PhantomData;
pub struct MyIterable {}
impl MyIterable {
// In the real use-case I can't remove the 'mut'.
pub fn get_iter<'a>(&'a mut self) -> MyIter<'a> {
MyIter {
marker: PhantomData,
}
}
}
pub struct MyIter<'a> {
marker: PhantomData<&'a MyIterable>,
}
impl<'a> Iterator for MyIter<'a> {
type Item = i32;
fn next(&mut self) -> Option<i32> {
Some(42)
}
}
use std::sync::Mutex;
rental! {
mod locking_iter {
pub use super::{MyIterable, MyIter};
use std::sync::MutexGuard;
#[rental]
pub struct LockingIter<'a> {
guard: MutexGuard<'a, MyIterable>,
iter: MyIter<'guard>,
}
}
}
use locking_iter::LockingIter;
impl<'a> Iterator for LockingIter<'a> {
type Item = i32;
#[inline]
fn next(&mut self) -> Option<Self::Item> {
self.rent_mut(|iter| iter.next())
}
}
struct Access {
shared: Mutex<MyIterable>,
}
impl Access {
pub fn get_iter<'a>(&'a self) -> Box<Iterator<Item = i32> + 'a> {
Box::new(LockingIter::new(self.shared.lock().unwrap(), |mi| {
mi.get_iter()
}))
}
}
fn main() {
let access = Access {
shared: Mutex::new(MyIterable {}),
};
let iter = access.get_iter();
let contents: Vec<i32> = iter.take(2).collect();
println!("contents: {:?}", contents);
}
As user rodrigo has pointed out in a comment, the solution is simply to change #[rental] to #[rental_mut].

Implement IntoIterator for binary tree

I am trying to build a binary tree and write an iterator to traverse values in the tree.
When implementing the IntoIterator trait for my tree nodes I ran into a problem with lifetimes
src\main.rs:43:6: 43:8 error: the lifetime parameter `'a` is not constrained by the impl trait, self type, or predicates [E0207]
src\main.rs:43 impl<'a, T: 'a> IntoIterator for Node<T> {
I understand that I need to specify that NodeIterator will live as long as Node but I am unsure of how to express that
use std::cmp::PartialOrd;
use std::boxed::Box;
struct Node<T: PartialOrd> {
value: T,
left: Option<Box<Node<T>>>,
right: Option<Box<Node<T>>>,
}
struct NodeIterator<'a, T: 'a + PartialOrd> {
current: &'a Node<T>,
parent: Option<&'a Node<T>>,
}
impl<T: PartialOrd> Node<T> {
pub fn insert(&mut self, value: T) {
...
}
}
impl<'a, T: 'a> IntoIterator for Node<T> { // line 43
type Item = T;
type IntoIter = NodeIterator<'a, T>;
fn into_iter(&self) -> Self::IntoIter {
NodeIterator::<'a> {
current: Some(&self),
parent: None
}
}
}
The particular error that you are getting is that 'a should appear on the right of for. Otherwise, how could the compiler know what a is?
When implementing IntoIterator you have to decide whether the iterator will consume the container, or whether it'll just produce references into it. At the moment, your setup is inconsistent, and the error message points it out.
In the case of a binary tree, you also have to think about which order you want to produce the values in: traditional orders are depth first (yielding a sorted sequence) and breadth first (exposing the "layers" of the tree). I'll assume depth first as it's the most common one.
Let's tackle the case of a consuming iterator first. It's simpler in the sense that we don't have to worry about lifetimes.
#![feature(box_patterns)]
struct Node<T: PartialOrd> {
value: T,
left: Option<Box<Node<T>>>,
right: Option<Box<Node<T>>>,
}
struct NodeIterator<T: PartialOrd> {
stack: Vec<Node<T>>,
next: Option<T>,
}
impl<T: PartialOrd> IntoIterator for Node<T> {
type Item = T;
type IntoIter = NodeIterator<T>;
fn into_iter(self) -> Self::IntoIter {
let mut stack = Vec::new();
let smallest = pop_smallest(self, &mut stack);
NodeIterator { stack: stack, next: Some(smallest) }
}
}
impl<T: PartialOrd> Iterator for NodeIterator<T> {
type Item = T;
fn next(&mut self) -> Option<T> {
if let Some(next) = self.next.take() {
return Some(next);
}
if let Some(Node { value, right, .. }) = self.stack.pop() {
if let Some(right) = right {
let box right = right;
self.stack.push(right);
}
return Some(value);
}
None
}
}
fn pop_smallest<T: PartialOrd>(node: Node<T>, stack: &mut Vec<Node<T>>) -> T {
let Node { value, left, right } = node;
if let Some(left) = left {
stack.push(Node { value: value, left: None, right: right });
let box left = left;
return pop_smallest(left, stack);
}
if let Some(right) = right {
let box right = right;
stack.push(right);
}
value
}
fn main() {
let root = Node {
value: 3,
left: Some(Box::new(Node { value: 2, left: None, right: None })),
right: Some(Box::new(Node { value: 4, left: None, right: None }))
};
for t in root {
println!("{}", t);
}
}
Now, we can "easily" adapt it to the non-consuming case by sprinkling in the appropriate references:
struct RefNodeIterator<'a, T: PartialOrd + 'a> {
stack: Vec<&'a Node<T>>,
next: Option<&'a T>,
}
impl<'a, T: PartialOrd + 'a> IntoIterator for &'a Node<T> {
type Item = &'a T;
type IntoIter = RefNodeIterator<'a, T>;
fn into_iter(self) -> Self::IntoIter {
let mut stack = Vec::new();
let smallest = pop_smallest_ref(self, &mut stack);
RefNodeIterator { stack: stack, next: Some(smallest) }
}
}
impl<'a, T: PartialOrd + 'a> Iterator for RefNodeIterator<'a, T> {
type Item = &'a T;
fn next(&mut self) -> Option<&'a T> {
if let Some(next) = self.next.take() {
return Some(next);
}
if let Some(node) = self.stack.pop() {
if let Some(ref right) = node.right {
self.stack.push(right);
}
return Some(&node.value);
}
None
}
}
fn pop_smallest_ref<'a, T>(node: &'a Node<T>, stack: &mut Vec<&'a Node<T>>) -> &'a T
where
T: PartialOrd + 'a
{
if let Some(ref left) = node.left {
stack.push(node);
return pop_smallest_ref(left, stack);
}
if let Some(ref right) = node.right {
stack.push(right);
}
&node.value
}
There's a lot to unpack in there; so take your time to digest it. Specifically:
the use of ref in Some(ref right) = node.right is because I don't want to consume node.right, only to obtain a reference inside the Option; the compiler will complain that I cannot move out of a borrowed object without it (so I just follow the complaints),
in stack.push(right), right: &'a Box<Node<T>> and yet stack: Vec<&'a Node<T>>; this is the magic of Deref: Box<T> implements Deref<T> so the compiler automatically transforms the reference as appropriate.
Note: I didn't write this code as-is; instead I just put the first few references where I expect them to be (such as the return type of Iterator) and then let the compiler guide me.

Infinite loop when implementing custom iterator in Rust

I am trying to implement a list zipper. So far I have:
#[derive(RustcDecodable, RustcEncodable, Debug, Clone)]
pub struct ListZipper {
pub focus: Option<Tile>,
pub left: VecDeque<Tile>,
pub right: VecDeque<Tile>,
}
impl PartialEq for ListZipper {
fn eq(&self, other: &ListZipper) -> bool {
self.left == other.left && self.focus == other.focus && self.right == other.right
}
}
I am now trying to implement an iterator
impl Iterator for ListZipper {
type Item = Tile;
fn next(&mut self) -> Option<Tile> {
self.left.iter().chain(self.focus.iter()).chain(self.right.iter()).next().map(|w| *w)
}
}
In my head this makes sense. When iterating over ListZipper, I want to iterate over left, then focus and then right. So I chain those iterators and just return next().
This works fine if all fields in ListZipper are empty. As soon as one is not empty iterating over ListZipper results in an infinite loop.
The problem is not the chain. If I replace that by e.g. self.left.iter(), and left is not empty, the problem is the same. Likewise for focus and right.
I tried printing all elements in the iterator and it appears to go through the VecDeque from front to back, and then gets stuck. I.e. next() does not advance the cursor when it reaches the back.
Why?
I realize I may not want ListZipper itself to be an iterator, but that is another discussion.
As mentioned in the comments, your iterator is lacking a crucial piece of state: how far along in the iteration it is. Every time you call next, it constructs another iterator completely from scratch and gets the first element.
Here's a reduced example:
struct ListZipper {
focus: Option<u8>,
}
impl Iterator for ListZipper {
type Item = u8;
fn next(&mut self) -> Option<Self::Item> {
self.focus.iter().next().cloned()
}
}
fn main() {
let lz = ListZipper { focus: Some(42) };
let head: Vec<_> = lz.take(5).collect();
println!("{:?}", head); // [42, 42, 42, 42, 42]
}
I realize I may not want ListZipper itself to be an iterator, but that is another discussion.
No, it's really not ^_^. You need to somehow mutate the thing being iterated on so that it can change and have different values for each subsequent call.
If you want to return a combination of existing iterators and iterator adapters, refer to Correct way to return an Iterator? for instructions.
Otherwise, you need to somehow change ListZipper during the call to next:
impl Iterator for ListZipper {
type Item = Tile;
fn next(&mut self) -> Option<Self::Item> {
if let Some(v) = self.left.pop_front() {
return Some(v);
}
if let Some(v) = self.focus.take() {
return Some(v);
}
if let Some(v) = self.right.pop_front() {
return Some(v);
}
None
}
}
More succinctly:
impl Iterator for ListZipper {
type Item = Tile;
fn next(&mut self) -> Option<Self::Item> {
self.left.pop_front()
.or_else(|| self.focus.take())
.or_else(|| self.right.pop_front())
}
}
Note that your PartialEq implementation seems to be the same as the automatically-derived one...
use std::collections::VecDeque;
type Tile = u8;
#[derive(Debug, Clone, PartialEq)]
pub struct ListZipper {
pub focus: Option<Tile>,
pub left: VecDeque<Tile>,
pub right: VecDeque<Tile>,
}
impl Iterator for ListZipper {
type Item = Tile;
fn next(&mut self) -> Option<Self::Item> {
self.left.pop_front()
.or_else(|| self.focus.take())
.or_else(|| self.right.pop_front())
}
}
fn main() {
let lz = ListZipper {
focus: Some(42),
left: vec![1, 2, 3].into(),
right: vec![97, 98, 99].into(),
};
let head: Vec<_> = lz.take(5).collect();
println!("{:?}", head);
}

Rust iterators and looking forward (peek/multipeek)

I am trying to use a pattern with iterators in Rust and falling down somewhere, apparently simple.
I would like to iterate through a container and find an element with a predicate [A] (simple), but then look forward using another predicate and get that value [B] and use [B] to mutate [A] in some way. In this case [A] is mutable and [B] can be immutable; this makes no difference to me, only to the borrow checker (rightly).
It would help to understand this with a simple scenario, so I have added a small snippet to let folk see the issue/attempted goal. I have played with itertools and breaking into for/while loops, although I want to remain as idiomatic as possible.
Silly Example scenario
Lookup an even number, find next number that is divisible by 3 and add to the initial number.
#[allow(unused)]
fn is_div_3(num: &u8) -> bool {
num % 3 == 0
}
fn main() {
let mut data: Vec<u8> = (0..100).collect();
let count = data.iter_mut()
.map(|x| {
if *x % 2 == 0 {
// loop through numbers forward to next is_div_3,
// then x = x + that number
}
true
})
.count();
println!("data {:?}, count was {} ", data, count);
}
playground
Sadly I'm a bit late, but here goes.
It's not totally pretty, but it's not as bad as the other suggestion:
let mut data: Vec<u8> = (1..100).collect();
{
let mut mut_items = data.iter_mut();
while let Some(x) = mut_items.next() {
if *x % 2 == 0 {
let slice = mut_items.into_slice();
*x += *slice.iter().find(|&x| x % 3 == 0).unwrap();
mut_items = slice.iter_mut();
}
}
}
println!("{:?}", data);
gives
[1, 5, 3, 10, 5, 15, 7, 17, 9, 22, ...]
as with Matthieu M.'s solution.
The key is to use mut_items.into_slice() to "reborrow" the iterator, effectively producing a local (and thus safe) clone of the iterator.
Warning: The iterator presented right below is unsafe because it allows one to obtain multiple aliases to a single mutable element; skip to the second part for the corrected version. (It would be alright if the return type contained immutable references).
If you are willing to write your own window iterator, then it becomes quite easy.
First, the iterator in all its gory details:
use std::marker::PhantomData;
struct WindowIterMut<'a, T>
where T: 'a
{
begin: *mut T,
len: usize,
index: usize,
_marker: PhantomData<&'a mut [T]>,
}
impl<'a, T> WindowIterMut<'a, T>
where T: 'a
{
pub fn new(slice: &'a mut [T]) -> WindowIterMut<'a, T> {
WindowIterMut {
begin: slice.as_mut_ptr(),
len: slice.len(),
index: 0,
_marker: PhantomData,
}
}
}
impl<'a, T> Iterator for WindowIterMut<'a, T>
where T: 'a
{
type Item = (&'a mut [T], &'a mut [T]);
fn next(&mut self) -> Option<Self::Item> {
if self.index > self.len { return None; }
let slice: &'a mut [T] = unsafe {
std::slice::from_raw_parts_mut(self.begin, self.len)
};
let result = slice.split_at_mut(self.index);
self.index += 1;
Some(result)
}
}
Invoked on [1, 2, 3] it will return (&[], &[1, 2, 3]) then (&[1], &[2, 3]), ... until (&[1, 2, 3], &[]). In short, it iterates over all the potential partitions of the slice (without shuffling).
Which is safe to use as:
fn main() {
let mut data: Vec<u8> = (1..100).collect();
for (head, tail) in WindowIterMut::new(&mut data) {
if let Some(element) = head.last_mut() {
if *element % 2 == 0 {
if let Some(n3) = tail.iter().filter(|i| *i % 3 == 0).next() {
*element += *n3;
}
}
}
}
println!("{:?}", data);
}
Unfortunately it can also be used as:
fn main() {
let mut data: Vec<u8> = (1..100).collect();
let mut it = WindowIterMut::new(&mut data);
let first_0 = { it.next(); &mut it.next().unwrap().0[0] };
let second_0 = &mut it.next().unwrap().0[0];
println!("{:?} {:?}", first_0 as *const _, second_0 as *const _);
}
which when run print: 0x7f73a8435000 0x7f73a8435000, show-casing that both mutable references alias the same element.
Since we cannot get rid of aliasing, we need to get rid of mutability; or at least defer to interior mutability (Cell here since u8 is Copy).
Fortunately, Cell has no runtime cost, but it does cost a bit in ergonomics (all those .get() and .set()).
I take the opportunity to make the iterator slightly more generic too, and rename it since Window is already a used name for a different concept.
struct FingerIter<'a, T>
where T: 'a
{
begin: *const T,
len: usize,
index: usize,
_marker: PhantomData<&'a [T]>,
}
impl<'a, T> FingerIter<'a, T>
where T: 'a
{
pub fn new(slice: &'a [T]) -> FingerIter<'a, T> {
FingerIter {
begin: slice.as_ptr(),
len: slice.len(),
index: 0,
_marker: PhantomData,
}
}
}
impl<'a, T> Iterator for FingerIter<'a, T>
where T: 'a
{
type Item = (&'a [T], &'a T, &'a [T]);
fn next(&mut self) -> Option<Self::Item> {
if self.index >= self.len { return None; }
let slice: &'a [T] = unsafe {
std::slice::from_raw_parts(self.begin, self.len)
};
self.index += 1;
let result = slice.split_at(self.index);
Some((&result.0[0..self.index-1], result.0.last().unwrap(), result.1))
}
}
We use it as a building brick:
fn main() {
let data: Vec<Cell<u8>> = (1..100).map(|i| Cell::new(i)).collect();
for (_, element, tail) in FingerIter::new(&data) {
if element.get() % 2 == 0 {
if let Some(n3) = tail.iter().filter(|i| i.get() % 3 == 0).next() {
element.set(element.get() + n3.get());
}
}
}
let data: Vec<u8> = data.iter().map(|cell| cell.get()).collect();
println!("{:?}", data);
}
On the playpen this prints: [1, 5, 3, 10, 5, 15, 7, 17, 9, 22, ...], which seems correct.

How can I add new methods to Iterator?

I want to define a .unique() method on iterators that enables me to iterate without duplicates.
use std::collections::HashSet;
struct UniqueState<'a> {
seen: HashSet<String>,
underlying: &'a mut Iterator<Item = String>,
}
trait Unique {
fn unique(&mut self) -> UniqueState;
}
impl Unique for Iterator<Item = String> {
fn unique(&mut self) -> UniqueState {
UniqueState {
seen: HashSet::new(),
underlying: self,
}
}
}
impl<'a> Iterator for UniqueState<'a> {
type Item = String;
fn next(&mut self) -> Option<String> {
while let Some(x) = self.underlying.next() {
if !self.seen.contains(&x) {
self.seen.insert(x.clone());
return Some(x);
}
}
None
}
}
This compiles. However, when I try to use in the same file:
fn main() {
let foo = vec!["a", "b", "a", "cc", "cc", "d"];
for s in foo.iter().unique() {
println!("{}", s);
}
}
I get the following error:
error[E0599]: no method named `unique` found for type `std::slice::Iter<'_, &str>` in the current scope
--> src/main.rs:37:25
|
37 | for s in foo.iter().unique() {
| ^^^^^^
|
= help: items from traits can only be used if the trait is implemented and in scope
= note: the following trait defines an item `unique`, perhaps you need to implement it:
candidate #1: `Unique`
What am I doing wrong? How would I extend this arbitrary hashable types?
In your particular case, it's because you have implemented your trait for an iterator of String, but your vector is providing an iterator of &str. Here's a more generic version:
use std::collections::HashSet;
use std::hash::Hash;
struct Unique<I>
where
I: Iterator,
{
seen: HashSet<I::Item>,
underlying: I,
}
impl<I> Iterator for Unique<I>
where
I: Iterator,
I::Item: Hash + Eq + Clone,
{
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
while let Some(x) = self.underlying.next() {
if !self.seen.contains(&x) {
self.seen.insert(x.clone());
return Some(x);
}
}
None
}
}
trait UniqueExt: Iterator {
fn unique(self) -> Unique<Self>
where
Self::Item: Hash + Eq + Clone,
Self: Sized,
{
Unique {
seen: HashSet::new(),
underlying: self,
}
}
}
impl<I: Iterator> UniqueExt for I {}
fn main() {
let foo = vec!["a", "b", "a", "cc", "cc", "d"];
for s in foo.iter().unique() {
println!("{}", s);
}
}
Broadly, we create a new extension trait called UniqueExt which has Iterator as a supertrait. When Iterator is a supertrait, we will have access to the associated type Iterator::Item.
This trait defines the unique method, which is only valid to call when then iterated item can be:
Hashed
Compared for total equality
Cloned
Additionally, it requires that the item implementing Iterator have a known size at compile time. This is done so that the iterator can be consumed by the Unique iterator adapter.
The other important part is the blanket implementation of the trait for any type that also implements Iterator:
impl<I: Iterator> UniqueExt for I {}