Implement a pairwise iterator - iterator

I have trouble writing code for a function that takes an iterator and returns an iterator that iterates in pairs (Option<T>, T) like so
a = [1,2,3]
assert pairwise(a) == `[(None, 1), (Some(1), 2), (Some(2), 3)]
fn pairwise<I, T>(&xs: &I) -> I
where
I: Iterator<Item = T>,
{
[None].iter().chain(xs.iter().map(Some)).zip(xs.iter())
}
fn main() {
let data: Vec<i32> = vec![1, 2, 3];
let newdata: Vec<Option<i32>, i32> = pairwise(&data).collect();
println!("{:?}", newdata);
}
error[E0599]: no method named `iter` found for type `I` in the current scope
--> src/main.rs:3:28
|
3 | [None].iter().chain(xs.iter().map(Some)).zip(xs.iter())
| ^^^^
|
Not sure why xs isn't iterable. I've stated it in the where clause haven't I?

fn pairwise<I, T>(&xs: &I) -> I
This doesn't make sense. See What is the correct way to return an Iterator (or any other trait)? and What is the difference between `e1` and `&e2` when used as the for-loop variable?.
I: Iterator<Item = T>,
There's no reason to specify that the Item is a T.
[None].iter()
It's better to use iter::once.
xs.iter()
There's no trait in the standard library that defines an iter method. Perhaps you meant IntoIterator?
let data: Vec<i32> = vec![1, 2, 3]
There's no reason to specify the type here; i32 is the default integral type.
Vec<Option<i32>, i32>
Vec<Option<i32>, i32>> // original version
This is not a valid type for Vec, and your original form doesn't even have balanced symbols.
After all that, you are faced with tough choices. Your example code passes in an iterator which has references to the slice but you've written your assertion such that you expect to get non-references back. You've also attempted to use an arbitrary iterator twice; there's no guarantee that such a thing is viable.
The most generic form I see is:
use std::iter;
fn pairwise<I>(right: I) -> impl Iterator<Item = (Option<I::Item>, I::Item)>
where
I: IntoIterator + Clone,
{
let left = iter::once(None).chain(right.clone().into_iter().map(Some));
left.zip(right)
}
fn main() {
let data = vec![1, 2, 3];
let newdata: Vec<_> = pairwise(&data).collect();
assert_eq!(newdata, [(None, &1), (Some(&1), &2), (Some(&2), &3)]);
let newdata: Vec<_> = pairwise(data.iter().copied()).collect();
assert_eq!(newdata, [(None, 1), (Some(1), 2), (Some(2), 3)]);
}
See also:
Iterating over a slice's values instead of references in Rust?
How to iterate over and filter an array?
How to create a non consuming iterator from a Vector
Why can I iterate over a slice twice, but not a vector?
The compiler suggests I add a 'static lifetime because the parameter type may not live long enough, but I don't think that's what I want
What is the correct way to return an Iterator (or any other trait)?

I know OP asked for "outer pairwise" ([(None, 1), (Some(1), 2), (Some(2), 3)]), but here is how I adapted it for "inner pairwise" ([(1, 2), (2, 3)]):
fn inner_pairwise<I>(right: I) -> impl Iterator<Item = (I::Item, I::Item)>
where
I: IntoIterator + Clone,
{
let left = right.clone().into_iter().skip(1);
left.zip(right)
}
For anyone here for "inner pairwise", you're looking for Itertools::tuple_windows.

Related

What does Rhs refer to in a compiler error message about PartialEq?

I am trying to create a function that takes two iterators as as parameters and iterates over the items by reference. Each Iterator item should implement PartialEq.
My first attempt was:
fn compute<T: Iterator>(first: T, second: T, len: usize) -> usize
where
T::Item: std::cmp::PartialEq,
{
// ...
}
This compiles but iterates (as far as I understand) not by reference but by value and the compiler complains about a move when iterating.
My second attempt was something like:
fn compute<'a, T>(first: T, second: T, len: usize) -> usize
where
T: Iterator<Item = &'a std::cmp::PartialEq>,
{
//...
}
resulting in a compiler error:
error[E0393]: the type parameter `Rhs` must be explicitly specified
--> src/main.rs:3:28
|
3 | T: Iterator<Item = &'a std::cmp::PartialEq>,
| ^^^^^^^^^^^^^^^^^^^ missing reference to `Rhs`
|
= note: because of the default `Self` reference, type parameters must be specified on object types
What does the Rhs (Right hand side?) the compiler refers to here mean? Why do I need a reference to it? How do I pass a bounded reference-based Iterator into a function?
PartialEq is a trait that allows you to compare two values. Those two values do not have to be of the same type! The generic type Rhs is used to specify what type we are comparing with. By default, the value of Rhs is the same as the type that is being compared to:
pub trait PartialEq<Rhs = Self>
where
Rhs: ?Sized,
In this case, you are actually requesting that the iterator value be the trait object &PartialEq. As the error message states:
because of the default Self reference, type parameters must be specified on object types
We could specify it:
fn compute<'a, T>(first: T, second: T, len: usize) -> usize
where
T: Iterator<Item = &'a std::cmp::PartialEq<i32>>,
or
fn compute<'a, T: 'a>(first: T, second: T, len: usize) -> usize
where
T: Iterator<Item = &'a std::cmp::PartialEq<&'a T>>,
but iterates (as far as I understand) not by reference but by value
It's quite possible for it to iterate by reference. Remember that T is any type and that i32, &i32, and &mut i32 are all types. Your first example is the formulation of the signature I would use:
fn compute<T: Iterator>(first: T, second: T, len: usize) -> usize
where
T::Item: std::cmp::PartialEq,
{
42
}
fn main() {
let a = [1, 2, 3];
let b = [4, 5, 6];
compute(a.iter(), b.iter(), 1);
compute(a.iter(), b.iter(), 2);
compute(a.iter(), b.iter(), 3);
}

Chaining iterators of different types

I get type errors when chaining different types of Iterator.
let s = Some(10);
let v = (1..5).chain(s.iter())
.collect::<Vec<_>>();
Output:
<anon>:23:20: 23:35 error: type mismatch resolving `<core::option::Iter<'_, _> as core::iter::IntoIterator>::Item == _`:
expected &-ptr,
found integral variable [E0271]
<anon>:23 let v = (1..5).chain(s.iter())
^~~~~~~~~~~~~~~
<anon>:23:20: 23:35 help: see the detailed explanation for E0271
<anon>:24:14: 24:33 error: no method named `collect` found for type `core::iter::Chain<core::ops::Range<_>, core::option::Iter<'_, _>>` in the current scope
<anon>:24 .collect::<Vec<_>>();
^~~~~~~~~~~~~~~~~~~
<anon>:24:14: 24:33 note: the method `collect` exists but the following trait bounds were not satisfied: `core::iter::Chain<core::ops::Range<_>, core::option::Iter<'_, _>> : core::iter::Iterator`
error: aborting due to 2 previous errors
But it works fine when zipping:
let s = Some(10);
let v = (1..5).zip(s.iter())
.collect::<Vec<_>>();
Output:
[(1, 10)]
Why is Rust able to infer the correct types for zip but not for chain and how can I fix it? n.b. I want to be able to do this for any iterator, so I don't want a solution that just works for Range and Option.
First, note that the iterators yield different types. I've added an explicit u8 to the numbers to make the types more obvious:
fn main() {
let s = Some(10u8);
let r = (1..5u8);
let () = s.iter().next(); // Option<&u8>
let () = r.next(); // Option<u8>
}
When you chain two iterators, both iterators must yield the same type. This makes sense as the iterator cannot "switch" what type it outputs when it gets to the end of one and begins on the second:
fn chain<U>(self, other: U) -> Chain<Self, U::IntoIter>
where U: IntoIterator<Item=Self::Item>
// ^~~~~~~~~~~~~~~ This means the types must match
So why does zip work? Because it doesn't have that restriction:
fn zip<U>(self, other: U) -> Zip<Self, U::IntoIter>
where U: IntoIterator
// ^~~~ Nothing here!
This is because zip returns a tuple with one value from each iterator; a new type, distinct from either source iterator's type. One iterator could be an integral type and the other could return your own custom type for all zip cares.
Why is Rust able to infer the correct types for zip but not for chain
There is no type inference happening here; that's a different thing. This is just plain-old type mismatching.
and how can I fix it?
In this case, your inner iterator yields a reference to an integer, a Clone-able type, so you can use cloned to make a new iterator that clones each value and then both iterators would have the same type:
fn main() {
let s = Some(10);
let v: Vec<_> = (1..5).chain(s.iter().cloned()).collect();
}
If you are done with the option, you can also use a consuming iterator with into_iter:
fn main() {
let s = Some(10);
let v: Vec<_> = (1..5).chain(s.into_iter()).collect();
}

Iterate over copy types

It is clear that iterators pass around a references to avoid moving objects into iterator or it's closure argument, but what with Copy types? Let me show you a small snippet:
fn is_odd(x: &&i32) -> bool { *x & 1 == 1 }
// [1] fn is_odd(x: &i32) -> bool { x & 1 == 1 }
// [2] fn is_odd(x: i32) -> bool { x & 1 == 1 }
fn main() {
let xs = &[ 10, 20, 13, 14 ];
for x in xs.iter().filter(is_odd) {
assert_eq!(13, *x);
}
// [1] ...is slightly better, but not ideal
// for x in xs.iter().cloned().filter(is_odd) {
// assert_eq!(13, x);
// }
}
Am I right that .cloned() is preferred when we iterate over something like &[i32] or &[u8], where extra indirection is involved instead of just copying the tiny data unit?
But it looks like I can not avoid references passed into is_odd function.
Is there a way to make [2] function from above snippet work for higher-level functions like filter?
Assume that I understand that moving non-Copy type into predicate function is silly. But not all types use move semantics by default, right?
It is clear that iterators pass around a references
This blanket statement is not true, iterators are more than capable of yielding a non-reference. filter will provide a reference to the closure because it doesn't want to give ownership of the item to the closure. In your example, your iterated value is a &i32, and then filter provides a &&i32.
Is there a way to make [2] function from above snippet work for higher-level functions like filter?
Certainly, just provide a closure that does the dereferencing:
fn is_odd(x: i32) -> bool { x & 1 == 1 }
fn main() {
let xs = &[ 10, 20, 13, 14 ];
for x in xs.iter().filter(|&&x| is_odd(x)) {
assert_eq!(13, *x);
}
}

How to compose mutable Iterators?

Editor's note: This code example is from a version of Rust prior to 1.0 and is not syntactically valid Rust 1.0 code. Updated versions of this code produce different errors, but the answers still contain valuable information.
I would like to make an iterator that generates a stream of prime numbers. My general thought process was to wrap an iterator with successive filters so for example you start with
let mut n = (2..N)
Then for each prime number you mutate the iterator and add on a filter
let p1 = n.next()
n = n.filter(|&x| x%p1 !=0)
let p2 = n.next()
n = n.filter(|&x| x%p2 !=0)
I am trying to use the following code, but I can not seem to get it to work
struct Primes {
base: Iterator<Item = u64>,
}
impl<'a> Iterator for Primes<'a> {
type Item = u64;
fn next(&mut self) -> Option<u64> {
let p = self.base.next();
match p {
Some(n) => {
let prime = n.clone();
let step = self.base.filter(move |&: &x| {x%prime!=0});
self.base = &step as &Iterator<Item = u64>;
Some(n)
},
_ => None
}
}
}
I have toyed with variations of this, but I can't seem to get lifetimes and types to match up. Right now the compiler is telling me
I can't mutate self.base
the variable prime doesn't live long enough
Here is the error I am getting
solution.rs:16:17: 16:26 error: cannot borrow immutable borrowed content `*self.base` as mutable
solution.rs:16 let p = self.base.next();
^~~~~~~~~
solution.rs:20:28: 20:37 error: cannot borrow immutable borrowed content `*self.base` as mutable
solution.rs:20 let step = self.base.filter(move |&: &x| {x%prime!=0});
^~~~~~~~~
solution.rs:21:30: 21:34 error: `step` does not live long enough
solution.rs:21 self.base = &step as &Iterator<Item = u64>;
^~~~
solution.rs:15:39: 26:6 note: reference must be valid for the lifetime 'a as defined on the block at 15:38...
solution.rs:15 fn next(&mut self) -> Option<u64> {
solution.rs:16 let p = self.base.next();
solution.rs:17 match p {
solution.rs:18 Some(n) => {
solution.rs:19 let prime = n.clone();
solution.rs:20 let step = self.base.filter(move |&: &x| {x%prime!=0});
...
solution.rs:20:71: 23:14 note: ...but borrowed value is only valid for the block suffix following statement 1 at 20:70
solution.rs:20 let step = self.base.filter(move |&: &x| {x%prime!=0});
solution.rs:21 self.base = &step as &Iterator<Item = u64>;
solution.rs:22 Some(n)
solution.rs:23 },
error: aborting due to 3 previous errors
Why won't Rust let me do this?
Here is a working version:
struct Primes<'a> {
base: Option<Box<Iterator<Item = u64> + 'a>>,
}
impl<'a> Iterator for Primes<'a> {
type Item = u64;
fn next(&mut self) -> Option<u64> {
let p = self.base.as_mut().unwrap().next();
p.map(|n| {
let base = self.base.take();
let step = base.unwrap().filter(move |x| x % n != 0);
self.base = Some(Box::new(step));
n
})
}
}
impl<'a> Primes<'a> {
#[inline]
pub fn new<I: Iterator<Item = u64> + 'a>(r: I) -> Primes<'a> {
Primes {
base: Some(Box::new(r)),
}
}
}
fn main() {
for p in Primes::new(2..).take(32) {
print!("{} ", p);
}
println!("");
}
I'm using a Box<Iterator> trait object. Boxing is unavoidable because the internal iterator must be stored somewhere between next() calls, and there is nowhere you can store reference trait objects.
I made the internal iterator an Option. This is necessary because you need to replace it with a value which consumes it, so it is possible that the internal iterator may be "absent" from the structure for a short time. Rust models absence with Option. Option::take replaces the value it is called on with None and returns whatever was there. This is useful when shuffling non-copyable objects around.
Note, however, that this sieve implementation is going to be both memory and computationally inefficient - for each prime you're creating an additional layer of iterators which takes heap space. Also the depth of stack when calling next() grows linearly with the number of primes, so you will get a stack overflow on a sufficiently large number:
fn main() {
println!("{}", Primes::new(2..).nth(10000).unwrap());
}
Running it:
% ./test1
thread '<main>' has overflowed its stack
zsh: illegal hardware instruction (core dumped) ./test1

Implementing a "cautious" take_while using Peekable

I'd like to use Peekable as the basis for a new cautious_take_while operation that acts like take_while from IteratorExt but without consuming the first failed item. (There's a side question of whether this is a good idea, and whether there are better ways to accomplish this goal in Rust -- I'd be happy for hints in that direction, but mostly I'm trying to understand where my code is breaking).
The API I'm trying to enable is basically:
let mut chars = "abcdefg.".chars().peekable();
let abc : String = chars.by_ref().cautious_take_while(|&x| x != 'd');
let defg : String = chars.by_ref().cautious_take_while(|&x| x != '.');
// yielding (abc = "abc", defg = "defg")
I've taken a crack at creating a MCVE here, but I'm getting:
:10:5: 10:19 error: cannot move out of borrowed content
:10 chars.by_ref().cautious_take_while(|&x| x != '.');
As far as I can tell, I'm following the same pattern as Rust's own TakeWhile in terms of my function signatures, but I'm seeing different different behavior from the borrow checker. Can someone point out what I'm doing wrong?
The funny thing with by_ref() is that it returns a mutable reference to itself:
pub trait IteratorExt: Iterator + Sized {
fn by_ref(&mut self) -> &mut Self { self }
}
It works because the Iterator trait is implemented for the mutable pointer to Iterator type. Smart!
impl<'a, I> Iterator for &'a mut I where I: Iterator, I: ?Sized { ... }
The standard take_while function works because it uses the trait Iterator, that is automatically resolved to &mut Peekable<T>.
But your code does not work because Peekable is a struct, not a trait, so your CautiousTakeWhileable must specify the type, and you are trying to take ownership of it, but you cannot, because you have a mutable pointer.
Solution, do not take a Peekable<T> but &mut Peekable<T>. You will need to specify the lifetime too:
impl <'a, T: Iterator, P> Iterator for CautiousTakeWhile<&'a mut Peekable<T>, P>
where P: FnMut(&T::Item) -> bool {
//...
}
impl <'a, T: Iterator> CautiousTakeWhileable for &'a mut Peekable<T> {
fn cautious_take_while<P>(self, f: P) -> CautiousTakeWhile<&'a mut Peekable<T>, P>
where P: FnMut(&T::Item) -> bool {
CautiousTakeWhile{inner: self, condition: f,}
}
}
A curious side effect of this solution is that now by_ref is not needed, because cautious_take_while() takes a mutable reference, so it does not steal ownership. The by_ref() call is needed for take_while() because it can take either Peekable<T> or &mut Peekable<T>, and it defaults to the first one. With the by_ref() call it will resolve to the second one.
And now that I finally understand it, I think it might be a good idea to change the definition of struct CautiousTakeWhile to include the peekable bit into the struct itself. The difficulty is that the lifetime has to be specified manually, if I'm right. Something like:
struct CautiousTakeWhile<'a, T: Iterator + 'a, P>
where T::Item : 'a {
inner: &'a mut Peekable<T>,
condition: P,
}
trait CautiousTakeWhileable<'a, T>: Iterator {
fn cautious_take_while<P>(self, P) -> CautiousTakeWhile<'a, T, P> where
P: FnMut(&Self::Item) -> bool;
}
and the rest is more or less straightforward.
This was a tricky one! I'll lead with the meat of the code, then attempt to explain it (if I understand it...). It's also the ugly, unsugared version, as I wanted to reduce incidental complexity.
use std::iter::Peekable;
fn main() {
let mut chars = "abcdefg.".chars().peekable();
let abc: String = CautiousTakeWhile{inner: chars.by_ref(), condition: |&x| x != 'd'}.collect();
let defg: String = CautiousTakeWhile{inner: chars.by_ref(), condition: |&x| x != '.'}.collect();
println!("{}, {}", abc, defg);
}
struct CautiousTakeWhile<'a, I, P> //'
where I::Item: 'a, //'
I: Iterator + 'a, //'
P: FnMut(&I::Item) -> bool,
{
inner: &'a mut Peekable<I>, //'
condition: P,
}
impl<'a, I, P> Iterator for CautiousTakeWhile<'a, I, P>
where I::Item: 'a, //'
I: Iterator + 'a, //'
P: FnMut(&I::Item) -> bool
{
type Item = I::Item;
fn next(&mut self) -> Option<I::Item> {
let return_next =
match self.inner.peek() {
Some(ref v) => (self.condition)(v),
_ => false,
};
if return_next { self.inner.next() } else { None }
}
}
Actually, Rodrigo seems to have a good explanation, so I'll defer to that, unless you'd like me to explain something specific.