Is there a way to deplete an iterator after calling Iterator::take? [duplicate] - iterator

I find it odd that Iterator::take_while takes ownership of the iterator. It seems like a useful feature to be able to take the first x elements which satisfy some function but still leave the rest of the elements available in the original iterator.
I understand that this is incompatible with a lazy implementation of take_while, but still feels useful. Was this just judged not useful enough to include in the standard library, or is there some other problem I'm not seeing?

All the iterator adapters take the original iterator by value for efficiency's sake. Additionally, taking ownership of the original iterator avoids having to deal with lifetimes when it isn't necessary.
If you wish to retain access to the original iterator, you can use by_ref. This introduces one level of indirection, but the programmer chooses to opt into the extra work when the feature is needed:
fn main() {
let v = [1, 2, 3, 4, 5, 6, 7, 8];
let mut i1 = v.iter();
for z in i1.by_ref().take_while(|&&v| v < 4) {
// ^^^^^^^^^
println!("Take While: {}", z);
}
for z in i1 {
println!("Rest: {}", z);
}
}
Has the output
Take While: 1
Take While: 2
Take While: 3
Rest: 5
Rest: 6
Rest: 7
Rest: 8
Iterator::by_ref works because there's an implementation of Iterator for any mutable reference to an iterator:
impl<'_, I> Iterator for &'_ mut I
where
I: Iterator + ?Sized,
This means that you can also take a mutable reference. The parenthesis are needed for precedence:
for z in (&mut i1).take_while(|&&v| v < 4)
Did you note that 4 was missing? That's because once take_while picks a value and decides to not use it, there's nowhere for it to "put it back". Putting it back would require opting into more storage and slowness than is always needed.
I've used the itertools crate to handle cases like this, specifically take_while_ref:
use itertools::Itertools; // 0.9.0
fn main() {
let v = [1, 2, 3, 4, 5, 6, 7, 8];
let mut i1 = v.iter();
for z in i1.take_while_ref(|&&v| v < 4) {
// ^^^^^^^^^^^^^^^
println!("Take While: {}", z);
}
for z in i1 {
println!("Rest: {}", z);
}
}
Take While: 1
Take While: 2
Take While: 3
Rest: 4
Rest: 5
Rest: 6
Rest: 7
Rest: 8

If it's getting too complicated, we may be using the wrong tool.
Note that 4 is present here.
fn main() {
let v = [1, 2, 3, 4, 5, 6, 7, 8];
let mut i1 = v.iter().peekable();
while let Some(z) = i1.next_if(|&n| n < &4) {
println!("Take While: {z}");
}
for z in i1 {
println!("Rest: {z}");
}
}
Take While: 1
Take While: 2
Take While: 3
Rest: 4
Rest: 5
Rest: 6
Rest: 7
Rest: 8
Playground
Yes, the OP asked for take_while and Shepmaster's solution is superb.

Related

How do I chain operators over lists in rust? Looking for equivalent to kotlin code

I have the following code in kotlin and I'm trying to find a rust equivalent, but don't understand the chaining mechanism in rust to convert.
val windowSize = 2
val result = listOf(1, 2, 3, 4, 5, 6)
.windowed(windowSize, 1) ; [[1,2], [2,3], [3,4], [4,5], [5,6]]
.map { it.sum() } ; [ 3, 5, 7, 9, 11]
.windowed(2, 1) ; [[3,5], [5,7], [7,9], [9,11] ]
.count { it[0] < it[1] } ; 4
;; result = 4, as there are 4 sequences that have first number less than 2nd,
;; when considering a sliding window over the original data of 2 items at a time.
It just takes a list of integers, splits them into pairs (but the windowSize will be a function parameter), sums those groups, splits the sums into pairs again, and finds where each second element is bigger than the previous, so finding increasing values over moving windows.
I'm converting this to the rust equivalent, but struggling to understand how to chain operations together.
What I've got so far is:
let input = [1, 2, 3, 4, 5, 6];
input.windows(2)
.map(|es| es.iter().sum())
// what goes here to do the next windows(2) operation?
.for_each(|x: u32| println!("{}", x));
I can "for_each" over the map to do things on the iteration, but I can't split it with another "windows()", or don't know the magic to make that possible. IntelliJ is showing me the return type from map is impl Iterator<Item=?>
Can anyone enlighten me please? I am an absolute beginner on rust, so this is undoubtedly to do with my understanding of the language as a whole.
The Itertools crate provides a reasonably convenient way to do this with the tuple_windows method.
use itertools::Itertools;
fn main() {
let input = [1i32, 2, 3, 4, 5, 6];
let output: usize = input
.windows(2)
.map(|es| es.iter().sum::<i32>())
.tuple_windows()
.filter(|(a, b)| a < b)
.count();
println!("{}", output);
}
Playground
The standard library does not have a way to do this without collecting the iterator first, which requires two passes through the data.
It is a bit convoluted to chain everything. You need to collect into a vec so you can access windows again. Then you can flat_map the windows to array references (taken from this other answer) to complete what you want to do:
fn main() {
let input = [1usize, 2, 3, 4, 5, 6];
let res = input
.windows(2)
.map(|es| es.iter().sum::<usize>())
.collect::<Vec<_>>()
.windows(2)
.flat_map(<[usize; 2]>::try_from)
.filter(|[a, b]| a < b)
.count();
println!("{}", res);
}
Playground
Note: Nightly feature array_windows that use const generic allow to remove the .flat_map(<&[usize; 2]>::try_from) call
As stated in #Aiden4's answer, the best solution is to use itertools::tuple_windows. It is however possible using just the standard library and without collecting to an intermediate vector using Iterator::scan:
fn main() {
let input = [1i32, 2, 3, 4, 5, 6];
let output: usize = input
.windows(2)
.map(|es| es.iter().sum())
.scan(0, |prev, cur| {
let res = (*prev, cur);
*prev = cur;
Some(res)
})
.skip(1)
.filter(|(a, b)| a < b)
.count();
println!("{}", output);
}
Playground
Using std and stable only:
fn main() {
let input = [1i32, 2, 3, 4, 5, 6];
let mut iter = input.windows(2).map(|es| es.iter().sum::<i32>());
let n = if let Some(mut prev) = iter.next() {
iter.map(|i| {
let ret = (prev, i);
prev = i;
ret
})
.filter(|(a, b)| a < b)
.count()
} else {
0
};
println!("{}", n);
}
This should be very fast.

Conditionally return empty iterator from flat_map

Given this definition for foo:
let foo = vec![vec![1, 2, 3], vec![4, 5, 6], vec![7, 8, 9]];
I'd like to be able to write code like this:
let result: Vec<_> = foo.iter()
.enumerate()
.flat_map(|(i, row)| if i % 2 == 0 {
row.iter().map(|x| x * 2)
} else {
std::iter::empty()
})
.collect();
but that raises an error about the if and else clauses having incompatible types. I tried removing the map temporarily and I tried defining an empty vector outside the closure and returning an iterator over that like so:
let empty = vec![];
let result: Vec<_> = foo.iter()
.enumerate()
.flat_map(|(i, row)| if i % 2 == 0 {
row.iter() //.map(|x| x * 2)
} else {
empty.iter()
})
.collect();
This seems kind of silly but it compiles. If I try to uncomment the map then it still complains about the if and else clauses having incompatible types. Here's part of the error message:
error[E0308]: if and else have incompatible types
--> src/main.rs:6:30
|
6 | .flat_map(|(i, row)| if i % 2 == 0 {
| ______________________________^
7 | | row.iter().map(|x| x * 2)
8 | | } else {
9 | | std::iter::empty()
10 | | })
| |_________^ expected struct `std::iter::Map`, found struct `std::iter::Empty`
|
= note: expected type `std::iter::Map<std::slice::Iter<'_, {integer}>, [closure#src/main.rs:7:28: 7:37]>`
found type `std::iter::Empty<_>`
Playground Link
I know I could write something that does what I want with some nested for loops but I'd like to know if there's a terse way to write it using iterators.
Since Rust is statically typed and each step in an iterator chain changes the result to a new type that entrains the previous types (unless you use boxed trait objects) you will have to write it in a way where both branches are covered by the same types.
One way to convey conditional emptiness with a single type is the TakeWhile iterator implementation.
.flat_map(|(i, row)| {
let iter = row.iter().map(|x| x * 2);
let take = i % 2 == 0;
iter.take_while(|_| take)
})
If you don't mind ignoring the edge-case where the input iterator foo could have more than usize elements you could also use Take instead with either 0 or usize::MAX. It has the advantage of providing a better size_hint() than TakeWhile.
In your specific example, you can use filter to remove unwanted elements prior to calling flat_map:
let result: Vec<_> = foo.iter()
.enumerate()
.filter(|&(i, _)| i % 2 == 0)
.flat_map(|(_, row)| row.iter().map(|x| x * 2))
.collect();
If you ever want to use it with map instead of flat_map, you can combine the calls to filter and map by using filter_map which takes a function returning an Option and only keeps elements that are Some(thing).

How to debug Kotlin sequences / collections

Take the following one-liner, which can be expressed as a series of operations on a collection or a sequence:
val nums = (10 downTo 1)
// .asSequence() if we want this to be a sequence
.filter { it % 2 == 0 }
.map { it * it }
.sorted()
// .asList() if declaring it a sequence
println(nums) // [4, 16, 36, 64, 100]
Let's say I want to see the elements at each step, they would be (from deduction):
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
[10, 8, 6, 4, 2]
[100, 64, 36, 16, 4]
[4, 16, 36, 64, 100]
Unfortunately, there's no good way to either debug this with a debugger or log these values for later inspection. With good functional programming constructs, entire methods can be rewritten as single statements like this but there seems to be no good way to inspect intermediate states, even counts (10, 5, 5, 5 here).
What's the best way to debug these?
You can log the intermediate values (lists) with
fun <T> T.log(): T { println(this); this }
//USAGE:
val nums = (10 downTo 1)
.filter { it % 2 == 0 }.log()
.map { it * it }.log()
.sorted().log()
This will work as desired since in your example you work with collections, not sequences. For lazy Sequence you need:
// coming in 1.1
public fun <T> Sequence<T>.onEach(action: (T) -> Unit): Sequence<T> {
return map {
action(it)
it
}
}
fun <T> Sequence<T>.log() = onEach {print(it)}
//USAGE:
val nums = (10 downTo 1).asSequance()
.filter { it % 2 == 0 }
.map { it * it }.log()
.sorted()
.toList()
In latest Intellij Idea when adding a breakpoint you have an option to set it to not inspect whole expression but only a Lambda body.
Then in the debug itself you can see what is happening inside of your Lambda.
But this is not the only way. You can also use Run to cursor (Alt + F9).
I think the current correct answer is that you want the Kotlin Sequence Debugger plugin, which lets you use IntelliJ's lovely Java stream debugger with Kotlin sequences.
Note that (unless I'm doing something wrong) it doesn't appear to work with collections, so you will have to convert the collection to a sequence in order to debug it. Easy enough using Iterable.asSequence, and a small price to pay -- you can always revert that change once you are done debugging.
you may use the also inline function to log, print at any sequence stage as explained by Andrey Breslav at Google I/O '18
(1..10)
.filter { it % 2 == 0 }
.also { e -> println(e) /* do your debug or print here */ }
.map { it * 2 }
.toList()

Using an iterator, how do I skip a number of values and then display the rest?

Random access to the elements is not allowed.
let vec = vec![1,2,3,4,5,6,7,8,9,0];
let n = 3;
for v in vec.iter().rev().take(n) {
println!("{}", v);
}
// this printed: 0, 9, 8
// need: 8, 9, 0
for v in vec.iter().rev().skip(n).rev() does not work.
I think the code you wrote does what you're asking it to.
You are reversing the vec with rev() and then you're taking the first 3 elements of the reversed vector (therefore 0, 9, 8)
To obtain the last 3 in non-reversed order you can skip to the end of the vector minus 3 elements, without reversing it:
let vec = vec![1,2,3,4,5,6,7,8,9,0];
let n = vec.len() - 3;
for v in vec.iter().skip(n) {
println!("{}", v);
}
Neither skip nor take yield DoubleEndIterator, you have to either:
skip, which is O(N) in the number of skipped items
collect the result of .rev().take(), and then rev it, which is O(N) in the number of items to be printed, and requires allocating memory for them
The skip is obvious, so let me illustrate the collect:
let vec = vec![1,2,3,4,5,6,7,8,9,0];
let vec: Vec<_> = vec.iter().rev().take(3).collect();
for v in vec.iter().rev() {
println!("{}", v);
}
Of course, the inefficiency is due to you shooting yourself in the foot by avoiding random access in the first place...
Based on the comments, I guess you want to iterate specifically through the elements of a Vec or slice. If that is the case, you could use range slicing, as shown below:
let vec = vec![1,2,3,4,5,6,7,8,9,0];
let n = vec.len() - 3;
for v in &vec[n..] {
println!("{}", v);
}
The big advantage of this approach is that it doesn't require to skip through elements you are not interested in (which may have a big cost if not optimized away). It will just make a new slice and then iterate through it. In other words, you have the guarantee that it will be fast.

Conflicting lifetime requirement for iterator returned from function

This may be a duplicate. I don't know. I couldn't understand the other answers well enough to know that. :)
Rust version: rustc 1.0.0-nightly (b47aebe3f 2015-02-26) (built 2015-02-27)
Basically, I'm passing a bool to this function that's supposed to build an iterator that filters one way for true and another way for false. Then it kind of craps itself because it doesn't know how to keep that boolean value handy, I guess. I don't know. There are actually multiple lifetime problems here, which is discouraging because this is a really common pattern for me, since I come from a .NET background.
fn main() {
for n in values(true) {
println!("{}", n);
}
}
fn values(even: bool) -> Box<Iterator<Item=usize>> {
Box::new([3usize, 4, 2, 1].iter()
.map(|n| n * 2)
.filter(|n| if even {
n % 2 == 0
} else {
true
}))
}
Is there a way to make this work?
You have two conflicting issues, so let break down a few representative pieces:
[3usize, 4, 2, 1].iter()
.map(|n| n * 2)
.filter(|n| n % 2 == 0))
Here, we create an array in the stack frame of the method, then get an iterator to it. Since we aren't allowed to consume the array, the iterator item is &usize. We then map from the &usize to a usize. Then we filter against a &usize - we aren't allowed to consume the filtered item, otherwise the iterator wouldn't have it to return!
The problem here is that we are ultimately rooted to the stack frame of the function. We can't return this iterator, because the array won't exist after the call returns!
To work around this for now, let's just make it static. Now we can focus on the issue with even.
filter takes a closure. Closures capture any variable used that isn't provided as an argument to the closure. By default, these variables are captured by reference. However, even is again a variable located on the stack frame. This time however, we can give it to the closure by using the move keyword. Here's everything put together:
fn main() {
for n in values(true) {
println!("{}", n);
}
}
static ITEMS: [usize; 4] = [3, 4, 2, 1];
fn values(even: bool) -> Box<Iterator<Item=usize>> {
Box::new(ITEMS.iter()
.map(|n| n * 2)
.filter(move |n| if even {
n % 2 == 0
} else {
true
}))
}