How do I chain operators over lists in rust? Looking for equivalent to kotlin code - kotlin

I have the following code in kotlin and I'm trying to find a rust equivalent, but don't understand the chaining mechanism in rust to convert.
val windowSize = 2
val result = listOf(1, 2, 3, 4, 5, 6)
.windowed(windowSize, 1) ; [[1,2], [2,3], [3,4], [4,5], [5,6]]
.map { it.sum() } ; [ 3, 5, 7, 9, 11]
.windowed(2, 1) ; [[3,5], [5,7], [7,9], [9,11] ]
.count { it[0] < it[1] } ; 4
;; result = 4, as there are 4 sequences that have first number less than 2nd,
;; when considering a sliding window over the original data of 2 items at a time.
It just takes a list of integers, splits them into pairs (but the windowSize will be a function parameter), sums those groups, splits the sums into pairs again, and finds where each second element is bigger than the previous, so finding increasing values over moving windows.
I'm converting this to the rust equivalent, but struggling to understand how to chain operations together.
What I've got so far is:
let input = [1, 2, 3, 4, 5, 6];
input.windows(2)
.map(|es| es.iter().sum())
// what goes here to do the next windows(2) operation?
.for_each(|x: u32| println!("{}", x));
I can "for_each" over the map to do things on the iteration, but I can't split it with another "windows()", or don't know the magic to make that possible. IntelliJ is showing me the return type from map is impl Iterator<Item=?>
Can anyone enlighten me please? I am an absolute beginner on rust, so this is undoubtedly to do with my understanding of the language as a whole.

The Itertools crate provides a reasonably convenient way to do this with the tuple_windows method.
use itertools::Itertools;
fn main() {
let input = [1i32, 2, 3, 4, 5, 6];
let output: usize = input
.windows(2)
.map(|es| es.iter().sum::<i32>())
.tuple_windows()
.filter(|(a, b)| a < b)
.count();
println!("{}", output);
}
Playground
The standard library does not have a way to do this without collecting the iterator first, which requires two passes through the data.

It is a bit convoluted to chain everything. You need to collect into a vec so you can access windows again. Then you can flat_map the windows to array references (taken from this other answer) to complete what you want to do:
fn main() {
let input = [1usize, 2, 3, 4, 5, 6];
let res = input
.windows(2)
.map(|es| es.iter().sum::<usize>())
.collect::<Vec<_>>()
.windows(2)
.flat_map(<[usize; 2]>::try_from)
.filter(|[a, b]| a < b)
.count();
println!("{}", res);
}
Playground
Note: Nightly feature array_windows that use const generic allow to remove the .flat_map(<&[usize; 2]>::try_from) call

As stated in #Aiden4's answer, the best solution is to use itertools::tuple_windows. It is however possible using just the standard library and without collecting to an intermediate vector using Iterator::scan:
fn main() {
let input = [1i32, 2, 3, 4, 5, 6];
let output: usize = input
.windows(2)
.map(|es| es.iter().sum())
.scan(0, |prev, cur| {
let res = (*prev, cur);
*prev = cur;
Some(res)
})
.skip(1)
.filter(|(a, b)| a < b)
.count();
println!("{}", output);
}
Playground

Using std and stable only:
fn main() {
let input = [1i32, 2, 3, 4, 5, 6];
let mut iter = input.windows(2).map(|es| es.iter().sum::<i32>());
let n = if let Some(mut prev) = iter.next() {
iter.map(|i| {
let ret = (prev, i);
prev = i;
ret
})
.filter(|(a, b)| a < b)
.count()
} else {
0
};
println!("{}", n);
}
This should be very fast.

Related

Is there a way to deplete an iterator after calling Iterator::take? [duplicate]

I find it odd that Iterator::take_while takes ownership of the iterator. It seems like a useful feature to be able to take the first x elements which satisfy some function but still leave the rest of the elements available in the original iterator.
I understand that this is incompatible with a lazy implementation of take_while, but still feels useful. Was this just judged not useful enough to include in the standard library, or is there some other problem I'm not seeing?
All the iterator adapters take the original iterator by value for efficiency's sake. Additionally, taking ownership of the original iterator avoids having to deal with lifetimes when it isn't necessary.
If you wish to retain access to the original iterator, you can use by_ref. This introduces one level of indirection, but the programmer chooses to opt into the extra work when the feature is needed:
fn main() {
let v = [1, 2, 3, 4, 5, 6, 7, 8];
let mut i1 = v.iter();
for z in i1.by_ref().take_while(|&&v| v < 4) {
// ^^^^^^^^^
println!("Take While: {}", z);
}
for z in i1 {
println!("Rest: {}", z);
}
}
Has the output
Take While: 1
Take While: 2
Take While: 3
Rest: 5
Rest: 6
Rest: 7
Rest: 8
Iterator::by_ref works because there's an implementation of Iterator for any mutable reference to an iterator:
impl<'_, I> Iterator for &'_ mut I
where
I: Iterator + ?Sized,
This means that you can also take a mutable reference. The parenthesis are needed for precedence:
for z in (&mut i1).take_while(|&&v| v < 4)
Did you note that 4 was missing? That's because once take_while picks a value and decides to not use it, there's nowhere for it to "put it back". Putting it back would require opting into more storage and slowness than is always needed.
I've used the itertools crate to handle cases like this, specifically take_while_ref:
use itertools::Itertools; // 0.9.0
fn main() {
let v = [1, 2, 3, 4, 5, 6, 7, 8];
let mut i1 = v.iter();
for z in i1.take_while_ref(|&&v| v < 4) {
// ^^^^^^^^^^^^^^^
println!("Take While: {}", z);
}
for z in i1 {
println!("Rest: {}", z);
}
}
Take While: 1
Take While: 2
Take While: 3
Rest: 4
Rest: 5
Rest: 6
Rest: 7
Rest: 8
If it's getting too complicated, we may be using the wrong tool.
Note that 4 is present here.
fn main() {
let v = [1, 2, 3, 4, 5, 6, 7, 8];
let mut i1 = v.iter().peekable();
while let Some(z) = i1.next_if(|&n| n < &4) {
println!("Take While: {z}");
}
for z in i1 {
println!("Rest: {z}");
}
}
Take While: 1
Take While: 2
Take While: 3
Rest: 4
Rest: 5
Rest: 6
Rest: 7
Rest: 8
Playground
Yes, the OP asked for take_while and Shepmaster's solution is superb.

Implement a pairwise iterator

I have trouble writing code for a function that takes an iterator and returns an iterator that iterates in pairs (Option<T>, T) like so
a = [1,2,3]
assert pairwise(a) == `[(None, 1), (Some(1), 2), (Some(2), 3)]
fn pairwise<I, T>(&xs: &I) -> I
where
I: Iterator<Item = T>,
{
[None].iter().chain(xs.iter().map(Some)).zip(xs.iter())
}
fn main() {
let data: Vec<i32> = vec![1, 2, 3];
let newdata: Vec<Option<i32>, i32> = pairwise(&data).collect();
println!("{:?}", newdata);
}
error[E0599]: no method named `iter` found for type `I` in the current scope
--> src/main.rs:3:28
|
3 | [None].iter().chain(xs.iter().map(Some)).zip(xs.iter())
| ^^^^
|
Not sure why xs isn't iterable. I've stated it in the where clause haven't I?
fn pairwise<I, T>(&xs: &I) -> I
This doesn't make sense. See What is the correct way to return an Iterator (or any other trait)? and What is the difference between `e1` and `&e2` when used as the for-loop variable?.
I: Iterator<Item = T>,
There's no reason to specify that the Item is a T.
[None].iter()
It's better to use iter::once.
xs.iter()
There's no trait in the standard library that defines an iter method. Perhaps you meant IntoIterator?
let data: Vec<i32> = vec![1, 2, 3]
There's no reason to specify the type here; i32 is the default integral type.
Vec<Option<i32>, i32>
Vec<Option<i32>, i32>> // original version
This is not a valid type for Vec, and your original form doesn't even have balanced symbols.
After all that, you are faced with tough choices. Your example code passes in an iterator which has references to the slice but you've written your assertion such that you expect to get non-references back. You've also attempted to use an arbitrary iterator twice; there's no guarantee that such a thing is viable.
The most generic form I see is:
use std::iter;
fn pairwise<I>(right: I) -> impl Iterator<Item = (Option<I::Item>, I::Item)>
where
I: IntoIterator + Clone,
{
let left = iter::once(None).chain(right.clone().into_iter().map(Some));
left.zip(right)
}
fn main() {
let data = vec![1, 2, 3];
let newdata: Vec<_> = pairwise(&data).collect();
assert_eq!(newdata, [(None, &1), (Some(&1), &2), (Some(&2), &3)]);
let newdata: Vec<_> = pairwise(data.iter().copied()).collect();
assert_eq!(newdata, [(None, 1), (Some(1), 2), (Some(2), 3)]);
}
See also:
Iterating over a slice's values instead of references in Rust?
How to iterate over and filter an array?
How to create a non consuming iterator from a Vector
Why can I iterate over a slice twice, but not a vector?
The compiler suggests I add a 'static lifetime because the parameter type may not live long enough, but I don't think that's what I want
What is the correct way to return an Iterator (or any other trait)?
I know OP asked for "outer pairwise" ([(None, 1), (Some(1), 2), (Some(2), 3)]), but here is how I adapted it for "inner pairwise" ([(1, 2), (2, 3)]):
fn inner_pairwise<I>(right: I) -> impl Iterator<Item = (I::Item, I::Item)>
where
I: IntoIterator + Clone,
{
let left = right.clone().into_iter().skip(1);
left.zip(right)
}
For anyone here for "inner pairwise", you're looking for Itertools::tuple_windows.

How to debug Kotlin sequences / collections

Take the following one-liner, which can be expressed as a series of operations on a collection or a sequence:
val nums = (10 downTo 1)
// .asSequence() if we want this to be a sequence
.filter { it % 2 == 0 }
.map { it * it }
.sorted()
// .asList() if declaring it a sequence
println(nums) // [4, 16, 36, 64, 100]
Let's say I want to see the elements at each step, they would be (from deduction):
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
[10, 8, 6, 4, 2]
[100, 64, 36, 16, 4]
[4, 16, 36, 64, 100]
Unfortunately, there's no good way to either debug this with a debugger or log these values for later inspection. With good functional programming constructs, entire methods can be rewritten as single statements like this but there seems to be no good way to inspect intermediate states, even counts (10, 5, 5, 5 here).
What's the best way to debug these?
You can log the intermediate values (lists) with
fun <T> T.log(): T { println(this); this }
//USAGE:
val nums = (10 downTo 1)
.filter { it % 2 == 0 }.log()
.map { it * it }.log()
.sorted().log()
This will work as desired since in your example you work with collections, not sequences. For lazy Sequence you need:
// coming in 1.1
public fun <T> Sequence<T>.onEach(action: (T) -> Unit): Sequence<T> {
return map {
action(it)
it
}
}
fun <T> Sequence<T>.log() = onEach {print(it)}
//USAGE:
val nums = (10 downTo 1).asSequance()
.filter { it % 2 == 0 }
.map { it * it }.log()
.sorted()
.toList()
In latest Intellij Idea when adding a breakpoint you have an option to set it to not inspect whole expression but only a Lambda body.
Then in the debug itself you can see what is happening inside of your Lambda.
But this is not the only way. You can also use Run to cursor (Alt + F9).
I think the current correct answer is that you want the Kotlin Sequence Debugger plugin, which lets you use IntelliJ's lovely Java stream debugger with Kotlin sequences.
Note that (unless I'm doing something wrong) it doesn't appear to work with collections, so you will have to convert the collection to a sequence in order to debug it. Easy enough using Iterable.asSequence, and a small price to pay -- you can always revert that change once you are done debugging.
you may use the also inline function to log, print at any sequence stage as explained by Andrey Breslav at Google I/O '18
(1..10)
.filter { it % 2 == 0 }
.also { e -> println(e) /* do your debug or print here */ }
.map { it * 2 }
.toList()

Using an iterator, how do I skip a number of values and then display the rest?

Random access to the elements is not allowed.
let vec = vec![1,2,3,4,5,6,7,8,9,0];
let n = 3;
for v in vec.iter().rev().take(n) {
println!("{}", v);
}
// this printed: 0, 9, 8
// need: 8, 9, 0
for v in vec.iter().rev().skip(n).rev() does not work.
I think the code you wrote does what you're asking it to.
You are reversing the vec with rev() and then you're taking the first 3 elements of the reversed vector (therefore 0, 9, 8)
To obtain the last 3 in non-reversed order you can skip to the end of the vector minus 3 elements, without reversing it:
let vec = vec![1,2,3,4,5,6,7,8,9,0];
let n = vec.len() - 3;
for v in vec.iter().skip(n) {
println!("{}", v);
}
Neither skip nor take yield DoubleEndIterator, you have to either:
skip, which is O(N) in the number of skipped items
collect the result of .rev().take(), and then rev it, which is O(N) in the number of items to be printed, and requires allocating memory for them
The skip is obvious, so let me illustrate the collect:
let vec = vec![1,2,3,4,5,6,7,8,9,0];
let vec: Vec<_> = vec.iter().rev().take(3).collect();
for v in vec.iter().rev() {
println!("{}", v);
}
Of course, the inefficiency is due to you shooting yourself in the foot by avoiding random access in the first place...
Based on the comments, I guess you want to iterate specifically through the elements of a Vec or slice. If that is the case, you could use range slicing, as shown below:
let vec = vec![1,2,3,4,5,6,7,8,9,0];
let n = vec.len() - 3;
for v in &vec[n..] {
println!("{}", v);
}
The big advantage of this approach is that it doesn't require to skip through elements you are not interested in (which may have a big cost if not optimized away). It will just make a new slice and then iterate through it. In other words, you have the guarantee that it will be fast.

Return a moving window of elements resulting from an iterator of Vec<u8>

I'm trying to figure out how to return a window of elements from a vector that I've first filtered without copying it to a new vector.
So this is the naive approach which works fine but I think will end up allocating a new vector from line 5 which I don't really want to do.
let mut buf = Vec::new();
file.read_to_end(&mut buf);
// Do some filtering of the read file and create a new vector for subsequent processing
let iter = buf.iter().filter(|&x| *x != 10 && *x != 13);
let clean_buf = Vec::from_iter(iter);
for iter in clean_buf.windows(13) {
print!("{}",iter.len());
}
Alternative approach where I could use a chain()? to achieve the same thing without copying into a new Vec
for iter in buf.iter().filter(|&x| *x != 10 && *x != 13) {
let window = ???
}
You can use Vec::retain instead of filter for this, which allows you to keep your Vec:
fn main() {
let mut buf = vec![
8, 9, 10, 11, 12, 13, 14,
8, 9, 10, 11, 12, 13, 14,
8, 9, 10, 11, 12, 13, 14,
];
println!("{:?}", buf);
buf.retain(|&x| x != 10 && x != 13);
println!("{:?}", buf);
for iter in buf.windows(13) {
print!("{}, ", iter.len());
}
println!("");
}
I don't see how this would be possible. You say:
elements from a vector that I've first filtered
But once you've filtered a vector, you don't have a vector anymore - you just have an Iterator. Iterators only have the concept of the next item.
To be most efficient, you'd have to create a small buffer of the size of your window. Unfortunately, you cannot write an iterator that returns a reference to itself, so you'd have to pass in a buffer to a hypothetical Iterator::windows method. In that case, you'd run into the problem of having a mutable reference (so you could populate the buffer) and an immutable reference (so you could return a slice), which won't fly.
The only close solution I can think of is to have multiple iterators over the same vector that you then zip together:
fn main() {
let nums: Vec<u8> = (1..100).collect();
fn is_even(x: &&u8) -> bool { **x % 2 == 0 }
let a = nums.iter().filter(is_even);
let b = nums.iter().filter(is_even).skip(1);
let c = nums.iter().filter(is_even).skip(2);
for z in a.zip(b).zip(c).map(|((a, b), c)| (a,b,c)) {
println!("{:?}", z);
}
}
This has the distinct downside of needing to apply the filtering condition multiple times, and the ugliness of the nested zips (you can fix the latter with use of itertools though).
Personally, I'd probably just collect into a Vec, as you have already done.