How do I convert a list of Option<T> to a list of T when T cannot be copied? [duplicate] - iterator

This question already has an answer here:
How do I avoid unwrap when converting a vector of Options or Results to only the successful values?
(1 answer)
Closed 4 years ago.
How do I take a Vec<Option<T>>, where T cannot be copied, and unwrap all the Some values?
I run into an error in the map step. I'm happy to move ownership of the original list and "throw away" the Nones.
#[derive(Debug)]
struct Uncopyable {
val: u64,
}
fn main() {
let num_opts: Vec<Option<Uncopyable>> = vec![
Some(Uncopyable { val: 1 }),
Some(Uncopyable { val: 2 }),
None,
Some(Uncopyable { val: 4 }),
];
let nums: Vec<Uncopyable> = num_opts
.iter()
.filter(|x| x.is_some())
.map(|&x| x.unwrap())
.collect();
println!("nums: {:?}", nums);
}
Playground
Which gives the error
error[E0507]: cannot move out of borrowed content
--> src/main.rs:17:15
|
17 | .map(|&x| x.unwrap())
| ^-
| ||
| |hint: to prevent move, use `ref x` or `ref mut x`
| cannot move out of borrowed content

In Rust, when you need a value, you generally want to move the elements or clone them.
Since move is more general, here it is, only two changes are necessary:
let nums: Vec<Uncopyable> = num_opts
.into_iter()
// ^~~~~~~~~~~~-------------- Consume vector, and iterate by value
.filter(|x| x.is_some())
.map(|x| x.unwrap())
// ^~~------------------ Take by value
.collect();
As llogiq points out, filter_map is specialized to filter out None already:
let nums: Vec<Uncopyable> = num_opts
.into_iter()
// ^~~~~~~~~~~~-------- Consume vector, and iterate by value
.filter_map(|x| x)
// ^~~----- Take by value
.collect();
And then it works (consuming num_opts).
As pointed out by #nirvana-msu, in Rust 1.33 std::convert::identity was added which can be used instead of |x| x. From the documentation:
let filtered = iter.filter_map(identity).collect::<Vec<_>>();

You don't need to copy the Uncopyable at all, if you are OK with using a Vec of references into the original Vec:
let nums: Vec<&Uncopyable> = num_opts.iter().filter_map(|x| x.as_ref()).collect();
// ^ notice the & before Uncopyable?
This may not do the trick for you if you have to work with an API that requires &[Uncopyable]. In that case, use Matthieu M.'s solution which can be reduced to:
let nums: Vec<Uncopyable> = num_opts.into_iter().filter_map(|x| x).collect();

Related

How can I have multiple iterators to the same data pertaining to a file?

I have a file that I wish to read and filter the data into two different sets and determine the number of items in each set.
use std::io::{self, BufRead};
fn main() {
let cursor = io::Cursor::new(b"pillow\nbrick\r\nphone");
let lines = cursor.lines().map(|l| l.unwrap());
let soft_count = lines.filter(|line| line.contains("pillow")).count();
let hard_count = lines.filter(|line| !line.contains("pillow")).count();
}
Playground
GitHub
However, the borrow checker gives me an error:
error[E0382]: use of moved value: `lines`
--> src/main.rs:14:22
|
8 | let lines = cursor.lines().map(|l| l.unwrap());
| ----- move occurs because `lines` has type `std::iter::Map<std::io::Lines<std::io::Cursor<&[u8; 19]>>, [closure#src/main.rs:8:36: 8:50]>`, which does not implement the `Copy` trait
9 |
10 | let soft_count = lines
| ----- value moved here
...
14 | let hard_count = lines
| ^^^^^ value used here after move
I tried getting around this using reference counting to allow multiple ownership:
use std::io::{self, BufRead};
use std::rc::Rc;
fn main() {
let cursor = io::Cursor::new(b"pillow\nbrick\r\nphone");
let lines = Rc::new(cursor.lines().map(|l| l.unwrap()));
let soft_count = Rc::clone(&lines)
.filter(|line| line.contains("pillow"))
.count();
let hard_count = Rc::clone(&lines)
.filter(|line| !line.contains("pillow"))
.count();
}
Playground
Github
I get a similar error message:
error[E0507]: cannot move out of an `Rc`
--> src/main.rs:11:22
|
11 | let soft_count = Rc::clone(&lines)
| ^^^^^^^^^^^^^^^^^ move occurs because value has type `std::iter::Map<std::io::Lines<std::io::Cursor<&[u8; 19]>>, [closure#src/main.rs:9:44: 9:58]>`, which does not implement the `Copy` trait
error[E0507]: cannot move out of an `Rc`
--> src/main.rs:15:22
|
15 | let hard_count = Rc::clone(&lines)
| ^^^^^^^^^^^^^^^^^ move occurs because value has type `std::iter::Map<std::io::Lines<std::io::Cursor<&[u8; 19]>>, [closure#src/main.rs:9:44: 9:58]>`, which does not implement the `Copy` trait
You cannot. Instead, you will need to clone the iterator, or some building block of it. In this case, the highest thing you can clone is the Cursor:
use std::io::{self, BufRead};
fn main() {
let cursor = io::Cursor::new(b"pillow\nbrick\r\nphone");
let lines = cursor.clone().lines().map(|l| l.unwrap());
let lines2 = cursor.lines().map(|l| l.unwrap());
let soft_count = lines.filter(|line| line.contains("pillow")).count();
let hard_count = lines2.filter(|line| !line.contains("pillow")).count();
}
For an actual File, you will need to use try_clone as it might fail. In either case, you will be referring to the same data twice and only the iterator information will be kept.
For your specific case, you don't need any of this. In fact, iterating over the data twice is inefficient. The simplest built-in thing you can do is to partition the iterator:
let (softs, hards): (Vec<_>, Vec<_>) = lines.partition(|line| line.contains("pillow"));
let soft_count = softs.len();
let hard_count = hards.len();
This is still a bit inefficient as you don't need the actual values. You could create your own type that implements Extend and discards the values:
#[derive(Debug, Default)]
struct Count(usize);
impl<T> std::iter::Extend<T> for Count {
fn extend<I>(&mut self, iter: I)
where
I: IntoIterator,
{
self.0 += iter.into_iter().count();
}
}
let (softs, hards): (Count, Count) = lines.partition(|line| line.contains("pillow"));
let soft_count = softs.0;
let hard_count = hards.0;
You could also just use a for loop or build something on top of fold:
let (soft_count, hard_count) = lines.fold((0, 0), |mut state, line| {
if line.contains("pillow") {
state.0 += 1;
} else {
state.1 += 1;
}
state
});

Conditionally return empty iterator from flat_map

Given this definition for foo:
let foo = vec![vec![1, 2, 3], vec![4, 5, 6], vec![7, 8, 9]];
I'd like to be able to write code like this:
let result: Vec<_> = foo.iter()
.enumerate()
.flat_map(|(i, row)| if i % 2 == 0 {
row.iter().map(|x| x * 2)
} else {
std::iter::empty()
})
.collect();
but that raises an error about the if and else clauses having incompatible types. I tried removing the map temporarily and I tried defining an empty vector outside the closure and returning an iterator over that like so:
let empty = vec![];
let result: Vec<_> = foo.iter()
.enumerate()
.flat_map(|(i, row)| if i % 2 == 0 {
row.iter() //.map(|x| x * 2)
} else {
empty.iter()
})
.collect();
This seems kind of silly but it compiles. If I try to uncomment the map then it still complains about the if and else clauses having incompatible types. Here's part of the error message:
error[E0308]: if and else have incompatible types
--> src/main.rs:6:30
|
6 | .flat_map(|(i, row)| if i % 2 == 0 {
| ______________________________^
7 | | row.iter().map(|x| x * 2)
8 | | } else {
9 | | std::iter::empty()
10 | | })
| |_________^ expected struct `std::iter::Map`, found struct `std::iter::Empty`
|
= note: expected type `std::iter::Map<std::slice::Iter<'_, {integer}>, [closure#src/main.rs:7:28: 7:37]>`
found type `std::iter::Empty<_>`
Playground Link
I know I could write something that does what I want with some nested for loops but I'd like to know if there's a terse way to write it using iterators.
Since Rust is statically typed and each step in an iterator chain changes the result to a new type that entrains the previous types (unless you use boxed trait objects) you will have to write it in a way where both branches are covered by the same types.
One way to convey conditional emptiness with a single type is the TakeWhile iterator implementation.
.flat_map(|(i, row)| {
let iter = row.iter().map(|x| x * 2);
let take = i % 2 == 0;
iter.take_while(|_| take)
})
If you don't mind ignoring the edge-case where the input iterator foo could have more than usize elements you could also use Take instead with either 0 or usize::MAX. It has the advantage of providing a better size_hint() than TakeWhile.
In your specific example, you can use filter to remove unwanted elements prior to calling flat_map:
let result: Vec<_> = foo.iter()
.enumerate()
.filter(|&(i, _)| i % 2 == 0)
.flat_map(|(_, row)| row.iter().map(|x| x * 2))
.collect();
If you ever want to use it with map instead of flat_map, you can combine the calls to filter and map by using filter_map which takes a function returning an Option and only keeps elements that are Some(thing).

Why does .flat_map() with .chars() not work with std::io::Lines, but does with a vector of Strings?

I am trying to iterate over characters in stdin. The Read.chars() method achieves this goal, but is unstable. The obvious alternative is to use Read.lines() with a flat_map to convert it to a character iterator.
This seems like it should work, but doesn't, resulting in borrowed value does not live long enough errors.
use std::io::BufRead;
fn main() {
let stdin = std::io::stdin();
let mut lines = stdin.lock().lines();
let mut chars = lines.flat_map(|x| x.unwrap().chars());
}
This is mentioned in Read file character-by-character in Rust, but it does't really explain why.
What I am particularly confused about is how this differs from the example in the documentation for flat_map, which uses flat_map to apply .chars() to a vector of strings. I don't really see how that should be any different. The main difference I see is that my code needs to call unwrap() as well, but changing the last line to the following does not work either:
let mut chars = lines.map(|x| x.unwrap());
let mut chars = chars.flat_map(|x| x.chars());
It fails on the second line, so the issue doesn't appear to be the unwrap.
Why does this last line not work, when the very similar line in the documentation doesn't? Is there any way to get this to work?
Start by figuring out what the type of the closure's variable is:
let mut chars = lines.flat_map(|x| {
let () = x;
x.unwrap().chars()
});
This shows it's a Result<String, io::Error>. After unwrapping it, it will be a String.
Next, look at str::chars:
fn chars(&self) -> Chars
And the definition of Chars:
pub struct Chars<'a> {
// some fields omitted
}
From that, we can tell that calling chars on a string returns an iterator that has a reference to the string.
Whenever we have a reference, we know that the reference cannot outlive the thing that it is borrowed from. In this case, x.unwrap() is the owner. The next thing to check is where that ownership ends. In this case, the closure owns the String, so at the end of the closure, the value is dropped and any references are invalidated.
Except the code tried to return a Chars that still referred to the string. Oops. Thanks to Rust, the code didn't segfault!
The difference with the example that works is all in the ownership. In that case, the strings are owned by a vector outside of the loop and they do not get dropped before the iterator is consumed. Thus there are no lifetime issues.
What this code really wants is an into_chars method on String. That iterator could take ownership of the value and return characters.
Not the maximum efficiency, but a good start:
struct IntoChars {
s: String,
offset: usize,
}
impl IntoChars {
fn new(s: String) -> Self {
IntoChars { s: s, offset: 0 }
}
}
impl Iterator for IntoChars {
type Item = char;
fn next(&mut self) -> Option<Self::Item> {
let remaining = &self.s[self.offset..];
match remaining.chars().next() {
Some(c) => {
self.offset += c.len_utf8();
Some(c)
}
None => None,
}
}
}
use std::io::BufRead;
fn main() {
let stdin = std::io::stdin();
let lines = stdin.lock().lines();
let chars = lines.flat_map(|x| IntoChars::new(x.unwrap()));
for c in chars {
println!("{}", c);
}
}
See also:
How can I store a Chars iterator in the same struct as the String it is iterating on?
Is there an owned version of String::chars?

Using an iterator, how do I skip a number of values and then display the rest?

Random access to the elements is not allowed.
let vec = vec![1,2,3,4,5,6,7,8,9,0];
let n = 3;
for v in vec.iter().rev().take(n) {
println!("{}", v);
}
// this printed: 0, 9, 8
// need: 8, 9, 0
for v in vec.iter().rev().skip(n).rev() does not work.
I think the code you wrote does what you're asking it to.
You are reversing the vec with rev() and then you're taking the first 3 elements of the reversed vector (therefore 0, 9, 8)
To obtain the last 3 in non-reversed order you can skip to the end of the vector minus 3 elements, without reversing it:
let vec = vec![1,2,3,4,5,6,7,8,9,0];
let n = vec.len() - 3;
for v in vec.iter().skip(n) {
println!("{}", v);
}
Neither skip nor take yield DoubleEndIterator, you have to either:
skip, which is O(N) in the number of skipped items
collect the result of .rev().take(), and then rev it, which is O(N) in the number of items to be printed, and requires allocating memory for them
The skip is obvious, so let me illustrate the collect:
let vec = vec![1,2,3,4,5,6,7,8,9,0];
let vec: Vec<_> = vec.iter().rev().take(3).collect();
for v in vec.iter().rev() {
println!("{}", v);
}
Of course, the inefficiency is due to you shooting yourself in the foot by avoiding random access in the first place...
Based on the comments, I guess you want to iterate specifically through the elements of a Vec or slice. If that is the case, you could use range slicing, as shown below:
let vec = vec![1,2,3,4,5,6,7,8,9,0];
let n = vec.len() - 3;
for v in &vec[n..] {
println!("{}", v);
}
The big advantage of this approach is that it doesn't require to skip through elements you are not interested in (which may have a big cost if not optimized away). It will just make a new slice and then iterate through it. In other words, you have the guarantee that it will be fast.

Conflicting lifetime requirement for iterator returned from function

This may be a duplicate. I don't know. I couldn't understand the other answers well enough to know that. :)
Rust version: rustc 1.0.0-nightly (b47aebe3f 2015-02-26) (built 2015-02-27)
Basically, I'm passing a bool to this function that's supposed to build an iterator that filters one way for true and another way for false. Then it kind of craps itself because it doesn't know how to keep that boolean value handy, I guess. I don't know. There are actually multiple lifetime problems here, which is discouraging because this is a really common pattern for me, since I come from a .NET background.
fn main() {
for n in values(true) {
println!("{}", n);
}
}
fn values(even: bool) -> Box<Iterator<Item=usize>> {
Box::new([3usize, 4, 2, 1].iter()
.map(|n| n * 2)
.filter(|n| if even {
n % 2 == 0
} else {
true
}))
}
Is there a way to make this work?
You have two conflicting issues, so let break down a few representative pieces:
[3usize, 4, 2, 1].iter()
.map(|n| n * 2)
.filter(|n| n % 2 == 0))
Here, we create an array in the stack frame of the method, then get an iterator to it. Since we aren't allowed to consume the array, the iterator item is &usize. We then map from the &usize to a usize. Then we filter against a &usize - we aren't allowed to consume the filtered item, otherwise the iterator wouldn't have it to return!
The problem here is that we are ultimately rooted to the stack frame of the function. We can't return this iterator, because the array won't exist after the call returns!
To work around this for now, let's just make it static. Now we can focus on the issue with even.
filter takes a closure. Closures capture any variable used that isn't provided as an argument to the closure. By default, these variables are captured by reference. However, even is again a variable located on the stack frame. This time however, we can give it to the closure by using the move keyword. Here's everything put together:
fn main() {
for n in values(true) {
println!("{}", n);
}
}
static ITEMS: [usize; 4] = [3, 4, 2, 1];
fn values(even: bool) -> Box<Iterator<Item=usize>> {
Box::new(ITEMS.iter()
.map(|n| n * 2)
.filter(move |n| if even {
n % 2 == 0
} else {
true
}))
}