How do I create a by-value iterator on the stack?

How do I create a by-value iterator on the stack? - iterator

I can create a consuming iterator in the heap:
vec![1, 10, 100].into_iter()
I can also create an iterator on the stack that borrows the elements:
[1, 10, 100].iter()
But if I write this:
[1, 10, 100].into_iter()
This is not a consuming iterator because [T; _]::into_iter does not exist: IntoIterator is only implemented for the borrowed version (aka slice). Is there a simple way (preferably in the std lib) to create a consuming iterator on the stack?
I know that [1, 10, 100].iter().cloned() can be done, but this requires the items to be clonable.

Is there a simple way (preferably in the std lib) to create a consuming iterator on the stack?
No.
Is there a simple way (preferably in the std lib) to create a consuming iterator on the stack?
Yes. Use a crate like stack or smallvec, which provide array types that implement IntoIterator.

Very ugly, but technically works:
for s in [
Some(String::from("hello")),
Some(String::from("goodbye"))
].iter_mut().map(|option| option.take().unwrap()) {
let s: String = s;
println!("{}", s);
}
You can use a macro that achieves this in a prettier way:
macro_rules! iter {
[ $( $item:expr ),+ ] => {{
[ $( Some($item), )+ ]
.iter_mut()
.map(|o| o.take().unwrap())
}};
// Rule to allow a trailing comma:
[ $( $item:expr, )+ ] => {{
iter![ $( $item ),+ ]
}};
}
fn main() {
for s in iter![String::from("hello"), String::from("goodbye")] {
println!("{}", s);
}
}

You can have a macro which wraps the values in a once iterator and chains them together:
macro_rules! value_iter {
() => {
std::iter::empty()
};
($v: expr, $( $rest: expr ), +) => {
std::iter::once($v).chain(
value_iter!($($rest),*)
)
};
($v: expr) => {
std::iter::once($v)
};
}
Using:
#[derive(Debug, PartialEq)]
struct Foo;
let it = value_iter![Foo, Foo, Foo];
let all: Vec<_> = it.collect();
assert_eq!(all, vec![Foo, Foo, Foo]);
A known drawback is that the iterator will not be an exact-size iterator, and so the compiler might miss some obvious optimizations.
Playground

Related

Idiomatic way to collect all errors from an iterator

Let's say I have a attrs: Vec<Attribute> of some function attributes and a function fn map_attribute(attr: &Attribute) -> Result<TokenStream, Error> that maps the attributes to some code.
I know that I could write something like this:
attrs.into_iter()
.map(map_attribute)
.collect::<Result<Vec<_>, _>()?
However, this is not what I want. What I want is spit out all errors at once, not stop with the first Error. Currently I do something like this:
let mut codes : Vec<TokenStream> = Vec::new();
let mut errors: Vec<Error> = Vec::new();
for attr in attrs {
match map_attribute(attr) {
Ok(code) => codes.push(code),
Err(err) => errors.push(err)
}
}
let mut error_iter = errors.into_iter();
if let Some(first) = error_iter.nth(0) {
return Err(iter.fold(first, |mut e0, e1| { e0.combine(e1); e0 }));
}
This second version does what I want, but is considerably more verbose than the first version. Is there a better / more idiomatic way to acchieve this, if possible without creating my own iterator?

The standard library does not have a convenient one-liner for this as far as I know, however the excellent itertools library does:
use itertools::Itertools; // 0.9.0
fn main() {
let foo = vec![Ok(42), Err(":("), Ok(321), Err("oh noes")];
let (codes, errors): (Vec<_>, Vec<_>)
= foo.into_iter().partition_map(From::from);
println!("codes={:?}", codes);
println!("errors={:?}", errors);
}
(Permalink to the playground)

I ended up writing my own extension for Iterator, which allows me to stop collecting codes when I encounter my first error. This is in my use case probably a bit more efficient than the answer by mcarton, since I only need the first partition bucket if the second one is empty. Also, I need to fold the errors anyways if I want to combine them into a single error.
pub trait CollectToResult
{
type Item;
fn collect_to_result(self) -> Result<Vec<Self::Item>, Error>;
}
impl<Item, I> CollectToResult for I
where
I : Iterator<Item = Result<Item, Error>>
{
type Item = Item;
fn collect_to_result(self) -> Result<Vec<Item>, Error>
{
self.fold(<Result<Vec<Item>, Error>>::Ok(Vec::new()), |res, code| {
match (code, res) {
(Ok(code), Ok(mut codes)) => { codes.push(code); Ok(codes) },
(Ok(_), Err(errors)) => Err(errors),
(Err(err), Ok(_)) => Err(err),
(Err(err), Err(mut errors)) => { errors.combine(err); Err(errors) }
}
})
}
}

Why do I get an UnsupportedType error when serializing to TOML with a manually implemented Serialize for an enum with struct variants?

I'm trying to implement Serialize for an enum that includes struct variants. The serde.rs documentation indicates the following:
enum E {
// Use three-step process:
// 1. serialize_struct_variant
// 2. serialize_field
// 3. end
Color { r: u8, g: u8, b: u8 },
// Use three-step process:
// 1. serialize_tuple_variant
// 2. serialize_field
// 3. end
Point2D(f64, f64),
// Use serialize_newtype_variant.
Inches(u64),
// Use serialize_unit_variant.
Instance,
}
With that in mind, I proceeded to implemention:
use serde::ser::{Serialize, SerializeStructVariant, Serializer};
use serde_derive::Deserialize;
#[derive(Deserialize)]
enum Variants {
VariantA,
VariantB { k: u32, p: f64 },
}
impl Serialize for Variants {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
match *self {
Variants::VariantA => serializer.serialize_unit_variant("Variants", 0, "VariantA"),
Variants::VariantB { ref k, ref p } => {
let mut state =
serializer.serialize_struct_variant("Variants", 1, "VariantB", 2)?;
state.serialize_field("k", k)?;
state.serialize_field("p", p)?;
state.end()
}
}
}
}
fn main() {
let x = Variants::VariantB { k: 5, p: 5.0 };
let toml_str = toml::to_string(&x).unwrap();
println!("{}", toml_str);
}
The code compiles, but when I run it it fails:
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: UnsupportedType', src/libcore/result.rs:999:5
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
I figured the issue must be in my use of the API, so I consulted the API documentation for StructVariant and it looks practically the same as my code. I'm sure I'm missing something, but I don't see it based on the docs and output.

Enabling external tagging for the enum enables Serde to serialize/deserialize it to TOML:
#[derive(Deserialize)]
#[serde(tag = "type")]
enum Variants {
VariantA,
VariantB { k: u32, p: f64 },
}
toml::to_string(&Variants::VariantB { k: 42, p: 13.37 })
serializes to
type = VariantB
k = 42
p = 13.37
This works well in Vecs and HashMaps, too.

The TOML format does not support enums with values:
use serde::Serialize; // 1.0.99
use toml; // 0.5.3
#[derive(Serialize)]
enum A {
B(i32),
}
fn main() {
match toml::to_string(&A::B(42)) {
Ok(s) => println!("{}", s),
Err(e) => eprintln!("Error: {}", e),
}
}
Error: unsupported Rust type
It's unclear what you'd like your data structure to map to as TOML. Using JSON works just fine:
use serde::Serialize; // 1.0.99
use serde_json; // 1.0.40
#[derive(Serialize)]
enum Variants {
VariantA,
VariantB { k: u32, p: f64 },
}
fn main() {
match serde_json::to_string(&Variants::VariantB { k: 42, p: 42.42 }) {
Ok(s) => println!("{}", s),
Err(e) => eprintln!("Error: {}", e),
}
}
{"VariantB":{"k":42,"p":42.42}}

Is there a more ergonomic syntax for Either when using futures?

Here's an example of using Tokio to run a function that returns a future:
use futures::sync::oneshot;
use futures::Future;
use std::thread;
use std::time::Duration;
use tokio;
#[derive(Debug)]
struct MyError {
error_code: i32,
}
impl From<oneshot::Canceled> for MyError {
fn from(_: oneshot::Canceled) -> MyError {
MyError { error_code: 1 }
}
}
fn deferred_task() -> impl Future<Item = i32, Error = MyError> {
let (sx, rx) = oneshot::channel();
thread::spawn(move || {
thread::sleep(Duration::from_millis(100));
sx.send(100).unwrap();
});
return rx.map_err(|e| MyError::from(e));
}
fn main() {
tokio::run(deferred_task().then(|r| {
println!("{:?}", r);
Ok(())
}));
}
However, when the function in question (i.e. deferred_task) is non-trivial, the code becomes much more complex when I write it, because the ? operation doesn't seem to easily mix with returning a future:
fn send_promise_to_worker(sx: oneshot::Sender<i32>) -> Result<(), ()> {
// Send the oneshot somewhere in a way that might fail, eg. over a channel
thread::spawn(move || {
thread::sleep(Duration::from_millis(100));
sx.send(100).unwrap();
});
Ok(())
}
fn deferred_task() -> impl Future<Item = i32, Error = MyError> {
let (sx, rx) = oneshot::channel();
send_promise_to_worker(sx)?; // <-------- Can't do this, because the return is not a result
return rx.map_err(|e| MyError::from(e));
}
A Future is a Result, it's meaningless to wrap it in result, and it breaks the impl Future return type.
Instead you get a deeply nested chain of:
fn deferred_task() -> impl Future<Item = i32, Error = MyError> {
let (sx, rx) = oneshot::channel();
match query_data() {
Ok(_i) => match send_promise_to_worker(sx) {
Ok(_) => Either::A(rx.map_err(|e| MyError::from(e))),
Err(_e) => Either::B(futures::failed(MyError { error_code: 2 })),
},
Err(_) => Either::B(futures::failed(MyError { error_code: 2 })),
}
}
full code
The more results you have, the deeper the nesting; exactly what the ? operator solves normally.
Am I missing something? Is there some syntax sugar to make this easier?

I do not see how async / await syntax will categorically help you with Either. Ultimately, you still need to return a single concrete type, and that's what Either provides. async / await will reduce the need for combinators like Future::map or Future::and_then however.
See also:
Why can impl trait not be used to return multiple / conditional types?
That being said, you don't need to use Either here.
You have consecutive Result-returning functions, so you can borrow a trick from JavaScript and use an IIFE to use use the ? operator. Then, we can "lift up" the combined Result into a future and chain it with the future from the receiver:
fn deferred_task() -> impl Future<Item = i32, Error = MyError> {
let (tx, rx) = oneshot::channel();
let x = (|| {
let _i = query_data().map_err(|_| MyError { error_code: 1 })?;
send_promise_to_worker(tx).map_err(|_| MyError { error_code: 2 })?;
Ok(())
})();
future::result(x).and_then(|()| rx.map_err(MyError::from))
}
In the future, that IIFE could be replaced with a try block, as I understand it.
You could also go the other way and convert everything to a future:
fn deferred_task() -> impl Future<Item = i32, Error = MyError> {
let (tx, rx) = oneshot::channel();
query_data()
.map_err(|_| MyError { error_code: 1 })
.into_future()
.and_then(|_i| {
send_promise_to_worker(tx)
.map_err(|_| MyError { error_code: 2 })
.into_future()
})
.and_then(|_| rx.map_err(MyError::from))
}
This would be helped with async / await syntax:
async fn deferred_task() -> Result<i32, MyError> {
let (tx, rx) = oneshot::channel();
query_data().map_err(|_| MyError { error_code: 1 })?;
send_promise_to_worker(tx).map_err(|_| MyError { error_code: 2 })?;
let v = await! { rx }?;
Ok(v)
}
I have also seen improved syntax for constructing the Either by adding left and right methods to the Future trait:
foo.left();
// vs
Either::left(foo);
However, this doesn't appear in any of the current implementations.
A Future is a Result
No, it is not.
There are two relevant Futures to talk about:
From the futures 0.1 crate
From the (nightly) standard library
Notably, Future::poll returns a type that can be in two states:
Complete
Not complete
In the futures crate, "success" and "failure" are tied to "complete", whereas in the standard library they are not. In the crate, Result implements IntoFuture, and in the standard library you can use future::ready. Both of these allow converting a Result into a future, but that doesn't mean that Result is a future, no more than saying that a Vec<u8> is an iterator, even though it can be converted into one.
It's possible that the ? operator (powered by the Try trait), will be enhanced to automatically convert from a Result to a specific type of Future, or that Result will even implement Future directly, but I have not heard of any such plans.

Is there some syntax sugar to make this easier?
Yes, it's called async/await, but it's not quite ready for wide consumption. It is only supported on nightly, it uses a slightly different version of futures that Tokio only supports via an interop library that causes additional syntactic overhead, and documentation for the whole thing is still spotty.
Here are some relevant links:
What is the purpose of async/await in Rust?
https://jsdw.me/posts/rust-asyncawait-preview/
https://areweasyncyet.rs/

Are there equivalents to slice::chunks/windows for iterators to loop over pairs, triplets etc?

It can be useful to iterate over multiple variables at once, overlapping (slice::windows), or not (slice::chunks).
This only works for slices; is it possible to do this for iterators, using tuples for convenience?
Something like the following could be written:
for (prev, next) in some_iter.windows(2) {
...
}
If not, could it be implemented as a trait on existing iterators?

It's possible to take chunks of an iterator using Itertools::tuples, up to a 4-tuple:
use itertools::Itertools; // 0.9.0
fn main() {
let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
for (prev, next) in some_iter.tuples() {
println!("{}--{}", prev, next);
}
}
(playground)
1--2
3--4
5--6
If you don't know that your iterator exactly fits into the chunks, you can use Tuples::into_buffer to access any leftovers:
use itertools::Itertools; // 0.9.0
fn main() {
let some_iter = vec![1, 2, 3, 4, 5].into_iter();
let mut t = some_iter.tuples();
for (prev, next) in t.by_ref() {
println!("{}--{}", prev, next);
}
for leftover in t.into_buffer() {
println!("{}", leftover);
}
}
(playground)
1--2
3--4
5
It's also possible to take up to 4-tuple windows with Itertools::tuple_windows:
use itertools::Itertools; // 0.9.0
fn main() {
let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
for (prev, next) in some_iter.tuple_windows() {
println!("{}--{}", prev, next);
}
}
(playground)
1--2
2--3
3--4
4--5
5--6
If you need to get partial chunks / windows, you can get

TL;DR: The best way to have chunks and windows on an arbitrary iterator/collection is to first collect it into a Vec and iterate over that.
The exact syntax requested is impossible in Rust.
The issue is that in Rust, a function's signature is depending on types, not values, and while Dependent Typing exists, there are few languages that implement it (it's hard).
This is why chunks and windows return a sub-slice by the way; the number of elements in a &[T] is not part of the type and therefore can be decided at run-time.
Let's pretend you asked for: for slice in some_iter.windows(2) instead then.
Where would the storage backing this slice live?
It cannot live:
in the original collection because a LinkedList doesn't have a contiguous storage
in the iterator because of the definition of Iterator::Item, there is no lifetime available
So, unfortunately, slices can only be used when the backing storage is a slice.
If dynamic allocations are accepted, then it is possible to use Vec<Iterator::Item> as the Item of the chunking iterator.
struct Chunks<I: Iterator> {
elements: Vec<<I as Iterator>::Item>,
underlying: I,
}
impl<I: Iterator> Chunks<I> {
fn new(iterator: I, size: usize) -> Chunks<I> {
assert!(size > 0);
let mut result = Chunks {
underlying: iterator, elements: Vec::with_capacity(size)
};
result.refill(size);
result
}
fn refill(&mut self, size: usize) {
assert!(self.elements.is_empty());
for _ in 0..size {
match self.underlying.next() {
Some(item) => self.elements.push(item),
None => break,
}
}
}
}
impl<I: Iterator> Iterator for Chunks<I> {
type Item = Vec<<I as Iterator>::Item>;
fn next(&mut self) -> Option<Self::Item> {
if self.elements.is_empty() {
return None;
}
let new_elements = Vec::with_capacity(self.elements.len());
let result = std::mem::replace(&mut self.elements, new_elements);
self.refill(result.len());
Some(result)
}
}
fn main() {
let v = vec!(1, 2, 3, 4, 5);
for slice in Chunks::new(v.iter(), 2) {
println!("{:?}", slice);
}
}
Will return:
[1, 2]
[3, 4]
[5]
The canny reader will realize that I surreptitiously switched from windows to chunks.
windows is more difficult, because it returns the same element multiple times which require that the element be Clone. Also, since it needs returning a full Vec each time, it will need internally to keep a Vec<Vec<Iterator::Item>>.
This is left as an exercise to the reader.
Finally, a note on performance: all those allocations are gonna hurt (especially in the windows case).
The best allocation strategy is generally to allocate a single chunk of memory and then live off that (unless the amount is really massive, in which case streaming is required).
It's called collect::<Vec<_>>() in Rust.
And since the Vec has a chunks and windows methods (by virtue of implementing Deref<Target=[T]>), you can then use that instead:
for slice in v.iter().collect::<Vec<_>>().chunks(2) {
println!("{:?}", slice);
}
for slice in v.iter().collect::<Vec<_>>().windows(2) {
println!("{:?}", slice);
}
Sometimes the best solutions are the simplest.

On nightly
The chunks version is now available on nightly under the name array_chunks
#![feature(iter_array_chunks)]
for [a, b, c] in some_iter.array_chunks() {
...
}
And it handles remainders nicely:
#![feature(iter_array_chunks)]
for [a, b, c] in some_iter.by_ref().array_chunks() {
...
}
let rem = some_iter.into_remainder();
On stable
Since Rust 1.51 this is possible with const generics where the iterator yields constant size arrays [T; N] for any N.
I built two standalone crates which implement this:
iterchunks provides array_chunks()
iterwindows provides
array_windows()
use iterchunks::IterChunks; // 0.2
for [a, b, c] in some_iter.array_chunks() {
...
}
use iterwindows::IterWindows; // 0.2
for [prev, next] in some_iter.array_windows() {
...
}
Using the example given in the Itertools answer:
use iterchunks::IterChunks; // 0.2
fn main() {
let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
for [prev, next] in some_iter.array_chunks() {
println!("{}--{}", prev, next);
}
}
This outputs
1--2
3--4
5--6
Most times the array size can be inferred but you can also specific it explicitly. Additionally, any reasonable size N can be used, there is no limit like in the Itertools case.
use iterwindows::IterWindows; // 0.2
fn main() {
let mut iter = vec![1, 2, 3, 4, 5, 6].into_iter().array_windows::<5>();
println!("{:?}", iter.next());
println!("{:?}", iter.next());
println!("{:?}", iter.next());
}
This outputs
Some([1, 2, 3, 4, 5])
Some([2, 3, 4, 5, 6])
None
Note: array_windows() uses clone to yield elements multiple times so its best used for references and cheap to copy types.

How to map on a vec and use a closure with pattern matching in Rust

I'd like to use map to iterate over an array and do stuff per item and get rid of the for loop. An error which I do not understand blocks my attempt. What I want to achieve is to iterate through a vector of i32 and match on them to concat a string with string literals and then return it at the end.
Function:
pub fn convert_to_rainspeak(prime_factors: Vec<i32>) -> String {
let mut speak = String::new();
prime_factors.iter().map(|&factor| {
match factor {
3 => { speak.push_str("Pling"); },
5 => { speak.push_str("Plang"); },
7 => { speak.push_str("Plong"); },
_ => {}
}
}).collect();
speak
}
fn main() {}
Output:
error[E0282]: type annotations needed
--> src/main.rs:10:8
|
10 | }).collect();
| ^^^^^^^ cannot infer type for `B`

Iterator::collect is defined as:
fn collect<B>(self) -> B
where
B: FromIterator<Self::Item>
That is, it returns a type that is up to the caller. However, you have completely disregarded the output, so there's no way for it to infer a type. The code misuses collect when it basically wants to use for.
In your "fixed" version (which has since been edited, making this paragraph make no sense), you are being very inefficient by allocating a string in every iteration. Plus you don't need to specify any explicit types other than those on the function, and you should accept a &[i32] instead:
fn convert_to_rainspeak(prime_factors: &[i32]) -> String {
prime_factors.iter()
.map(|&factor| {
match factor {
3 => "Pling",
5 => "Plang",
7 => "Plong",
_ => "",
}
})
.collect()
}
fn main() {
println!("{}", convert_to_rainspeak(&[1, 2, 3, 4, 5]));
}

The error in this case is that the compiler can't figure out what type you are collecting into. If you add a type annotation to collect, then it will work:
pub fn convert_to_rainspeak(prime_factors:Vec<i32>) -> String{
let mut speak = String::new();
prime_factors.iter().map(| &factor| {
match factor {
3 => {speak.push_str("Pling");},
5 => {speak.push_str("Plang");},
7 => {speak.push_str("Plong");},
_ => {}
}
}).collect::<Vec<_>>();
speak
}
However, this is really not the idiomatic way to do this. You should use for instead:
pub fn convert_to_rainspeak(prime_factors:Vec<i32>) -> String {
let mut speak = String::new();
for factor in prime_factors.iter() {
match *factor {
3 => speak.push_str("Pling"),
5 => speak.push_str("Plang"),
7 => speak.push_str("Plong"),
_ => {}
}
}
speak
}

You could also use flat_map() instead of map().
This way you can map to Option and return None instead of empty string if there is no corresponding value.
fn convert_to_rainspeak(prime_factors: &[i32]) -> String {
prime_factors
.iter()
.flat_map(|&factor| match factor {
3 => Some("Pling"),
5 => Some("Plang"),
7 => Some("Plong"),
_ => None,
})
.collect()
}
fn main() {
println!("{}", convert_to_rainspeak(&[1, 2, 3, 7]));
}

few minutes later I solved it myself (any upgrade appreciated):
pub fn convert_to_rainspeak(prime_factors:Vec<i32>) -> String{
let mut speak:String = prime_factors.iter().map(|&factor| {
match factor {
3 => {"Pling"},
5 => {"Plang"},
7 => {"Plong"},
_ => {""}
}
}).collect();
speak
}
The issue was that I was not aware that .collect() is awaiting the result of map. I then assigned prime_factors.iter()... to a string and rearranged var bindings so it now all works.
EDIT: refactored redundant assignments to the speak vector

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How do I create a by-value iterator on the stack? - iterator

Is there a simple way (preferably in the std lib) to create a consuming iterator on the stack? No. Is there a simple way (preferably in the std lib) to create a consuming iterator on the stack? Yes. Use a crate like stack or smallvec, which provide array types that implement IntoIterator.

Related

Idiomatic way to collect all errors from an iterator

Why do I get an UnsupportedType error when serializing to TOML with a manually implemented Serialize for an enum with struct variants?

Is there a more ergonomic syntax for Either when using futures?

Are there equivalents to slice::chunks/windows for iterators to loop over pairs, triplets etc?

How to map on a vec and use a closure with pattern matching in Rust

Categories

Resources