Iterate over copy types - iterator

It is clear that iterators pass around a references to avoid moving objects into iterator or it's closure argument, but what with Copy types? Let me show you a small snippet:
fn is_odd(x: &&i32) -> bool { *x & 1 == 1 }
// [1] fn is_odd(x: &i32) -> bool { x & 1 == 1 }
// [2] fn is_odd(x: i32) -> bool { x & 1 == 1 }
fn main() {
let xs = &[ 10, 20, 13, 14 ];
for x in xs.iter().filter(is_odd) {
assert_eq!(13, *x);
}
// [1] ...is slightly better, but not ideal
// for x in xs.iter().cloned().filter(is_odd) {
// assert_eq!(13, x);
// }
}
Am I right that .cloned() is preferred when we iterate over something like &[i32] or &[u8], where extra indirection is involved instead of just copying the tiny data unit?
But it looks like I can not avoid references passed into is_odd function.
Is there a way to make [2] function from above snippet work for higher-level functions like filter?
Assume that I understand that moving non-Copy type into predicate function is silly. But not all types use move semantics by default, right?

It is clear that iterators pass around a references
This blanket statement is not true, iterators are more than capable of yielding a non-reference. filter will provide a reference to the closure because it doesn't want to give ownership of the item to the closure. In your example, your iterated value is a &i32, and then filter provides a &&i32.
Is there a way to make [2] function from above snippet work for higher-level functions like filter?
Certainly, just provide a closure that does the dereferencing:
fn is_odd(x: i32) -> bool { x & 1 == 1 }
fn main() {
let xs = &[ 10, 20, 13, 14 ];
for x in xs.iter().filter(|&&x| is_odd(x)) {
assert_eq!(13, *x);
}
}

Related

How to create a vector of all decorated functions from a specific module?

I have a file main.rs and a file rule.rs. I want to define functions in rule.rs to be included in the Rules::rule vector without having to push them one by one. I'd prefer a loop to push them.
main.rs:
struct Rules {
rule: Vec<fn(arg: &Arg) -> bool>,
}
impl Rules {
fn validate_incomplete(self, arg: &Arg) -> bool {
// iterate through all constraints and evaluate, if false return and stop
for constraint in self.incomplete_rule_constraints.iter() {
if !constraint(&arg) {
return false;
}
}
true
}
}
rule.rs:
pub fn test_constraint1(arg: &Arg) -> bool {
arg.last_element().total() < 29500
}
pub fn test_constraint2(arg: &Arg) -> bool {
arg.last_element().total() < 35000
}
Rules::rule should be populated with test_constraint1 and test_constraint2.
In Python, I could add a decorator #rule_decorator above the constraints which you want to be included in the Vec, but I don't see an equivalent in Rust.
In Python, I could use dir(module) to see all available methods/attributes.
Python variant:
class Rules:
def __init__(self, name: str):
self.name = name
self.rule = []
for member in dir(self):
method = getattr(self, member)
if "rule_decorator" in dir(method):
self.rule.append(method)
def validate_incomplete(self, arg: Arg):
for constraint in self.incomplete_rule_constraints:
if not constraint(arg):
return False
return True
With the rule.py file:
#rule_decorator
def test_constraint1(arg: Arg):
return arg.last_element().total() < 29500
#rule_decorator
def test_constraint1(arg: Arg):
return arg.last_element().total() < 35000
All functions with a rule_decorator are added to the self.rule list and checked off by the validate_incomplete function.
Rust does not have the same reflection features as Python. In particular, you cannot iterate through all functions of a module at runtime. At least you can't do that with builtin tools. It is possible to write so called procedural macros which let you add custom attributes to your functions, e.g. #[rule_decorator] fn foo() { ... }. With proc macros, you can do almost anything.
However, using proc macros for this is way too over-engineered (in my opinion). In your case, I would simply list all functions to be included in your vector:
fn test_constraint1(arg: u32) -> bool {
arg < 29_500
}
fn test_constraint2(arg: u32) -> bool {
arg < 35_000
}
fn main() {
let rules = vec![test_constraint1 as fn(_) -> _, test_constraint2];
// Or, if you already have a vector and need to add to it:
let mut rules = Vec::new();
rules.extend_from_slice(
&[test_constraint1 as fn(_) -> _, test_constraint2]
);
}
A few notes about this code:
I replaced &Arg with u32, because it doesn't have anything to do with the problem. Please omit unnecessary details from questions on StackOverflow.
I used _ in the number literals to increase readability.
The strange as fn(_) -> _ cast is sadly necessary. You can read more about it in this question.
You can, with some tweaks and restrictions, achieve your goals. You'll need to use the inventory crate. This is limited to Linux, macOS and Windows at the moment.
You can then use inventory::submit to add values to a global registry, inventory::collect to build the registry, and inventory::iter to iterate over the registry.
Due to language restrictions, you cannot create a registry for values of a type that you do not own, which includes the raw function pointer. We will need to create a newtype called Predicate to use the crate:
use inventory; // 0.1.3
struct Predicate(fn(&u32) -> bool);
inventory::collect!(Predicate);
struct Rules;
impl Rules {
fn validate_incomplete(&self, arg: &u32) -> bool {
inventory::iter::<Predicate>
.into_iter()
.all(|Predicate(constraint)| constraint(arg))
}
}
mod rules {
use super::Predicate;
pub fn test_constraint1(arg: &u32) -> bool {
*arg < 29500
}
inventory::submit!(Predicate(test_constraint1));
pub fn test_constraint2(arg: &u32) -> bool {
*arg < 35000
}
inventory::submit!(Predicate(test_constraint2));
}
fn main() {
if Rules.validate_incomplete(&42) {
println!("valid");
} else {
println!("invalid");
}
}
There are a few more steps you'd need to take to reach your originally-stated goal:
"a vector"
You can collect from the provided iterator to build a Vec.
"decorated functions"
You can write your own procedural macro that will call inventory::submit!(Predicate(my_function_name)); for you.
"from a specific module"
You can add the module name into the Predicate struct and filter on that later.
See also:
How can I statically register structures at compile time?

Generate a tree of structs with testing/quick, respecting invariants

I have a tree of structs which I'd like to test using testing/quick, but constraining it to within my invariants.
This example code works:
var rnd = rand.New(rand.NewSource(time.Now().UnixNano()))
type X struct {
HasChildren bool
Children []*X
}
func TestSomething(t *testing.T) {
x, _ := quick.Value(reflect.TypeOf(X{}), rnd)
_ = x
// test some stuff here
}
But we hold HasChildren = true whenever len(Children) > 0 as an invariant, so it'd be better to ensure that whatever quick.Value() generates respects that (rather than finding "bugs" that don't actually exist).
I figured I could define a Generate function which uses quick.Value() to populate all the variable members:
func (X) Generate(rand *rand.Rand, size int) reflect.Value {
x := X{}
throwaway, _ := quick.Value(reflect.TypeOf([]*X{}), rand)
x.Children = throwaway.Interface().([]*X)
if len(x.Children) > 0 {
x.HasChildren = true
} else {
x.HasChildren = false
}
return reflect.ValueOf(x)
}
But this is panicking:
panic: value method main.X.Generate called using nil *X pointer [recovered]
And when I change Children from []*X to []X, it dies with a stack overflow.
The documentation is very thin on examples, and I'm finding almost nothing in web searches either.
How can this be done?
Looking at the testing/quick source code it seems that you can't create recursive custom generators and at the same time reuse the quick library facilities to generate the array part of the struct, because the size parameter, that is designed to limit the number of recursive calls, cannot be passed back into quick.Value(...)
https://golang.org/src/testing/quick/quick.go (see around line 50)
in your case this lead to an infinite tree that quickly "explodes" with 1..50 leafs at each level (that's the reason for the stack overflow).
If the function quick.sizedValue() had been public we could have used it to accomplish your task, but unfortunately this is not the case.
BTW since HasChildren is an invariant, can't you simply make it a struct method?
type X struct {
Children []*X
}
func (me *X) HasChildren() bool {
return len(me.Children) > 0
}
func main() {
.... generate X ....
if x.HasChildren() {
.....
}
}

How to compose mutable Iterators?

Editor's note: This code example is from a version of Rust prior to 1.0 and is not syntactically valid Rust 1.0 code. Updated versions of this code produce different errors, but the answers still contain valuable information.
I would like to make an iterator that generates a stream of prime numbers. My general thought process was to wrap an iterator with successive filters so for example you start with
let mut n = (2..N)
Then for each prime number you mutate the iterator and add on a filter
let p1 = n.next()
n = n.filter(|&x| x%p1 !=0)
let p2 = n.next()
n = n.filter(|&x| x%p2 !=0)
I am trying to use the following code, but I can not seem to get it to work
struct Primes {
base: Iterator<Item = u64>,
}
impl<'a> Iterator for Primes<'a> {
type Item = u64;
fn next(&mut self) -> Option<u64> {
let p = self.base.next();
match p {
Some(n) => {
let prime = n.clone();
let step = self.base.filter(move |&: &x| {x%prime!=0});
self.base = &step as &Iterator<Item = u64>;
Some(n)
},
_ => None
}
}
}
I have toyed with variations of this, but I can't seem to get lifetimes and types to match up. Right now the compiler is telling me
I can't mutate self.base
the variable prime doesn't live long enough
Here is the error I am getting
solution.rs:16:17: 16:26 error: cannot borrow immutable borrowed content `*self.base` as mutable
solution.rs:16 let p = self.base.next();
^~~~~~~~~
solution.rs:20:28: 20:37 error: cannot borrow immutable borrowed content `*self.base` as mutable
solution.rs:20 let step = self.base.filter(move |&: &x| {x%prime!=0});
^~~~~~~~~
solution.rs:21:30: 21:34 error: `step` does not live long enough
solution.rs:21 self.base = &step as &Iterator<Item = u64>;
^~~~
solution.rs:15:39: 26:6 note: reference must be valid for the lifetime 'a as defined on the block at 15:38...
solution.rs:15 fn next(&mut self) -> Option<u64> {
solution.rs:16 let p = self.base.next();
solution.rs:17 match p {
solution.rs:18 Some(n) => {
solution.rs:19 let prime = n.clone();
solution.rs:20 let step = self.base.filter(move |&: &x| {x%prime!=0});
...
solution.rs:20:71: 23:14 note: ...but borrowed value is only valid for the block suffix following statement 1 at 20:70
solution.rs:20 let step = self.base.filter(move |&: &x| {x%prime!=0});
solution.rs:21 self.base = &step as &Iterator<Item = u64>;
solution.rs:22 Some(n)
solution.rs:23 },
error: aborting due to 3 previous errors
Why won't Rust let me do this?
Here is a working version:
struct Primes<'a> {
base: Option<Box<Iterator<Item = u64> + 'a>>,
}
impl<'a> Iterator for Primes<'a> {
type Item = u64;
fn next(&mut self) -> Option<u64> {
let p = self.base.as_mut().unwrap().next();
p.map(|n| {
let base = self.base.take();
let step = base.unwrap().filter(move |x| x % n != 0);
self.base = Some(Box::new(step));
n
})
}
}
impl<'a> Primes<'a> {
#[inline]
pub fn new<I: Iterator<Item = u64> + 'a>(r: I) -> Primes<'a> {
Primes {
base: Some(Box::new(r)),
}
}
}
fn main() {
for p in Primes::new(2..).take(32) {
print!("{} ", p);
}
println!("");
}
I'm using a Box<Iterator> trait object. Boxing is unavoidable because the internal iterator must be stored somewhere between next() calls, and there is nowhere you can store reference trait objects.
I made the internal iterator an Option. This is necessary because you need to replace it with a value which consumes it, so it is possible that the internal iterator may be "absent" from the structure for a short time. Rust models absence with Option. Option::take replaces the value it is called on with None and returns whatever was there. This is useful when shuffling non-copyable objects around.
Note, however, that this sieve implementation is going to be both memory and computationally inefficient - for each prime you're creating an additional layer of iterators which takes heap space. Also the depth of stack when calling next() grows linearly with the number of primes, so you will get a stack overflow on a sufficiently large number:
fn main() {
println!("{}", Primes::new(2..).nth(10000).unwrap());
}
Running it:
% ./test1
thread '<main>' has overflowed its stack
zsh: illegal hardware instruction (core dumped) ./test1

Conflicting lifetime requirement for iterator returned from function

This may be a duplicate. I don't know. I couldn't understand the other answers well enough to know that. :)
Rust version: rustc 1.0.0-nightly (b47aebe3f 2015-02-26) (built 2015-02-27)
Basically, I'm passing a bool to this function that's supposed to build an iterator that filters one way for true and another way for false. Then it kind of craps itself because it doesn't know how to keep that boolean value handy, I guess. I don't know. There are actually multiple lifetime problems here, which is discouraging because this is a really common pattern for me, since I come from a .NET background.
fn main() {
for n in values(true) {
println!("{}", n);
}
}
fn values(even: bool) -> Box<Iterator<Item=usize>> {
Box::new([3usize, 4, 2, 1].iter()
.map(|n| n * 2)
.filter(|n| if even {
n % 2 == 0
} else {
true
}))
}
Is there a way to make this work?
You have two conflicting issues, so let break down a few representative pieces:
[3usize, 4, 2, 1].iter()
.map(|n| n * 2)
.filter(|n| n % 2 == 0))
Here, we create an array in the stack frame of the method, then get an iterator to it. Since we aren't allowed to consume the array, the iterator item is &usize. We then map from the &usize to a usize. Then we filter against a &usize - we aren't allowed to consume the filtered item, otherwise the iterator wouldn't have it to return!
The problem here is that we are ultimately rooted to the stack frame of the function. We can't return this iterator, because the array won't exist after the call returns!
To work around this for now, let's just make it static. Now we can focus on the issue with even.
filter takes a closure. Closures capture any variable used that isn't provided as an argument to the closure. By default, these variables are captured by reference. However, even is again a variable located on the stack frame. This time however, we can give it to the closure by using the move keyword. Here's everything put together:
fn main() {
for n in values(true) {
println!("{}", n);
}
}
static ITEMS: [usize; 4] = [3, 4, 2, 1];
fn values(even: bool) -> Box<Iterator<Item=usize>> {
Box::new(ITEMS.iter()
.map(|n| n * 2)
.filter(move |n| if even {
n % 2 == 0
} else {
true
}))
}

Implementing Decodable for a wrapper around a fixed size vector

Background: the serialize crate is undocumented, deriving Decodable doesn't work. I've also looked at existing implementations for other types and find the code difficult to follow.
How does the decoding process work, and how do I implement Decodable for this struct?
pub struct Grid<A> {
data: [[A,..GRIDW],..GRIDH]
}
The reason why #[deriving(Decodable)] doesn't work is that [A,..GRIDW] doesn't implement Decodable, and it's impossible to implement a trait for a type when both are defined outside of this crate, which is the case here. So the only solution I can see is to manually implement Decodable for Grid.
And this is as far as I've gotten
impl <A: Decodable<D, E>, D: Decoder<E>, E> Decodable<D, E> for Grid<A> {
fn decode(decoder: &mut D) -> Result<Grid<A>, E> {
decoder.read_struct("Grid", 1u, ref |d| Ok(Grid {
data: match d.read_struct_field("data", 0u, ref |d| Decodable::decode(d)) {
Ok(e) => e,
Err(e) => return Err(e)
},
}))
}
}
Which gives an error at Decodable::decode(d)
error: failed to find an implementation of trait
serialize::serialize::Decodable for [[A, .. 20], .. 20]
It's not really possible to do this nicely at the moment for a variety of reasons:
We can't be generic over the length of a fixed length array (the fundamental issue)
The current trait coherence restrictions means we can't write a custom trait MyDecodable<D, E> { ... } with impl MyDecodable<D, E> for [A, .. GRIDW] (and one for GRIDH) and a blanket implementation impl<A: Decodable<D, E>> MyDecodable<D, E> for A. This forces a trait-based solution into using an intermediary type, which then makes the compiler's type inference rather unhappy and AFAICT impossible to satisfy.
We don't have associated types (aka "output types"), which I think would allow the type inference to be slightly sane.
Thus, for now, we're left with a manual implementation. :(
extern crate serialize;
use std::default::Default;
use serialize::{Decoder, Decodable};
static GRIDW: uint = 10;
static GRIDH: uint = 5;
fn decode_grid<E, D: Decoder<E>,
A: Copy + Default + Decodable<D, E>>(d: &mut D)
-> Result<Grid<A>, E> {
// mirror the Vec implementation: try to read a sequence
d.read_seq(|d, len| {
// check it's the required length
if len != GRIDH {
return Err(
d.error(format!("expecting length {} but found {}",
GRIDH, len).as_slice()));
}
// create the array with empty values ...
let mut array: [[A, .. GRIDW], .. GRIDH]
= [[Default::default(), .. GRIDW], .. GRIDH];
// ... and fill it in progressively ...
for (i, outer) in array.mut_iter().enumerate() {
// ... by reading each outer element ...
try!(d.read_seq_elt(i, |d| {
// ... as a sequence ...
d.read_seq(|d, len| {
// ... of the right length,
if len != GRIDW { return Err(d.error("...")) }
// and then read each element of that sequence as the
// elements of the grid.
for (j, inner) in outer.mut_iter().enumerate() {
*inner = try!(d.read_seq_elt(j, Decodable::decode));
}
Ok(())
})
}));
}
// all done successfully!
Ok(Grid { data: array })
})
}
pub struct Grid<A> {
data: [[A,..GRIDW],..GRIDH]
}
impl<E, D: Decoder<E>, A: Copy + Default + Decodable<D, E>>
Decodable<D, E> for Grid<A> {
fn decode(d: &mut D) -> Result<Grid<A>, E> {
d.read_struct("Grid", 1, |d| {
d.read_struct_field("data", 0, decode_grid)
})
}
}
fn main() {}
playpen.
It's also possible to write a more "generic" [T, .. n] decoder by using macros to instantiate each version, with special control over how the recursive decoding is handled to allow nested fixed-length arrays to be handled (as required for Grid); this requires somewhat less code (especially with more layers, or a variety of different lengths), but the macro solution:
may be harder to understand, and
the one I give there may be less efficient (there's a new array variable created for every fixed length array, including new Defaults, while the non-macro solution above just uses a single array and thus only calls Default::default once for each element in the grid). It may be possible to expand to a similar set of recursive loops, but I'm not sure.