How to create a vector of all decorated functions from a specific module? - module

I have a file main.rs and a file rule.rs. I want to define functions in rule.rs to be included in the Rules::rule vector without having to push them one by one. I'd prefer a loop to push them.
main.rs:
struct Rules {
rule: Vec<fn(arg: &Arg) -> bool>,
}
impl Rules {
fn validate_incomplete(self, arg: &Arg) -> bool {
// iterate through all constraints and evaluate, if false return and stop
for constraint in self.incomplete_rule_constraints.iter() {
if !constraint(&arg) {
return false;
}
}
true
}
}
rule.rs:
pub fn test_constraint1(arg: &Arg) -> bool {
arg.last_element().total() < 29500
}
pub fn test_constraint2(arg: &Arg) -> bool {
arg.last_element().total() < 35000
}
Rules::rule should be populated with test_constraint1 and test_constraint2.
In Python, I could add a decorator #rule_decorator above the constraints which you want to be included in the Vec, but I don't see an equivalent in Rust.
In Python, I could use dir(module) to see all available methods/attributes.
Python variant:
class Rules:
def __init__(self, name: str):
self.name = name
self.rule = []
for member in dir(self):
method = getattr(self, member)
if "rule_decorator" in dir(method):
self.rule.append(method)
def validate_incomplete(self, arg: Arg):
for constraint in self.incomplete_rule_constraints:
if not constraint(arg):
return False
return True
With the rule.py file:
#rule_decorator
def test_constraint1(arg: Arg):
return arg.last_element().total() < 29500
#rule_decorator
def test_constraint1(arg: Arg):
return arg.last_element().total() < 35000
All functions with a rule_decorator are added to the self.rule list and checked off by the validate_incomplete function.

Rust does not have the same reflection features as Python. In particular, you cannot iterate through all functions of a module at runtime. At least you can't do that with builtin tools. It is possible to write so called procedural macros which let you add custom attributes to your functions, e.g. #[rule_decorator] fn foo() { ... }. With proc macros, you can do almost anything.
However, using proc macros for this is way too over-engineered (in my opinion). In your case, I would simply list all functions to be included in your vector:
fn test_constraint1(arg: u32) -> bool {
arg < 29_500
}
fn test_constraint2(arg: u32) -> bool {
arg < 35_000
}
fn main() {
let rules = vec![test_constraint1 as fn(_) -> _, test_constraint2];
// Or, if you already have a vector and need to add to it:
let mut rules = Vec::new();
rules.extend_from_slice(
&[test_constraint1 as fn(_) -> _, test_constraint2]
);
}
A few notes about this code:
I replaced &Arg with u32, because it doesn't have anything to do with the problem. Please omit unnecessary details from questions on StackOverflow.
I used _ in the number literals to increase readability.
The strange as fn(_) -> _ cast is sadly necessary. You can read more about it in this question.

You can, with some tweaks and restrictions, achieve your goals. You'll need to use the inventory crate. This is limited to Linux, macOS and Windows at the moment.
You can then use inventory::submit to add values to a global registry, inventory::collect to build the registry, and inventory::iter to iterate over the registry.
Due to language restrictions, you cannot create a registry for values of a type that you do not own, which includes the raw function pointer. We will need to create a newtype called Predicate to use the crate:
use inventory; // 0.1.3
struct Predicate(fn(&u32) -> bool);
inventory::collect!(Predicate);
struct Rules;
impl Rules {
fn validate_incomplete(&self, arg: &u32) -> bool {
inventory::iter::<Predicate>
.into_iter()
.all(|Predicate(constraint)| constraint(arg))
}
}
mod rules {
use super::Predicate;
pub fn test_constraint1(arg: &u32) -> bool {
*arg < 29500
}
inventory::submit!(Predicate(test_constraint1));
pub fn test_constraint2(arg: &u32) -> bool {
*arg < 35000
}
inventory::submit!(Predicate(test_constraint2));
}
fn main() {
if Rules.validate_incomplete(&42) {
println!("valid");
} else {
println!("invalid");
}
}
There are a few more steps you'd need to take to reach your originally-stated goal:
"a vector"
You can collect from the provided iterator to build a Vec.
"decorated functions"
You can write your own procedural macro that will call inventory::submit!(Predicate(my_function_name)); for you.
"from a specific module"
You can add the module name into the Predicate struct and filter on that later.
See also:
How can I statically register structures at compile time?

Related

Testing serialize/deserialize functions for serde "with" attribute

Serde derive macros come with the ability to control how a field is serialized/deserialized through the #[serde(with = "module")] field attribute. The "module" should have serialize and deserialize functions with the right arguments and return types.
An example that unfortunately got a bit too contrived:
use serde::{Deserialize, Serialize};
#[derive(Debug, Default, PartialEq, Eq)]
pub struct StringPair(String, String);
mod stringpair_serde {
pub fn serialize<S>(sp: &super::StringPair, ser: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
ser.serialize_str(format!("{}:{}", sp.0, sp.1).as_str())
}
pub fn deserialize<'de, D>(d: D) -> Result<super::StringPair, D::Error>
where
D: serde::Deserializer<'de>,
{
d.deserialize_str(Visitor)
}
struct Visitor;
impl<'de> serde::de::Visitor<'de> for Visitor {
type Value = super::StringPair;
fn expecting(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(f, "a pair of strings separated by colon (:)")
}
fn visit_str<E>(self, s: &str) -> Result<Self::Value, E>
where
E: serde::de::Error,
{
Ok(s.split_once(":")
.map(|tup| super::StringPair(tup.0.to_string(), tup.1.to_string()))
.unwrap_or(Default::default()))
}
}
}
#[derive(Serialize, Deserialize)]
struct UsesStringPair {
// Other fields ...
#[serde(with = "stringpair_serde")]
pub stringpair: StringPair,
}
fn main() {
let usp = UsesStringPair {
stringpair: StringPair("foo".to_string(), "bar".to_string()),
};
assert_eq!(
serde_json::json!(&usp).to_string(),
r#"{"stringpair":"foo:bar"}"#
);
let usp: UsesStringPair = serde_json::from_str(r#"{"stringpair":"baz:qux"}"#).unwrap();
assert_eq!(
usp.stringpair,
StringPair("baz".to_string(), "qux".to_string())
)
}
Testing derived serialization for UsesStringPair is trivial with simple assertions. But I have looked at serde_test example as that makes sense to me too.
However, I want to be able to independently test the stringpair_serde::{serialize, deserialize} functions (e.g. if my crate provides just mycrate::StringPair and mycrate::stringpair_serde, and UsesStringPair is for the crate users to implement).
One way I've looked into is creating a serde_json::Serializer (using new, requires a io::Write implementation, which I couldn't figure out how to create and use trivially, but that's a separate question) and calling serialize with the created Serializer, then making assertions on the result as before. However, that does not test any/all implementations of serde::Serializer, just the one provided in serde_json.
I'm wondering if there's a method like in the serde_test example that works for ser/deser functions provided by a module.

How do I format a signed integer to a sign-aware hexadecimal representation?

My initial intent was to convert a signed primitive number to its hexadecimal representation in a way that preserves the number's sign. It turns out that the current implementations of LowerHex, UpperHex and relatives for signed primitive integers will simply treat them as unsigned. Regardless of what extra formatting flags that I add, these implementations appear to simply reinterpret the number as its unsigned counterpart for formatting purposes. (Playground)
println!("{:X}", 15i32); // F
println!("{:X}", -15i32); // FFFFFFF1 (expected "-F")
println!("{:X}", -0x80000000i32); // 80000000 (expected "-80000000")
println!("{:+X}", -0x80000000i32); // +80000000
println!("{:+o}", -0x8000i16); // +100000
println!("{:+b}", -0x8000i16); // +1000000000000000
The documentation in std::fmt is not clear on whether this is supposed to happen, or is even valid, and UpperHex (or any other formatting trait) does not mention that the implementations for signed integers interpret the numbers as unsigned. There seem to be no related issues on Rust's GitHub repository either. (Post-addendum notice: Starting from 1.24.0, the documentation has been improved to properly address these concerns, see issue #42860)
Ultimately, one could implement specific functions for the task (as below), with the unfortunate downside of not being very compatible with the formatter API.
fn to_signed_hex(n: i32) -> String {
if n < 0 {
format!("-{:X}", -n)
} else {
format!("{:X}", n)
}
}
assert_eq!(to_signed_hex(-15i32), "-F".to_string());
Is this behaviour for signed integer types intentional? Is there a way to do this formatting procedure while still adhering to a standard Formatter?
Is there a way to do this formatting procedure while still adhering to a standard Formatter?
Yes, but you need to make a newtype in order to provide a distinct implementation of UpperHex. Here's an implementation that respects the +, # and 0 flags (and possibly more, I haven't tested):
use std::fmt::{self, Formatter, UpperHex};
struct ReallySigned(i32);
impl UpperHex for ReallySigned {
fn fmt(&self, f: &mut Formatter) -> fmt::Result {
let prefix = if f.alternate() { "0x" } else { "" };
let bare_hex = format!("{:X}", self.0.abs());
f.pad_integral(self.0 >= 0, prefix, &bare_hex)
}
}
fn main() {
for &v in &[15, -15] {
for &v in &[&v as &UpperHex, &ReallySigned(v) as &UpperHex] {
println!("Value: {:X}", v);
println!("Value: {:08X}", v);
println!("Value: {:+08X}", v);
println!("Value: {:#08X}", v);
println!("Value: {:+#08X}", v);
println!();
}
}
}
This is like Francis Gagné's answer, but made generic to handle i8 through i128.
use std::fmt::{self, Formatter, UpperHex};
use num_traits::Signed;
struct ReallySigned<T: PartialOrd + Signed + UpperHex>(T);
impl<T: PartialOrd + Signed + UpperHex> UpperHex for ReallySigned<T> {
fn fmt(&self, f: &mut Formatter) -> fmt::Result {
let prefix = if f.alternate() { "0x" } else { "" };
let bare_hex = format!("{:X}", self.0.abs());
f.pad_integral(self.0 >= T::zero(), prefix, &bare_hex)
}
}
fn main() {
println!("{:#X}", -0x12345678);
println!("{:#X}", ReallySigned(-0x12345678));
}

Why are the strings in my iterator being concatenated?

My original goal is to fetch a list of words, one on each line, and to put them in a HashSet, while discarding comment lines and raising I/O errors properly. Given the file "stopwords.txt":
a
# this is actually a comment
of
the
this
I managed to make the code compile like this:
fn stopword_set() -> io::Result<HashSet<String>> {
let words = Result::from_iter(
BufReader::new(File::open("stopwords.txt")?)
.lines()
.filter(|r| match r {
&Ok(ref l) => !l.starts_with('#'),
_ => true
}));
Ok(HashSet::from_iter(words))
}
fn main() {
let set = stopword_set().unwrap();
println!("{:?}", set);
assert_eq!(set.len(), 4);
}
Here's a playground that also creates the file above.
I would expect to have a set of 4 strings at the end of the program. To my surprise, the function actually returns a set containing a single string with all words concatenated:
{"aofthethis"}
thread 'main' panicked at 'assertion failed: `(left == right)` (left: `1`, right: `4`)'
Led by a piece of advice in the docs for FromIterator, I got rid of all calls to from_iter and used collect instead (Playground), which has indeed solved the problem.
fn stopword_set() -> io::Result<HashSet<String>> {
BufReader::new(File::open("stopwords.txt")?)
.lines()
.filter(|r| match r {
&Ok(ref l) => !l.starts_with('#'),
_ => true
}).collect()
}
Why are the previous calls to from_iter leading to unexpected inferences, while collect() works just as intended?
A simpler reproduction:
use std::collections::HashSet;
use std::iter::FromIterator;
fn stopword_set() -> Result<HashSet<String>, u8> {
let input: Vec<Result<_, u8>> = vec![Ok("foo".to_string()), Ok("bar".to_string())];
let words = Result::from_iter(input.into_iter());
Ok(HashSet::from_iter(words))
}
fn main() {
let set = stopword_set().unwrap();
println!("{:?}", set);
assert_eq!(set.len(), 2);
}
The problem is that here, we are collecting from the iterator twice. The type of words is Result<_, u8>. However, Result also implements Iterator itself, so when we call from_iter on that at the end, the compiler sees that the Ok type must be String due to the method signature. Working backwards, you can construct a String from an iterator of Strings, so that's what the compiler picks.
Removing the second from_iter would solve it:
fn stopword_set() -> Result<HashSet<String>, u8> {
let input: Vec<Result<_, u8>> = vec![Ok("foo".to_string()), Ok("bar".to_string())];
Result::from_iter(input.into_iter())
}
Or for your original:
fn stopword_set() -> io::Result<HashSet<String>> {
Result::from_iter(
BufReader::new(File::open("stopwords.txt")?)
.lines()
.filter(|r| match r {
&Ok(ref l) => !l.starts_with('#'),
_ => true
}))
}
Of course, I'd normally recommend using collect instead, as I prefer the chaining:
fn stopword_set() -> io::Result<HashSet<String>> {
BufReader::new(File::open("stopwords.txt")?)
.lines()
.filter(|r| match r {
&Ok(ref l) => !l.starts_with('#'),
_ => true,
})
.collect()
}

Iterate over copy types

It is clear that iterators pass around a references to avoid moving objects into iterator or it's closure argument, but what with Copy types? Let me show you a small snippet:
fn is_odd(x: &&i32) -> bool { *x & 1 == 1 }
// [1] fn is_odd(x: &i32) -> bool { x & 1 == 1 }
// [2] fn is_odd(x: i32) -> bool { x & 1 == 1 }
fn main() {
let xs = &[ 10, 20, 13, 14 ];
for x in xs.iter().filter(is_odd) {
assert_eq!(13, *x);
}
// [1] ...is slightly better, but not ideal
// for x in xs.iter().cloned().filter(is_odd) {
// assert_eq!(13, x);
// }
}
Am I right that .cloned() is preferred when we iterate over something like &[i32] or &[u8], where extra indirection is involved instead of just copying the tiny data unit?
But it looks like I can not avoid references passed into is_odd function.
Is there a way to make [2] function from above snippet work for higher-level functions like filter?
Assume that I understand that moving non-Copy type into predicate function is silly. But not all types use move semantics by default, right?
It is clear that iterators pass around a references
This blanket statement is not true, iterators are more than capable of yielding a non-reference. filter will provide a reference to the closure because it doesn't want to give ownership of the item to the closure. In your example, your iterated value is a &i32, and then filter provides a &&i32.
Is there a way to make [2] function from above snippet work for higher-level functions like filter?
Certainly, just provide a closure that does the dereferencing:
fn is_odd(x: i32) -> bool { x & 1 == 1 }
fn main() {
let xs = &[ 10, 20, 13, 14 ];
for x in xs.iter().filter(|&&x| is_odd(x)) {
assert_eq!(13, *x);
}
}

Implementing Decodable for a wrapper around a fixed size vector

Background: the serialize crate is undocumented, deriving Decodable doesn't work. I've also looked at existing implementations for other types and find the code difficult to follow.
How does the decoding process work, and how do I implement Decodable for this struct?
pub struct Grid<A> {
data: [[A,..GRIDW],..GRIDH]
}
The reason why #[deriving(Decodable)] doesn't work is that [A,..GRIDW] doesn't implement Decodable, and it's impossible to implement a trait for a type when both are defined outside of this crate, which is the case here. So the only solution I can see is to manually implement Decodable for Grid.
And this is as far as I've gotten
impl <A: Decodable<D, E>, D: Decoder<E>, E> Decodable<D, E> for Grid<A> {
fn decode(decoder: &mut D) -> Result<Grid<A>, E> {
decoder.read_struct("Grid", 1u, ref |d| Ok(Grid {
data: match d.read_struct_field("data", 0u, ref |d| Decodable::decode(d)) {
Ok(e) => e,
Err(e) => return Err(e)
},
}))
}
}
Which gives an error at Decodable::decode(d)
error: failed to find an implementation of trait
serialize::serialize::Decodable for [[A, .. 20], .. 20]
It's not really possible to do this nicely at the moment for a variety of reasons:
We can't be generic over the length of a fixed length array (the fundamental issue)
The current trait coherence restrictions means we can't write a custom trait MyDecodable<D, E> { ... } with impl MyDecodable<D, E> for [A, .. GRIDW] (and one for GRIDH) and a blanket implementation impl<A: Decodable<D, E>> MyDecodable<D, E> for A. This forces a trait-based solution into using an intermediary type, which then makes the compiler's type inference rather unhappy and AFAICT impossible to satisfy.
We don't have associated types (aka "output types"), which I think would allow the type inference to be slightly sane.
Thus, for now, we're left with a manual implementation. :(
extern crate serialize;
use std::default::Default;
use serialize::{Decoder, Decodable};
static GRIDW: uint = 10;
static GRIDH: uint = 5;
fn decode_grid<E, D: Decoder<E>,
A: Copy + Default + Decodable<D, E>>(d: &mut D)
-> Result<Grid<A>, E> {
// mirror the Vec implementation: try to read a sequence
d.read_seq(|d, len| {
// check it's the required length
if len != GRIDH {
return Err(
d.error(format!("expecting length {} but found {}",
GRIDH, len).as_slice()));
}
// create the array with empty values ...
let mut array: [[A, .. GRIDW], .. GRIDH]
= [[Default::default(), .. GRIDW], .. GRIDH];
// ... and fill it in progressively ...
for (i, outer) in array.mut_iter().enumerate() {
// ... by reading each outer element ...
try!(d.read_seq_elt(i, |d| {
// ... as a sequence ...
d.read_seq(|d, len| {
// ... of the right length,
if len != GRIDW { return Err(d.error("...")) }
// and then read each element of that sequence as the
// elements of the grid.
for (j, inner) in outer.mut_iter().enumerate() {
*inner = try!(d.read_seq_elt(j, Decodable::decode));
}
Ok(())
})
}));
}
// all done successfully!
Ok(Grid { data: array })
})
}
pub struct Grid<A> {
data: [[A,..GRIDW],..GRIDH]
}
impl<E, D: Decoder<E>, A: Copy + Default + Decodable<D, E>>
Decodable<D, E> for Grid<A> {
fn decode(d: &mut D) -> Result<Grid<A>, E> {
d.read_struct("Grid", 1, |d| {
d.read_struct_field("data", 0, decode_grid)
})
}
}
fn main() {}
playpen.
It's also possible to write a more "generic" [T, .. n] decoder by using macros to instantiate each version, with special control over how the recursive decoding is handled to allow nested fixed-length arrays to be handled (as required for Grid); this requires somewhat less code (especially with more layers, or a variety of different lengths), but the macro solution:
may be harder to understand, and
the one I give there may be less efficient (there's a new array variable created for every fixed length array, including new Defaults, while the non-macro solution above just uses a single array and thus only calls Default::default once for each element in the grid). It may be possible to expand to a similar set of recursive loops, but I'm not sure.