This question already has answers here:
How to mock specific methods but not all of them in Rust?
(2 answers)
How to mock external dependencies in tests? [duplicate]
(1 answer)
How can I test stdin and stdout?
(1 answer)
Is there a way of detecting whether code is being called from tests in Rust?
(1 answer)
What is the proper way to use the `cfg!` macro to choose between multiple implementations?
(1 answer)
Closed 3 years ago.
I have a function generates a salted hash digest for some data. For the salt, it uses a random u32 value. It looks something like this:
use rand::RngCore;
use std::collections::hash_map::DefaultHasher;
use std::hash::Hasher;
fn hash(msg: &str) -> String {
let salt = rand::thread_rng().next_u32();
let mut s = DefaultHasher::new();
s.write_u32(salt);
s.write(msg.as_bytes());
format!("{:x}{:x}", &salt, s.finish())
}
In a test, I'd like to validate that it produces expected values, given a known salt and string. How do I mock (swizzle?) rand::thread_rng().next_u32() in the test to generate a specific value? In other words, what could replace the comment in this example to make the test pass?
mod tests {
#[test]
fn test_hashes() {
// XXX How to mock ThreadRng::next_u32() to return 3892864592?
assert_eq!(hash("foo"), "e80866501cdda8af09a0a656");
}
}
Some approaches I've looked at:
I'm aware that the ThreadRng returned by rand::thread_rng() implements RngCore, so in theory I could set a variable somewhere to store a reference to a RngCore, and implement my own mocked variant to set during testing. I've taken this sort of approach in Go and Java, but I couldn't get the Rust type checker to allow it.
I looked at the list of mock frameworks, such as MockAll, but they appear to be designed to mock a struct or trait to pass to a method, and this code doesn't pass one, and I wouldn't necessarily want users of the library to be able to pass in a RngCore.
Use the #[cfg(test)] macro to call a different function specified in the tests module, then have that function read the value to return from elsewhere. This I got to work, but had to use an unsafe mutable static variable to set the value for the mocked method to find, which seems gross. Is there a better way?
As a reference, I'll post an answer using the #[cfg(test)] + unsafe mutable static variable technique, but hope there's a more straightforward way to do this sort of thing.
In the test module, use lazy-static to add a static variable with a Mutex for thread safety, create a function like next_u32() to return its value, and have tests set the static variable to a known value. It should fall back on returning a properly random number if it's not set, so here I've made it Vec<u32> so it can tell:
mod tests {
use super::*;
use lazy_static::lazy_static;
use std::sync::Mutex;
lazy_static! {
static ref MOCK_SALT: Mutex<Vec<u32>> = Mutex::new(vec![]);
}
// Replaces random salt generation when testing.
pub fn mock_salt() -> u32 {
let mut sd = MOCK_SALT.lock().unwrap();
if sd.is_empty() {
rand::thread_rng().next_u32()
} else {
let ret = sd[0];
sd.clear();
ret
}
}
#[test]
fn test_hashes() {
MOCK_SALT.lock().unwrap().push(3892864592);
assert_eq!(hash("foo"), "e80866501cdda8af09a0a656");
}
}
Then modify hash() to call tests::mock_salt() instead of rand::thread_rng().next_u32() when testing (the first three lines of the function body are new):
fn hash(msg: &str) -> String {
#[cfg(test)]
let salt = tests::mock_salt();
#[cfg(not(test))]
let salt = rand::thread_rng().next_u32();
let mut s = DefaultHasher::new();
s.write_u32(salt);
s.write(msg.as_bytes());
format!("{:x}{:x}", &salt, s.finish())
}
Then use of the macros allows Rust to determine, at compile time, which function to call, so there's no loss of efficiency in non-test builds. It does mean that there's some knowledge of the tests module in the source code, but it's not included in the binary, so should be relatively safe. I suppose there could be a custom derive macro to automate this somehow. Something like:
#[mock(rand::thread_rng().next_u32())]
let salt = rand::thread_rng().next_u32();
Would auto-generate the mocked method in the tests module (or elsewhere?), slot it in here, and provide functions for the tests to set the value --- only when testing, of course. Seems like a lot, though.
Playground.
Related
In Rust the main function is defined like this:
fn main() {
}
This function does not allow for a return value though. Why would a language not allow for a return value and is there a way to return something anyway? Would I be able to safely use the C exit(int) function, or will this cause leaks and whatnot?
As of Rust 1.26, main can return a Result:
use std::fs::File;
fn main() -> Result<(), std::io::Error> {
let f = File::open("bar.txt")?;
Ok(())
}
The returned error code in this case is 1 in case of an error. With File::open("bar.txt").expect("file not found"); instead, an error value of 101 is returned (at least on my machine).
Also, if you want to return a more generic error, use:
use std::error::Error;
...
fn main() -> Result<(), Box<dyn Error>> {
...
}
std::process::exit(code: i32) is the way to exit with a code.
Rust does it this way so that there is a consistent explicit interface for returning a value from a program, wherever it is set from. If main starts a series of tasks then any of these can set the return value, even if main has exited.
Rust does have a way to write a main function that returns a value, however it is normally abstracted within stdlib. See the documentation on writing an executable without stdlib for details.
As was noted by others, std::process::exit(code: i32) is the way to go here
More information about why is given in RFC 1011: Process Exit. Discussion about the RFC is in the pull request of the RFC.
The reddit thread on this has a "why" explanation:
Rust certainly could be designed to do this. It used to, in fact.
But because of the task model Rust uses, the fn main task could start a bunch of other tasks and then exit! But one of those other tasks may want to set the OS exit code after main has gone away.
Calling set_exit_status is explicit, easy, and doesn't require you to always put a 0 at the bottom of main when you otherwise don't care.
Try:
use std::process::ExitCode;
fn main() -> ExitCode {
ExitCode::from(2)
}
Take a look in doc
or:
use std::process::{ExitCode, Termination};
pub enum LinuxExitCode { E_OK, E_ERR(u8) }
impl Termination for LinuxExitCode {
fn report(self) -> ExitCode {
match self {
LinuxExitCode::E_OK => ExitCode::SUCCESS,
LinuxExitCode::E_ERR(v) => ExitCode::from(v)
}
}
}
fn main() -> LinuxExitCode {
LinuxExitCode::E_ERR(3)
}
You can set the return value with std::os::set_exit_status.
I am learning functional programming using Arrow.kt, intending to walk a path hierarchy and hash every file (and do some other stuff). Forcing myself to use functional concepts as much as possible.
Assume I have a data class CustomHash(...) already defined in code. It will be referenced below.
First I need to build a sequence of files by walking the path. This is an impure/effectful function, so it should be marked as such with the IO monad:
fun getFiles(rootPath: File): IO<Sequence<File>> = IO {
rootPath.walk() // This function is of type (File)->Sequence<File>
}
I need to read the file. Again, impure, so this is marked with IO
fun getRelevantFileContent(file: File): IO<Array<Byte>> {
// Assume some code here to extract only certain data relevant for my hash
}
Then I have a function to compute a hash. If it takes a byte array, then it's totally pure. Making it suspend because it will be slow to execute:
suspend fun computeHash(data: Array<Byte>): CustomHash {
// code to compute the hash
}
My issue is how to chain this all together in a functional manner.
fun main(rootPath: File) {
val x = getFiles(rootPath) // IO<Sequence<File>>
.map { seq -> // seq is of type Sequence<File>
seq.map { getRelevantFileContent(it) } // This produces Sequence<IO<Hash>>
}
}
}
Right now, if I try this, x is of type IO<Sequence<IO<Hash>>>. It is clear to me why this is the case.
Is there some way of turning Sequence<IO<Any>> into IO<Sequence<Any>>? Which I suppose is essentially, probably getting the terms imprecise, taking blocks of code that execute in their own coroutines and running the blocks of code all on the same coroutine instead?
If Sequence weren't there, I know IO<IO<Hash>> could have been IO<Hash> by using a flatMap in there, but Sequence of course doesn't have that flattening of IO capabilities.
Arrow's documentation has a lot of "TODO" sections and jumps very fast into documentation that presumes a lot of intermediate/advanced functional programming knowledge. It hasn't really been helpful for this problem.
First you need to convert the Sequence to SequenceK then you can use the sequence function to do that.
import arrow.fx.*
import arrow.core.*
import arrow.fx.extensions.io.applicative.applicative
val sequenceOfIOs: Sequence<IO<Any>> = TODO()
val ioOfSequence: IO<Sequence<Any>> = sequenceOfIOs.k()
.sequence(IO.applicative())
.fix()
libc's error handling is usually to return something < 0 in case of an error. I find myself doing this over and over:
let pid = fork()
if pid < 0 {
// Please disregard the fact that `Err(pid)`
// should be a `&str` or an enum
return Err(pid);
}
I find it ugly that this needs 3 lines of error handling, especially considering that these tests are quite frequent in this kind of code.
Is there a way to return an Err in case fork() returns < 0?
I found two things which are close:
assert_eq!. This needs another line and it panics so the caller cannot handle the error.
Using traits like these:
pub trait LibcResult<T> {
fn to_option(&self) -> Option<T>;
}
impl LibcResult<i64> for i32 {
fn to_option(&self) -> Option<i64> {
if *self < 0 { None } else { Some(*self) }
}
}
I could write fork().to_option().expect("could not fork"). This is now only one line, but it panics instead of returning an Err. I guess this could be solved using ok_or.
Some functions of libc have < 0 as sentinel (e.g. fork), while others use > 0 (e.g. pthread_attr_init), so this would need another argument.
Is there something out there which solves this?
As indicated in the other answer, use pre-made wrappers whenever possible. Where such wrappers do not exist, the following guidelines might help.
Return Result to indicate errors
The idiomatic Rust return type that includes error information is Result (std::result::Result). For most functions from POSIX libc, the specialized type std::io::Result is a perfect fit because it uses std::io::Error to encode errors, and it includes all standard system errors represented by errno values. A good way to avoid repetition is using a utility function such as:
use std::io::{Result, Error};
fn check_err<T: Ord + Default>(num: T) -> Result<T> {
if num < T::default() {
return Err(Error::last_os_error());
}
Ok(num)
}
Wrapping fork() would look like this:
pub fn fork() -> Result<u32> {
check_err(unsafe { libc::fork() }).map(|pid| pid as u32)
}
The use of Result allows idiomatic usage such as:
let pid = fork()?; // ? means return if Err, unwrap if Ok
if pid == 0 {
// child
...
}
Restrict the return type
The function will be easier to use if the return type is modified so that only "possible" values are included. For example, if a function logically has no return value, but returns an int only to communicate the presence of error, the Rust wrapper should return nothing:
pub fn dup2(oldfd: i32, newfd: i32) -> Result<()> {
check_err(unsafe { libc::dup2(oldfd, newfd) })?;
Ok(())
}
Another example are functions that logically return an unsigned integer, such as a PID or a file descriptor, but still declare their result as signed to include the -1 error return value. In that case, consider returning an unsigned value in Rust, as in the fork() example above. nix takes this one step further by having fork() return Result<ForkResult>, where ForkResult is a real enum with methods such as is_child(), and from which the PID is extracted using pattern matching.
Use options and other enums
Rust has a rich type system that allows expressing things that have to be encoded as magic values in C. To return to the fork() example, that function returns 0 to indicate the child return. This would be naturally expressed with an Option and can be combined with the Result shown above:
pub fn fork() -> Result<Option<u32>> {
let pid = check_err(unsafe { libc::fork() })? as u32;
if pid != 0 {
Some(pid)
} else {
None
}
}
The user of this API would no longer need to compare with the magic value, but would use pattern matching, for example:
if let Some(child_pid) = fork()? {
// execute parent code
} else {
// execute child code
}
Return values instead of using output parameters
C often returns values using output parameters, pointer parameters into which the results are stored. This is either because the actual return value is reserved for the error indicator, or because more than one value needs to be returned, and returning structs was badly supported by historical C compilers.
In contrast, Rust's Result supports return value independent of error information, and has no problem whatsoever with returning multiple values. Multiple values returned as a tuple are much more ergonomic than output parameters because they can be used in expressions or captured using pattern matching.
Wrap system resources in owned objects
When returning handles to system resources, such as file descriptors or Windows handles, it good practice to return them wrapped in an object that implements Drop to release them. This will make it less likely that a user of the wrapper will make a mistake, and it makes the use of return values more idiomatic, removing the need for awkward invocations of close() and resource leaks coming from failing to do so.
Taking pipe() as an example:
use std::fs::File;
use std::os::unix::io::FromRawFd;
pub fn pipe() -> Result<(File, File)> {
let mut fds = [0 as libc::c_int; 2];
check_err(unsafe { libc::pipe(fds.as_mut_ptr()) })?;
Ok(unsafe { (File::from_raw_fd(fds[0]), File::from_raw_fd(fds[1])) })
}
// Usage:
// let (r, w) = pipe()?;
// ... use R and W as normal File object
This pipe() wrapper returns multiple values and uses a wrapper object to refer to a system resource. Also, it returns the File objects defined in the Rust standard library and accepted by Rust's IO layer.
The best option is to not reimplement the universe. Instead, use nix, which wraps everything for you and has done the hard work of converting all the error types and handling the sentinel values:
pub fn fork() -> Result<ForkResult>
Then just use normal error handling like try! or ?.
Of course, you could rewrite all of nix by converting your trait to returning Results and including the specific error codes and then use try! or ?, but why would you?
There's nothing magical in Rust that converts negative or positive numbers into a domain specific error type for you. The code you already have is the correct approach, once you've enhanced it to use a Result either by creating it directly or via something like ok_or.
An intermediate solution would be to reuse nix's Errno struct, perhaps with your own trait sugar on top.
so this would need another argument
I'd say it would be better to have different methods: one for negative sentinel values and one for positive sentinel values.
When writing a test, I would like to know how many times a function is called, since bad logic may yield a correct result even when excessive and unnecessary function calls are performed.
To give some context, this is a tree-search function running a test on a fixed data set, however that isn't important to the answer.
I'm currently using a static mutable variable, however this means every access needs to be marked as unsafe:
#[cfg(test)]
static mut total_calls: usize = 0;
fn function_to_count() {
#[cfg(test)]
unsafe {
total_calls += 1;
}
// do stuff
}
#[test]
fn some_test() {
// do stuff, indirectly call function_to_count().
assert!(total_calls < 100);
}
It would be good to avoid having to put unsafe into the code.
Is there a better way to count indirect function calls in Rust?
Mutable statics are unsafe because they're global, and could be accessed from any thread at any time. The simplest solution is to change the definition of the function in question to take some kind of "counter" interface that keeps track of calls. You can avoid performance problems by using generics plus a "dummy" implementation that does nothing.
// Use a callable because I'm feeling lazy.
fn function_to_count<Count: FnMut()>(count: &mut Count) {
count();
// ...
}
#[cfg(test)]
#[test]
fn some_test() {
let mut count = 0;
for _ in 0..10 {
function_to_count(&mut || count += 1);
}
assert_eq!(count, 10);
}
You should really, seriously do that, and not what I'm about to describe:
The other solution is to use a thread-safe construct.
A word of warning: do not use this if you have more than one test! The test runner will, by default, run tests in parallel. As such, if you have more than one test calling into the instrumented function, you will get corrupt results. You'd have to write some kind of exclusive locking mechanism and somehow teach the function to "know" which run it's a part of, and at that point, you should just use the previously described solution instead. You could also disable parallel tests, but I believe you can only do that from outside the code, and that's just asking for someone to forget and run into weird failures as a result.
But anyway...
use std::sync::atomic::{ATOMIC_USIZE_INIT, AtomicUsize, Ordering};
#[cfg(test)]
static TOTAL_CALLS: AtomicUsize = ATOMIC_USIZE_INIT;
fn function_to_count() {
if cfg!(test) {
TOTAL_CALLS.fetch_add(1, Ordering::SeqCst);
}
// ...
}
#[cfg(test)]
#[test]
fn some_test() {
for _ in 0..10 {
function_to_count();
}
assert_eq!(TOTAL_CALLS.load(Ordering::SeqCst), 10);
}
I'm writing some unit tests, and I'm stuck writing a test for the following method:
func (database *Database) FindUnusedKey() string {
count := 0
possibleKey := helpers.RandomString(helpers.Config.KeySize)
for database.DoesKeyExist(possibleKey) {
possibleKey = helpers.RandomString(helpers.Config.KeySize + uint8(count/10))
count++
}
return possibleKey
}
I want a test in which helpers.RandomString(int) returns a string that is already a key in my database, but I've found no way to redeclare or monkey patch helpers.RandomString(int) in my test.
I tried using testify mock, but it doesn't seem possible.
Am I doing something wrong?
Thanks.
You can extract database.DoesKeyExist via dependency injection and supply a function that returns true first time in unit tests. More details http://openmymind.net/Dependency-Injection-In-Go/