What is the best way to create and use a struct with only one instantiation in the system? Yes, this is necessary, it is the OpenGL subsystem, and making multiple copies of this and passing it around everywhere would add confusion, rather than relieve it.
The singleton needs to be as efficient as possible. It doesn't seem possible to store an arbitrary object on the static area, as it contains a Vec with a destructor. The second option is to store an (unsafe) pointer on the static area, pointing to a heap allocated singleton. What is the most convenient and safest way to do this, while keeping syntax terse?
Non-answer answer
Avoid global state in general. Instead, construct the object somewhere early (perhaps in main), then pass mutable references to that object into the places that need it. This will usually make your code easier to reason about and doesn't require as much bending over backwards.
Look hard at yourself in the mirror before deciding that you want global mutable variables. There are rare cases where it's useful, so that's why it's worth knowing how to do.
Still want to make one...?
Tips
In the 3 following solutions:
If you remove the Mutex then you have a global singleton without any mutability.
You can also use a RwLock instead of a Mutex to allow multiple concurrent readers.
Using lazy-static
The lazy-static crate can take away some of the drudgery of manually creating a singleton. Here is a global mutable vector:
use lazy_static::lazy_static; // 1.4.0
use std::sync::Mutex;
lazy_static! {
static ref ARRAY: Mutex<Vec<u8>> = Mutex::new(vec![]);
}
fn do_a_call() {
ARRAY.lock().unwrap().push(1);
}
fn main() {
do_a_call();
do_a_call();
do_a_call();
println!("called {}", ARRAY.lock().unwrap().len());
}
Using once_cell
The once_cell crate can take away some of the drudgery of manually creating a singleton. Here is a global mutable vector:
use once_cell::sync::Lazy; // 1.3.1
use std::sync::Mutex;
static ARRAY: Lazy<Mutex<Vec<u8>>> = Lazy::new(|| Mutex::new(vec![]));
fn do_a_call() {
ARRAY.lock().unwrap().push(1);
}
fn main() {
do_a_call();
do_a_call();
do_a_call();
println!("called {}", ARRAY.lock().unwrap().len());
}
Using std::sync::LazyLock
The standard library is in the process of adding once_cell's functionality, currently called LazyLock:
#![feature(once_cell)] // 1.67.0-nightly
use std::sync::{LazyLock, Mutex};
static ARRAY: LazyLock<Mutex<Vec<u8>>> = LazyLock::new(|| Mutex::new(vec![]));
fn do_a_call() {
ARRAY.lock().unwrap().push(1);
}
fn main() {
do_a_call();
do_a_call();
do_a_call();
println!("called {}", ARRAY.lock().unwrap().len());
}
A special case: atomics
If you only need to track an integer value, you can directly use an atomic:
use std::sync::atomic::{AtomicUsize, Ordering};
static CALL_COUNT: AtomicUsize = AtomicUsize::new(0);
fn do_a_call() {
CALL_COUNT.fetch_add(1, Ordering::SeqCst);
}
fn main() {
do_a_call();
do_a_call();
do_a_call();
println!("called {}", CALL_COUNT.load(Ordering::SeqCst));
}
Manual, dependency-free implementation
There are several existing implementation of statics, such as the Rust 1.0 implementation of stdin. This is the same idea adapted to modern Rust, such as the use of MaybeUninit to avoid allocations and unnecessary indirection. You should also look at the modern implementation of io::Lazy. I've commented inline with what each line does.
use std::sync::{Mutex, Once};
use std::time::Duration;
use std::{mem::MaybeUninit, thread};
struct SingletonReader {
// Since we will be used in many threads, we need to protect
// concurrent access
inner: Mutex<u8>,
}
fn singleton() -> &'static SingletonReader {
// Create an uninitialized static
static mut SINGLETON: MaybeUninit<SingletonReader> = MaybeUninit::uninit();
static ONCE: Once = Once::new();
unsafe {
ONCE.call_once(|| {
// Make it
let singleton = SingletonReader {
inner: Mutex::new(0),
};
// Store it to the static var, i.e. initialize it
SINGLETON.write(singleton);
});
// Now we give out a shared reference to the data, which is safe to use
// concurrently.
SINGLETON.assume_init_ref()
}
}
fn main() {
// Let's use the singleton in a few threads
let threads: Vec<_> = (0..10)
.map(|i| {
thread::spawn(move || {
thread::sleep(Duration::from_millis(i * 10));
let s = singleton();
let mut data = s.inner.lock().unwrap();
*data = i as u8;
})
})
.collect();
// And let's check the singleton every so often
for _ in 0u8..20 {
thread::sleep(Duration::from_millis(5));
let s = singleton();
let data = s.inner.lock().unwrap();
println!("It is: {}", *data);
}
for thread in threads.into_iter() {
thread.join().unwrap();
}
}
This prints out:
It is: 0
It is: 1
It is: 1
It is: 2
It is: 2
It is: 3
It is: 3
It is: 4
It is: 4
It is: 5
It is: 5
It is: 6
It is: 6
It is: 7
It is: 7
It is: 8
It is: 8
It is: 9
It is: 9
It is: 9
This code compiles with Rust 1.55.0.
All of this work is what lazy-static or once_cell do for you.
The meaning of "global"
Please note that you can still use normal Rust scoping and module-level privacy to control access to a static or lazy_static variable. This means that you can declare it in a module or even inside of a function and it won't be accessible outside of that module / function. This is good for controlling access:
use lazy_static::lazy_static; // 1.2.0
fn only_here() {
lazy_static! {
static ref NAME: String = String::from("hello, world!");
}
println!("{}", &*NAME);
}
fn not_here() {
println!("{}", &*NAME);
}
error[E0425]: cannot find value `NAME` in this scope
--> src/lib.rs:12:22
|
12 | println!("{}", &*NAME);
| ^^^^ not found in this scope
However, the variable is still global in that there's one instance of it that exists across the entire program.
Starting with Rust 1.63, it can be easier to work with global mutable singletons, although it's still preferable to avoid global variables in most cases.
Now that Mutex::new is const, you can use global static Mutex locks without needing lazy initialization:
use std::sync::Mutex;
static GLOBAL_DATA: Mutex<Vec<i32>> = Mutex::new(Vec::new());
fn main() {
GLOBAL_DATA.lock().unwrap().push(42);
println!("{:?}", GLOBAL_DATA.lock().unwrap());
}
Note that this also depends on the fact that Vec::new is const. If you need to use non-const functions to set up your singleton, you could wrap your data in an Option, and initially set it to None. This lets you use data structures like Hashset which currently cannot be used in a const context:
use std::sync::Mutex;
use std::collections::HashSet;
static GLOBAL_DATA: Mutex<Option<HashSet<i32>>> = Mutex::new(None);
fn main() {
*GLOBAL_DATA.lock().unwrap() = Some(HashSet::from([42]));
println!("V2: {:?}", GLOBAL_DATA.lock().unwrap());
}
Alternatively, you could use an RwLock, instead of a Mutex, since RwLock::new is also const as of Rust 1.63. This would make it possible to read the data from multiple threads simultaneously.
If you need to initialize with non-const functions and you'd prefer not to use an Option, you could use a crate like once_cell or lazy-static for lazy initialization as explained in Shepmaster's answer.
From What Not To Do In Rust
To recap: instead of using interior mutability where an object changes
its internal state, consider using a pattern where you promote new
state to be current and current consumers of the old state will
continue to hold on to it by putting an Arc into an RwLock.
use std::sync::{Arc, RwLock};
#[derive(Default)]
struct Config {
pub debug_mode: bool,
}
impl Config {
pub fn current() -> Arc<Config> {
CURRENT_CONFIG.with(|c| c.read().unwrap().clone())
}
pub fn make_current(self) {
CURRENT_CONFIG.with(|c| *c.write().unwrap() = Arc::new(self))
}
}
thread_local! {
static CURRENT_CONFIG: RwLock<Arc<Config>> = RwLock::new(Default::default());
}
fn main() {
Config { debug_mode: true }.make_current();
if Config::current().debug_mode {
// do something
}
}
Use SpinLock for global access.
#[derive(Default)]
struct ThreadRegistry {
pub enabled_for_new_threads: bool,
threads: Option<HashMap<u32, *const Tls>>,
}
impl ThreadRegistry {
fn threads(&mut self) -> &mut HashMap<u32, *const Tls> {
self.threads.get_or_insert_with(HashMap::new)
}
}
static THREAD_REGISTRY: SpinLock<ThreadRegistry> = SpinLock::new(Default::default());
fn func_1() {
let thread_registry = THREAD_REGISTRY.lock(); // Immutable access
if thread_registry.enabled_for_new_threads {
}
}
fn func_2() {
let mut thread_registry = THREAD_REGISTRY.lock(); // Mutable access
thread_registry.threads().insert(
// ...
);
}
If you want mutable state(NOT Singleton), see What Not to Do in Rust for more descriptions.
Hope it's helpful.
If you are on nightly, you can use LazyLock.
It more or less does what the crates once_cell and lazy_sync do. Those two crates are very common, so there's a good chance they might already by in your Cargo.lock dependency tree. But if you prefer to be a bit more "adventurous" and go with LazyLock, be prepered that it (as everything in nightly) might be a subject to change before it gets to stable.
(Note: Up until recently std::sync::LazyLock used to be named std::lazy::SyncLazy but was recently renamed.)
A bit late to the party, but here's how I worked around this issue (rust 1.66-nightly):
#![feature(const_size_of_val)]
#![feature(const_ptr_write)]
static mut GLOBAL_LAZY_MUT: StructThatIsNotSyncNorSend = unsafe {
// Copied from MaybeUninit::zeroed() with minor modifications, see below
let mut u = MaybeUninit::uninit();
let bytes = mem::size_of_val(&u);
write_bytes(u.as_ptr() as *const u8 as *mut u8, 0xA5, bytes); //Trick the compiler check that verifies pointers and references are not null.
u.assume_init()
};
(...)
fn main() {
unsafe {
let mut v = StructThatIsNotSyncNorSend::new();
mem::swap(&mut GLOBAL_LAZY_MUT, &mut v);
mem::forget(v);
}
}
Beware that this code is unbelievably unsafe, and can easily end up being UB if not handled correctly.
You now have a !Send !Sync value as a global static, without the protection of a Mutex. If you access it from multiple threads, even if just for reading, it's UB. If you don't initialize it the way shown, it's UB, because it calls Drop on an actually unitialized value.
You just convinced the rust compiler that something that is UB is not UB. You just convinced that putting a !Sync and !Send in a global static is fine.
If unsure, don't use this snippet.
My limited solution is to define a struct instead of a global mutable one. To use that struct, external code needs to call init() but we disallow calling init() more than once by using an AtomicBoolean (for multithreading usage).
static INITIATED: AtomicBool = AtomicBool::new(false);
struct Singleton {
...
}
impl Singleton {
pub fn init() -> Self {
if INITIATED.load(Ordering::Relaxed) {
panic!("Cannot initiate more than once")
} else {
INITIATED.store(true, Ordering::Relaxed);
Singleton {
...
}
}
}
}
fn main() {
let singleton = Singleton::init();
// panic here
// let another_one = Singleton::init();
...
}
Related
I'm trying to provide co-routine support for non WinRT types that will asynchronously execute on the Windows thread pool.
cppcoro and libunifex both provide the equivalent coroutine task type task<T>. I imagine it would be a good idea to run these tasks on the windows thread pool, but maybe I'm just being silly and should just use one of the thread pools provided these libraries.
On the other hand, I was hoping to see how cppwinrt handles this but I cant penetrate definition of IAsyncOperation
template <typename TResult>
struct __declspec(empty_bases) IAsyncOperation :
winrt::Windows::Foundation::IInspectable,
impl::consume_t<winrt::Windows::Foundation::IAsyncOperation<TResult>>,
impl::require<winrt::Windows::Foundation::IAsyncOperation<TResult>, winrt::Windows::Foundation::IAsyncInfo>
{
static_assert(impl::has_category_v<TResult>, "TResult must be WinRT type.");
IAsyncOperation(std::nullptr_t = nullptr) noexcept {}
IAsyncOperation(void* ptr, take_ownership_from_abi_t) noexcept : winrt::Windows::Foundation::IInspectable(ptr, take_ownership_from_abi) {}
};
Kenny Kerr did a talk at cppcon 2016 on this so maybe ill try watching that and then try greping the source code/generated headers.
The most hardest part about learning to implement coroutines is disambiguating the difference between awaitable types and coroutine promise types.
concept awaitable:
await_ready() -> bool
await_suspend(coroutine_handle<> handle) -> void
await_resume() -> void
concept promise_type:
get_return_object() -> coroutine_return_type<T>
return_value(T const& v) -> void
unhandled_exception() -> void
initial_suspend() -> awaitable
final_suspend() noexcept -> awaitable
For now, I only need to poll if the object returned by my coroutine is ready (rather than co_await it), but I need to be able to await on winrt::resume_on_signal in the coroutine. cppwinrt does the work implementing the awaitable type in winrt::resume_on_signal and I need to do the work implementing the promise type returned by the coroutine I call winrt::resume_on_signal in.
The simplest possible type you could return from a coroutine might be std::shared_ptr<std::optional<T> and specializing coroutine_traits allows you to do this:
namespace std::experimental
{
template <typename T, typename... Args>
struct coroutine_traits<std::shared_ptr<std::optional<T>>, Args...>
{
struct promise_type
{
using result_type = std::shared_ptr<std::optional<T>>;
result_type result = std::make_shared<std::optional<T>>(std::nullopt);
result_type get_return_object() { return result; }
void return_value(T const& v) { *result = std::optional<T>(v); }
void unhandled_exception() { assert(0); }
std::experimental::suspend_never initial_suspend() { return{}; }
std::experimental::suspend_never final_suspend() noexcept { return{}; }
};
};
}
which you can use as
std::shared_ptr<std::optional<int>> my_coroutine2(HANDLE h)
{
co_await winrt::resume_on_signal(h);
co_return int(1);
}
...
auto shared_optional = my_coroutine2();
if (*shared_optional)
{
auto result = **shared_optional;
}
However, there are caveats. You will create a race condition if you also make this an awaitable type oldnewthing, although I don't beleive you even could make it awaitable because you need to be able call a continuation when the value is set. And it doesn't handle exceptions.
Edit: In hindsight, I could have just used CreateThreadpoolWait and never bothered with coroutines.
The performance of concurrency::task is very good, so that may have been a better solution to my problem, but I'm familiar with cppwinrt coroutines and they provide a very convenient winrt::resume_on_signal which I will use.
If you want to learn how to implement coroutine promise types I recommend watching James McNellis' talk.
For how cppwinrt implements their asynchrony with coroutines watch https://www.youtube.com/watch?v=v0SjumbIips (they do it by specializing coroutine_traits)
template <typename... Args>
struct coroutine_traits<winrt::Windows::Foundation::IAsyncAction, Args...> {...}
But I just followed Raymond Chen's advice and wrote a wrapper around IInspectable
template<typename T>
struct IInspectableWrapper : winrt::implements<IInspectableWrapper<T>, winrt::Windows::Foundation::IInspectable>
{
IInspectableWrapper(T state) : state_(state) {}
static winrt::Windows::Foundation::IInspectable box(T state)
{
return winrt::make<IInspectableWrapper>(state);
}
static T unbox(const Windows::Foundation::IInspectable& inspectable)
{
return winrt::get_self<IInspectableWrapper>(inspectable)->state_;
}
private:
T state_;
};
Which can be used
winrt::Windows::Foundation::IAsyncOperation<IInspectable> my_coroutine()
{
co_return IInspectableWrapper<int>::box(1);
}
...
auto boxed_result = co_await my_coroutine();
auto result = IInspectableWrapper<int>::unbox(boxed_result);
Although the unboxing is unsafe.
Also you could just pass an out parameter to you coroutine and return an IAsyncAction
winrt::Windows::Foundation::IAsyncAction my_coroutine3(std::weak_ptr<int> weak_i)
{
if(auto shared_i = weak_i.lock())
*shared_i = 1;
}
...
auto i = std::make_shared<int>();
auto action = my_coroutine3(i);
auto future = std::tuple{std::move(i), std::move(action)};
I'm trying to understand how to use the failure crate. It works splendidly as a unification of different types of standard errors, but when creating custom errors (Fails), I do not understand how to match for custom errors. For example:
use failure::{Fail, Error};
#[derive(Debug, Fail)]
pub enum Badness {
#[fail(display = "Ze badness")]
Level(String)
}
pub fn do_badly() -> Result<(), Error> {
Err(Badness::Level("much".to_owned()).into())
}
#[test]
pub fn get_badness() {
match do_badly() {
Err(Badness::Level(level)) => panic!("{:?} badness!", level),
_ => (),
};
}
fails with
error[E0308]: mismatched types
--> barsa-nagios-forwarder/src/main.rs:74:9
|
73 | match do_badly() {
| ---------- this match expression has type `failure::Error`
74 | Err(Badness::Level(level)) => panic!("{:?} badness!", level),
| ^^^^^^^^^^^^^^^^^^^^^ expected struct `failure::Error`, found enum `Badness`
|
= note: expected type `failure::Error`
found type `Badness`
How can I formulate a pattern which matches a specific custom error?
You need to downcast the Error
When you create a failure::Error from some type that implements the Fail trait (via from or into, as you do), you temporarily hide the information about the type you're wrapping from the compiler. It doesn't know that Error is a Badness - because it can also be any other Fail type, that's the point. You need to remind the compiler of this, the action is called downcasting. The failure::Error has three methods for this: downcast, downcast_ref and downcast_mut. After you've downcast it, you can pattern match on the result as normal - but you need to take into account the possibility that downcasting itself may fail (if you try to downcast to a wrong type).
Here's how it'd look with downcast:
pub fn get_badness() {
if let Err(wrapped_error) = do_badly() {
if let Ok(bad) = wrapped_error.downcast::<Badness>() {
panic!("{:?} badness!", bad);
}
}
}
(two if lets can be combined in this case).
This quickly gets very unpleasant if more than one error type needs to be tested, since downcast consumes the failure::Error it was called on (so you can't try another downcast on the same variable if the first one fails). I sadly couldn't figure out an elegant way to do this. Here's a variant one shouldn't really use (panic! in map is questionable, and doing anything else there would be plenty awkward, and I don't even want to think about more cases than two):
#[derive(Debug, Fail)]
pub enum JustSoSo {
#[fail(display = "meh")]
Average,
}
pub fn get_badness() {
if let Err(wrapped_error) = do_badly() {
let e = wrapped_error.downcast::<Badness>()
.map(|bad| panic!("{:?} badness!", bad))
.or_else(|original| original.downcast::<JustSoSo>());
if let Ok(so) = e {
println!("{}", so);
}
}
}
or_else chain should work OK if you actually want to produce some value of the same type from all of the possible\relevant errors. Consider also using non-consuming methods if a reference to the original error is fine for you, as this would allow you to just make a series of if let blocks , one for each downcast attempt.
An alternative
Don't put your errors into failure::Error, put them in a custom enum as variants. It's more boilerplate, but you get painless pattern matching, which the compiler also will be able to check for sanity. If you choose to do this, I'd recommend derive_more crate which is capable of deriving From for such enums; snafu looks very interesting as well, but I have yet to try it. In its most basic form this approach looks like this:
pub enum SomeError {
Bad(Badness),
NotTooBad(JustSoSo),
}
pub fn do_badly_alt() -> Result<(), SomeError> {
Err(SomeError::Bad(Badness::Level("much".to_owned())))
}
pub fn get_badness_alt() {
if let Err(wrapper) = do_badly_alt() {
match wrapper {
SomeError::Bad(bad) => panic!("{:?} badness!", bad),
SomeError::NotTooBad(so) => println!("{}", so),
}
}
}
Consider the pattern where there are several states registered with a dispatcher and each state knows what state to transition to when it receives an appropriate event. This is a simple state transition pattern.
struct Dispatcher {
states: HashMap<Uid, Rc<RefCell<State>>>,
}
impl Dispatcher {
pub fn insert_state(&mut self, state_id: Uid, state: Rc<RefCell<State>>) -> Option<Rc<RefCell<State>>> {
self.states.insert(state_id, state)
}
fn dispatch(&mut self, state_id: Uid, event: Event) {
if let Some(mut state) = states.get_mut(&state_id).cloned() {
state.handle_event(self, event);
}
}
}
trait State {
fn handle_event(&mut self, &mut Dispatcher, Event);
}
struct S0 {
state_id: Uid,
move_only_field: Option<MOF>,
// This is pattern that concerns me.
}
impl State for S0 {
fn handle_event(&mut self, dispatcher: &mut Dispatcher, event: Event) {
if event == Event::SomeEvent {
// Do some work
if let Some(mof) = self.mof.take() {
let next_state = Rc::new(RefCell::new(S0 {
state_id: self.state_id,
move_only_field: mof,
}));
let _ = dispatcher.insert(self.state_id, next_state);
} else {
// log an error: BUGGY Logic somewhere
let _ = dispatcher.remove_state(&self.state_id);
}
} else {
// Do some other work, maybe transition to State S2 etc.
}
}
}
struct S1 {
state_id: Uid,
move_only_field: MOF,
}
impl State for S1 {
fn handle_event(&mut self, dispatcher: &mut Dispatcher, event: Event) {
// Do some work, maybe transition to State S2/S3/S4 etc.
}
}
With reference to the inline comment above saying:
// This is pattern that concerns me.
S0::move_only_field needs to be an Option in this pattern because self is borrowed in handle_event, but I am not sure that this is best way to approach it.
Here are the ways I can think of with demerits of each one:
Put it into an Option as I have done: this feels hacky and every time I need
to check the invariant that the Option is always Some otherwise
panic! or make it a NOP with if let Some() = and ignore
the else clause, but this causes code-bloat. Doing an unwrap
or bloating the code with if let Some() feels a bit off.
Get it into a shared ownership Rc<RefCell<>>: Need to heap allocate
all such variables or construct another struct called Inner or
something that has all these non-clonable types and put that into an
Rc<RefCell<>>.
Pass stuff back to Dispatcher indicating it to basically remove us
from the map and then move things out of us to the next State which
will also be indicated via our return value: Too much coupling,
breaks OOP, does not scale as Dispatcher needs to know about all the
States and needs frequent updating. I don't think this is a good
paradigm, but could be wrong.
Implement Default for MOF above: Now we can mem::replace it with
the default while moving out the old value. The burden of panicking OR
returning an error OR doing a NOP is now hidden in implementation of
MOF. The problem here is we don't always have the access to MOF
type and for those that we do, it again takes the point of bloat
from user code to the code of MOF.
Let the function handle_event take self by move as fn handle_event(mut self, ...) -> Option<Self>: Now instead of Rc<RefCell<>> you will need to have Box<State> and move it out each time in the dispatcher and if the return is Some you put it back. This almost feels like a sledgehammer and makes many other idioms impossible, for instance if I wanted to share self further in some registered closure/callback I would normally put a Weak<RefCell<>> previously but now sharing self in callbacks etc is impossible.
Are there any other options? Is there any that is considered the "most idiomatic" way of doing this in Rust?
Let the function handle_event take self by move as fn handle_event(mut self, ...) -> Option<Self>: Now instead of Rc<RefCell<>> you will need to have Box<State> and move it out each time in the dispatcher and if the return is Some you put it back.
This is what I would do. However, you don't need to switch from Rc to Box if there is only one strong reference: Rc::try_unwrap can move out of an Rc.
Here's part of how you might rewrite Dispatcher:
struct Dispatcher {
states: HashMap<Uid, Rc<State>>,
}
impl Dispatcher {
fn dispatch(&mut self, state_id: Uid, event: Event) {
if let Some(state_ref) = self.states.remove(&state_id) {
let state = state_ref.try_unwrap()
.expect("Unique strong reference required");
if let Some(next_state) = state.handle_event(event) {
self.states.insert(state_id, next_state);
}
} else {
// handle state_id not found
}
}
}
(Note: dispatch takes state_id by value. In the original version, this wasn't necessary -- it could have been changed to pass by reference. In this version, it is necessary, since state_id gets passed to HashMap::insert. It looks like Uid is Copy though, so it makes little difference.)
It's not clear whether state_id actually needs to be a member of the struct that implements State anymore, since you don't need it inside handle_event -- all the insertion and removal happens inside impl Dispatcher, which makes sense and reduces coupling between State and Dispatcher.
impl State for S0 {
fn handle_event(self, event: Event) -> Option<Rc<State>> {
if event == Event::SomeEvent {
// Do some work
let next_state = Rc::new(S0 {
state_id: self.state_id,
move_only_field: self.mof,
});
Some(next_state)
} else {
// Do some other work
}
}
}
Now you don't have to handle a weird, should-be-impossible corner case where the Option is None.
This almost feels like a sledgehammer and makes many other idioms impossible, for instance if I wanted to share self further in some registered closure/callback I would normally put a Weak<RefCell<>> previously but now sharing self in callbacks etc is impossible.
Because you can move out of an Rc if you have the only strong reference, you don't have to sacrifice this technique.
"Feels like a sledgehammer" might be subjective, but to me, what a signature like fn handle_event(mut self, ...) -> Option<Self> does is encode an invariant. With the original version, each impl State for ... had to know when to insert and remove itself from the dispatcher, and whether it did or not was uncheckable. For example, if somewhere deep in the logic you forgot to call dispatcher.insert(state_id, next_state), the state machine wouldn't transition, and might get stuck or worse. When handle_event takes self by-value, that's not possible anymore -- you have to return the next state, or the code simply won't compile.
(Aside: both the original version and mine do at least two hashtable lookups each time dispatch is called: once to get the current state, and again to insert the new state. If you wanted to get rid of the second lookup, you could combine approaches: store Option<Rc<State>> in the HashMap, and take from the Option instead of removing it from the map entirely.)
When looping over a slice of structs, the value I get is a reference (which is fine), however in some cases it's annoying to have to write var as (*var) in many places.
Is there a better way to avoid re-declaring the variable?
fn my_fn(slice: &[MyStruct]) {
for var in slice {
let var = *var; // <-- how to avoid this?
// Without the line above, errors in comments occur:
other_fn(var); // <-- expected struct `MyStruct`, found reference
if var != var.other {
// ^^ trait `&MyStruct: std::cmp::PartialEq<MyStruct>>` not satisfied
foo();
}
}
}
See: actual error output (more cryptic).
You can remove the reference by destructuring in the pattern:
// |
// v
for &var in slice {
other_fn(var);
}
However, this only works for Copy-types! If you have a type that doesn't implement Copy but does implement Clone, you could use the cloned() iterator adapter; see Chris Emerson's answer for more information.
In some cases you can iterate directly on values if you can consume the iterable, e.g. using Vec::into_iter().
With slices, you can use cloned or copied on the iterator:
fn main() {
let v = vec![1, 2, 3];
let slice = &v[..];
for u in slice.iter().cloned() {
let u: usize = u; // prove it's really usize, not &usize
println!("{}", u);
}
}
This relies on the item implementing Clone or Copy, but if it doesn't you probably do want references after all.
When writing a test, I would like to know how many times a function is called, since bad logic may yield a correct result even when excessive and unnecessary function calls are performed.
To give some context, this is a tree-search function running a test on a fixed data set, however that isn't important to the answer.
I'm currently using a static mutable variable, however this means every access needs to be marked as unsafe:
#[cfg(test)]
static mut total_calls: usize = 0;
fn function_to_count() {
#[cfg(test)]
unsafe {
total_calls += 1;
}
// do stuff
}
#[test]
fn some_test() {
// do stuff, indirectly call function_to_count().
assert!(total_calls < 100);
}
It would be good to avoid having to put unsafe into the code.
Is there a better way to count indirect function calls in Rust?
Mutable statics are unsafe because they're global, and could be accessed from any thread at any time. The simplest solution is to change the definition of the function in question to take some kind of "counter" interface that keeps track of calls. You can avoid performance problems by using generics plus a "dummy" implementation that does nothing.
// Use a callable because I'm feeling lazy.
fn function_to_count<Count: FnMut()>(count: &mut Count) {
count();
// ...
}
#[cfg(test)]
#[test]
fn some_test() {
let mut count = 0;
for _ in 0..10 {
function_to_count(&mut || count += 1);
}
assert_eq!(count, 10);
}
You should really, seriously do that, and not what I'm about to describe:
The other solution is to use a thread-safe construct.
A word of warning: do not use this if you have more than one test! The test runner will, by default, run tests in parallel. As such, if you have more than one test calling into the instrumented function, you will get corrupt results. You'd have to write some kind of exclusive locking mechanism and somehow teach the function to "know" which run it's a part of, and at that point, you should just use the previously described solution instead. You could also disable parallel tests, but I believe you can only do that from outside the code, and that's just asking for someone to forget and run into weird failures as a result.
But anyway...
use std::sync::atomic::{ATOMIC_USIZE_INIT, AtomicUsize, Ordering};
#[cfg(test)]
static TOTAL_CALLS: AtomicUsize = ATOMIC_USIZE_INIT;
fn function_to_count() {
if cfg!(test) {
TOTAL_CALLS.fetch_add(1, Ordering::SeqCst);
}
// ...
}
#[cfg(test)]
#[test]
fn some_test() {
for _ in 0..10 {
function_to_count();
}
assert_eq!(TOTAL_CALLS.load(Ordering::SeqCst), 10);
}