How to make a Rust singleton's destructor run? - singleton

These are the ways I know of to create singletons in Rust:
#[macro_use]
extern crate lazy_static;
use std::sync::{Mutex, Once, ONCE_INIT};
#[derive(Debug)]
struct A(usize);
impl Drop for A {
fn drop(&mut self) {
// This is never executed automatically.
println!(
"Dropping {:?} - Important stuff such as release file-handles etc.",
*self
);
}
}
// ------------------ METHOD 0 -------------------
static PLAIN_OBJ: A = A(0);
// ------------------ METHOD 1 -------------------
lazy_static! {
static ref OBJ: Mutex<A> = Mutex::new(A(1));
}
// ------------------ METHOD 2 -------------------
fn get() -> &'static Mutex<A> {
static mut OBJ: *const Mutex<A> = 0 as *const Mutex<A>;
static ONCE: Once = ONCE_INIT;
ONCE.call_once(|| unsafe {
OBJ = Box::into_raw(Box::new(Mutex::new(A(2))));
});
unsafe { &*OBJ }
}
fn main() {
println!("Obj = {:?}", PLAIN_OBJ); // A(0)
println!("Obj = {:?}", *OBJ.lock().unwrap()); // A(1)
println!("Obj = {:?}", *get().lock().unwrap()); // A(2)
}
None of these call A's destructor (drop()) at program exit. This is expected behaviour for Method 2 (which is heap allocated), but I hadn't looked into the implementation of lazy_static! to know it was going to be similar.
There is no RAII here. I could achieve that behaviour of an RAII singleton in C++ (I used to code in C++ until a year a back, so most of my comparisons relate to it - I don't know many other languages) using function local statics:
A& get() {
static A obj; // thread-safe creation with C++11 guarantees
return obj;
}
This is probably allocated/created (lazily) in implementation defined area and is valid for the lifetime of the program. When the program terminates, the destructor is deterministically run. We need to avoid accessing it from destructors of other statics, but I have never run into that.
I might need to release resources and I want drop() to be run. Right now, I end up doing it manually just before program termination (towards the end of main after all threads have joined etc.).
I don't even know how to do this using lazy_static! so I have avoided using it and only go for Method 2 where I can manually destroy it at the end.
I don't want to do this; is there a way I can have such a RAII behaved singleton in Rust?

Singletons in particular, and global constructors/destructors in general, are a bane (especially in language such as C++).
I would say the main (functional) issues they cause are known respectively as static initialization (resp. destruction) order fiasco. That is, it is easy to accidentally create a dependency cycle between those globals, and even without such a cycle it is not immediately clear to compiler in which order they should be built/destroyed.
They may also cause other issues: slower start-up, accidentally shared memory, ...
In Rust, the attitude adopted has been No life before/after main. As such, attempting to get the C++ behavior is probably not going to work as expected.
You will get much greater language support if you:
drop the global aspect
drop the attempt at having a single instance
(and as a bonus, it'll be so much easier to test in parallel, too)
My recommendation, thus, is to simply stick with local variables. Instantiate it in main, pass it by value/reference down the call-stack, and not only do you avoid those tricky initialization order issue, you also get destruction.

Related

Should I abstract error types in my Rust API? [duplicate]

I'm writing a function that could return several one of several different errors.
fn foo(...) -> Result<..., MyError> {}
I'll probably need to define my own error type to represent such errors. I'm presuming it would be an enum of possible errors, with some of the enum variants having diagnostic data attached to them:
enum MyError {
GizmoError,
WidgetNotFoundError(widget_name: String)
}
Is that the most idiomatic way to go about it? And how do I implement the Error trait?
You implement Error exactly like you would any other trait; there's nothing extremely special about it:
pub trait Error: Debug + Display {
fn description(&self) -> &str { /* ... */ }
fn cause(&self) -> Option<&Error> { /* ... */ }
fn source(&self) -> Option<&(Error + 'static)> { /* ... */ }
}
description, cause, and source all have default implementations1, and your type must also implement Debug and Display, as they are supertraits.
use std::{error::Error, fmt};
#[derive(Debug)]
struct Thing;
impl Error for Thing {}
impl fmt::Display for Thing {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "Oh no, something bad went down")
}
}
Of course, what Thing contains, and thus the implementations of the methods, is highly dependent on what kind of errors you wish to have. Perhaps you want to include a filename in there, or maybe an integer of some kind. Perhaps you want to have an enum instead of a struct to represent multiple types of errors.
If you end up wrapping existing errors, then I'd recommend implementing From to convert between those errors and your error. That allows you to use try! and ? and have a pretty ergonomic solution.
Is that the most idiomatic way to go about it?
Idiomatically, I'd say that a library will have a small (maybe 1-3) number of primary error types that are exposed. These are likely to be enumerations of other error types. This allows consumers of your crate to not deal with an explosion of types. Of course, this depends on your API and whether it makes sense to lump some errors together or not.
Another thing to note is that when you choose to embed data in the error, that can have wide-reaching consequences. For example, the standard library doesn't include a filename in file-related errors. Doing so would add overhead to every file error. The caller of the method usually has the relevant context and can decide if that context needs to be added to the error or not.
I'd recommend doing this by hand a few times to see how all the pieces go together. Once you have that, you will grow tired of doing it manually. Then you can check out crates which provide macros to reduce the boilerplate:
error-chain
failure
quick-error
Anyhow
SNAFU
My preferred library is SNAFU (because I wrote it), so here's an example of using that with your original error type:
use snafu::prelude::*; // 0.7.0
#[derive(Debug, Snafu)]
enum MyError {
#[snafu(display("Refrob the Gizmo"))]
Gizmo,
#[snafu(display("The widget '{widget_name}' could not be found"))]
WidgetNotFound { widget_name: String },
}
fn foo() -> Result<(), MyError> {
WidgetNotFoundSnafu {
widget_name: "Quux",
}
.fail()
}
fn main() {
if let Err(e) = foo() {
println!("{}", e);
// The widget 'Quux' could not be found
}
}
Note I've removed the redundant Error suffix on each enum variant. It's also common to just call the type Error and allow the consumer to prefix the type (mycrate::Error) or rename it on import (use mycrate::Error as FooError).
1 Before RFC 2504 was implemented, description was a required method.
The crate custom_error allows the definition of custom error types with less boilerplate than what was proposed above:
custom_error!{MyError
Io{source: io::Error} = "input/output error",
WidgetNotFoundError{name: String} = "could not find widget '{name}'",
GizmoError = "A gizmo error occurred!"
}
Disclaimer: I am the author of this crate.
Is that the most idiomatic way to go about it? And how do I implement the Error trait?
It's a common way, yes. "idiomatic" depends on how strongly typed you want your errors to be, and how you want this to interoperate with other things.
And how do I implement the Error trait?
Strictly speaking, you don't need to here. You might for interoperability with other things that require Error, but since you've defined your return type as this enum directly, your code should work without it.

How do you define custom `Error` types in Rust?

I'm writing a function that could return several one of several different errors.
fn foo(...) -> Result<..., MyError> {}
I'll probably need to define my own error type to represent such errors. I'm presuming it would be an enum of possible errors, with some of the enum variants having diagnostic data attached to them:
enum MyError {
GizmoError,
WidgetNotFoundError(widget_name: String)
}
Is that the most idiomatic way to go about it? And how do I implement the Error trait?
You implement Error exactly like you would any other trait; there's nothing extremely special about it:
pub trait Error: Debug + Display {
fn description(&self) -> &str { /* ... */ }
fn cause(&self) -> Option<&Error> { /* ... */ }
fn source(&self) -> Option<&(Error + 'static)> { /* ... */ }
}
description, cause, and source all have default implementations1, and your type must also implement Debug and Display, as they are supertraits.
use std::{error::Error, fmt};
#[derive(Debug)]
struct Thing;
impl Error for Thing {}
impl fmt::Display for Thing {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "Oh no, something bad went down")
}
}
Of course, what Thing contains, and thus the implementations of the methods, is highly dependent on what kind of errors you wish to have. Perhaps you want to include a filename in there, or maybe an integer of some kind. Perhaps you want to have an enum instead of a struct to represent multiple types of errors.
If you end up wrapping existing errors, then I'd recommend implementing From to convert between those errors and your error. That allows you to use try! and ? and have a pretty ergonomic solution.
Is that the most idiomatic way to go about it?
Idiomatically, I'd say that a library will have a small (maybe 1-3) number of primary error types that are exposed. These are likely to be enumerations of other error types. This allows consumers of your crate to not deal with an explosion of types. Of course, this depends on your API and whether it makes sense to lump some errors together or not.
Another thing to note is that when you choose to embed data in the error, that can have wide-reaching consequences. For example, the standard library doesn't include a filename in file-related errors. Doing so would add overhead to every file error. The caller of the method usually has the relevant context and can decide if that context needs to be added to the error or not.
I'd recommend doing this by hand a few times to see how all the pieces go together. Once you have that, you will grow tired of doing it manually. Then you can check out crates which provide macros to reduce the boilerplate:
error-chain
failure
quick-error
Anyhow
SNAFU
My preferred library is SNAFU (because I wrote it), so here's an example of using that with your original error type:
use snafu::prelude::*; // 0.7.0
#[derive(Debug, Snafu)]
enum MyError {
#[snafu(display("Refrob the Gizmo"))]
Gizmo,
#[snafu(display("The widget '{widget_name}' could not be found"))]
WidgetNotFound { widget_name: String },
}
fn foo() -> Result<(), MyError> {
WidgetNotFoundSnafu {
widget_name: "Quux",
}
.fail()
}
fn main() {
if let Err(e) = foo() {
println!("{}", e);
// The widget 'Quux' could not be found
}
}
Note I've removed the redundant Error suffix on each enum variant. It's also common to just call the type Error and allow the consumer to prefix the type (mycrate::Error) or rename it on import (use mycrate::Error as FooError).
1 Before RFC 2504 was implemented, description was a required method.
The crate custom_error allows the definition of custom error types with less boilerplate than what was proposed above:
custom_error!{MyError
Io{source: io::Error} = "input/output error",
WidgetNotFoundError{name: String} = "could not find widget '{name}'",
GizmoError = "A gizmo error occurred!"
}
Disclaimer: I am the author of this crate.
Is that the most idiomatic way to go about it? And how do I implement the Error trait?
It's a common way, yes. "idiomatic" depends on how strongly typed you want your errors to be, and how you want this to interoperate with other things.
And how do I implement the Error trait?
Strictly speaking, you don't need to here. You might for interoperability with other things that require Error, but since you've defined your return type as this enum directly, your code should work without it.

Pass an iterator of structs to a function accepting references in Rust?

I have a function that takes an iterator over references to structs. Sometimes I'm iterating over a vector, which works fine, but sometimes I create an iterator that produces new structs, and I'm having trouble figuring that one out. I get that when I create a value in a closure, it goes away when the closure does. Rust is always trying to move values out of things when I don't want it to; why doesn't it here?
struct Thing {
value: u32,
}
fn consume<'a, I: IntoIterator<Item = &'a Thing>>(things: I) {
for thing in things {
println!("{}", thing.value);
}
}
fn main() {
let solid = vec![Thing { value: 0 }];
let ephemeral = (1..5).map(|i| &Thing { value: i }); // Boxing Thing does not work either
consume(solid.iter());
consume(ephemeral);
}
But
error[E0515]: cannot return reference to temporary value
--> src/main.rs:13:36
|
13 | let ephemeral = (1..5).map(|i| &Thing { value: i }); // Boxing Thing does not work either
| ^------------------
| ||
| |temporary value created here
| returns a reference to data owned by the current function
I have the sense I either need to move the struct out of the closure and iterator, or store it somewhere. But Boxing the struct doesn't work and returning a struct rather than a pointer doesn't type check (and I can't find the opposite of .cloned()). What's the approach here?
Short answer: you can't.
Longer explanation:
Here is "an iterator that produces new structs":
let iterator_of_structs = (1..5).map(|value| Thing { value });
The main trick to figuring this out is to always ask "who owns the data?".
Each time we call next, the closure takes ownership of an integer (via value) and constructs a new Thing. The closure returns the Thing, transferring ownership to the code that called next.
While you are borrowing a value (a.k.a. taking a reference), the ownership of the value cannot change hands and the value must last longer than the borrow lasts.
Let's turn to the concept of an iterator of references and ask our question: "who owns the data?".
map(|value| &Thing { value })
Here, we create a Thing and take a reference to it. No variable owns the Thing, so the scope owns it and the value will be destroyed when the scope ends. The closure tries to return the reference, but that violates the axiom that borrowed items must outlive their borrows.
So, how do you fix it? The easiest thing is to change your function to be more accepting:
use std::borrow::Borrow;
struct Thing {
value: u32,
}
fn consume(things: impl IntoIterator<Item = impl Borrow<Thing>>) {
for thing in things {
let thing = thing.borrow();
println!("{}", thing.value);
}
}
fn main() {
let iterator_of_structs = (1..5).map(|value| Thing { value });
consume(iterator_of_structs);
let vector_of_structs: Vec<_> = (1..5).map(|value| Thing { value }).collect();
let iterator_of_references_to_structs = vector_of_structs.iter();
consume(iterator_of_references_to_structs);
}
Here, we accept any type which can be converted into an iterator of items that allow us to borrow a reference to a Thing. This is implemented for any item and any reference to an item.
An iterator of references allows the consumer to keep all the references that the iterator yielded, for as long as they want (at least while the iterator itself remains alive). Obviously to support that, all objects to which the iterator creates references need to be in memory at the same time. There is no way around this with the iterator protocol as-is. So your best course of action is to collect() the iterator into a vector and create an reference iterator from that (as you do with solid). Unfortunately this means losing the laziness.
There is an alternative iterator abstraction, called streaming iterator, which would support this. With streaming iterators, the consumer may only hold onto the reference until it gets the next one. I am not aware of any crates implementing this though, and it would be a completely different trait which no function using std::iter::Iterator supports. In many cases it may even be impossible to use streaming iterators, because the algorithm needs the freedom to reference several values at once.

A method can only be called after the object is initialized - how to proceed?

I am finding a recurring pattern in my day-to-day coding, as follows:
var foo = new Foo();
foo.Initialize(params);
foo.DoSomething();
In these cases, foo.Initialize is absolutely needed so that it can actually DoSomething, otherwise some foo properties would still be null/non-initialized.
Is there a pattern to it? How to be safely sure DoSomething will only/always be called after Initialize? And how to proceed if it doesn't: should I raise an exception, silent ignore it, check some flag...?
Essentially you're saying Initialize is a constructor. So that code really should be part of the constructor:
var foo = new Foo(params);
foo.DoSomething();
That's exactly what a constructor is for: it's code which is guaranteed to run before any of the object methods are run, and its job is to check pre-conditions and provide a sane environment for other object methods to run.
If there really is a lot of work taking place in the initialization, then I can certainly see the argument that it's "too much to put in a constructor". (I'm sure somebody with a deeper familiarity of language mechanics under the hood could provide some compelling explanations on the matter, but I'm not that person.)
It sounds to me like a factory would be useful here. Something like this:
public class Foo
{
private Foo()
{
// trivial initialization operations
}
private void Initialize(SomeType params)
{
// non-trivial initialization operations
}
public static Foo CreateNew(SomeType params)
{
var result = new Foo();
result.Initialize(params);
return result;
}
}
And the consuming code becomes:
var foo = Foo.CreateNew(params);
foo.DoSomething();
All manner of additional logic could be put into that factory, including a variety of sanity checks of the params or validating that heavy initialization operations completed successfully (such as if they rely on external resources). It would be a good place to inject dependencies as well.
This basically comes down to a matter of cleanly separating concerns. The constructor's job is to create an instance of the object, the initializer's job is to get the complex object ready for intended use, and the factory's job is to coordinate these efforts and only return ready-for-use objects (handling any errors accordingly).

AutoPtr in C++/CLI mixed mode

I have a C++/CLI wrapper around native .lib and .h files. I use the AutoPtr class pretty extensively in the wrapper class to manage the unmanaged objects I create for wrapping. I have hit a roadblock with the copy constructor/assignment operator.
Using the AutoPtr class from Mr. Kerr: http://weblogs.asp.net/kennykerr/archive/2007/03/26/AutoPtr.aspx
He suggests the following(in the comments) to recreate the behavior of the assignment operator:
SomeManagedClass->NativePointer.Reset(new NativeType);
Which I believe is true. But when I compile my code:
ByteMessageWrap (const ByteMessageWrap% rhs)
{
AutoPtr<ByteMessage> m_NativeByteMessage(rhs.m_NativeByteMessage.GetPointer());
};
ByteMessageWrap% operator=(const ByteMessageWrap% rhs)
{
//SomeManagedClass->NativePointer.Reset(new NativeType);
if (this == %rhs) // prevent assignment to self
return *this;
this->m_NativeByteMessage.Reset(rhs.m_NativeByteMessage.GetPointer());
return *this;
};
-- I get the following errors:
error C2662:
'WrapTest::AutoPtr::GetPointer' :
cannot convert 'this' pointer from
'const WrapTest::AutoPtr' to
'WrapTest::AutoPtr %'
Has anyone experienced similar issues?
For further background on the answer, I removed the "const" keyword from the signature. I know that is not smiled upon in terms of code correctness for a copy ctor, but the CLR doesn't like it at all -- sort of belies the CLR at its core with memory management.
I wonder if it's possible to leave the const in the signature and then use GCHandle or pin_ptr to make sure memory doesn't move on you while performing the copy?
Looking at Kenny Kerr's AutoPtr, it transfers ownership in its constructor -- essentially a "move" constructor rather than a copy constructor. This is analogous with std::auto_ptr.
If you really want to transfer ownership from rhs to this (i.e. leave rhs without it NativeByteMessage), you need to change your copy ctor into a move ctor.
Also, you need to use initialization syntax;
// warning - code below doesn't work
ByteMessageWrap (ByteMessageWrap% rhs)
: m_NativeByteMessage(rhs.m_NativeByteMessage); // take ownership
{
}
ByteMessageWrap% operator=(ByteMessageWrap% rhs)
{
//SomeManagedClass->NativePointer.Reset(new NativeType);
if (this == %rhs) // prevent assignment to self
return *this;
m_NativeByteMessage.Reset(rhs.m_NativeByteMessage.Release());
return *this;
}