Are LLVM "fatal errors" really fatal? - error-handling

I'm wondering if LLVM fatal errors are really "fatal" - ie. they invalidate the entire state of the system and are not recoverable.
For example (I'm using the llvm-c interface), the default behavior of the following code:
LLVMMemoryBufferRef mb = LLVMCreateMemoryBufferWithMemoryRange(somedata, data_length, "test", 0);
LLVMModuleRef module;
if (LLVMParseBitcode2(mb, &module) != 0) {
fprintf(stderr, "could not parse module bitcode");
}
is that if the pointer somedata points to invalid bitcode, the fprintf is never executed, but instead the entire process aborts with its own fatal error message on stderr.
However, there is supposedly an interface to catch such errors: LLVMFatalErrorHandler. However, after installing an error handler, the process still just aborts without calling the error handler.
The documentation in LLVM is very poor overall, and the C interface is barely documented at all. But it seems like super-fragile design to have the entire process abort in a mandatory way if some bitcode is corrupt!
So, I'm wondering if "fatal" here implies, as usual - that if such an error occurs, we may not recover and continue using the library (for example trying some different bitcode or repairing the old one, for example), or if it is not really a "fatal" error and we can have the FatalErrorHandler or some other means of catching and notify, or take other remediating actions, and continue the program.

Ok, after reading through the LLVM source for 10+ hours and enlisting the help of a friendly LLVM dev, the answer here is that this is not in fact a fatal error, after all!
The functions called above in the C interface are deprecated and should have been removed; LLVM used to have a notion of "global context", and that was removed years ago. The correct way to do this - so that this error can be caught and handled without aborting the process - is to use the LLVMDiagnosticInfo interface after creating an LLVMContext instance and using the context-specific bitcode reader functions:
void llvmDiagnosticHandler(LLVMDiagnosticInfoRef dir, void *p) {
fprintf(stderr, "LLVM Diagnostic: %s\n", LLVMGetDiagInfoDescription(dir));
}
...
LLVMContextRef llvmCtx = LLVMContextCreate();
LLVMContextSetDiagnosticHandler(llvmCtx, llvmDiagnosticHandler, NULL);
LLVMMemoryBufferRef mb = LLVMCreateMemoryBufferWithMemoryRange(somedata, data_length, "test", 0);
LLVMModuleRef module;
if (LLVMGetBitcodeModuleInContext2(llvmCtx, mb, &module) != 0) {
fprintf(stderr, "could not parse module bitcode");
}
The LLVMDiagnosticInfo also carries with it a "severity" code that indicates the seriousness of the error (sometimes mere warnings or perfomance hints are returned). Also, as I suspected, it is not the case that failing to parse bitcode invalidates the library or context state.
The code that was aborting with the cruddy error message was just a stop-gap to let legacy apps which still called the old API functions work - it set up a context and a minimal error handler which behaves in this way.

Related

Should an Error with a source include that source in the Display output?

I have an error type that impls the Error trait, and it wraps an underlying error cause, so the source method returns Some(source). I want to know whether the Display impl on my error type should include a description of that source error, or not.
I can see two options:
Yes, include source in Display output, e.g. "Error opening database: No such file"
This makes it easy to print a whole error chain just by formatting with "{}" but impossible to only display the error itself without the underlying chain of source errors. Also it makes the source method a bit pointless and gives client code no choice on how to format separation between each error in the chain. Nevertheless this choice seems common enough in example code I have found.
No, just print the error itself e.g. "Error opening database" and leave it to client code to traverse and display source if it wants to include that in the output.
This gives client code the choice of whether to display just the surface error or the whole chain, and in the latter case how to format separation between each error in the chain. It leaves client code with the burden of iterating through the chain, and I haven't yet fallen upon a canonical utility for conveniently formatting an error chain from errors that each only Display themselves excluding source. (So of course I have my own.)
The snafu crate (which I really like) seems to hint at favoring option 2, in that an error variant with a source field but no display attribute defaults to formatting Display output that does not include source.
Maybe my real question here is: What is the purpose of the source method? Is it to make formatting error chains more flexible? Or should Display really output everything that should be user-visible about an error, and source is just there for developer-visible purposes?
I would love to see some definitive guidance on this, ideally in the documentation of the Error trait.
#[derive(Debug)]
enum DatabaseError {
Opening { source: io::Error },
}
impl Error for DatabaseError {
fn source(&self) -> Option<&(dyn Error + 'static)> {
match self {
DataBaseError::Opening { source } => Some(source),
}
}
}
impl fmt::Display for DatabaseError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
DatabaseError::Opening { source } => {
// ??? Should we include the source?
write!(f, "Error opening database: {}", source)
// ??? Or should we leave it to the caller to call .source()
// if they want to include that in the error description?
write!(f, "Error opening database")
}
}
}
}
The two options of whether to print the source error on a Display implementation creates two schools of design. This answer will explain the two while objectively stating their key differences and clarifying a few possible misconceptions in the way.
Design 1: Yes, include source on your Display impl
Example with SNAFU:
#[derive(Debug, Snafu)]
enum Error {
#[snafu(display("Could not read data set token: {}", source))]
ReadToken {
#[snafu(backtrace)]
source: ReadDataSetError,
},
}
The key advantage, as already mentioned in the question, is that providing the full amount of information is as simple as just printing the error value.
eprintln!("[ERROR] {}", err);
It is simple and easy, requiring no helper functions for reporting the error, albeit with the lack of presentation flexibility. Without string manipulation, a chain of colon-separated errors is what you will always get.
[ERROR] Could not read data set token: Could not read item value: Undefined value length of element tagged (5533,5533) at position 3548
Design 2: No, leave out source on your Display impl
#[derive(Debug, Snafu)]
enum Error {
#[snafu(display("Could not read data set token"))]
ReadToken {
#[snafu(backtrace)]
source: ReadDataSetError,
},
}
While this will not give you the full information with a single line print like before, you can leave that task to a project-wide error reporter. This also grants the consumer of the API greater flexibility on error presentation.
A simple example follows. Additional logic would be required for presenting the error's backtrace.
fn report<E: 'static>(err: E)
where
E: std::error::Error,
E: Send + Sync,
{
eprintln!("[ERROR] {}", err);
if let Some(cause) = err.source() {
eprintln!();
eprintln!("Caused by:");
for (i, e) in std::iter::successors(Some(cause), |e| e.source()).enumerate() {
eprintln!(" {}: {}", i, e);
}
}
}
Playground
It's also worth considering the interest of integrating with opinionated libraries. That is, certain crates in the ecosystem may have already made an assumption about which option to choose. In anyhow, error reports will already traverse the error's source chain by default. When using anyhow for error reporting, you should not be adding source, otherwise you may come up with an irritating list of repeated messages:
[ERROR] Could not read data set token: Could not read item value: Undefined value length of element tagged (5533,5533) at position 3548
Caused by:
0: Could not read item value: Undefined value length of element tagged (5533,5533) at position 3548
1: Undefined value length of element tagged (5533,5533) at position 3548
Likewise, the eyre library provides a customizable error reporting abstraction, but existing error reporters in the eyre crate ecosystem also assume that the source is not printed by the error's Display implementation.
So, which one?
Thanks to the efforts of the Error Handling project group, a key guideline regarding the implementation of Display was proposed in early 2021:
An error type with a source error should either return that error via source or include that source's error message in its own Display output, but never both.
That would be the second design: avoid appending the source's error message in your Display implementation. For SNAFU users, this currently means that applications will need to bring in an error reporter until one is made available directly in the snafu crate. As the ecosystem is yet to mature around this guideline, one may still find error utility crates lacking support for error reporting in this manner.
In either case...
This decision only plays a role in error reporting, not in error matching or error handling in some other manner. The existence of the source method establishes a chain-like structure on all error types, which can be exploited in pattern matching and subsequent flow control of the program.
The Error::source method has a purpose in the ecosystem, regardless of how errors are reported.
In addition, it's ultimately up to the developers to choose how to design their errors and respective Display implementations, although once it starts integrating with other components, following the guideline will be the right way towards consistent error reporting.
What about Rust API guidelines?
The Rust API guidelines do not present an opinion about Display in errors, other than C-GOOD-ERR, which only states that the error type's Display message should be "lowercase without trailing punctuation, and typically concise". There is a pending proposal to update this guideline, instructing developers to exclude source in their Display impl. However, the pull request was created before the guideline was proposed, and has not been updated since then (at the time of writing).
See also:
The error handling protect group (GitHub)
Inside Rust Blog post 2021-07-01: What the error handling project group is working towards

objective-c: Expanded from macro

in Mopub-SDK for iOS.there is an error in calling "mp_safe_block" method.
Macro definition:
// Macros for dispatching asynchronously to the main queue
#define mp_safe_block(block, ...) block ? block(__VA_ARGS__) : nil
Called as:
mp_safe_block(complete, NSError.sdkInitializationInProgress, nil);
Error message:
Left operand to ? is void, but right operand is of type 'nullptr_t'
maybe this error has nothing to do with the SDK itself. how to fix it ?
PS:
that sdk code run correctly in a new xCode project created by myself. but there is an error in a Xcode-project builded by MMF2(clickTeam fusion)
and this xCode-project version is too old. I updateded setting of Xcode.but it still an error.

Confused by MSDN "recommended way of handling errors" in COM

I have been reading the MSDN dev guide to COM. However the code on this page is confusing. Reproducing here:
The following code sample shows the recommended way of handling unknown errors:
HRESULT hr;
hr = xxMethod();
switch (GetScode(hr))
{
case NOERROR:
// Method returned success.
break;
case x1:
// Handle error x1 here.
break;
case x2:
// Handle error x2 here.
break;
case E_UNEXPECTED:
default:
// Handle unexpected errors here.
break;
}
The GetScode function doesn't seem to be defined, nor is NOERROR, and searching MSDN didn't help. A web search indicated that GetScode is a macro that converts HRESULT to SCODE, however those are both 32-bit ints so I'm not sure what it is for.
It was suggested that it is a historical artifact that does nothing on 32-bit systems, but on 16-bit systems it converts hr to a 16-bit int. However, if that is true, then I do not see how E_UNEXPECTED would be matched, since that is 0x8000FFFF. Also, it's unclear whether x1 and x2 are meant to be 0x800..... values, or some sort of truncated version.
Finally, this code treats all-but-one of the success values as errors. Other pages on the same MSDN guide say that SUCCEEDED(hr) or FAILED(hr) should be used to determine between a success or failure.
So, is this code sample really the "recommended way" or is this some sort of documentation blunder?
This is (pretty) old stuff. The winerror.h file in the SDK says this:
////////////////////////////////////
// //
// COM Error Codes //
// //
////////////////////////////////////
//
// The return value of COM functions and methods is an HRESULT.
// This is not a handle to anything, but is merely a 32-bit value
// with several fields encoded in the value. The parts of an
// HRESULT are shown below.
//
// Many of the macros and functions below were orginally defined to
// operate on SCODEs. SCODEs are no longer used. The macros are
// still present for compatibility and easy porting of Win16 code.
// Newly written code should use the HRESULT macros and functions.
//
I think it's pretty clear. I would trust the SDK first, and the doc after that.
We can see SCODE is consistently defined like this in WTypesbase.h (in recent SDKs, in older SDKs, I think it was in another file):
typedef LONG SCODE;
So it's really a 32-bit.
The text is correct; one should be wary of blindly returning failure codes from internal functions, particularly if your code uses a facility code defined elsewhere in the system.
Specifically, at the COM interface function level, you should ensure that the error codes you're returning are meaningful for your interface, and you should remap errors that originate from inside the function to meaningful error codes.
Practically speaking, however, nobody does this, which is why you see bizarre and un-actionable error dialogs like "Unexpected error".

Prevent "Execution was interrupted, reason: internal ObjC exception breakpoint(-3)" on lldb

I've written some code that dumps all ivars of a class into a dictionary in Objective C. This uses valueForKey: to get the data from the class. Sometimes, KVC throws an internal exception that is also captured properly - but this disrupts lldb's feature and all I get is:
error: Execution was interrupted, reason: internal ObjC exception breakpoint(-3)..
The process has been returned to the state before expression evaluation.
There are no breakpoints set. I even tried with -itrue -ufalse as expression options, but it doesn't make a difference. This totally defeats for what I want to use lldb for, and it seems like such a tiny issue. How can I bring clang to simply ignore if there are internal, captured ObjC exceptions while calling a method?
I tried this both from within Xcode, and directly via calling clang from the terminal and connecting to a remote debug server - no difference.
I ran into the same issue. My solution was to wrap a try/catch around it (I only use this code for debugging). See: DALIntrospection.m line #848
NSDictionary *DALPropertyNamesAndValuesMemoryAddressesForObject(NSObject *instance)
Or, if you're running on iOS 7, the private instance method _ivarDescription will print all the ivars for you (similar instance methods are _methodDescription and _shortMethodDescription).
I met the same problem.
My solution is simply alloc init the property before assigning it to the value which caused the crash.
Myself and coworkers ran into this today, and we eventually found a workaround using lldb's python API. The manual way is to run script, and enter:
options = lldb.SBExpressionOptions()
options.SetTrapExceptions(False)
print lldb.frame.EvaluateExpression('ThisThrowsAndCatches()', options).value
This could be packaged into its own command via command script add.
error: Execution was interrupted, reason: internal ObjC exception breakpoint(-3).. The process has been returned to the state before expression evaluation.
Note that lldb specifically points to the internal breakpoint -3 that caused the interruption.
To see the list of all internal breakpoints, run:
(lldb) breakpoint list --internal
...
Kind: ObjC exception
-3: Exception breakpoint (catch: off throw: on) using: name = 'objc_exception_throw', module = libobjc.A.dylib, locations = 1
-3.1: where = libobjc.A.dylib`objc_exception_throw, address = 0x00007ff81bd27be3, unresolved, hit count = 4
Internal breakpoints can be disabled like regular ones:
(lldb) breakpoint disable -3
1 breakpoints disabled.
In case lldb continues getting interrupted you might also need to disable the conditions of the breakpoint:
(lldb) breakpoint disable -3.*
1 breakpoints disabled.
In my particular case there were multiple exception breakpoints I had to disable before I finally got the expected result:
(lldb) breakpoint disable -4 -4.* -5 -5.*
6 breakpoints disabled.

Using system symbol table from VxWorks RTP

I have an existing project, originally implemented as a Vxworks 5.5 style kernel module.
This project creates many tasks that act as a "host" to run external code. We do something like this:
void loadAndRun(char* file, char* function)
{
//load the module
int fd = open (file, O_RDONLY,0644);
loadModule(fdx, LOAD_ALL_SYMBOLS);
SYM_TYPE type;
FUNCPTR func;
symFindByName(sysSymTbl, &function , (char**) &func, &type);
while (true)
{
func();
}
}
This all works a dream, however, the functions that get called are non-reentrant, with global data all over the place etc. We have a new requirement to be able to run multiple instances of these external modules, and my obvious first thought is to use vxworks RTP to provide memory isolation.
However, no matter what I try, I cannot persuade my new RTP project to compile and link.
error: 'sysSymTbl' undeclared (first use in this function)
If I add the correct include:
#include <sysSymTbl.h>
I get:
error: sysSymTbl.h: No such file or directory
and if i just define it extern:
extern SYMTAB_ID sysSymTbl;
i get:
error: undefined reference to `sysSymTbl'
I havent even begun to start trying to stitch in the actual module load code, at the moment I just want to get the symbol lookup working.
So, is the system symbol table accessible from VxWorks RTP applications? Can moduleLoad be used?
EDIT
It appears that what I am trying to do is covered by the Application Programmers Guide in the section on Plugins (section 4.9 for V6.8) (thanks #nos), which is to use dlopen() etc. Like this:
void * hdl= dlopen("pathname",RTLD_NOW);
FUNCPTR func = dlsym(hdl,"FunctionName");
func();
However, i still end up in linker-hell, even when i specify -Xbind-lazy -non-static to the compiler.
undefined reference to `_rtld_dlopen'
undefined reference to `_rtld_dlsym'
The problem here was that the documentation says to specify -Xbind-lazy and -non-static as compiler options. However, these should actually be added to the linker options.
libc.so.1 for the appropriate build target is then required on the target to satisfy the run-time link requirements.