Example of how Objective-C's #try-#catch implementation is executed at runtime? - objective-c

In Objective-C's low-level runtime headers (/usr/include/objc), there is an objc-exceptions.h file. It would seem this is how #try/#catch is implemented by the ObjC compiler.
I am trying to invoke these functions manually (for experimentations with the ObjC runtime and implementation) in order to catch an "unrecognized selector sent to class" exception.
So basically, all I'm looking for is an example of how to do a #try/#catch using the low-level runtime functions. Thanks in advance!

So you want to know how the runtime does exception handling?
Prepare to be disappointed.
Because it doesn't. ObjC doesn't have an exception-handling ABI, only SPI which you've already found. No doubt you've also discovered that the Objective-C exception ABI is actually the exact same one as the C++ exception handling ABI. To that end, let's get started with some code.
#include <Foundation/Foundation.h>
int main(int argc, char **argv) {
#try {
#throw [NSException exceptionWithName:#"ExceptionalCircumstances" reason:#"Drunk on power" userInfo:nil];
} #catch(...) {
NSLog(#"Catch");
} #finally {
NSLog(#"Finally");
}
}
Run through clang with -ObjC -O3 (and stripped of a disgusting amount of debug information) we get this:
_main: ## #main
push rbp
mov rbp, rsp
push r14
push rbx
mov rdi, qword ptr [rip + L_OBJC_CLASSLIST_REFERENCES_$_]
mov rsi, qword ptr [rip + L_OBJC_SELECTOR_REFERENCES_]
lea rdx, qword ptr [rip + L__unnamed_cfstring_]
lea rcx, qword ptr [rip + L__unnamed_cfstring_2]
xor r8d, r8d
call qword ptr [rip + _objc_msgSend#GOTPCREL]
mov rdi, rax
call _objc_exception_throw
LBB0_2:
mov rdi, rax
call _objc_begin_catch
lea rdi, qword ptr [rip + L__unnamed_cfstring_4]
xor eax, eax
call _NSLog
call _objc_end_catch
xor ebx, ebx
LBB0_8:
lea rdi, qword ptr [rip + L__unnamed_cfstring_6]
xor eax, eax
call _NSLog
test bl, bl
jne LBB0_10
LBB0_11:
xor eax, eax
pop rbx
pop r14
pop rbp
ret
LBB0_5:
mov rbx, rax
call _objc_end_catch
jmp LBB0_7
LBB0_6:
mov rbx, rax
LBB0_7:
mov rdi, rbx
call _objc_begin_catch
mov bl, 1
jmp LBB0_8
LBB0_12:
mov r14, rax
test bl, bl
je LBB0_14
jmp LBB0_13
LBB0_10:
call _objc_exception_rethrow
jmp LBB0_11
LBB0_16: ## %.thread
mov r14, rax
LBB0_13:
call _objc_end_catch
LBB0_14:
mov rdi, r14
call __Unwind_Resume
LBB0_15:
call _objc_terminate
If you compile it with ObjC++ nothing changes. (Well, that's not entirely true. The last _objc_terminate turns into a jump into clang's personal ___clang_call_terminate routine). Anyhow, this code can be divided into 3 important sections. The first is from _main to the start of LBB0_2, or where our try block happens. Because we're blatantly throwing an exception and catching it in our try block, the compiler has gone ahead and removed the branch around LBB0_2 and moved straight to the catch handlers. At this point Objective-C, or more accurately CoreFoundation, has set up an exception object for us and libC++ has begun searching for an exception handler during the requisite unwinding phase.
The second important block of code is from LBB0_2 to the end of LBB0_11 where our catch and finally blocks live. Because all is well, all the code below this is dead (and hopefully gets stripped in release), but let's imagine it wasn't.
The third part is from LBB0_8 on down where the compiler would have emitted a jump to from the NSLog in LBB0_2 if we'd done something stupid like, say, tried not to catch our exception. This handler instead flips a bit after calling into the objc_begin_catch that causes us to branch around the ret and move onto the objc_exception_rethrow() that tells the unwind handler that we've dropped the ball and to continue searching for handlers somewhere else. Of course, we're main, so there are no other handlers, and std::terminate gets invoked as we leave.
All this to say you're gonna have a bad time if you want to try to write this stuff out by hand. All the __cxa_* and ObjC SPI functions throw around exception objects in ways you can't rely on and (rather pessimistically many) handlers are emitted in a very tight order to make sure the C++ ABI contract is fulfilled because if it isn't the spec mandates std::terminate be called. If you'd like to take an active listening role, you are allowed to redefine the exception handling stuff with your own functions and Objective-C has objc_setUncaughtExceptionHandler, objc_setExceptionMatcher objc_setExceptionPreprocessor.

Related

Do empty functions get optimized away in Rust?

I want to put some performance impacting function calls behind a feature gate in my code. If the feature isn't enabled, I was thinking of just having an empty implementation of that function implemented instead. That way, hopefully, the Rust complier can completely remove that from the function.
Something like this:
// Included if feature is enabled
fn foo() {
// ...
}
// Included if the feature is disabled
fn foo() {}
// Performance critical code
for i in 1..1000000000 {
// ...
foo();
}
Would the call to foo() get optimized away if it is empty?
Just try it in the amazing Compiler Explorer :)
The result assembly for your example is:
example::main:
push rbp
mov rbp, rsp
mov eax, 1
.LBB0_1:
xor ecx, ecx
cmp eax, 1000000000
setl cl
add ecx, eax
cmp eax, 1000000000
mov eax, ecx
jl .LBB0_1
pop rbp
ret
As you can see there is no call instruction and foo() isn't called at all. However, you might wonder why the loop isn't removed, as it doesn't have an effect on the outside world. I can just assume that sometimes those loops are in fact used to waste time in some sense. If you decrease the counter to 100, the loop is completely removed.
Anyway: Yes, the optimizer will remove empty functions!
According to my check with release mode on current stable Rust, the following code:
fn foo() {}
fn main() {
for _ in 1..1000000000 {
foo();
}
println!(); // a side effect so that the whole program is not optimized away
}
Compiles to the same assembly as if the loop was empty:
for _ in 1..1000000000 {}

Why is there no compiler warning when I add a try finally to a method missing a return type [duplicate]

Under normal conditions, when a block is declared to return a value, but no return statement actually appears in the block, Clang fails to compile it with an error (of a missing return value).
However, this breaks when that block contains #try{} #catch(...){} or #try{} #finally{}.
Does anyone know why?
The way I found this was when using #weakify() and #strongify() in RACExtScope in ReactiveCocoa, in one block I forgot to return a signal. But the compiler didn't warn me and crashed on runtime, which lead me to dig into it, preprocess the code and find that this causes it. Any explanation would be very much appreciated, I honestly don't know why this would happen, thanks!
I also created a gist, in case someone had a comment/suggestion: https://gist.github.com/czechboy0/11358741
int main(int argc, const char * argv[])
{
id (^iReturnStuff)() = ^id() {
#try{} #finally{}
//if you comment out line 4, Clang will not compile this.
//if you leave it like this, Clang will compile and run this, even though
//there's no value being returned.
//is there something special in #try{} that turns off compiler errors?
};
return 0;
}
Clang's block specification makes brief mention of control flow in a block. I've reproduced it here (emphasis mine)
The compound statement of a Block is treated much like a function body
with respect to control flow in that goto, break, and continue do not
escape the Block. Exceptions are treated normally in that when thrown
they pop stack frames until a catch clause is found.
Reading through a little further, you really get the sense that exceptions in Objective-C are downright weird. From the section on exceptions
The standard Cocoa convention is that exceptions signal programmer
error and are not intended to be recovered from. Making code
exceptions-safe by default would impose severe runtime and code size
penalties on code that typically does not actually care about
exceptions safety. Therefore, ARC-generated code leaks by default on
exceptions, which is just fine if the process is going to be
immediately terminated anyway. Programs which do care about recovering
from exceptions should enable the option.
From the above, one could reasonably deduce that the ObjC exceptions specification is so fragile or malleable that not even the compiler writers can guarantee stable code against it, therefore they just disabled all reasonable termination checks in once #try-#catch are encountered.
This can also be seen in the code generated by Clang with and without the try-catches. First, without
___main_block_invoke:
pushq %rbp
movq %rsp, %rbp
movabsq $0, %rax
movq %rdi, -8(%rbp)
movq %rdi, -16(%rbp)
popq %rbp
ret
This is pretty simple x86 that pushes a new stack frame, moves 0 (nil) into the return register, then returns. Now, with the try-catch block:
___main_block_invoke:
pushq %rbp
movq %rsp, %rbp
subq $64, %rsp
movq %rdi, -16(%rbp)
movq %rdi, -24(%rbp)
movb $0, -25(%rbp)
movl -32(%rbp), %eax
testb $1, -25(%rbp)
movl %eax, -48(%rbp) ## 4-byte Spill
jne LBB1_1
jmp LBB1_3
LBB1_1:
callq _objc_exception_rethrow
jmp LBB1_2
LBB1_2:
LBB1_3:
movl -48(%rbp), %eax ## 4-byte Reload
movl %eax, -32(%rbp)
movq -8(%rbp), %rdi
addq $64, %rsp
popq %rbp
jmp _objc_autoreleaseReturnValue ## TAILCALL
LBB1_4:
movl %edx, %ecx
movq %rax, -40(%rbp)
movl %ecx, -44(%rbp)
testb $1, -25(%rbp)
jne LBB1_5
jmp LBB1_7
LBB1_5:
callq _objc_end_catch
jmp LBB1_6
LBB1_6:
jmp LBB1_7
LBB1_7:
jmp LBB1_8
LBB1_8:
movq -40(%rbp), %rdi
callq __Unwind_Resume
LBB1_9:
movq %rdx, -56(%rbp) ## 8-byte Spill
movq %rax, -64(%rbp) ## 8-byte Spill
callq _objc_terminate
Besides the more complicated function proem, notice the lack of a proper ret. The function still has two exit points,
jmp _objc_autoreleaseReturnValue
and
call _objc_terminate
The first is a relatively new feature of the language where, when in the tailcall position, it can be used to omit -autoreleases in favor of drawing upon thread-local variables by examining the code that came before it. The second begins immediate termination of the process and jumps into the C++ exception handling mechanism. What this means is that the function does, in fact, have the requisite exit points to keep CLANG from complaining about missing return statements. Unfortunately, what it also means is that CLANG's forgoing of messing with the ObjC exception mechanism can potentially message garbage, as you've seen. This is one of the reasons EXTScope has switched to using the #autoreleasepool directive to eat that sigil.
This is a bug in clang, this one: https://llvm.org/PR46693. It also happens without blocks -- #try just confuses clang.
Trunk clang will flag your example if you replace the #finally with a #catch:
% cat foo.mm
int main(int argc, const char * argv[])
{
id (^iReturnStuff)() = ^id() {
#try{} #catch(id i){}
};
}
% out/gn/bin/clang -c foo.mm
foo.mm:5:5: error: non-void block does not return a value
};
^
Hopefully it'll be fixed for #finally eventually too.
See also this question.

Printing Objective-C method implementation at runtime

Is it possible to print/log the implementation of a certain class method at runtime to the console screen? I am assuming the log will be in assembly which is fine by me.
You could add a breakpoint at the start of the line, step through line by line and call "disassemble" in the debugger:
One line of my code (with private information replaced) for example produced this:
-(void) method
{
__weak typeof(self) selfReference = self; // <-- This call was disassembled.
...
Project`-[Class method] + 32 at Class.m:176:
-> 0x9c5cc: ldr r1, [sp, #304]
0x9c5ce: add r0, sp, #296
0x9c5d0: blx 0x33abec ; symbol stub for: objc_initWeak
0x9c5d4: ldr r1, [sp, #304]
Edit
I can't verify it's working perfectly since I'm not too handy with assembly, but you can use the debugger (Clang I'm using) to just call
disassemble -n methodName
This claims to
Disassemble entire contents of the given function name.
NB: I did this with a breakpoint at the start of the method I was using to test
Try creating a symbolic breakpoint to stop at the method in question:

lifetime of variables defined inside block statement in C

in this C-code fragment:
void func(void)
{
int x=10;
if (x>2)
{
int y=2;
//block statement
{
int m=12;
}
}
else
{
int z=5;
}
}
when does x,y,z and m get allocated and deallocated from func stack frame ?
The actual allocation depends on your compiler, but many compilers allocate space on the stack at the beginning of the function and free it just before the function returns. Note that this is separate from when the variables are actually accessible though, which is just till the end of the block they are defined in.
In your example, with optimization turned on, the compiler is likely not to allocate any space on the stack for your variables and simply return, since it can determine at compile time that the function doesn't actually have any effect.
According to C++ rules, you should think that every local variable is destroyed in the end of its block. This si the time when destructor is called. However, compiler may decide to allocate/deallocate all local variables together in the beginning/end of the function, this is what VC++ compiler does:
void func(void)
{
001413B0 push ebp
001413B1 mov ebp,esp
001413B3 sub esp,0F0h
001413B9 push ebx
001413BA push esi
001413BB push edi
001413BC lea edi,[ebp-0F0h]
001413C2 mov ecx,3Ch
001413C7 mov eax,0CCCCCCCCh
001413CC rep stos dword ptr es:[edi]
int x=10;
001413CE mov dword ptr [x],0Ah
if (x>2)
001413D5 cmp dword ptr [x],2
001413D9 jle func+3Bh (1413EBh)
{
int y=2;
001413DB mov dword ptr [y],2
//block statement
{
int m=12;
001413E2 mov dword ptr [m],0Ch
}
}
else
001413E9 jmp func+42h (1413F2h)
{
int z=5;
001413EB mov dword ptr [z],5
}
}
But these are implementation details, compiler is free to adjust stack pointer by another way.
So, actual stack pointer adjustment is not defined, but constructor/destructor calls are done exactly according to the function internal blocks. And of course, you cannot use a variable outside its block - this is not compiled. Though stack space may be allocated at this point.
In case of automatic variables (variables declared within a block of code are automatic by default) memory is allocated automatically upon entry to a block and freed automatically upon exit from the block (if you're using c with gcc). You can check this source or this source for more information.

EXC_BAD_ACCESS in assembly function

I am trying to write a function in assembly that is callable from Objective-C code. I've gotten simple results by setting %rax and returning directly, but when I try to use the stack to store local variables, I get EXC_BAD_ACCESS. Could someone take a look at this and tell me what's going wrong? My assembly looks like this:
.global _fn
_fn:
pushq %rbp
movq %rsp, %rbp
subq 0x8, %rsp
addq 0x8, %rsp
popq %rbp
ret
Xcode dumps this and indicates the crash is at sub 0x8,%rsp when I call fn from main:
0x0000000100020000 <+0000> push %rbp
0x0000000100020001 <+0001> mov %rsp,%rbp
0x0000000100020004 <+0004> sub 0x8,%rsp
0x000000010002000c <+0012> add 0x8,%rsp
0x0000000100020014 <+0020> pop %rbp
0x0000000100020015 <+0021> retq
The mere subtraction of 8 from rsp should not cause an exception.
Most likely you need to prefix the constants with the dollar sign. If you don't, (g)as will treat those numbers as memory operands at the corresponding addresses.
And accessing memory at address 8 is usually as good on the x86 platform as a NULL pointer dereference.