I want to put some performance impacting function calls behind a feature gate in my code. If the feature isn't enabled, I was thinking of just having an empty implementation of that function implemented instead. That way, hopefully, the Rust complier can completely remove that from the function.
Something like this:
// Included if feature is enabled
fn foo() {
// ...
}
// Included if the feature is disabled
fn foo() {}
// Performance critical code
for i in 1..1000000000 {
// ...
foo();
}
Would the call to foo() get optimized away if it is empty?
Just try it in the amazing Compiler Explorer :)
The result assembly for your example is:
example::main:
push rbp
mov rbp, rsp
mov eax, 1
.LBB0_1:
xor ecx, ecx
cmp eax, 1000000000
setl cl
add ecx, eax
cmp eax, 1000000000
mov eax, ecx
jl .LBB0_1
pop rbp
ret
As you can see there is no call instruction and foo() isn't called at all. However, you might wonder why the loop isn't removed, as it doesn't have an effect on the outside world. I can just assume that sometimes those loops are in fact used to waste time in some sense. If you decrease the counter to 100, the loop is completely removed.
Anyway: Yes, the optimizer will remove empty functions!
According to my check with release mode on current stable Rust, the following code:
fn foo() {}
fn main() {
for _ in 1..1000000000 {
foo();
}
println!(); // a side effect so that the whole program is not optimized away
}
Compiles to the same assembly as if the loop was empty:
for _ in 1..1000000000 {}
Related
In a piece of inline assembly code, what is the best way to load the address of a label into a register?
I can do this easily in x86 or ARM. E.g.
lea my_label, %rax
...
my_label:
...
In PPC, should I use $PC and relative address to compute the address of the label? How to do that?
Thanks
OK, it's probably more complex than I thought. This might work:
void* f(void)
{
void* var_reg;
asm volatile(
"lis %[var_reg], my_label#ha\n"
"la %[var_reg], my_label#l(%[var_reg])\n"
"my_label:\n"
: [var_reg]"=&r"(var_reg)
);
return var_reg;
}
Is it possible to print/log the implementation of a certain class method at runtime to the console screen? I am assuming the log will be in assembly which is fine by me.
You could add a breakpoint at the start of the line, step through line by line and call "disassemble" in the debugger:
One line of my code (with private information replaced) for example produced this:
-(void) method
{
__weak typeof(self) selfReference = self; // <-- This call was disassembled.
...
Project`-[Class method] + 32 at Class.m:176:
-> 0x9c5cc: ldr r1, [sp, #304]
0x9c5ce: add r0, sp, #296
0x9c5d0: blx 0x33abec ; symbol stub for: objc_initWeak
0x9c5d4: ldr r1, [sp, #304]
Edit
I can't verify it's working perfectly since I'm not too handy with assembly, but you can use the debugger (Clang I'm using) to just call
disassemble -n methodName
This claims to
Disassemble entire contents of the given function name.
NB: I did this with a breakpoint at the start of the method I was using to test
Try creating a symbolic breakpoint to stop at the method in question:
in this C-code fragment:
void func(void)
{
int x=10;
if (x>2)
{
int y=2;
//block statement
{
int m=12;
}
}
else
{
int z=5;
}
}
when does x,y,z and m get allocated and deallocated from func stack frame ?
The actual allocation depends on your compiler, but many compilers allocate space on the stack at the beginning of the function and free it just before the function returns. Note that this is separate from when the variables are actually accessible though, which is just till the end of the block they are defined in.
In your example, with optimization turned on, the compiler is likely not to allocate any space on the stack for your variables and simply return, since it can determine at compile time that the function doesn't actually have any effect.
According to C++ rules, you should think that every local variable is destroyed in the end of its block. This si the time when destructor is called. However, compiler may decide to allocate/deallocate all local variables together in the beginning/end of the function, this is what VC++ compiler does:
void func(void)
{
001413B0 push ebp
001413B1 mov ebp,esp
001413B3 sub esp,0F0h
001413B9 push ebx
001413BA push esi
001413BB push edi
001413BC lea edi,[ebp-0F0h]
001413C2 mov ecx,3Ch
001413C7 mov eax,0CCCCCCCCh
001413CC rep stos dword ptr es:[edi]
int x=10;
001413CE mov dword ptr [x],0Ah
if (x>2)
001413D5 cmp dword ptr [x],2
001413D9 jle func+3Bh (1413EBh)
{
int y=2;
001413DB mov dword ptr [y],2
//block statement
{
int m=12;
001413E2 mov dword ptr [m],0Ch
}
}
else
001413E9 jmp func+42h (1413F2h)
{
int z=5;
001413EB mov dword ptr [z],5
}
}
But these are implementation details, compiler is free to adjust stack pointer by another way.
So, actual stack pointer adjustment is not defined, but constructor/destructor calls are done exactly according to the function internal blocks. And of course, you cannot use a variable outside its block - this is not compiled. Though stack space may be allocated at this point.
In case of automatic variables (variables declared within a block of code are automatic by default) memory is allocated automatically upon entry to a block and freed automatically upon exit from the block (if you're using c with gcc). You can check this source or this source for more information.
I try to excute assembler inline with icc in msasm:
int main (void)
{
__asm{
mov eax, 5h; //works
push eax; // after shell command /opt/intel/bin/icc -use_msasm asm.c:
// asm.c(7): (col. 5) error: Unsupported instruction form in asm
// instruction push.
//pop ebp; // the same
};
printf("success!\n");
return 1;
}
Does anybody know why icc doesn`t accept push and pop?
Thanks in advance!
You should use x64 version of registers instead.
So the correct version should like this:
__asm{
mov rax, 5h;
push rax;
};
Also, pay attention to architecture differences when dealing with pointers, 0x8*******, etc. You should never use batch Find and Replace without reading your inline first.
In Objective-C's low-level runtime headers (/usr/include/objc), there is an objc-exceptions.h file. It would seem this is how #try/#catch is implemented by the ObjC compiler.
I am trying to invoke these functions manually (for experimentations with the ObjC runtime and implementation) in order to catch an "unrecognized selector sent to class" exception.
So basically, all I'm looking for is an example of how to do a #try/#catch using the low-level runtime functions. Thanks in advance!
So you want to know how the runtime does exception handling?
Prepare to be disappointed.
Because it doesn't. ObjC doesn't have an exception-handling ABI, only SPI which you've already found. No doubt you've also discovered that the Objective-C exception ABI is actually the exact same one as the C++ exception handling ABI. To that end, let's get started with some code.
#include <Foundation/Foundation.h>
int main(int argc, char **argv) {
#try {
#throw [NSException exceptionWithName:#"ExceptionalCircumstances" reason:#"Drunk on power" userInfo:nil];
} #catch(...) {
NSLog(#"Catch");
} #finally {
NSLog(#"Finally");
}
}
Run through clang with -ObjC -O3 (and stripped of a disgusting amount of debug information) we get this:
_main: ## #main
push rbp
mov rbp, rsp
push r14
push rbx
mov rdi, qword ptr [rip + L_OBJC_CLASSLIST_REFERENCES_$_]
mov rsi, qword ptr [rip + L_OBJC_SELECTOR_REFERENCES_]
lea rdx, qword ptr [rip + L__unnamed_cfstring_]
lea rcx, qword ptr [rip + L__unnamed_cfstring_2]
xor r8d, r8d
call qword ptr [rip + _objc_msgSend#GOTPCREL]
mov rdi, rax
call _objc_exception_throw
LBB0_2:
mov rdi, rax
call _objc_begin_catch
lea rdi, qword ptr [rip + L__unnamed_cfstring_4]
xor eax, eax
call _NSLog
call _objc_end_catch
xor ebx, ebx
LBB0_8:
lea rdi, qword ptr [rip + L__unnamed_cfstring_6]
xor eax, eax
call _NSLog
test bl, bl
jne LBB0_10
LBB0_11:
xor eax, eax
pop rbx
pop r14
pop rbp
ret
LBB0_5:
mov rbx, rax
call _objc_end_catch
jmp LBB0_7
LBB0_6:
mov rbx, rax
LBB0_7:
mov rdi, rbx
call _objc_begin_catch
mov bl, 1
jmp LBB0_8
LBB0_12:
mov r14, rax
test bl, bl
je LBB0_14
jmp LBB0_13
LBB0_10:
call _objc_exception_rethrow
jmp LBB0_11
LBB0_16: ## %.thread
mov r14, rax
LBB0_13:
call _objc_end_catch
LBB0_14:
mov rdi, r14
call __Unwind_Resume
LBB0_15:
call _objc_terminate
If you compile it with ObjC++ nothing changes. (Well, that's not entirely true. The last _objc_terminate turns into a jump into clang's personal ___clang_call_terminate routine). Anyhow, this code can be divided into 3 important sections. The first is from _main to the start of LBB0_2, or where our try block happens. Because we're blatantly throwing an exception and catching it in our try block, the compiler has gone ahead and removed the branch around LBB0_2 and moved straight to the catch handlers. At this point Objective-C, or more accurately CoreFoundation, has set up an exception object for us and libC++ has begun searching for an exception handler during the requisite unwinding phase.
The second important block of code is from LBB0_2 to the end of LBB0_11 where our catch and finally blocks live. Because all is well, all the code below this is dead (and hopefully gets stripped in release), but let's imagine it wasn't.
The third part is from LBB0_8 on down where the compiler would have emitted a jump to from the NSLog in LBB0_2 if we'd done something stupid like, say, tried not to catch our exception. This handler instead flips a bit after calling into the objc_begin_catch that causes us to branch around the ret and move onto the objc_exception_rethrow() that tells the unwind handler that we've dropped the ball and to continue searching for handlers somewhere else. Of course, we're main, so there are no other handlers, and std::terminate gets invoked as we leave.
All this to say you're gonna have a bad time if you want to try to write this stuff out by hand. All the __cxa_* and ObjC SPI functions throw around exception objects in ways you can't rely on and (rather pessimistically many) handlers are emitted in a very tight order to make sure the C++ ABI contract is fulfilled because if it isn't the spec mandates std::terminate be called. If you'd like to take an active listening role, you are allowed to redefine the exception handling stuff with your own functions and Objective-C has objc_setUncaughtExceptionHandler, objc_setExceptionMatcher objc_setExceptionPreprocessor.