I have two LLVM functions that use an API. Something like this:
define void #funcA() {
...
call void #api_set(i32 %someValue)
...
}
define void #funcB() {
...
%sameValue = call i32 #api_get()
...
}
declare void #api_set(i32)
declare i32 #api_get()
The functions api_set and api_get are black boxes to the optimizer.
However, I know that for these particular calls, %someValue in funcA will always be equal to %sameValue in funcB.
When LLVM optimizes these functions, the constant propagation pass may find that %someValue can be replaced with a constant. Is there a way to tell LLVM it is allowed to also replace %sameValue when it replaces %someValue? Perhaps an attribute/metadata/etc? Something simpler that I'm missing?
I have already made a pass that checks for this, and it works, but the pass has such a simple specification that I imagine there is a simpler/more lightweight way to achieve this.
P.S. If there is a way to do this in C that would be helpful too.
Related
How can i make an arraylist of functions, and call each function easily? I have already tried making an ArrayList<Function<Unit>>, but when i tried to do this:
functionList.forEach { it }
and this:
for(i in 0 until functionList.size) functionList[i]
When i tried doing this: it() and this: functionList[i](), but it wouldn't even compile in intellij. How can i do this in kotlin? Also, does the "Unit" in ArrayList<Function<Unit>> mean return value or parameters?
Just like this:
val funs:List<() -> Unit> = listOf({}, { println("fun")})
funs.forEach { it() }
The compiler can successfully infer the type of funs here which is List<() -> Unit>. Note that () -> Unit is a function type in Kotlin which represents a function that does not take any argument and returns Unit.
I think there are two problems with the use of the Function interface here.
The first problem is that it doesn't mean what you might think. As I understand it, it's a very general interface, implemented by all functions, however many parameters they take (or none). So it doesn't have any invoke() method. That's what the compiler is complaining about.
Function has several sub-interfaces, one for each 'arity' (i.e. one for each number of parameters): Function0 for functions that take no parameters, Function1 for functions taking one parameter, and so on. These have the appropriate invoke() methods. So you could probably fix this by replacing Function by Function0.
But that leads me on to the second problem, which is that the Function interfaces aren't supposed to be used this way. I think they're mainly for Java compatibility and/or for internal use by the compiler.
It's usually much better to use the Kotlin syntax for function types: (P1, P2...) -> R. This is much easier to read, and avoids these sorts of problems.
So the real answer is probably to replace Function<Unit> by () -> Unit.
Also, in case it's not clear, Kotlin doesn't have a void type. Instead, it has a type called Unit, which has exactly one value. This might seem strange, but makes better sense in the type system, as it lets the compiler distinguish functions that return without an explicit value, from those which don't return. (The latter might always throw an exception or exit the process. They can be defined to return Nothing -- a type with no values at all.)
I'm optimizing a very time-critical CUDA kernel. My application accepts a wide range of switches that affect the behavior (for instance, whether to use 3rd or 5th order derivative). Consider as an approximation a set of 50 switches, where every switch is an integer variable (a bool sometimes, or a float, but this case is not so relevant for this question).
All these switches are constant during the execution of the application. Most of these switches are run-time and I store them in constant memory, so to exploit the caching mechanism. Some other switches can be compile-time and the customer is fine with having to re-compile the application if he wants to change the value in the switch. A very simple example could be:
__global__ void mykernel(const float* in, float *out)
{
for ( /* many many times */ )
if (compile_time_switch)
do_this(in, out);
else
do_that(in, out);
}
Assume that do_this and do_that are compute-bound and very cheap, that I optimize the for loop so that its overhead is negligible, that I have to place the if inside the iteration. If the compiler recognizes that compile_time_switch is static information it can optimize out the call to the "wrong" function and create code that is just as optimized as if the if weren't there. Now the real question:
In which ways can I provide the compiler with the static value of this switch? I see two such ways, listed below, but none of them work for me. What other possibilities remain?
Template parameters
Providing a template parameter enables this static optimization.
template<int compile_time_switch>
__global__ void mykernel(const float* in, float *out)
{
for ( /* many many times */ )
if (compile_time_switch)
do_this(in, out);
else
do_that(in, out);
}
This simple solution does not work for me, since I don't have direct access to the code that calls the kernel.
Static members
Consider the following struct:
struct GlobalParameters
{
static const bool compile_time_switch = true;
};
Now GlobalParameters::compile_time_switch contains the static information as I want it, and that compiler would be able to optimize the kernel. Unfortunately, CUDA does not support such static members.
EDIT: the last statement is apparently wrong. the definition of the struct is of course legit and you are able to use the static member GlobalParameters::compile_time_switch in device code. The compiler inlines the variable, so that the final code will directly contain the value, not a run-time variable access, which is the behavior you would expect from an optimizer compiler. So, the second options is actually suitable.
I consider my problem solved both thanks to this fact and to kronos' answer. However, I'm still looking for other alternative methods to provide compile-time information to the compiler.
Yor third options are preprocessor definitions:
#define compile_time_switch 1
__global__ void mykernel(const float* in, float *out)
{
for ( /* many many times */ )
if (compile_time_switch)
do_this(in, out);
else
do_that(in, out);
}
The preprocessor will discard the else case compleatly and the compiler has nothing to optimize in his dead code elemination pass, because there is no dead code.
Furthermore, you can specify the definition with the -D comand line switch and (I think) any by nvidia supported compiler will accept -D (msvc may use a different switch).
Assuming I'm including a header file in my precompiled header that includes a bunch of inline functions to be used as helpers wherever needed in any of the project's TUs -- what would be the correct way to write those inlines?
1) as static inlines? e.g.:
static inline BOOL doSomethingWith(Foo *bar)
{
// ...
}
2) as extern inlines? e.g.:
in Shared.h
extern inline BOOL doSomethingWith(Foo *bar);
in Shared.m
inline BOOL doSomethingWith(Foo *bar)
{
// ...
}
My intention with inlines is to:
make the code less verbose by encapsulating common instructions
to centralize the code they contain to aid with future maintenance
to use them instead of macros for the sake of type safety
to be able to have return values
So far I have only seen variant 1) in the wild.
I have read (sadly can't find it anymore) that variant 1) does not accurately move the inline function's body into the callers but rather creates a new function, and that only extern inline ensures that kind of behavior.
Skipping whether you should be inlining at all for the reasons you give, the standard way to inline in Cocoa is to use the predefined macro NS_INLINE - use it either in the source file using the function or in an imported header. So your example becomes:
NS_INLINE BOOL doSomethingWith(Foo *bar)
For GCC/Clang the macro uses static and the always_inline attribute.
Most (maybe all) compilers won't inline extern inline as they operate on a single compile unit at a time - a source file along with all its includes.
make the code less verbose by encapsulating common instructions
Non-inline functions do that as well...
to centralize the code they contain to aid with future maintenance
then you should have non-inline functions, don't you think?
to use them instead of macros for the sake of type safety
to be able to have return values
those seem OK to me.
Well, when I write inline functions, I usually make them static - that's typically how it's done. (Else you can get all sorts of mysterious linker errors if you're not careful enough.) It's important to note that inline does not affect the visibility of a function, so if you want to use it in multiple files, you need the static modifier.
An extern inline function does not make a lot of sense. If you have only one implementation of the function, that defeats the purpose of inline. If you use link-time optimization (where cross-file inlining is done by the linker), then the inline hint for the compiler is not very useful anyway.
only extern inline insures that kind of behavior.
It doesn't "ensure" anything at all. There's no portable way to force inlining - in fact, most modern compilers ignore the keyword completely and use heuristics instead to decide when to inline. In GNU C, you can force inlining using the __attribute__((always_inline)) attribute, but unless you have a very good reason for that, you shouldn't be doing it.
If you take this method call for instance(from other post)
- (int)methodName:(int)arg1 withArg2:(int)arg2
{
// Do something crazy!
return someInt;
}
Is withArg2 actually ever used for anything inside this method ?
withArg2 is part of the method name (it is usually written without arguments as methodName:withArg2: if you want to refer to the method in the documentation), so no, it is not used for anything inside the method.
As Tamás points out, withArg2 is part of the method name. If you write a function with the exact same name in C, it will look like this:
int methodNamewithArg2(int arg1, int arg2)
{
// Do something crazy!
return someInt;
}
Coming from other programming languages, the Objective-C syntax at first might appear weird, but after a while you will start to understand how it makes your whole code more expressive. If you see the following C++ function call:
anObject.subString("foobar", 2, 3, true);
and compare it to a similar Objective-C method invocation
[anObject subString:"foobar" startingAtCharacter:2 numberOfCharacters:3 makeResultUpperCase:YES];
it should become clear what I mean. The example may be contrived, but the point is to show that embedding the meaning of the next parameter into the method name allows to write very readable code. Even if you choose horrible variable names or use literals (as in the example above), you will still be able to make sense of the code without having to look up the method documentation.
You would call this method as follows:
int i=[self methodName:arg1 withArg2:arg2];
This is just iOs's way of making the code easier to read.
Is it possible to call a function by name in Objective C? For instance, if I know the name of a function ("foo"), is there any way I can get the pointer to the function using that name and call it? I stumbled across a similar question for python here and it seems it is possible there. I want to take the name of a function as input from the user and call the function. This function does not have to take any arguments.
For Objective-C methods, you can use performSelector… or NSInvocation, e.g.
NSString *methodName = #"doSomething";
[someObj performSelector:NSSelectorFromString(methodName)];
For C functions in dynamic libraries, you can use dlsym(), e.g.
void *dlhandle = dlopen("libsomething.dylib", RTLD_LOCAL);
void (*function)(void) = dlsym(dlhandle, "doSomething");
if (function) {
function();
}
For C functions that were statically linked, not in general. If the corresponding symbol hasn’t been stripped from the binary, you can use dlsym(), e.g.
void (*function)(void) = dlsym(RTLD_SELF, "doSomething");
if (function) {
function();
}
Update: ThomasW wrote a comment pointing to a related question, with an answer by dreamlax which, in turn, contains a link to the POSIX page about dlsym. In that answer, dreamlax notes the following with regard to converting a value returned by dlsym() to a function pointer variable:
The C standard does not actually define behaviour for converting to and from function pointers. Explanations vary as to why; the most common being that not all architectures implement function pointers as simple pointers to data. On some architectures, functions may reside in an entirely different segment of memory that is unaddressable using a pointer to void.
With this in mind, the calls above to dlsym() and the desired function can be made more portable as follows:
void (*function)(void);
*(void **)(&function) = dlsym(dlhandle, "doSomething");
if (function) {
(*function)();
}