Insert int variable into the class using LLVM PASS or Clang - variables

I want to insert integer variable into all classes using LLVM PASS or Clang.
How to do this ?
For example..
class foo {
int a;
}
I want to insert new value as below.
class foo {
int a;
unsigned int b; // I want to insert this.
}
How can I do this using LLVM PASS or Clang ?
- I much prefer LLVM PASS.
Thank you very much :)

My recommendation would be to use Clang for this, as LLVM operates on bitcode (IR) and the operation you want is very much C++ related, so why not exploit Clang's knowledge about the AST?
With LibTooling you can write stand-alone tool to do exactly what you want. More specifically, use an AST Matcher to find all C++ class declarations (cxxRecordDecl). You can then insert a new FieldDecl in your callback.
More info: LibTooling and LibASTMatchers Tutorial

Related

What do curly braces inside a function's argument list mean in C? [duplicate]

This code:
#include <stdio.h>
int main()
{
void (^a)(void) = ^ void () { printf("test"); } ;
a();
}
Compile without warning with clang -Weverything -pedantic -std=c89 (version clang-800.0.42.1) and print test.
I could not find any information about standard C having lambda, also gcc has it's own syntax for lambda and it would be strange for them do this if a standard solution existed.
This behavior seems to be specfic to newer versions of Clang, and is a language extension called "blocks".
The Wikipedia article on C "blocks" also provides information which supports this claim:
Blocks are a non-standard extension added by Apple Inc. to Clang's implementations of the C, C++, and Objective-C programming languages that uses a lambda expression-like syntax to create closures within these languages. Blocks are supported for programs developed for Mac OS X 10.6+ and iOS 4.0+, although third-party runtimes allow use on Mac OS X 10.5 and iOS 2.2+ and non-Apple systems.
Emphasis above is mine. On Clang's language extension page, under the "Block type" section, it gives a brief overview of what the Block type is:
Like function types, the Block type is a pair consisting of a result value type and a list of parameter types very similar to a function type. Blocks are intended to be used much like functions with the key distinction being that in addition to executable code they also contain various variable bindings to automatic (stack) or managed (heap) memory.
GCC also has something similar to blocks called lexically scoped nested functions. However, there are some key differences also note in the Wikipedia articles on C blocks:
Blocks bear a superficial resemblance to GCC's extension of C to support lexically scoped nested functions. However, GCC's nested functions, unlike blocks, must not be called after the containing scope has exited, as that would result in undefined behavior.
GCC-style nested functions also require dynamic creation of executable thunks when taking the address of the nested function. [...].
Emphasis above is mine.
the C standard does not define lambdas at all but the implementations can add extensions.
Gcc also added an extension in order for the programming languages that support lambdas with static scope to be able to convert them easily toward C and compile closures directly.
Here is an example of extension of gcc that implements closures.
#include <stdio.h>
int(*mk_counter(int x))(void)
{
int inside(void) {
return ++x;
}
return inside;
}
int
main() {
int (*counter)(void)=mk_counter(1);
int x;
x=counter();
x=counter();
x=counter();
printf("%d\n", x);
return 0;
}

cmake CheckSymbolExists for intrinsic

I'd like to check for intel intrinsics such as _mm_popcnt_u32 or _mm_blendv_epi8 using cmake. However the function check_symbol_exists doesn't work properly depending on the compiler. (it works with Clang but not with GCC). Indeed it is documented
If the header files declare the symbol as a function or variable then the symbol must also be available for linking (so intrinsics may not be detected).
Is there a simple way to check for those ?
As the CMake documentation also states:
If the symbol is a type, enum value, or intrinsic it will not be
recognized (consider using CheckTypeSize or CheckCSourceCompiles).
So, try to compile a small example with the Intel intrinsic using CheckCSourceCompiles. E.g.:
include(CheckCSourceCompiles)
check_c_source_compiles("
int main() {
int tmp = _mm_popcnt_u32(0);
return 0;
}
"
HAVE_INTRINISC
)

Force g++ to generate code for unused functions

By default, g++ seems to omit code for unused in-class defined methods. Example from my previous question:
struct Foo {
void bar() {}
void baz() {}
};
int main() {
Foo foo;
foo.bar();
}
When compiling this with g++ -g -O0 - c main.cpp, the resulting object file only contains references to bar and not to baz. Adding --no-deafault-inline to the computer flags does not help either. Any ideas how I can force g++ to generate code for baz as well?
Rationale
The test coverage tool gcov reports unused methods as non-executable if they are omitted from the final executable. However, to get meaningful reports I want them to be reported as executable-but-not-executed. For this, I need to find a way to achieve this without having to alter the original source code.
A portable way to do that is to add some "reference" (in the ordinary sense, not only the C++ one, of the word) to these routines.
This could be something as simple as
typedef void (Foo::*funptr_t) (void);
extern "C" const funptr_t tabfun[] = { &Foo::bar, &Foo::baz };
(I'm declaring the array tabfun as extern "C" to be sure that array is emitted, even if not used)
You might try the -fno-inline argument to GCC. You could also customize GCC (e.g. with MELT) to have such an array added automatically (without touching the source code), but this requires some work.

Strange "selector mangling" in Objective-C method with Boolean arguments

I need to recover method names dynamically, via reflection calls at runtime. But get strange results for some.
My TestClass contains a method like:
- (void)testMethod6_NSRect:(NSRect)a1 int:(int)a2 int:(int)a3 bool:(Boolean)a4 {
...
}
When enumerating the above classes methods using class_copyMethodList() and fetching the method selectors via method_getName(), I get:
"testMethod6_NSRect:int:int:_Bool:"
Thus, the selector was rewritten somehow (by gcc?) to make "_Bool" from "bool". As far as I tested yet, this seems to be done only for a "bool" selector-part - if I change it to int:(int), as in:
- (void)testMethod1_int:(int)a1 int:(int)a2 int:(int)a3 int:(int)a4 {
...
}
I get the expected:
"testMethod1_int:int:int:int:"
Q:
Does anyone know the rules or a pointer to where I could find them for this "selector rewriting", or am I missing something? Is this only done for "bool"?
I also need to know if this behavior is depending on the gcc-version, osx-version or runtime library version.
I am using (gcc --version):
i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)
on a (uname -a)
10.8.0 Darwin Kernel Version 10.8.0:
The problem lies in an ugly piece of preprocessor magic in the C99 standard header <stdbool.h>:
#define bool _Bool
C99 defines a type named _Bool which behaves like C++'s bool type. The define is there to be able to use it in C but with the C++ identifier.
Solution:
#undef bool
Try using BOOL instead of Boolean

avoiding duplicate SWIG boilerplate when using many SWIG-generated modules

When generating an interface module with SWIG, the generated C/C++ file contains a ton of static boilerplate functions. So if one wants to modularize the use of SWIG-generated interfaces by using many separately compiled small interfaces in the same application, there ends up being a lot of bloat due to these duplicate functions.
Using gcc's -ffunction-sections option, and the GNU linker's --icf=safe option (-Wl,--icf=safe to the compiler), one can remove some of the duplication, but by no means all of it (I think it won't coalesce anything that has a relocation in it—which many of these functions do).
My question: I'm wondering if there's a way to remove more of this duplicated boilerplate, ideally one that doesn't rely on GNU-specific compiler/linker options.
In particular, is there a SWIG option/flag/something that says "don't include boilerplate in each output file"? There actually is a SWIG option, -external-runtime that tells it to generate a "boilerplate-only" output file, but no apparent way of suppressing the copy included in each normal output file. [I think this sort of thing should be fairly simple to implement in SWIG, so I'm surprised that it doesn't seem to exist... but I can't seem to find anything documented.]
Here's a small example:
Given the interface file swg-oink.swg for module swt_oink:
%module swt_oink
%{ extern int oinker (const char *x); %}
extern int oinker (const char *x);
... and a similar interface swg-barf.swg for swt_barf:
%module swt_barf
%{ extern int barfer (const char *x); %}
extern int barfer (const char *x);
... and a test main file, swt-main.cc:
extern "C"
{
#include "lua.h"
#include "lualib.h"
#include "lauxlib.h"
extern int luaopen_swt_oink (lua_State *);
extern int luaopen_swt_barf (lua_State *);
}
int main ()
{
lua_State *L = lua_open();
luaopen_swt_oink (L);
luaopen_swt_barf (L);
}
int oinker (const char *) { return 7; }
int barfer (const char *) { return 2; }
and compiling them like:
swig -lua -c++ swt-oink.swg
g++ -c -I/usr/include/lua5.1 swt-oink_wrap.cxx
swig -lua -c++ swt-barf.swg
g++ -c -I/usr/include/lua5.1 swt-barf_wrap.cxx
g++ -c -I/usr/include/lua5.1 swt-main.cc
g++ -o swt swt-main.o swt-oink_wrap.o swt-barf_wrap.o
then the size of each xxx_wrap.o file is about 16KB, of which 95% is boilerplate, and the size of the final executable is roughly the sum of these, about 39K. If one compiles each interface file with -ffunction-sections, and links with -Wl,--icf=safe, the size of the final executable is 34KB, but there's still clearly a lot of duplication (using nm on the executable one can see tons of functions defined multiple times, and looking at their source, it's clear that it would be fine to use a single global definition for most of them).
I'm fairly sure SWIG doesn't have an option for doing this. I'm speculating now, but I think the reason might well be concern about visibility of this for modules built with different versions of SWIG. Imagine the following scenario:
Two libraries X and Y both provide an interface to their code using SWIG. They both opt to make the "SWIG glue" stuff visible across different translation units in order to reduce code size. This will all be well and good if both X and Y are using the same version of SWIG. What happens though if X uses SWIG 1.1 and Y uses SWIG 1.3? Both modules work fine on their own, but depending on how the platform treats shared objects and how the language itself loads them (RTLD_GLOBAL?) some potentially very bad things would happen from the combination of the two modules being used in the same VM.
The penalty of the code duplication is pretty low I suspect - the cost of swapping between VM and native code is typically quite high, which probably dwarfs the slightly reduced instruction cache hits, although it might be interesting to see real benchmarks. On the up side this is code no users ever need to worry about it, since it's all auto generated and all correctly kept with interfaces written for the corresponding version.
I might be a bit late, but here is a workaround:
In SWIG (<= 1.3 ) there is -noruntime command-line option
Since SWIG 2.0 -noruntime was deprecated, so now one should pass -DSWIG_NOINCLUDE to the C preprocessor - not to the swig itself
I am completely not sure that this is correct, but it at least works for me. I am going to clarify this question in the SWIG's mailing list.