Compile and run C code using clang API - api

I would like to use the clang/llvm APIs to compile a c-function, defined in a string and immediately execute it.
Something like:
void main() {
std::string codestr = "int foo(int bar) { return bar * 2; }"
clang::??? *code = clang::???.compile(codestr);
int result = code->call("foo", 5);
}
I am looking for tutorials, but what I found so far does not quite match my goal or does not work, because it refers to an outdated version of LLVM.
Currently, I am using LLVM 3.5.
Does anyone have a good tutorial at hand?

I followed this blog post with good results. The clang API has changed, so you may have to make adjustments. With LLVM 3.6.1, I got good results with the following code:
llvm::Module* compile(const char* filename) {
clang::CompilerInstance compiler;
clang::CompilerInvocation* invocation = new clang::CompilerInvocation();
llvm::IntrusiveRefCntPtr<clang::DiagnosticIDs> DiagID(new clang::DiagnosticIDs());
auto diagOptions = new clang::DiagnosticOptions();
clang::DiagnosticsEngine Diags(DiagID, diagOptions,
new clang::TextDiagnosticPrinter(llvm::errs(), diagOptions));
std::vector<const char *> arguments = {filename};
clang::CompilerInvocation::CreateFromArgs(*invocation,
&*arguments.begin(), &*arguments.end(),
Diags);
compiler.setInvocation(invocation);
compiler.setDiagnostics(new clang::DiagnosticsEngine(DiagID, diagOptions,
new clang::TextDiagnosticPrinter(llvm::errs(), diagOptions)));
std::unique_ptr<clang::CodeGenAction> action(new clang::EmitLLVMOnlyAction());
compiler.ExecuteAction(*action);
std::unique_ptr<llvm::Module> result = action->takeModule();
llvm::errs() << *result;
return result.release();
}
I was very careless with the pointers, so its very possible there's a memory leak or a double free (although it didn't crash).
I couldn't figure out how to take the source from a memory buffer, so I dumped it in a temporary file using mkstemp.
I didn't get around to executing the result, but I think you can follow #michael-haidi's response, or check out the LLVM Kaleidoscope tutorial (This is the JIT chapter).

I recommend using MCJIT because the old JIT infrastructure will be removed in a further release.
I can't point you to a full tutorial and cannot promise that the API hasn't changed since the blog post but here you'll find a guide how to use MCJIT with the Kaleidoscope example from LLVM and thats it. Examples and tutorials are hard to find for LLVM/Clang. However, I suggest trying it and maybe you can document your journey with a short example.
The Julia project also uses MCJIT for jit compilation of C++ code inside of the Julia lang. Maybe you can peek at the code and find out how the use MCJIT.
Good luck ;)

Related

Clion IDE, Whenever i create new file gives error [duplicate]

In the journey to learning C++ im learning through the C++ Manual thats on the actual website. Im using DevC++ and have hit a problem, not knowing whether its the compilers error or not.
I was going through this code bit by bit typing it in myself, as I feel its more productive, and adding my own stuff that ive learnt to the examples, then I get to initialising variables. This is the code that is in the C++ manual
#include <iostream>
using namespace std;
int main ()
{
int a=5; // initial value = 5
int b(2); // initial value = 2
int result; // initial value undetermined
a = a + 3;
result = a - b;
cout << result;
return 0;
}
This is popping up a compiler error saying " Multiple definitions of "Main""
Now This is on the actual C++ page so im guessing its a compiler error.
Could someone please point me in the right direction as to why this is happening and what is the cause for this error.
Multiple definitions of "main" suggests that you have another definition of main. Perhaps in another .c or .cpp file in your project. You can only have one function with the same name and signature (parameter types). Also, main is very special so you can only have one main function that can be used as the entry point (has either no parameters, one int, or an int and a char**) in your project.
P.S. Technically this is a linker error. It's a subtle difference, but basically it's complaining that the linker can't determine which function should be the entry point, because there's more than one definition with the same name.
Found I had two file references in my tasks.json file that were causing this error and which took me a long time to figure out. Hope this helps someone else..... See "HERE*****" below:
"-I/usr/include/glib-2.0",
"-I/usr/lib/x86_64-linux-gnu/glib-2.0/include",
//"${file}", //HERE**********************
"-lgtk-3",
"-lgdk-3",
"-lpangocairo-1.0",
"-lpango-1.0",
"-lharfbuzz",
"-latk-1.0",
"-lcairo-gobject",
"-lcairo",
"-lgdk_pixbuf-2.0",
"-lgio-2.0",
"-lgobject-2.0",
"-lglib-2.0",
"-o",
"${fileDirname}/${fileBasenameNoExtension}" //HERE*************
],
When I practiced CMake, I encountered the same problem. Finally, I found that the source code path set in the cmakelist project was incorrect. As a result, the compiled files included many duplicate files generated during CMake execution. As a result, compilation errors occurred

Code sharing between multiple independently compiled binaries/hex files

I'm looking for documentation/information on how to share information/code between multiple binaries compiled for a Cortex-m/0/4/7 architectures. The two binaries will be on the same chip and same architecture. They are flashed at different locations and sets the main stack pointer and resets the program counter so that one binary "jumps" to the other binary. I want to share code between these two binaries.
I've done a simple copy of an array of function pointers into a section defined in the linker script into RAM. Then read the RAM out in the other binary and cast it to an array then use the index to call functions in the other binary. This does work as a Proof-of-concept, but I think what I'm looking for is a bit more complex. As I want some way of describing compatibility between the two binaries. I want some what the functionality of shared libraries, but I'm unsure if I need position independent code.
As an example how the current copy process is done it is basically:
Source binary:
void copy_func()
{
memncpy(array_of_function_pointers, fixed_size, address_custom_ram_section)
}
Binary which is jumped too from source binary:
array_fp_type get_funcs()
{
memncpy(adress_custom_ram_section, fixed_size, array_of_fp)
return array_of_fp;
}
Then I can use the array_of_fp to call into functions residing in the source binary from the jump binary.
So what I'm looking for is some resources or input for someone who have implemented a similar system. Like I would like to not have to have a custom RAM section where I'm copying the function pointers into.
I would be fine with having the compilation step of source binary outputting something which can be included into the compilation step of the jump binary. However it needs to be reproducible and recompiling the source binary shouldn't break the compatibility with the jump binary(even if it included a different file from what is now outputted) as long as you don't change the interface.
To clarify source binary shouldn't require any specific knowledge about the jump binary. The code should not reside in both binaries as this would defeat the purpose of this mechanism. The overall goal if this mechanism is a way to save space when creating multi-binary applications on cortex-m processors.
Any ideas or links to resources are welcome. If you have any more questions feel free to comment on the question and I'll try to answer it.
Its very hard for me to picture what you want to do, but if you're interested in having an application link against your bootloader/ROM, then see Loading symbol file while linking for a hint on what you could do.
Build your "source"(?) image, scrape its mapfile and make a symbol file, then use that when you link your "jump"(?) image.
This does mean you need to link your "jump" image against a specific version of your "source" image.
If you need them to be semi-version independent (i.e. you define a set of functions that get exported, but you can rebuild on either side), then you need to export function pointers at known locations in your "source" image and link against those function pointers in your "jump" image. You can simplify the bookkeeping by making a structure of function pointers access the functions through that on either side.
For example:
shared_functions.h:
struct FunctionPointerTable
{
void(*function1)(int);
void(*function2)(char);
};
extern struct FunctionPointerTable sharedFunctions;
Source file in "source" image:
void function1Implementation(int a)
{
printf("You sent me an integer: %d\r\n", a);
function2Implementation((char)(a%256))
sharedFunctions.function2((char)(a%256));
}
void function2Implementation(char b)
{
printf("You sent me an char: %c\r\n", b);
}
struct FunctionPointerTable sharedFunctions =
{
function1Implementation,
function2Implementation,
};
Source file in "jump" image:
#include "shared_functions.h"
sharedFunctions.function1(1024);
sharedFunctions.function2(100);
When you compile/link the "source", take its mapfile and extract the location of sharedFunctions and create a symbol file that is linked with the source the "jump" image.
Note: the printfs (or anything directly called by the shared functions) would come from the "source" image (and not the "jump" image).
If you need them to come from the "jump" image (or be overridable) , then you need to access them through the same function pointer table, and the "jump" image needs to fix the function pointer table up with its version of the relevant function. I updated the function1() to show this. The direct call to function2 will always be the "source" version. The shared function call version of it will go through the jump table and call the "source" version unless the "jump" image updates the function table to point to its implementation.
You CAN get away from the structure, but then you need to export the function pointers one by one (not a big problem), but you want to keep them in order and at a fixed location, which means explicitly putting them in the linker descriptor file, etc. etc. I showed the structure method to distill it down to the easiest example.
As you can see, things get pretty hairy, and there is some penalty (calling through the function pointer is slower because you need to load up the address to jump to)
As explained in comment, we could imagine an application and a bootloader relying on same dynamic library. So application and bootloader rely on library, application can be changed without impact on library or boot.
I did not find an easy way to do a shared library with arm-none-eabi-gcc. However
this document gives some alternatives to shared libraries. I your case, I would recommand the jump table solution.
Write a library with the functions that need to be used in bootloader and in applicative.
"library" code
typedef void (*genericFunctionPointer)(void)
// use the linker script to set MySection at a known address
// I think this could be a structure like Russ Schultz solution but struct may or may not compile identically in lib and boot. However yes struct would be much easyer and avoiding many function pointer cast.
const genericFunctionPointer FpointerArray[] __attribute__ ((section ("MySection")))=
{
(genericFunctionPointer)lib_f1,
(genericFunctionPointer)lib_f2,
}
void lib_f1(void)
{
//some code
}
uint8_t lib_f2(uint8_t param)
{
//some code
}
applicative and/or bootloader code
typedef void (*genericFunctionPointer)(void)
// Use the linker script to set MySection at same address as library was compiled
// in linker script also put this section as `NOLOAD` because it is init by library and not by our code
//volatile is needed here because you read in flash memory and compiler may initialyse usage of this array to NULL pointers
volatile const genericFunctionPointer FpointerArray[NB_F] __attribute__ ((section ("MySection")));
enum
{
lib_f1,
lib_f2,
NB_F,
}
int main(void)
{
(correctCastF1)(FpointerArray[lib_f1])();
uint8_t a = (correctCastF2)(FpointerArray[lib_f2])(10);
}
You can look into using linker sections. If you have your bootloader source code in folder bootloader, you can use
SECTIONS
{
.bootloader:
{
build_output/bootloader/*.o(.text)
} >flash_region1
.binary1:
{
build_output/binary1/*.o(.text)
} >flash_region2
.binary2:
{
build_output/binary2/*.o(.text)
} >flash_region3
}

How does printf() work without variable list in its argument?

the following code:
#include<stdio.h>
void main()
{
int i=100,j=200;
printf("%d.....%d");
}
gives
200.....100
as the output.
Could someone explain how printf works without datalist
It provides a warning at compile time (warning: too few arguments for format), and is not documented, therefore it's undefined behaviour and should not be used. Different compilers are likely to have different behaviours and behaviour may even change between versions of the same compiler.
Try reading about it on Wikipedia for more info.
It is some garbage value in the stack since you haven't provided any integer arguments. printf() function doesn't know there are no arguments present and it will search the related stack location and print what ever there is. And as mentioned in Robadob's answer, behavior will change according to the compiler.

Whole web app in C++ with DOM interaction

I have recently heard of compiling C++ code to javascript using emscripten and how, if asmjs optimizations are done, it has the potential of running applications really fast.
I have read several post, tutorial and even heard some very interesting youtube videos. I have also run the hello world example successfully.
However, I don't know the full capabilities of this approach, specially if an entire new webapp can/should be written in C++ as a whole, without glue code.
More concretely I would like to write something similar to the following C++ (as a reference not working code).
#include <window>
class ApplicationLogic : public DOMListener{
private:
int num;
public:
ApplicationLogic():num(0);
virtual void onClickEvent(DOMEventData event){
num++;
}
virtual ~ApplicationLogic(){}
}
int main(){
DOMElement but = Window.getElementById("foo");
ApplicationLogic app();
but.setOnclick(app);
}
I hope it makes clear the idea, but the goal is to achieve something similar to:
A static function that initializes the module run when the window is ready (same behaviour that gives jquery.ready()). So listeners can be added to DOM elements.
A way to interact with the DOM directly from C/C++, hence the #include <window>, basically access to the DOM and other elements like JSON, Navigator and such.
I keep thinking of Lua and how when the lua script includes a shared object (dynamic linked library) it searched for a initialize function in that .so file, and there one would register the functions available from outside the module, just exactly how the return of the function module created in asmjs acts. But I can't figure out how to emulate jquery.ready directly with C++.
As you can see I have little knowledge about asmjs, but I haven't found tutorials or similar for what I'm looking for, I have read references to standard libraries included at compile time for stdlibc, stdlibc++ and SDL, but no reference on how to manipulate the DOM from the C++ source.
what's up. I know this is an old topic, but I'm posting here in case anyone else comes here looking for the answer to this question (like I did).
Technically, yes it is possible - but with a ton of what you called "glue code", and also a good bit of JavaScript (which kind of defeats the purpose IMO). For example:
#include <emscripten.h>
#include <string>
#define DIV 0
#define SPAN 1
#define INPUT 2
// etc. etc. etc. for every element you want to use
// Creates an element of the given type (see #defines above)
// and returns the element's ID
int RegisterElement(int type)
{
return EM_ASM_INT({
var i = 0;
while (document.getElementById(i))
i++;
var t;
if ($0 == 0) t = "div";
else if ($0 == 1) t = "span";
else if ($0 == 2) t = "input";
else
t = "span";
var test = document.createElement(t);
test.id = i;
document.body.appendChild(test);
return i;
}, type);
}
// Calls document.getElementById(ID).innerHTML = text
void SetText(int ID, const char * text)
{
char str[500];
strcpy(str, "document.getElementById('");
char id[1];
sprintf(id, "%d", ID);
strcat(str, id);
strcat(str, "').innerHTML = '");
strcat(str, text);
strcat(str, "';");
emscripten_run_script(str);
}
// And finally we get to our main entry point...
int main()
{
RegisterElement(DIV); // Creates an empty div, just as an example
int test = RegisterElement(SPAN); Creates an empty SPAN, test = its ID
SetText(test, "Testing, 1-2-3"); Set the span's inner HTML
return 0; And we're done
}
I had the same question and came up with this solution, and it compiled and worked as expected. But we're basically building a C/C++ API just to do what JavaScript already does "out of the box". Don't get me wrong - from a language standpoint I'd take C++ over JavaScript any day - but I can't help but think it's not worth the development time and possible performance issues involved in a setup like this. If I were going to do a web app in C++, I would definitely use Cheerp (the new name for Duetto).
As somebody pointed out already, if you start of with a fresh codebase exclusively for the web, then duetto could be a solution. But in my opinion duetto has many drawbacks, like no C allocators, which would probably make it very hard if you want to use 3rd party libraries.
If you are using emscripten, it provides an API for all kinds of DOM events, which does pretty much exactly what you want.
emscripten_set_click_callback(const char *target, void *userData, int useCapture, int (*func)(int eventType, const EmscriptenMouseEvent *mouseEvent, void *userData));
hope this helps

Objective C HTML parser error "expected expression before xmlNode"

Was following this Simple libxml2 HTML parsing example, using Objective-c, Xcode, and HTMLparser.h and http://benreeves.co.uk/objective-c-hmtl-parser/
The author notes that there's something wrong with rawContentsOfNode method.
NSArray *bodytext = [bodyNode findChildTags:#"td"];
for (HTMLNode *inputBody in bodytext) {
//NSLog(#"%#", [inputBody getAttributeNamed:#"class"]);
NSString *test = rawContentsOfNode(xmlNode *bodytext, htmlDocPtr doc);
}
There doesn't seem to be any example of using the updated version. and I can't figure out whats wrong. Any help with fixing this would be great.
The example in the StackOverflow answer won't even compile because he has just copy-pasted the note in the original example.
This:
rawContentsOfNode(xmlNode *bodytext, htmlDocPtr doc);
is part of a function prototype not a function call. It's a C function that requires and xmlNode and a htmlDocPtr as parameters. Looking at the interface of HTMLNode, we see that the prototype given in the comment is wrong, it should be:
NSString* rawContentsOfNode(xmlNode *node);
There's no mention in the source code of a function matching the prototype recommended in the blog post. I have no idea what they were talking about, unless it has been removed since the comment was made.
The XML node is a public member of the HTML node, so you could do:
test = rawContentsOfNode(inputBody->_node);
But the method rawContents does that anyway so you might as well use it.
test = [inputBody rawContents];
Note that (again checking the source code) there is an issue in that the content of the node is assumed to be encoded in UTF-8 this may be true, but the default encoding for HTTP is ISO-8859-1 so it may not.