How does the source code gets compiled only by a compiler? - compiler-specific

My question is, How does a compiler converts the source code, its just a software.
Is there a role of any hardware or CPU to help compiler to do its job? Thanks.

Like other software compilers need a CPU to run on. However, I haven't heard of any CPUs with any compiler specific support. For example, gcc is just another C++ (was C prior to 2012) program. It can compile and run itself just like any other C++ program, and the CPU runs it just like any other piece of code.

Related

Equivalents to gcc/clang's march=native in other compilers?

I'd like to know if there are other compilers than gcc and clang that provide something like an -march=native option, and if so, what that option is. I already understand from another question (Automatically building for best available platform in visual c++ (equivalent to gcc's -march=native)) that Microsoft's compilers do not have that option (unless it's implied in the option that activates the SSE2 instruction set, up to and excluding AVX and higher at least).
The use case is simple: provide a cmake set-up and thus the user with an option to activate and build with support for all the "intrinsics" his or her CPU supports. We currently have detection logic for the actual intrinsics we target (e.g. SSE4.2 and/or PCLMUL on x86) but that logic will probably get very complex when more platforms and compilers have to be taken into consideration. Simplifying them could lead to situations where the compiler starts to use unsupported instruction sets outside of the intended places protected by runtime checks.
Currently, the Microsoft Visual C++ compiler doesn't provide equivalent flags to march=native. You'll have to figure out the appropriate flags manually or using a script before building the code.
Regarding the Intel C++ compiler, the xHost and QxHost flags have basically the same purpose.

Is it possible to compile Java Bytecode to Native Code using pypy?

pypy currently translates Rpython to Native code using Pluggable JIT and GC. Currently it has a Python frontend . I am wondering if it is possible to write a Java Bytecode frontend to pypy making an alternate cool JVM (written in (R)Python)
An RPython interpreter for Java bytecode wouldn't be a compiler for Java bytecode to native code. The RPython code is compiled to native code, not the code the interpreter is interpreting.
At runtime (some-of) the interpreted code would be JIT-compiled to native code, but that's completely different, and the HotSpot VM already does this. Given that HotSpot has been developed over a long period of time with serious resources behind it, and specifically tuned for Java, I doubt you could get anything even approaching as good as it out of PyPy.
PyPy's strength is the idea that you can write things like garbage collectors and JIT compilers as a framework that works independently of the languages you're interpreting. Then lots of people can write lots of interpreters for lots of languages, and write them in a fairly high-level easy-to-code way, but they still all get high quality GCs, JIT compilers, etc without having to specifically implement them for each language. PyPy is unlikely to be a reasonable alternative to an existing project that has already sunk huge amounts of resources into developing highly optimised GCs and JIT compilers that are specifically tuned for their language.

Is it possible to embed LLVM Interpreter in my software and does it make sense?

Suppose I have a software and I want to make cross-plataform plugins. You compile the plugin for a virtual machine, and any platform running my software would be able to run this code.
I am wondering if it is possible to use LLVM interpreter and bytecode for this purpose. Also, I am wondering if does make sense using LLVM for this purpose instead of something else, i.e. is it what LLVM was made for?
I'm not sure that LLVM was designed for it. However, I doubt there is anything that hasn't been done using LLVM1
Other virtual-machines based script engines are specifically created for the job:
LUA is very popular
Wikipedia lists some other Extension/embeddable languages under the Scripting language entry
If you're looking for embeddable virtual machines:
IKVM supports embedding JVM and CLR in a bridged mode (interoperable)
Parrot supports embedding (and includes a Python interpreter; mind you, you can just run python bytecode images)
Perl has similar architecture and supports embedding
Javascript supports embedding (not sure about the architecture of v8, but I guess it would use a virtual machine)
Mono's CLR engine supports embedding: http://www.mono-project.com/Embedding_Mono
1 including compiling c++ information to javascript to run in your browser...
There is VMIR (https://github.com/andoma/vmir) which is a LLVM bitcode interpreter / JIT engine that's intended to be embedded into other apps.
Disclaimer: I'm the author of it and it's still work-in-progress but works reasonable well.
In theory, there exist a limited subset of LLVM IR which can be portable across various platforms. You shall not specify alignments, you shall not bitcast pointers to integral types, you must avoid intrinsics, etc. Which means - you can't immediately use a code generated by a stock C compiler (llvm-gcc, Clang, whatever), unless you specify a limited target for it and implement sanitising LLVM passes. Another issue is that the bitcode format from different LLVM versions is not guaranteed to be compatible.
In practice, I would not go there. Mono is a reasonably small, embeddable, fast VM, and all the .NET stack of tools is available for it. VM itself is pretty low-level (as long as you do not care about the verifyability).
LLVM includes an interpreter, so if you can build this interpreter for your target platforms, you can then evaluate LLVM bitcode on the fly.
It's apparently not so fast though.
In their classic discussion (that you do not want to miss if you're a fan of open source, LLVM, compilers) about LLVM vs libJIT, that has happened long before LLVM became famous and established, the author of libJIT Rhys Weatherley raised this particular issue, he stated that LLVM is not suitable to be embedded, while Chris Lattner, the author of LLVM stated that otherwise, it is modular and you can use it in any possible fashion including embedding only the parts you need.

Where is the VM in LLVM?

Note: marked as community wiki.
Where is the Low Level Virtual Machine in LLVM?
I see that we have llvm-g++ and c-lang, but to me, a LLVM is something almost like Valgrind of a simulator, where instructions are executed on it, and I can write programs to instrument the running code / interrupt when certain conditions happen / etc ...
Where are the tools like this built on LLVM?
Thanks!
I think you're looking for QEMU, not LLVM.
The low-level virtual machine in LLVM is that, after converting the higher-level C and C++ language input into an internal low-level representation (as a stage in the normal compiling process), it can then save this low-level representation and execute it on a JIT compiler (which thus acts somewhat like a virtual machine). This JIT compiler does a substantial amount of optimization, and so I expect it would be difficult to instrument in quite the form that you're thinking of -- in particular, it does not do instruction-by-instruction stepping through the execution.
QEMU, by contrast, is an open-source emulator that does instruction-by-instruction stepping through of machine code. It already contains a certain amount of ability to instrument code to look for certain conditions, in that it can connect to GDB and set watchpoints and so forth, which are implemented in QEMU itself.
To use LLVM for running x86 code you should check libCPU or outdated llvm-qemu.
Look at running x86 program _on_ llvm

Do JVMs on Desktops Use JIT Compilation?

I always come across articles which claim that Java is interpreted. I know that Oracle's HotSpot JRE provides just-in-time compilation, however is this the case for a majority of desktop users? For example, if I download Java via: http://www.java.com/en/download, will this include a JIT Compiler?
Yes, absolutely. Articles claiming Java is interpreted are typically written by people who either don't understand how Java works or don't understand what interpreted means.
Having said that, HotSpot will interpret code sometimes - and that's a good thing. There are definitely portions of any application (around startup, usually) which are only executed once. If you can interpret that faster than you can JIT compile it, why bother with the overhead? On the other hand, my experience of "Java is interpreted" articles is that this isn't what they mean :)
EDIT: To take T. J. Crowder's point in: yes, the JVM downloaded from java.com will be HotSpot. There are two different JITs for HotSpot, however - server and desktop. To sum up the differences in a single sentence, the desktop JIT is designed to start apps quickly, whereas the server JIT is more focused on high performance over time: server apps typically run for a very long time, so time spent optimising them really heavily pays off in the long run.
There is nothing in the JVM specification that mandates any particular execution strategy. Some JVMs only interpret, they don't even have a compiler. Some JVMs only JIT compile, they don't even have an interpreter. Some JVMs have both an intepreter and a compiler (or even multiple compilers) and statically choose between the two on startup. Some have both and dynamically switch back and forth during runtime. Some aren't even virtual machines in the usual sense of the word at all, they just statically compile JVM bytecode into native machinecode ahead-of-time.
The particular JVM that you are asking about, Oracle's HotSpot JVM, has one interpreter and two compilers, called the C1 and C2 compiler, also colloquially known as the client and server compilers, after their corresponding commandline options. HotSpot dynamically switches back and forth between the interpreter and one of the compilers at runtime (but it will not switch between the two compilers, you have to specify one of them on the commandline and then only that one will be used for the entire runtime of the JVM).
As per document here Starting with some of the later Java SE 7 releases, a new feature called tiered compilation became available. This feature uses the C1 compiler mode at the start to provide better startup performance. Once the application is properly warmed up, the C2 compiler mode takes over to provide more-aggressive optimizations and, usually, better performance
The C1 compiler is an optimizing compiler which is pretty fast and doesn't use a lot of memory. The C2 compiler is much more aggressively optimizing, but is also slower and uses more memory.
You select between the two by specifying the -client and -server commandline options (-client is the default if you don't specify one), which also sets a couple of other JVM parameters like the default JIT threshold (in -client mode, methods will be compiled after they have been interpreted 1500 times, in -server mode after 10000 times, can be set with the -XX:CompileThreshold commandline argument).
Whether or not "the majority of desktop users" actually will run in compiled or interpreted mode depends largely on what code they are running. My guess is that the vast majority of desktop users run the HotSpot JVM from Oracle's JRE/JDK or one of its forks (e.g. SoyLatte on OSX, IcedTea or OpenJDK on Unix/BSD/Linux) and they don't fiddle with the commandline options, so they will probably get the C1 compiler with the default 1500 JIT threshold. (But applications such as IntelliJ, Eclipse or NetBeans have their own launcher scripts that usually supply different commandline arguments.)
In my case, for example, I often run small scripts which never actually reach the JIT threshold, so they are never compiled. (Nor should they be.)
Some of these links about the Hotspot JVM (what you are downloading in the java.com download link above) might help:
Java SE HotSpot at a Glance
The Java HotSpot Performance Engine Architecture
Frequently Asked Questions About the Java HotSpot VM
Neither of the (otherwise-excellent) answers so far seems to have actually answered your last question, so: Yes, the Java runtime you downloaded from www.java.com is Oracle's (Sun's) Hotspot JVM, and so yes, it will do JIT compilation. HotSpot isn't just for servers or anything like that, it runs on desktops and takes full advantage of its (very mature) optimizing JIT compiler.
Jvm spec never claim how to execute the java bytecode, however, you can specify a JIT compiler if you use the JVM from hotspot VM, JIT is just a technique to optimize byte code execution.