Do JVMs on Desktops Use JIT Compilation? - jvm

I always come across articles which claim that Java is interpreted. I know that Oracle's HotSpot JRE provides just-in-time compilation, however is this the case for a majority of desktop users? For example, if I download Java via: http://www.java.com/en/download, will this include a JIT Compiler?

Yes, absolutely. Articles claiming Java is interpreted are typically written by people who either don't understand how Java works or don't understand what interpreted means.
Having said that, HotSpot will interpret code sometimes - and that's a good thing. There are definitely portions of any application (around startup, usually) which are only executed once. If you can interpret that faster than you can JIT compile it, why bother with the overhead? On the other hand, my experience of "Java is interpreted" articles is that this isn't what they mean :)
EDIT: To take T. J. Crowder's point in: yes, the JVM downloaded from java.com will be HotSpot. There are two different JITs for HotSpot, however - server and desktop. To sum up the differences in a single sentence, the desktop JIT is designed to start apps quickly, whereas the server JIT is more focused on high performance over time: server apps typically run for a very long time, so time spent optimising them really heavily pays off in the long run.

There is nothing in the JVM specification that mandates any particular execution strategy. Some JVMs only interpret, they don't even have a compiler. Some JVMs only JIT compile, they don't even have an interpreter. Some JVMs have both an intepreter and a compiler (or even multiple compilers) and statically choose between the two on startup. Some have both and dynamically switch back and forth during runtime. Some aren't even virtual machines in the usual sense of the word at all, they just statically compile JVM bytecode into native machinecode ahead-of-time.
The particular JVM that you are asking about, Oracle's HotSpot JVM, has one interpreter and two compilers, called the C1 and C2 compiler, also colloquially known as the client and server compilers, after their corresponding commandline options. HotSpot dynamically switches back and forth between the interpreter and one of the compilers at runtime (but it will not switch between the two compilers, you have to specify one of them on the commandline and then only that one will be used for the entire runtime of the JVM).
As per document here Starting with some of the later Java SE 7 releases, a new feature called tiered compilation became available. This feature uses the C1 compiler mode at the start to provide better startup performance. Once the application is properly warmed up, the C2 compiler mode takes over to provide more-aggressive optimizations and, usually, better performance
The C1 compiler is an optimizing compiler which is pretty fast and doesn't use a lot of memory. The C2 compiler is much more aggressively optimizing, but is also slower and uses more memory.
You select between the two by specifying the -client and -server commandline options (-client is the default if you don't specify one), which also sets a couple of other JVM parameters like the default JIT threshold (in -client mode, methods will be compiled after they have been interpreted 1500 times, in -server mode after 10000 times, can be set with the -XX:CompileThreshold commandline argument).
Whether or not "the majority of desktop users" actually will run in compiled or interpreted mode depends largely on what code they are running. My guess is that the vast majority of desktop users run the HotSpot JVM from Oracle's JRE/JDK or one of its forks (e.g. SoyLatte on OSX, IcedTea or OpenJDK on Unix/BSD/Linux) and they don't fiddle with the commandline options, so they will probably get the C1 compiler with the default 1500 JIT threshold. (But applications such as IntelliJ, Eclipse or NetBeans have their own launcher scripts that usually supply different commandline arguments.)
In my case, for example, I often run small scripts which never actually reach the JIT threshold, so they are never compiled. (Nor should they be.)

Some of these links about the Hotspot JVM (what you are downloading in the java.com download link above) might help:
Java SE HotSpot at a Glance
The Java HotSpot Performance Engine Architecture
Frequently Asked Questions About the Java HotSpot VM

Neither of the (otherwise-excellent) answers so far seems to have actually answered your last question, so: Yes, the Java runtime you downloaded from www.java.com is Oracle's (Sun's) Hotspot JVM, and so yes, it will do JIT compilation. HotSpot isn't just for servers or anything like that, it runs on desktops and takes full advantage of its (very mature) optimizing JIT compiler.

Jvm spec never claim how to execute the java bytecode, however, you can specify a JIT compiler if you use the JVM from hotspot VM, JIT is just a technique to optimize byte code execution.

Related

Is there a stackless and heapless programming language?

Is there a statically compiled programming language that is both stackless and heapless?
For data, such a language would not have a concept of memory allocation. Instead, the memory requirements of the program would be known completely at compile-time.
For code, there would not be a concept of call stack. There could be functions, but they'd be inlined at every call site.
I am specifically interested by portable languages with some form of implementation or a compiler that produces native binaries.
Pure x86 machine language fits your stackless and heapless constraints(within real mode constraints). Portability is not possible, unless the compiler has access to every memory location for all hardware IO(memory locations) that are Fixed for all supported platforms(this condition excludes all dynamic interfaces including PlugandPlay, USB, and PCI/PCIE busses)
It is completely possible to create such a structure within severe hardware limits(every device must be compiled in and allocated at boot, as in older computers like the c64 or Apple II) but all functionality must be pre-compiled into the OS, as in every program possible that is to run on the platform.
This is not a general computing platform anymore. Program a micro-controller, GPU, or ASIC to solve the task instead.

Control execution speed

I am thinking of making a "programming game", i.e. where each player writes a program to control their "bot", and then the programs are pitted against each-other to see who wins (by some definition of "win").
To make this fair, each bot program should execute at the same speed, so using native pre-compiled C/C++ code seems out of the question.
I can think of 3 options, but am unsure about 2:
Use a language that runs in a VM - This would mean that bots are written in Java and compiled to JVM bytecode. Then every bot gets a JVM and I would need to control the JVM "clock" or whatever it has to control the execution speed.
Problem: Can the JVM "clock" be controlled, telling it to run X clock cycles worth of code?
Use a scripting language - Bots wuld be written in JS or Python or whatever.
Problem: Same as above - can the speed be controlled?
Use my own simplified language -
Problems: I am writing a game, not a compiler. It will mean anyone playing has to learn yet another language, which means no one will play.
So basically, I guess the question is can I control the execution speed of the JVM or some language interpreter (not in theory - in practice)? Or is there another option I didn't think of?
The JVM isn't real-time, nor, I suspect is your OS. Relying on the JVM and/or process interactions isn't going to work since you're at the mercy of OS scheduling, JVM thread scheduling etc.
If you want to coordinate multiple threads, then you should look at the JVM thread model and in particular how to use locks to coordinate 2 threads.
One option would be to write your own JVM that you instrument to run only a fixed number of bytecode instructions from each program. Bytecode is a lot easier to digest that human-readable source code, so you could get away with relatively little implementation work, while your users would get to program in any programming language that can produce Java bytecode.
It gets easier if you institute some restrictions like "no threads" and "no try/catch". You'll need to implement a few core language features from java.lang.* plus some domain-specific I/O features, but for most of the rest of the JRE (for example java.util.*) you should be able to get away with executing bytecode from an existing JRE implementation (modulo legal constraints if you distribute the game engine).
Expect a slowdown of between 10x and 100x (depending on your implementation technology) compared to running on an off-the-shelf optimized JVM.
Alternatively, run an existing JVM in debug mode, single-step through the contestant programs with your game pretending to be a debugger. Whether this is easier or harder than writing a bare-bones JVM yourself I'm not sure.

Where is the VM in LLVM?

Note: marked as community wiki.
Where is the Low Level Virtual Machine in LLVM?
I see that we have llvm-g++ and c-lang, but to me, a LLVM is something almost like Valgrind of a simulator, where instructions are executed on it, and I can write programs to instrument the running code / interrupt when certain conditions happen / etc ...
Where are the tools like this built on LLVM?
Thanks!
I think you're looking for QEMU, not LLVM.
The low-level virtual machine in LLVM is that, after converting the higher-level C and C++ language input into an internal low-level representation (as a stage in the normal compiling process), it can then save this low-level representation and execute it on a JIT compiler (which thus acts somewhat like a virtual machine). This JIT compiler does a substantial amount of optimization, and so I expect it would be difficult to instrument in quite the form that you're thinking of -- in particular, it does not do instruction-by-instruction stepping through the execution.
QEMU, by contrast, is an open-source emulator that does instruction-by-instruction stepping through of machine code. It already contains a certain amount of ability to instrument code to look for certain conditions, in that it can connect to GDB and set watchpoints and so forth, which are implemented in QEMU itself.
To use LLVM for running x86 code you should check libCPU or outdated llvm-qemu.
Look at running x86 program _on_ llvm

Will static linking on one unix distribution work but not another?

If I statically link an executable in ubuntu, is there any chance that that executable won't work within another distribution such as mint os? or fedora? I know processor types are affected, but other then that is there anything else I have to be wary of? Sorry if this is a dumb question. Thanks for any help
There are a few corner cases, but for the most part, you should be in good shape with static linking. The one that comes to mind is libnss. This particular library is essentially impossible to link statically, because of the way it does its job (permissions, authentication, security tasks). As long as the glibc-versions are similar, you should be ok on this issue, though.
If your program needs to work with subtle features of the kernel, like volume managers, you've got a pretty slim chance of getting your program to work, statically linked, across distros, because the kernel interfaces may change slightly.
Most typical applications, the kind that even makes sense to discuss portability, like network services, gui-applications, language tools (like compilers/interpreters) wont have a problem with any of this.
If you statically link a program on one computer and then move it to another computer in which the system basically runs the same way, then it should work just fine. That's the point of static linking; that there are no other files the program depends on - it's entirely self-contained, so as long as it can run at all, it will run the same way it does on its "host" system.
This contrasts with dynamic linking, in which the program incorporates elements of other files (libraries) at runtime. If you move a dynamically linked program to another system where the libraries it depends on are different (or nonexistent), it won't work.
In most cases, your executable will work just fine. As long as your executable doesn't depend on anything unusual being present for it to function, there will be no problem. (And, if it does depend on something unusual being present, then you'll have the same issue even if you dynamically link.)
Statically linking is usually safer than dynamically linking for compatibility between different UNIX environments, as long as the same CPU is in use.
To have a statically linked binary fail, again assuming the same processor architecture, you would have to do something such as link on a system using the a.out binary format and try to execute it on a system running ELF, in which case the dynamically linked version would fail just as badly.
So why do people not routinely link statically? Two reasons:
It makes the executable larger, sometimes MUCH larger, and
If bugs in the libraries are fixed, you'll have to relink your program to get access to the bug fixes. If a critical security bug is fixed in the libraries, you have to relink and redistribute your exe.
On the contrary. Whatever your chances are of getting a binary to work across distributions or even OSes, those chances are maximized by static linking. Static linking makes an executable self-contained in terms of libraries. It can still go wrong if it tries to read a file that's not there on another system.
For even better chances of portability, try linking against dietlibc or some other libc. An article at Linux Journal mentions some candidates. A smaller, simpler libc is less likely to depend on things in the filesystem that differ from distro to distro.
I would, for the reasons noted above avoid statically linking something unless you absolutely must.
That being said, it should work on any other similar kernel of the same architecture (i.e. if you statically link on a machine running linux 2.4.x , the loader VDSO is going to be different on linux 2.6, VDSO being virtual dynamic shared object, a shared object that the kernel exposes to every process containing loader code).
Other pitfalls include things in /etc not being where you'd think, logs being in different places, system utilities being absent or different (ubuntu uses update-rc.d, RHEL uses chkconfig), etc.
There are sometimes that you just have no choice. I was writing a program that talked to LVM2's string based cmdlib interface in favor of using execv().. low and behold, 30% of the distros I needed to support did NOT include that library and offered no way of getting it. So, I had to link against the static object when producing binary packages.
If you are using glibc, you can be confident that stuff like getpwnam() and friends will still work .. just make sure to watch any hard coded paths (better yet, make them configurable at run time)
As long as you can guarantee it'll only be executed on a similar version of the OS on similar hardware your program will work fine if it statically linked. so, if you build for a 2.6 Linux and statically link you will be fine to run on (almost) all 2.6 Linux distributions.
Be warned you can't statically link some parts of GLIBC so if you're using them you'll have to dynamically link anyway. From memory the name service stuff (nss) parts required dynamic linking when I was investigating it.
You can't statically link a program for (say) Linux then expect it to run on BSD or Windows. BSD and Unix don't present or handle their system calls in the same way Linux does. I tell a slight lie because the BSDs have a Linux emulation layer that can be enabled, but out of the box it won't work.
No it will not work. Static linking for distribution independence is a concept from the old unix ages and is not recommended. By the fact you can't as many libraries are not avail as static libraries anyway.
Follow the Linux Standard Base way, this is your only chance to get as much cross distribution portability as possible.
The LSB also works fine if you program for FreeBSD and Solaris.
There are two compatibility questions at issue here: library versions and library inventory.
You don't say what libraries you are using.
If you have no '-l' options, then the only 'library' is glibc itself, which serves as the interface to the kernel. Glibc versions are upward compatible. If you link on a glibc 2.x system you can run on a glibc 2.y, for y > x. The developers make a firm commitment to this.
If you have -l options, static linking is always safe. If you are dynamically linked, you have to ensure that (1) the library is present on the target system, and (2) has a compatible version. Your Mileage Might Vary as to whether the target distro has what you need.

Why do we need other JVM languages

I see here that there are a load of languages aside from Java that run on the JVM. I'm a bit confused about the whole concept of other languages running in the JVM. So:
What is the advantage in having other languages for the JVM?
What is required (in high level terms) to write a language/compiler for the JVM?
How do you write/compile/run code in a language (other than Java) in the JVM?
EDIT: There were 3 follow up questions (originally comments) that were answered in the accepted answer. They are reprinted here for legibility:
How would an app written in, say, JPython, interact with a Java app?
Also, Can that JPython application use any of the JDK functions/objects??
What if it was Jaskell code, would the fact that it is a functional language not make it incompatible with the JDK?
To address your three questions separately:
What is the advantage in having other languages for the JVM?
There are two factors here. (1) Why have a language other than Java for the JVM, and (2) why have another language run on the JVM, instead of a different runtime?
Other languages can satisfy other needs. For example, Java has no built-in support for closures, a feature that is often very useful.
A language that runs on the JVM is bytecode compatible with any other language that runs on the JVM, meaning that code written in one language can interact with a library written in another language.
What is required (in high level terms) to write a language/compiler for the JVM?
The JVM reads bytecode (.class) files to obtain the instructions it needs to perform. Thus any language that is to be run on the JVM needs to be compiled to bytecode adhering to the Sun specification. This process is similar to compiling to native code, except that instead of compiling to instructions understood by the CPU, the code is compiled to instructions that are interpreted by the JVM.
How do you write/compile/run code in a language (other than Java) in the JVM?
Very much in the same way you write/compile/run code in Java. To get your feet wet, I'd recommend looking at Scala, which runs flawlessly on the JVM.
Answering your follow up questions:
How would an app written in, say, JPython, interact with a Java app?
This depends on the implementation's choice of bridging the language gap. In your example, Jython project has a straightforward means of doing this (see here):
from java.net import URL
u = URL('http://jython.org')
Also, can that JPython application use any of the JDK functions/objects?
Yes, see above.
What if it was Jaskell code, would the fact that it is a functional language not make it incompatible with the JDK?
No. Scala (link above) for example implements functional features while maintaining compatibility with Java. For example:
object Timer {
def oncePerSecond(callback: () => unit) {
while (true) { callback(); Thread sleep 1000 }
}
def timeFlies() {
println("time flies like an arrow...")
}
def main(args: Array[String]) {
oncePerSecond(timeFlies)
}
}
You need other languages on the JVM for the same reason you need multiple programming languages in general: Different languages are better as solving different problems ... static typing vs. dynamic typing, strict vs. lazy ... Declarative, Imperative, Object Oriented ... etc.
In general, writing a "compiler" for another language to run on the JVM (or on the .Net CLR) is essentially a matter of compiling that language into java bytecode (or in the case of .Net, IL) instead of to assembly/machine language.
That said, a lot of the extra languages that are being written for JVM aren't compiled, but rather interpreted scripting languages...
Turning this on its head, consider you want to design a new language and you want it to run in a managed runtime with a JIT and GC. Then consider that you could:
(a) write you own managed runtime (VM) and tackle all sorts of technically difficult issues that will doubtless lead to many bugs, bad performance, improper threading and a great deal of portability effort
or
(b) compile your language into bytecode that can run on the Java VM which is already quite mature, fast and supported on a number of platforms (sometimes with more than one choice of vendor impementation).
Given that the JavaVM bytecode is not tied so closely to the Java language as to unduly restrict the type of language you can implement, it has been a popular target environment for languages that want to run in a VM.
Java is a fairly verbose programming language that is getting outdated very quickly with all of the new fancy languages/frameworks coming out in the past 5 years. To support all the fancy syntax that people want in a language AND preserve backwards compatibility it makes more sense to add more languages to the runtime.
Another benefit is it lets you run some web frameworks written in Ruby ala JRuby (aka Rails), or Grails(Groovy on Railys essentially), etc. on a proven hosting platform that likely already is in production at many companies, rather than having to using that not nearly as tried and tested Ruby hosting environments.
To compile the other languages you are just converting to Java byte code.
I would answer, “because Java sucks” but then again, perhaps that's too obvious … ;-)
The advantage to having other languages for the JVM is quite the same as the advantage to having other languages for computer in general: while all turing-complete languages can technically accomplish the same tasks, some languages make some tasks easier than others while other languages make other tasks easier. Since the JVM is something we already have the ability to run on all (well, nearly all) computers, and a lot of computers, in fact already have it, we can get the "write once, run anywhere" benefit, but without requiring that one uses Java.
Writing a language/compiler for the JVM isn't really different from writing one for a real machine. The real difference is that you have to compile to the JVM's bytecode instead of to the machine's executable code, but that's really a minor difference in the grand scheme of things.
Writing code for a language other than Java in the JVM really isn't different from writing Java except, of course, that you'll be using a different language. You'll compile using the compiler that somebody writes for it (again, not much different from a C compiler, fundamentally, and pretty much not different at all from a Java compiler), and you'll end up being able to run it just like you would compiled Java code since once it's in bytecode, the JVM can't tell what language it came from.
Different languages are tailored to different tasks. While certain problem domains fit the Java language perfectly, some are much easier to express in alternative languages. Also, for a user accustomed to Ruby, Python, etc, the ability to generate Java bytecode and take advantage of the JDK classes and JIT compiler has obvious benefits.
Answering just your second question:
The JVM is just an abstract machine and execution model. So targetting it with a compiler is just the same as any other machine and execution model that a compiler might target, be it implemented in hardware (x86, CELL, etc) or software (parrot, .NET). The JVM is fairly simple, so its actually a fairly easy target for compilers. Also, implementations tend to have pretty good JIT compilers (to deal with the lousy code that javac produces), so you can get good performance without having to worry about a lot of optimizations.
A couple of caveats apply. First, the JVM directly embodies java's module and inheritance system, so trying to do anything else (multiple inheritance, multiple dispatch) is likely to be tricky and require convoluted code. Second, JVMs are optimized to deal with the kind of bytecode that javac produces. Producing bytecode that is very different from this is likely to get into odd corners of the JIT compiler/JVM which will likely be inefficient at best (at worst, they can crash the JVM or at least give spurious VirtualMachineError exceptions).
What the JVM can do is defined by the JVM's bytecode (what you find in .class files) rather than the source language. So changing the high level source code language isn't going to have a substantial impact on the available functionality.
As for what is required to write a compiler for the JVM, all you really need to do is generate correct bytecode / .class files. How you write/compile code with an alternate compiler sort of depends on the compiler in question, but once the compiler outputs .class files, running them is no different than running the .class files generated by javac.
The advantage for these other languages is that they get relatively easy access to lots of java libraries.
The advantage for Java people varies depending on language -- each has a story tell Java coders about what they do better. Some will stress how they can be used to add dynamic scripting to JVM-based apps, others will just talk about how their language is easier to use, has a better syntax, or so forth.
What's required are the same things to write any other language compiler: parsing to an AST, then transforming that to instructions for the target architecture (byte code) and storing it in the right format (.class files).
From the users' perspective, you just write code and run the compiler binaries, and out comes .class files you can mix in with those your java compiler produces.
The .NET languages are more for show than actual usefulness. Each language has been so butchered, that they're all C# with a new face.
There are a variety of reasons to provide alternative languages for the Java VM:
The JVM is multiplatform. Any language ported to the JVM gets that as a free bonus.
There is quite a bit of legacy code out there. Antiquated engines like ColdFusion perform better while offering customers the ability to slowly phase their applications from the legacy solution to the modern solution.
Certain forms of scripting are better suited to rapid development. JavaFX, for example, is designed with rapid Graphical development in mind. In this way it competes with engines like DarkBasic. (Processing is another player in this space.)
Scripting environments can offer control. For example, an application may wish to expose a VBA-like environment to the user without exposing the underlying Java APIs. Using an engine like Rhino can provide an environment that supports quick and dirty coding in a carefully controlled sandbox.
Interpreted scripts mean that there's no need to recompile anything. No need to recompile translates into a more dynamic environment. e.g. Despite OpenOffice's use of Java as a "scripting language", Java sucks for that use. The user has to go through all kinds of recompile/reload gyrations that are unnecessary in a dynamic scripting environment like Javascript.
Which brings me to another point. Scripting engines can be more easily stopped and reloaded without stopping and reloading the entire JVM. This increases the utility of the scripting language as the environment can be reset at any time.
It's much easier for a compiler writer to generate JVM or CLR byte-codes. They are a much cleaner and higher level abstraction than any machine language. Because of this, it is much more feasible to experiment with creating new languages than ever before, because all you have to do is target one of these VM architectures and you will have a set of tools and libraries already available for your language. They let language designers focus more on the language than all the necessary support infrastructure.
Because the JSR process is rendering Java more and more dead: http://www.infoq.com/news/2009/01/java7-updated
It's a shame that even essential and long known additions like Closures are not added just because the members cannot agree on an implementation.
Java has accumulated a massive user base over seven major versions (from 1.0 to 1.6). Its capability to evolve is limited by the need to preserve backwards compatibility for the uncountable millions of lines of Java code running in production.
This is a problem because Java needs to evolve to:
compete with newer programming languages that have learned from Java's successes and failures.
incorporate new advances in programming language design.
allow users to take full advantage of advances in hardware - e.g. multi-core processors.
fix some cutting edge ideas that introduced unexpected problems (e.g. checked exceptions, generics).
The requirement for backwards compatibility is a barrier to staying competitive.
If you compare Java to C#, Java has the advantage in mature, production ready libraries and frameworks, and a disadvantage in terms of language features and rate of increase in market share. This is what you would expect from comparing two successful languages that are one generation apart.
Any new language has the same advantage and disadvantage that C# has compared to Java to an extreme degree. One way of maximizing the advantage in terms of language features, and minimizing the disadvantage in terms of mature libraries and frameworks is to build the language for an existing virtual machine and make it interoperable with code written for that virtual machine. This is the reason behind the modest success of Groovy and Clojure; and the excitement around Scala. Without the JVM these languages could only ever have occupied a tiny niche in a very specialized market segment, whereas with the JVM they occupy a significant niche in the mainstream.
They do it to keep up with .Net. .Net allows C#, VB, J# (formerly), F#, Python, Ruby (coming soon), and c++. I'm probably missing some. Probably the big one in there is Python, for the scripting people.
To an extent it is probably an 'Arms Race' against the .NET CLR.
But I think there are also genuine reasons for introducing new languages to the JVM, particularly when they will be run 'in parallel', you can use the right language for the right job, a scripting language like Groovy may be exactly what you need for your page presentation, whereas regular old Java is better for your business logic.
I'm going to leave someone more qualified to talk about what is required to write a new language/compiler.
As for how to writing code, you do it in notepad/vi as usual! (or use a development tool that supports the language if you want to do it the easy way.) Compiling will require a special compiler for the language that will interpret and compile it into bytecode.
Since java also produces bytecode technically you don't need to do anything special to run it.
The reason is that the JVM platform offers a lot of advantages.
Giant number of libraries
Broader degree of platform
implementations
Mature frameworks
Legacy code that's
already part of your infrastructure
The languages Sun is trying to support with their Scripting spec (e.g. Python, Ruby) are up and comers largely due to their perceived productivity enhancements. Running Jython allows you to, in theory, be more productive, and leverage the capabilities of Python to solve a problem more suited to Python, but still be able to integrate, on a runtime level, with your existing codebase. The classic implementations of Python and Ruby effect the same ability for C libraries.
Additionally, it's often easier to express some things in a dynamic language than in Java. If this is the case, you can go the other way; consume Python/Ruby libraries from Java.
There's a performance hit, but many are willing to accept that in exchange for a less verbose, clearer codebase.