Why do almost all OO languages compile to bytecode? - oop

Of the object-oriented languages I know, pretty much all but C++ and Objective-C compile to bytecode running on some sort of virtual machine. Why have so many different languages settled on compiling to bytecode, as opposed to machine code? Is it possible in princible to have a high-level memory-managed OOP language that compiled to machine code?
Edit: I'm aware that multiplatform support is often advanced as an advantage of this approach. However, it's quite possible to compile natively on multiple platforms, without making a new compiler per platform. One can, per example, emit C code and then compile that with GCC.

There's no reason in fact, this is a kind of coincidence. OOP now is the leading concept in "big" programming, and so virtual machines are.
Also note, that there are 2 distinct parts of traditional virtual machines - garbage collector and bytecode interpreter/JIT-compiler, and these parts can exist separately. For example, Common Lisp implementation called SBCL compiles program to a native code, but at runtime heavily uses garbage collection.

This is done to allow a VM or JIT compiler the chance to compile the code on demand optimally for the architecture on which the code is executed. Also, it allows for cross-platform bytecode to be created once and then executed on multiple hardware architectures. This allows for hardware specific optimizations to be placed into the compiled code.
Since byte code is not limited to a microarchitecture, it can be smaller than machine code. Complex instructions can be represented vs. the much more primitive instructions available in modern day CPUs, since the constraints in the design of CPU instructions are very different from the constraints in designing a bytecode architecture.
Then there's the issue of security. The bytecode can be verified and analyzed prior to execution (i.e., no buffer overflows, variables of a certain type being accessed as something they are not), etc...

Java uses bytecode because two of its initial design goals were portability and compactness. Those both came from the initial vision of a language for embedded devices, where fragments of code could be downloaded on the fly.
Python, Ruby, Smalltalk, javascript, awk and so on use bytecode because writing a native compiler is a lot of work, but a textual interpreter is too slow - bytecode hits a sweet spot of being fairly easy to write, but also satisfactorily quick to run.
I have no idea why the Microsoft languages use bytecode, since for them, neither portability nor compactness is a big deal. A lot of the thinking behind the CLR came out of computer scientists in Cambridge, so i imagine considerations like ease of program analysis and verification were involved.
Note that as well as C++ and Objective C, Eiffel, Ada 9X, Vala and Go are OO languages (of varying vintage) that are compiled straight to native code.
All in all, i'd say that OO and bytecode do not go hand in hand. Rather, we have a coincidental convergence of several streams of development: the traditional bytecoded interpreters of scripting languages like Python and Ruby, the mad Gosling masterplan of Java, and whatever it is Microsoft's motives are.

The biggest reason why most interpreted languages (not specifically OO languages) are compiled to bytecode is for performance. The most expensive part of interpreting code is transforming text source to an intermediate representation. For instance, to perform something like:
foo + bar;
The interpreter would have to scan 10 characters, transform them into 4 tokens, build an AST for the operation, resolve three symbols (+ is a symbol, which depends on the types of foo and bar), all before it can perform any action that actually depends on the run-time state of the program. None of this can change from run to run, and so many languages try to store some form of intermediate representation.
bytecode, rather than storing an AST has a few advantages. For one, bytecodes are easy to serialize, so the IR can be written to disk and reused at the next invocation, further reducing interpretation time. Another reason is that bytecode often takes up less actual ram. significantly bytecode representations are often easy to just in time compile, because they are often structurally similar to typical machine code.

As another data point, the D programming language is GC'ed, OO, and a lot higher level than C++ while still being compiled to native code.

Bytecode is significantly more flexible medium than machine code. First, it provides the basis for platform portability without the need for a compiler or shipping source code. So a developer can distribute a single version of the application without needing to give up the source, require complex developer tools, or anticipate potential target platforms. While the later is not always practical it does happen. Especially with developer libraries say I distribute a library that I've only tested on Windows, but someone else uses it on Linux or Android. It happens quite frequently actually, and most of the time it works as expected.
Byte code is also generally more optimized that an interpreter because it's closer to machine instructions therefore faster to translate to machine instructions. Not all OO languages are compiled. Ruby, Python, and even Javascript are interpreted so they aren't compiled to anything so the ruby interpreter has to take a very flexible language and turn that into instructions, but that flexibility comes at a price paid an runtime: parse text, generate AST, translate AST to machine code, etc. It's also easy to do optimizations like JIT where byte code is translated to machine code directly, and even gives the possibility for creating optimizations for specific hardware.
Finally, just because one language compiles to bytecode doesn't preclude other languages taking advantage of of that byte code. Now any optimization using that byte code can be applied to these other languages that might know how to translate themselves to that byte code. That makes the byte code a very important layer for reusability for other languages.
OO and byte code compilation goes back to the 70s with Smalltalk, and I'm sure someone will say LISP as early as the 50s/60s. But, it really wasn't until the 90s that it started to really be used in production systems on a large scale.
Native compilation sounds like the optimal path, and probably why our industry spent 20 years or more thinking that was THE ANSWER to all our problems, but the last 15 years we've seen byte code compilation take stage and it's been a significant advantage over what we did before. Looking back we realize how much time wasted natively compiling everything mostly by hand.

I agree with Chubbard's answer and I'd add that in OO languages type information can be very important for enabling optimizations by virtual-machines or last-level compilers

It is easier to develop an interpreter than a compiler.
Effort in development of...:
interpreter < bytecode-interpreter < bytecode-jit-compiler < compiler-to-platform-independent-language < compiler-to-multiple-machine-dependent-assembler.
It is a general trend to stop the development at jit-compilers because of platform independence. Only the preferred languages in respect to performance and research in theoretical computer science are and will be developed in ALL possible directions, including new bytecode-interpreter, even while there are good and advanced compilers to platform independent languages and to different machine-dependant assemblers.
The research in OOP languages is pretty ...let's say dull, compared to functional languages, because really new language and compiler technologies are more easily expressed with/in/using mathematical cathegory theory and mathematical descriptions of touring-complete type-systems. In other words: it is nearly functional in itself, while imperative languages are nearly only assembler-frontends with some syntactic sugar. OOP languages tend to be imperative languages, because functional languages have already closures and lambda. There are other ways to implement java-like "interfaces" in functional languages, and there is just no need for additional object oriented features.
In i.e. Haskell, adding the feature of OOP-like programming would probably be more than only a few steps back in technology – there would be no point in using that. (<- that is not only IMHO... you ever heard of GADTs or Multi-parameter-type-classes?) Probably there might be even better ways to dynamically create Objects with Interfaces to communicate with OOP-languges than changing that language itself. But there are other functional languages, too, that explicitely combine functional and OOP aspects. There is just more science with mainly functional languages than non-functional OO-languages.
OO languages can not be easily compiled to other OO languages, iff they are in some way more "advanced". Usually, they have features like stack-protector, advanced debugging abilities, abstract and inspectable multi-threading, dynamic object-loading from files from the internet... Many of these features are not or not-easily realisable with C or C++ as compiler-backend. The functional language LISP (which is 50 years old!) was AFAIK the first with garbage collector. As compiler-backend LISP used a hacked version of the language C, because plain C did not allow some of those things, assembler did allow, i.e. proper-tail-calls or tables-next-to-code. C-- allows that.
An other aspect: Imperative languages are intended to run on a specific architecture, i.e. C and C++ programs run on only those architectures, they are programmed for. Java is more extreme: it runs only on a single architecture, a virtual one, which itself runs on others.
Functional languages are usually by design pretty architecture-independent: LISP was developed to be so immense architecture-unspecific, that it could be compiled to genetic code, in some distant future. Yes, like programs running in living biologic cells.
With the bytecode for the LLVM, functional languages will most-likely be compiled to bytecode in the future, too. Most imperative languages will most likely still have the same inherited problems as they have now from not-abstracting-far-enough. Well, I'm not that sure about clang and D, but those two are not "the most" anyway.

Related

How compiled language is better than interpreted language in optimizing the hardware?

Specifically how is compiled language able to better optimize the hardware compared to interpreted language? Other online sources that I have read only gave vague explanations like because it is written in the native code of the target machine while some do not even offer explanation at all. Would appreciate if the explanation provided can be as "Layman" as possible given that I've only just started to code.
One major reason is optimizing compilers. Compiling "in advance" makes it much easier to apply optimizations to code, especially if you're compiling to native assembly code (as you typically do in C, for example). The fact that you know some stuff about the machine that it's going to be deployed on allows you to do machine-specific optimizations. This is especially important for, for example, Pentium-based processors, which have numerous complicated instructions that would tend to require some degree of knowledge of program structure in order to use (e.g. the MMX instruction set).
There are also some cases where the compiler can make structural changes to programs. For example, under special circumstances, some compilers can replace recursion with loops. (I once heard of someone writing a recursive Factorial function in C to learn about how to implement recursion in assembly language only to realize to his horror that the compiler had recognized an optimization and replaced his recursion with a for loop).

OOP Performance after compilation

I got into a conversation with someone about OOP, who said that OOP costs to much performance. Now I know that in some cases it might, but as I see it, it would depend on different things.
Language execution. In languages using an interpretor, I can see that it could be a possibility. But what about compiled language like C++ or half compiled like Java? In any case it would just slow down the compilation vs. C, but as native or byte code I would think that the compilers would have optimized it to a point where this is not a problem.
Language structure. If we take PHP as an example, it is quite a flexible language with little rules. Java on the other hand uses strict naming schemes, strict file structure rules and is strict about data types. This speeds up lookup quite a bit. What if we used the same rules in PHP? Made it 100% OOP and adapted the same rules as Java has, would this not speed up PHP?
I found a really great OOP example, but this example does not prove the upside of OOP, but rather the upside of overview and structure. It's no problem using PP to do the same, at least not in PHP.
OOP is a very moot term and that's why your question is moot as well.
On the most generic level, OOP is about objects (let's not dive into what they are) encapsulating some state and passing each other messages to enquire or change that state. As you can see, these objects might be processes running on separate network-connected machines and message passing might be done quite literally—by passing messages of some application-level protocol over that network; this is the one extreme. The opposite edge of this spectrum is, say, C or C++ or Object Pascal etc which are compiled down to machine instructions and in which objects are just memory regions. I reckon the only "interesting" topic is a language on this side of the OOP spectrum, right?
In this, down-to-machine, level, the only relevant slow down I perceive is dynamic dispatch which is what is typically used to implement implementation inheritance (class Bar extends class Foo as in PHP) which allows you to pass objects of derived classes to the code expecting objects of their base class. This is typically requires a lookup through the table of methods at runtime to select a relevant method.
Note that this is not somehow inherent to the concept of OOP. For instance, dynamic lookup like this has been routinely used in plain C code even before C++ came to existence, and C is not an OOP language.
What I'm leading you to, is that some ways to access data and code cost more than others in terms of performance but provide powerful programming tools. Picking such an algorythm while considering a resulting tradeoff is not at all peculiar to implementing OOP concepts and happens in any computing field and any computing paradigm or a combinations of them.
In the end, I would say that the most visible slow-downs will come not from the code running on a CPU but rather from the runtime system. For instance, PHP is known for its ability to dynamically load code at runtime. Does this count as a feature of it being an OOP-enabled language? On the one hand, in these days of heavy frameworks, when PHP loads something it usually loads definitions of classes. On the other hand, if these frameworks were, say, purely procedural the same performance cost would be incurred (as the most loading time is spent waiting on I/O). Interpreted or JIT-compiled languages have to interpret or compile the code they execute and this incurs pefrormance hits. Does this depend on some of these languages implementing OOP concepts? Unlikely, IMO.

Anyone program "low-level" on JVM?

Sometimes we hear about brave people who understand and write assembly language for performance reasons, as opposed to using a compiler with a high-level language. Can the same be done on the JVM? I've reviewed the JVM instruction set, and it resembles assembly language in some respects, though it's much higher level (I'm assuming that the system-specific implementations of the JVM are extremely efficient).
Is it possible to, say, write JVM instructions and put them into a Java-executable binary?
Yes. You can do this via the asm library.
In fact, this is typically how people implement non-Java languages on top of the JVM, and how many Java metaprogramming libraries work.
You may very well want to do this for the same kind of metaprogramming capabilities - e.g., generating classes at runtime, or using the InvokeDynamic instruction to generate your own method dispatch rules.
There isn't a whole lot of performance benefit to be gained from using raw Java bytecode rather than writing the corresponding high-level Java (the JIT is your main performance booster, and it's optimized for the sorts of patterns "vanilla" Java code generates) but it does give you flexibility for things that are difficult, verbose, or impossible to express in Java.

Why are the interpreters of all popular scripting languages written in C (if not in C at least not in C++)?

I recently asked a question on switching from C++ to C for writing an interpreter for speed and I got a comment from someone asking why on earth I would switch to C for that.
So I found out that I actually don't know why - except that C++ object oriented system has a much higher abstraction and therefore is slower.
Why are the interpreters of all popular scripting languages written in C and not in C++?
If you want to tell me about some other language where the interpreter for it isn't in C, please replace all occurences of popular scripting languages in this question with Ruby, Python, Perl and PHP.
C is a very old language, and is thus supported on pretty much every system available. It is therefore a good choice for any project that needs to be ported everywhere.
Ruby dates back to 1995. If you were writing an interpreter in 1995, what were your options? Java was released in the same year. (And was painfully slow in v1.0 and in many ways, not really worth using)
C++ was not yet standardized, and compiler support for it was very sketchy. (it had also not yet made the transition to the "modern C++" that we use today. I think the STL was proposed for standardization around this time as well. It didn't actually get added to the standard until years later. And even after it was added, it took several more years for 1) compilers to catch up, and 2) people to get used to this generic programming style. Back then, C++ was an OOP language first and foremost, and in many cases, that style of C++ was quite a bit slower than C. (In modern C++ code, that performance difference is pretty much eliminated, partly through better compilers, and partly through better coding styles, less reliance on OOP constructs and more on templates and generic programming)
Python was started in 1991. Perl is even older (1987)
PHP is from 1995 as well, but additionally, and importantly, was created by a guy who knew virtually nothing of programming. (and yes, of course this has shaped the language in many important ways)
The languages you mention were started in C because C was the best bet for a portable, future-proof platform back then.
And while I haven't looked this up, I'm willing to bet that apart from the PHP case, which is shaped by incompetence more than anything, the language designers of the other languages chose C because they *already knew it. So perhaps the lesson is not "C is best", but "the language you already know is best"
There are other reasons why C is often chosen:
experience and accessibility: C is a simple language that is fairly easy to pick up, lowering the barrier of entry. It's also popular, and there are a lot of experienced C programmers around. One reason why these languages have become popular might just be that it was easy to find programmers to help developing the interpreters. C++ is more complex to learn and use well. Today, that might not be so much of a problem, but 10 or 15 years ago?
interoperability: Most languages communicate through C interfaces. Since your fancy new language is going to rely on components written in other languages (especially in early versions when the language itself is limited and has few libraries), it's always nice and simple to call a C function.So since we're going to have some C code anyway, it might be tempting to go all the way and just write the whole thing in C.
performance: C doesn't get in your way much. It doesn't magically make your code fast, but it allows you to achieve good performance. So does C++, of course, or many other languages. But it's true for C as well.
portability: Practically every platform has a C compiler. Until recently, C++ compilers were much more hit and miss.
These reasons don't mean that C is in fact a superior language for writing interpreters (or for anything else), they simply explain some of the motivations that have caused others to write in C.
I'd guess it's because C is pretty much the only language that has a reasonably standard compiler for almost every platform in existence.
I would hazard a guess that it's in part due to 1998 C++ not being standardized until 1998, making achieving portability that much harder.
All those languages you list were developed before that standardization.
Why are the interpreters of all popular scripting languages written in C and not in C++?
What makes you think that they are written in C? In my experience, the majority of implementations for the majority of scripting languages are written in languages other than C.
Here's a couple of examples:
Ruby
BlueRuby: written in ABAP
HotRuby: JavaScript
Red Sun: ActionScript
SmallRuby: Smalltalk/X
MagLev: Ruby, GemStone Smalltalk
Smalltalk.rb: Smalltalk
Alumina: Smalltalk
Cardinal: PIR, NQP, PGE
RubyGoLightly: Go
YARI: Io
JRuby: Java
XRuby: Java
Microsoft IronRuby: C#
the original IronRuby by Wilco Bauwer: C#
Ruby.NET: C#
NETRuby: C#
MacRuby: Objective-C
Rubinius: Ruby, C++
MetaRuby: Ruby
RubyVM: Ruby
Python
IronPython: C#
Jython: Java
Pynie: PIR, NQP, PGE
PyPy: Python, RPython
PHP
P8: Java
Quercus: Java
Phalanger: C#
Perl6
Rakudo: Perl6, PIR, NQP, PGE
Pugs: Haskell
Sprixel: JavaScript
v6.pm: Perl5
Elf: CommonLisp
JavaScript
Narcissus: JavaScript
Ejacs: ELisp
Jint: C#
IronJS: F#
Rhino: Java
Mascara (ECMAScript Harmony Reference Implementation): Python
ECMAScript 4 Reference Implementation: Standard ML
The HotSpot JVM is written in C++, the Animorphic Smalltalk VM (from which HotSpot and V8 are derived) is written in C++, the Self VM (on which the Animorphic Smalltalk VM is based) is written in C++.
Interestingly enough, in many of the above cases, the implementations that are not written in C, are actually faster than the ones written in C.
As an example of two implementations that are written in C, take Lua and CPython. In both cases, they are actually written in a small subset of a very old version of C. The reason for this is that they want to be highly portable. CPython, for example, runs on platform for which a C++ compiler doesn't even exist. Also, Perl was written in 1989, CPython in 1990, Lua in 1993, SpiderMonkey in 1995. C++ wasn't standardized until 1998.
The complexity of C++ is great compared to that of C - many people consider it one of the most complex and error prone languages in existance.
Many of the features of C++ are problematic as well - the STL was standardized many years ago and it still lacks one great implementation.
OOP is certainly great, but it does not outweigh C++'s deficiencies in many scenarios.
Most known compiler books are written with examples in C.
Also two of the major tools lexx (builds a lexer) and yacc (Translates a grammar to C) have support for C.
If the question is about why C and not C++ the answer comes down to the fact that when you implement a scripting language the C++ object model comes into your way. Its so restricted that you will not be able to use it for your own objects.
So you can only use this for the internals and they there you usually do not get enough benefits from C++ over the much simpler C language, which makes it easier to port and distribute.
The only problem when implementing a script language in C are missing coroutine support (you have to switch your stack pointer in some way) and most important there is no way to do exception handling without a lot of overhead (compared to C++).

Embedded platform development in (!C)

I'm curious to see how popular the alternatives to C are in the embedded developer world e.g. Ada...
I've only ever used C (with a little bit of assembler), but then my targets have very limited resources. Is there a move else where in this space to something else? What is winning the ware in set top boxes?
If !C what was the underlying reason?
Compiler support for target
Trace \ static analysis tools
other...
Thanks.
Forth is quite popular for embedded development.
Also, while Smalltalk is probably not popular in the embedded community, embedded development is definitely popular in the Smalltalk community.
When you say "embedded development", keep in mind that you have to consider the scale of the project.
When programming something on the scale of a microcontroller or the firmware for an ASIC, you tend to see C and assembly dominate the scene. Embedded developers tend to "specialize" in these languages since compilers for them are available for nearly every embedded target platform. If your project migrates from, say, a chip with a PowerPC core to a chip with an ARM core, you can be fairly confident that your C code will not be overly difficult to port over. Some chips do have compilers available for other languages, but typically they do not match the C compiler in terms of efficiency of the resulting binary. Since embedded systems are often low on resources, system designers want to make their code as efficient as possible (also one reason why you see a lot of assembly language code). I have seen development tools available for languages such as C++, Pascal, Basic, and others, but they are typically niche tools that are not mature enough to match the efficiency of the available C compilers. Debugging tools for these languages also tend to be harder to find than what is available for C/assembly.
You also mentioned set-top boxes. Embedded systems on this scale can pack the equivalent power of a desktop computer from 7-8 years ago. Their available RAM, storage space, and processing power allows them to run full-featured operating systems and interpreters for higher-level languages. On these more powerful systems you will still see C and assembly language being used (for driver code, if nothing else), but other languages (such as Java, Lua, Tcl, Ruby, etc) are becoming more and more common. Using interpreted languages makes porting code from one platform to another even easier, as long as the platform has sufficient resources to handle the overhead of the language interpreter. Any low-level code that interfaces directly with hardware (drivers) with still typically use assembly or C since high-level languages don't always have the capability to do this sort of thing. Anything running as an application on top of the embedded operating system can usually be developed and tested inside an emulator or virtual machine, and so you will see a lot of code being developed in whatever language the developer happens to be comfortable with.
TLDR version: C is popular because is it a versatile language that nearly all developers are familiar with. Assembly is popular because it allows for low-level hardware access in ways that would otherwise be difficult or impossible. Interpreted/scripted languages such as Java are becoming more popular, but the resource requirements of the interpreters for these languages may be too much for some embedded systems to handle. The quality and variety of development/debugging tools availability for the C and assembly languages also makes these options attractive.
Perhaps not quite the large step from C you're looking for but C++ is also resonably popular for embedded projects.
I haven't used myself, but Bascom is quite popular for AVR microcontrollers. It is a Basic IDE that lets you interact with the peripherals very easily. I've met hardware people that successfully use it.
Yes. Java is becoming more popular - many processors have added instructions that pertain primarily to Java and similar languages (.net). Also, uclinux runs on microcontrollers, so you can use practically any language for some of the larger micros.
Basic is still common, as is assembly.
You'll see Ada in certain gov't projects.
And some engineers are even putting Lua and other interpreters on their micros so their customers can extend the functionality.
But C is still dominant.
-Adam
In the early 90 I did a lot of embedded development on the 8051 using Intel PLM51 and the DCX51 operating system.
PLM is very simple language – but very powerful
We now use C
If you work in the smartcard space, you get to use Java Card. Yep, Java, on an 8-bit micro. It's kinda fun, actually. I get to develop in Eclipse, test ( & debug!) on the PC simulator, and can be confident that it'll run the same on the card. It's just such a pity Java is a terrible language for embedded apps :)
I've used EC++ (Embedded C++) quite extensively.
Also, PICBasic has been popular with the PIC'ers for eons now.
I have used Ada in embedded project for military avionics because of customer requirements. There is lots of Ada tools for embedded development but most of it is very expensive. Personally I would just use C.
There is a Pascal compiler for 8051
JAL
There is a group of folks working to make Lua a viable option for embedded work. They are targeting primarily 32-bit ARMs with 256K FLASH and 64K RAM or better, and seem happy with their work so far.
They are partly inspired by the classic BASIC-Stamp, a BASIC interpreter running in a moderately powerful PIC with the program itself stored in a serial EEPROM device.
At work, I am still maintaining a customer's embedded system that is written in a compiled flavor of BASIC running in a Zilog Z180 CPU. 1980's technology all around, with most of the system still built out of 24-bin DIP packages in sockets. The compiler runs under CP/M-80 running in a Z80 simulator, that itself runs in the MS-DOS simulator built into Windows. Aside from the shear amazement that anything productive can be done this way (and that you can still buy 27C256 UV erasable EPROMS, and that my nearly 20 year old Data/IO PROM programmer still works) I really wish the customer could afford to move to a new hardware design so the system could be rewritten in a maintainable language.
Depends on the microcontroller, many of them have C but the compilers are horribly, assembler is usually easy and the best performing, most efficient, etc. Ones like the msp, avr, and arm are good for C compilers and for those I would and do use C (depending on the problem).
I would stick to C or assembler, you are wasting memory, performance, and resources using anything else.
Pascal, Modula2 work fine too. Essentially they are pretty much equivalent to C, except for the inability to do alloca (though some have that as extension).
But the core problem will be the problem with any !C compiler: what do you prefer, a better compiler/toolchain or the language of preference.
Despite I like the Wirthian languages most, I simply use C, and am living with the consequences, simply because the toolchain is better.
There have been examples in the past (Pascals, or even tightly compiled Basics), but C is mostly the norm. I never understood why.
I worked on a device which ran some incredibly old version of python (1.4 or something). There was no way to debug it (other than printing debug messages) so when your code hit an exception everything would just stop and you scratched your head for an hour. Whenever you made a change and upgraded the code it was running, it took about 10 minutes to interpret and compile it.
Needless to say we scrapped that and replaced the microcontroller with one that ran C.
See this related question:
What languages are used for real-time systems programming.
In response to your "why" question, from the standpoint of government/military acquisition, there is a perception that Java (language, platform, etc...) is the lingua franca these days and that economies of scale in the language will reduce acquisition and maintenance cost. There's also a hope that one can efficiently train a competent Java programmer to be a reasonable RT/embedded programmer in Java faster than if they are required to learn a new language. This rationale is suspect, in my opinion, but it does answer the "why" question.
If you include the iPhone as an embedded platform then Objective-C
Considering how many times I've had a Java out-of-memory exception on my phone(most of the time I do anything remotely interesting), I'd run away from Java like a bat out of a hot place.
I've heard that Erlang was designed for use for cell phones. I think Lisp is a good architecture for remote device support- if the device cna handle the run-time.
A lot of home-brew users and small companies needing a cheap solution have found Tiny Tiger and Basic STAMP (using BASIC) meets their needs.