Related
Specifically how is compiled language able to better optimize the hardware compared to interpreted language? Other online sources that I have read only gave vague explanations like because it is written in the native code of the target machine while some do not even offer explanation at all. Would appreciate if the explanation provided can be as "Layman" as possible given that I've only just started to code.
One major reason is optimizing compilers. Compiling "in advance" makes it much easier to apply optimizations to code, especially if you're compiling to native assembly code (as you typically do in C, for example). The fact that you know some stuff about the machine that it's going to be deployed on allows you to do machine-specific optimizations. This is especially important for, for example, Pentium-based processors, which have numerous complicated instructions that would tend to require some degree of knowledge of program structure in order to use (e.g. the MMX instruction set).
There are also some cases where the compiler can make structural changes to programs. For example, under special circumstances, some compilers can replace recursion with loops. (I once heard of someone writing a recursive Factorial function in C to learn about how to implement recursion in assembly language only to realize to his horror that the compiler had recognized an optimization and replaced his recursion with a for loop).
I want to write a macroprocessor. So far I've done a very simple sketch of how it should look and I came to the conclusion that inventing a completely new language would not be a good idea but I should reuse existing concepts. My sketch so far is a kind of irb with some tex-alike syntax and features, but I'm not sure what I should use as ruby-substitute.
The language should be simple, yet powerful. I don't want to write an OS in it, but it should be less "raw" than e.g. bc or forth. I don't care about execution time at all. Embedding should not be too hard and it'll be nice if the language itself was stable.
So far I've considered these:
Lua - It should process text easily. Lua does not even have a while(c=getchar()){}. I'm skeptic.
awk - Simple, text processing is easy, but never intended for embedding
perl - Way to complex, stable, but it seems almost dead.
python - Significant whitespaces; won't they get in the way for inlined function-definitions?
groovy/nice/java - Hard/impossible to embed? Also way to heavy.
javascript - Really like it (besides DOM) but is there a stable/embeddable implementation? I don't want to mess around with the api every 2 weeks when there's a new v8 version. As I said, I don't care about execution time.
I have not really found any pros/cons for
io
guile/scheme
TCL
Update: The language should have features such as function-definition, library-loading or regexps (loops would also be very nice) I don't want to use a traditional macro-language such as M4 because I want to able to write in a more procedural (or maybe functional) style. Macro languages have their pros, but I requires a completely new way of thinking about a problem which is hard especially for beginners. My Aim is to use the best of both worlds.
Given that TCL is about string and array processing, and is intended for embedding, it would seem an obvious choice.
Luatex has a certain following. Presumably they have found a way to make it work for text processing, so you might like to look at that.
Scheme (including guile) is also very nice for scripting; alternatively you might look at whether there is a way you could embed an elisp processor (embed xemacs?), which after all is all about text processing.
Of the object-oriented languages I know, pretty much all but C++ and Objective-C compile to bytecode running on some sort of virtual machine. Why have so many different languages settled on compiling to bytecode, as opposed to machine code? Is it possible in princible to have a high-level memory-managed OOP language that compiled to machine code?
Edit: I'm aware that multiplatform support is often advanced as an advantage of this approach. However, it's quite possible to compile natively on multiple platforms, without making a new compiler per platform. One can, per example, emit C code and then compile that with GCC.
There's no reason in fact, this is a kind of coincidence. OOP now is the leading concept in "big" programming, and so virtual machines are.
Also note, that there are 2 distinct parts of traditional virtual machines - garbage collector and bytecode interpreter/JIT-compiler, and these parts can exist separately. For example, Common Lisp implementation called SBCL compiles program to a native code, but at runtime heavily uses garbage collection.
This is done to allow a VM or JIT compiler the chance to compile the code on demand optimally for the architecture on which the code is executed. Also, it allows for cross-platform bytecode to be created once and then executed on multiple hardware architectures. This allows for hardware specific optimizations to be placed into the compiled code.
Since byte code is not limited to a microarchitecture, it can be smaller than machine code. Complex instructions can be represented vs. the much more primitive instructions available in modern day CPUs, since the constraints in the design of CPU instructions are very different from the constraints in designing a bytecode architecture.
Then there's the issue of security. The bytecode can be verified and analyzed prior to execution (i.e., no buffer overflows, variables of a certain type being accessed as something they are not), etc...
Java uses bytecode because two of its initial design goals were portability and compactness. Those both came from the initial vision of a language for embedded devices, where fragments of code could be downloaded on the fly.
Python, Ruby, Smalltalk, javascript, awk and so on use bytecode because writing a native compiler is a lot of work, but a textual interpreter is too slow - bytecode hits a sweet spot of being fairly easy to write, but also satisfactorily quick to run.
I have no idea why the Microsoft languages use bytecode, since for them, neither portability nor compactness is a big deal. A lot of the thinking behind the CLR came out of computer scientists in Cambridge, so i imagine considerations like ease of program analysis and verification were involved.
Note that as well as C++ and Objective C, Eiffel, Ada 9X, Vala and Go are OO languages (of varying vintage) that are compiled straight to native code.
All in all, i'd say that OO and bytecode do not go hand in hand. Rather, we have a coincidental convergence of several streams of development: the traditional bytecoded interpreters of scripting languages like Python and Ruby, the mad Gosling masterplan of Java, and whatever it is Microsoft's motives are.
The biggest reason why most interpreted languages (not specifically OO languages) are compiled to bytecode is for performance. The most expensive part of interpreting code is transforming text source to an intermediate representation. For instance, to perform something like:
foo + bar;
The interpreter would have to scan 10 characters, transform them into 4 tokens, build an AST for the operation, resolve three symbols (+ is a symbol, which depends on the types of foo and bar), all before it can perform any action that actually depends on the run-time state of the program. None of this can change from run to run, and so many languages try to store some form of intermediate representation.
bytecode, rather than storing an AST has a few advantages. For one, bytecodes are easy to serialize, so the IR can be written to disk and reused at the next invocation, further reducing interpretation time. Another reason is that bytecode often takes up less actual ram. significantly bytecode representations are often easy to just in time compile, because they are often structurally similar to typical machine code.
As another data point, the D programming language is GC'ed, OO, and a lot higher level than C++ while still being compiled to native code.
Bytecode is significantly more flexible medium than machine code. First, it provides the basis for platform portability without the need for a compiler or shipping source code. So a developer can distribute a single version of the application without needing to give up the source, require complex developer tools, or anticipate potential target platforms. While the later is not always practical it does happen. Especially with developer libraries say I distribute a library that I've only tested on Windows, but someone else uses it on Linux or Android. It happens quite frequently actually, and most of the time it works as expected.
Byte code is also generally more optimized that an interpreter because it's closer to machine instructions therefore faster to translate to machine instructions. Not all OO languages are compiled. Ruby, Python, and even Javascript are interpreted so they aren't compiled to anything so the ruby interpreter has to take a very flexible language and turn that into instructions, but that flexibility comes at a price paid an runtime: parse text, generate AST, translate AST to machine code, etc. It's also easy to do optimizations like JIT where byte code is translated to machine code directly, and even gives the possibility for creating optimizations for specific hardware.
Finally, just because one language compiles to bytecode doesn't preclude other languages taking advantage of of that byte code. Now any optimization using that byte code can be applied to these other languages that might know how to translate themselves to that byte code. That makes the byte code a very important layer for reusability for other languages.
OO and byte code compilation goes back to the 70s with Smalltalk, and I'm sure someone will say LISP as early as the 50s/60s. But, it really wasn't until the 90s that it started to really be used in production systems on a large scale.
Native compilation sounds like the optimal path, and probably why our industry spent 20 years or more thinking that was THE ANSWER to all our problems, but the last 15 years we've seen byte code compilation take stage and it's been a significant advantage over what we did before. Looking back we realize how much time wasted natively compiling everything mostly by hand.
I agree with Chubbard's answer and I'd add that in OO languages type information can be very important for enabling optimizations by virtual-machines or last-level compilers
It is easier to develop an interpreter than a compiler.
Effort in development of...:
interpreter < bytecode-interpreter < bytecode-jit-compiler < compiler-to-platform-independent-language < compiler-to-multiple-machine-dependent-assembler.
It is a general trend to stop the development at jit-compilers because of platform independence. Only the preferred languages in respect to performance and research in theoretical computer science are and will be developed in ALL possible directions, including new bytecode-interpreter, even while there are good and advanced compilers to platform independent languages and to different machine-dependant assemblers.
The research in OOP languages is pretty ...let's say dull, compared to functional languages, because really new language and compiler technologies are more easily expressed with/in/using mathematical cathegory theory and mathematical descriptions of touring-complete type-systems. In other words: it is nearly functional in itself, while imperative languages are nearly only assembler-frontends with some syntactic sugar. OOP languages tend to be imperative languages, because functional languages have already closures and lambda. There are other ways to implement java-like "interfaces" in functional languages, and there is just no need for additional object oriented features.
In i.e. Haskell, adding the feature of OOP-like programming would probably be more than only a few steps back in technology – there would be no point in using that. (<- that is not only IMHO... you ever heard of GADTs or Multi-parameter-type-classes?) Probably there might be even better ways to dynamically create Objects with Interfaces to communicate with OOP-languges than changing that language itself. But there are other functional languages, too, that explicitely combine functional and OOP aspects. There is just more science with mainly functional languages than non-functional OO-languages.
OO languages can not be easily compiled to other OO languages, iff they are in some way more "advanced". Usually, they have features like stack-protector, advanced debugging abilities, abstract and inspectable multi-threading, dynamic object-loading from files from the internet... Many of these features are not or not-easily realisable with C or C++ as compiler-backend. The functional language LISP (which is 50 years old!) was AFAIK the first with garbage collector. As compiler-backend LISP used a hacked version of the language C, because plain C did not allow some of those things, assembler did allow, i.e. proper-tail-calls or tables-next-to-code. C-- allows that.
An other aspect: Imperative languages are intended to run on a specific architecture, i.e. C and C++ programs run on only those architectures, they are programmed for. Java is more extreme: it runs only on a single architecture, a virtual one, which itself runs on others.
Functional languages are usually by design pretty architecture-independent: LISP was developed to be so immense architecture-unspecific, that it could be compiled to genetic code, in some distant future. Yes, like programs running in living biologic cells.
With the bytecode for the LLVM, functional languages will most-likely be compiled to bytecode in the future, too. Most imperative languages will most likely still have the same inherited problems as they have now from not-abstracting-far-enough. Well, I'm not that sure about clang and D, but those two are not "the most" anyway.
There are many scripting language communities claiming that the language can be used for everything but in fact, nearly everybody uses it for one specific thing, e.g.: web development. If I take a look at Ruby, for example, they tell you its general-purpose but actually everybody is using it with rails for web development only..
Can you list me some uses of popular general-purpose scripting languages for the local PC? (except embedding) Are there any?
Is the fast development usually worth having to bring the whole interpreter with your program? Then there would be some language-dependent performance and stability problems too in most cases..
best regards,
lamas
I tend to use Python for most things that aren't compute bound, i.e. they aren't restricted by how many computations you do per second. Some of the things I've used Python for are:
General scripts to manipulate images etc. with the Python Imaging Library.
GUI frontends for command line applications using the pexpect module.
Mathematical modeling of microbial systems.
Bioinformatics.
Some web programming.
etc...
When the program/algorithm is compute bound, I use C together with Python and Ctypes. Does this fit your definition of general purpose? It's certainly useful for a wide variety of applications, but not suitable if the program needs to crunch numbers fast.
Stability: Python 2.5/2.6 is rock solid. Never had a crash that wasn't caused by self-stupidity.
Fast development: It's definitely worth it for me. For the most part, in the field where I work, programmer time is orders of magnitude more valuable than processor time. I'm quite happy to let a program run for hours if I can write it in a few days instead of a few weeks.
I often use PHP for things that I used to use bat files for. Much easier to write. Ironically, the deployment scripts to create installable materials for my web apps from the subversion sources are written in PHP.
Python is popular in the gaming community. EVE Online is written in python.
claiming that they can be used for everything but I often can't find any examples for that
You are basing your question on an incorrect assumption. Although, as pointed out, a Turing complete language will be able to compute what you require ... languages are 'viewed' by most as the sum of their most useful features and productive semantics.
The reality is:
Most scripting languages can do the same things, or support the most common things via libraries.
Some languages make a subset of operations more convenient, take Perl and regular expressions as an example
CPU time is cheap, as is RAM. Simple to understand code is the priority for most people.
The rise of the scripting languages is natural. Trying to assert any one language, approach or level of execution is good for a range of situations is usually fruitless.
What do you want?
What is the best language for that?
Is is fast enough or small enough? Usually the answer is yes
Imagine trying to use Python where you should be using Erlang, or C instead of Lisp because you thought all languages are equal. They aren't, even though, you can achieve the same things in a problem domain, in most languages/platforms with varying levels of ballache dependant on the task.
I often use ruby for what other people would create bash/sh files for. I find Ruby syntax intuitive for batch tasks along with a lot of other sorts of tasks(it's my goto language)
Perl is extremely popular for general scripting in unixes, such as there are package managers and websites and maintenance scripts written in perl.
Python is extremely popular for both web and application use.
VBA Is popular for being abused to write programs inside of Access, and also was once commonly used in ASP for websites (right?)
Nobody mentioned AppleScript!
Hahah, no seriously, Perl runs everywhere, is installed by default on (almost) any Unix-family OS (and is easy to get on Windows), and is extremely useful for gluing things together. And if you browse a bit at CPAN you'll see that it's extremely general-purpose. "Swiss army chainsaw" was intended as a slur but I think of it fondly. Performance is good too, though it hardly ever actually matters. Larry Wall's goal was "make easy things easy and hard things possible".
OK OK, so I'm a fanboy still, sigh.
There are a lot of discussions all over the internet and on SO, i.e. here and here, about static vs dynamic languages.
I'm not going to ask again about one vs another. Instead, my question is for those who moved (or at least tried to move) from static typed language to dynamic.
I'm not talking about moderate usage of JS on your web page or other scripting language embedded into statically typed software or small personal scripts. I mean moving to dynamic language as your primary general purpose language for developing production quality software in team.
Was that easy? What was the biggest advantage and the biggest challenge? Was it fun? :)
UPD: Did you find IDE support good enough? Did you find that you need less IDE support?
Was that easy?
Moderately. Some Java-isms are hard habits to break. My first six months, I wrote Python with ;'s. Icky. Once I was over it, though, I haven't looked back.
What was the biggest advantage?
Moving from the "write -> compile -> build -> run -> break -> debug -> write" cycle to a "write -> run -> break -> write" cycle. It takes time to get used to immediate gratification from the Python command-line interpreter. I was soooo used to endless design and planning before attempting to write (much less compile) any code.
At first I considered the python command line to be a kind of "education-only" interface. Then reading docstrings, doctests, and user guides where the application is being typed at the >>> prompt, I started to realize that the truly great Python software boils complexity and nuance down to stuff you can type interactively.
[I wish I could design stuff that worked that cleanly.]
What was the biggest challenge?
Multiple inheritance. I use it very rarely.
Was it fun?
So far.
It's also amazingly productive. More time with user requirements and real data. Less time planning an inheritance hierarchy with proper interfaces to capture meaning and compile correctly and be extensible enough to last at least to the next revision.
If I were you, I would try Scala!!!.
Scala has some aspects really interesting that lets you feel like doing dynamic, while doing static.
Scala is a statically typed language
with dynamic typed smell, because the
compiler makes you less repetitive
inferring your assignments.
A compiled language with a warm and
wonderful script flavor.Cause you can use the scala console, or even write scripts just like ruby or python. So you can choose between "write -> compile -> build -> run -> break -> debug -> write" or "write -> run -> break -> write" as S.Lott said.
Scala is a complete Functional
language with full support for OO. So you don't lose many important OO aspects like inheritance, encapsulation, polymorphism, etc.
Why answering you questions suggesting Scala? Because I tryed script languages before, and the main was Ruby. And it was just like S.Lott said. But not so easy for me and my team. Most of time static is safe, less error prone, and even faster if you have the right language.
Answering you three questions putting Scala inside we have:
Was that easy?
Yes. Sometimes you need to concentrate to leave you old concepts aside and go deep.
What was the biggest advantage?
You feel in home cause you don't need to change you environment or rewrite existing applications to migrate to Scala (talking about Java). If you come from Java, you can start playing with Scala after reading some articles. Not too much effort. Another important advantage is the use of a functional language en its embedded power.
Was it fun?
Sure! Changing your mind, changing your way to solve problems to the best is for sure funny.
This is my vision. You don't exactly need to leave off static to grab the advantage of dynamic.
Nice question.
I am now working in Ruby, PHP and ActionScript (the least dynamic of the three) instead of languages that I would prefer, like Java and C#. But beggars, I mean, workers in this economy, can't be choosers. Or rather, you have to choose your battles and your master.
It's hard to compare Ruby and Java because they've got more than one difference, and you only asked about the dynamic/static thing (and not even about the strongly vs. weakly-typed thing!). But on that front, what affects me most is always the IDE. I was always horrified when other Java programmers used Notepad or Textpad to write code, and nowadays there are just too many advantages of a good IDE for that madness. Not true with Ruby! I use Netbeans and it does really well, but one of the main differences is that I have to actually type code. Autocomplete, for me, was/is a way of life (I write SMS messages in full English/Spanish with the predictive dictionary, for instance, and never use abbreviations) and writing Ruby code does require more work.
So at first it was painful and I was constantly looking at, for instance, function names of classes that I had written (or that are part of Ruby) just to get the spelling right! So that sucked, I thought, and I continued to think that until...
I moved back to ActionScript the other day, and to get my IDE autocompleting (FlashDevelop or FlexBuilder) I declare all variables with types (strongly-typed by choice, if you will)... and suddenly I thought what a friggin' hassle!
And then today I had to do some feature additions on a Ruby project and it felt free and cool. The code is clean, and why would I be informing the IDE of what I'm trying to write anyway?
So I would say that 1) the biggest challenges are learning the language and the framework you're working in, like always 2) it's been amazingly fun and deeply eye-opening. New languages always carry new things with them, but dynamic languages just feel different. And that's just the kind of thing that gets you to wake up at 7am and do some coding on a Sunday morning before falling asleep again.
I like programming and like most of you, I've spent some time with stored procedures, XSL, static, dynamic, whatever... it's all fun, and they all feel totally different. In the end, the framework you are working in will be the thing that will convince you too stay or not (if you have a choice), I think, but languages are to learned, studied and experienced, not compared.
I can't qualify myself fully under that handle but I did spend a while writing some an interesting Python mini-game after having spent many years writing Java. So, I might be mixing a little bit of moving from compiled to interpreted along with it.
I found myself using notation to mimic static typing. :)
However, I did find myself cranking code out at a slightly better clip. Having an interpreter is a godsend as far as learning new language/writing new code. The shorter the time between finishing a line of code and seeing it work, the faster you can write, and I think that is probably the best thing most dynamic and interpreted languages.
My code didn't look too different, all things considered. Though, Python has a lot of fun data structures. :)
I'm also interested in this topic.
Tried do dive into Ruby and Rails a while ago, and it really helped me to grasp the ASP.Net MVC stuff, which i think is a bit too chalenging at first for average .net developer.
If you're interested more on moving in this direction, or curious about how some developers moved from static to dynamic languages as their full time jobs, i highly recommend this Alt.Net podcast.