Why have language interpreters be written in the target language? [duplicate] - interpreter

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Bootstrapping a language
What's the importance of having an interpreter for a given language written in the target language (for example, PyPy)?

It's not so much about writing the interpreter in itself - more about writing the interpreter in a high-level language, not in C. Ideally, doing so allows to change details of the implementation, and making the interpreter more modular.
For the specific case of PyPy, writing the interpreter and the core objects in (R)Python allows to retarget PyPy for targets (C, JVM, .NET, JavaScript, etc), and also allows to replace aspects such as the garbage collector.

I'm sure there are many different reasons for doing it. In some cases, it's because you truly believe the language is the best tool... so writing the language interpreter or compiler in the language itself can be seen as a form of dogfooding. If you are really interested in this subject, the following article is a really amazing read about the development of squeak. The current version of squeak is a smalltalk runtime written in smalltalk.
http://users.ipa.net/~dwighth/squeak/oopsla_squeak.html

An added benefit is that if you implement good debugers and IDEs for your target language, they also work for your source language.

This way, you can prove that the target language is serious business, because being able to make it compile something is a sign that it is a good language.
OK, C++ and Java produce compilers as well... so maybe that argument is only half as good as it may seem.

Related

Are there programming languages that directly translate into another?

Is there a programming language that doesn't compile, but rather just translates into another language? I apologize if this is a stupid question to ask, but I was just wondering if this would be a literal shortcut in creating a programming language. Wouldn't it be easier (probably not speedy) but still doable?
Is there a programming language that doesn't compile, but rather just translates into another language?
That makes no sense to me. My definition of compilation is "translating from one language (the source language) to another (the target language)".
Usually the source language is something written by humans and the target language is machine code (or asm), but that's not a requirement. In fact, many compilers are structured as multiple layers, each translating to another intermediate language (until the final layer emits code in the target language).
And it's not directly related to a language, but a particular implementation. We can take C, for example: There are C interpreters, C compilers that target assembler code, C compilers that target machine code (of various platforms), C compilers that target JavaScript, C compilers that target Perl, etc.
As for simplifying the implementation of a language: Yes, there are various kinds of code reuse that apply.
One way is to separate compiler front-ends (translate from source language to an internal abstract representation) and back-ends (translate from the internal abstract representation to machine code for a particular platform). This way you can keep the front-end and only write a new back-end if you want to support another target platform. You can also keep the back-end and only write a new front-end if you want to add support for another source language.
Another way is to use a full-blown programming language as the intermediate representation. For example, your new compiler might produce C code, which can then be compiled to machine code by any C compiler. The first implementation of C++ did exactly this. C has a number of drawbacks as a compiler target language; there have been efforts to create languages better suited for the task (see e.g. C--, which is used internally by GHC (a Haskell compiler)).
Today the most commonly translated language is JavaScript. The newer constructs of ECMAScript are translated to the old version to be compatible with older browsers. The translation is done by Babel.
There are also other languages like TypeScript and CoffeScript that are translated to JavaScript.
f2c translates Fortran 77 to C code. So it is probably an example for what you are looking for.
All general-purpose programming languages are Turing complete. That means any one of them can be translated into another.
When creating a new programming language, many designers often have their first prototypes translate their new language into one their are familiar with. This makes it easier to check if the translation is correct, that the new language is working correctly, and to share ideas with colleagues since it is machine independent.
When their design becomes stable, they make a front end to an existing compiler to do the compiling. Using an existing compiler has several advantages. Optimization is instantly available. The new language can access existing libraries. Compiling can be targeted to all the existing back ends, making the language available on different architectures.
Yes, this is one technique for creating new languages. The first experiments in what became C++ were translated to C for compilation. Taken from http://wiki.c2.com/?CeeAsAnIntermediateLanguage:
Examples of using C in this fashion:
CeeFront; the original implementation of C++, translated to C.
Comeau C++ (http://www.comeaucomputing.com/) translates C++ to C. It
is the first C++ compiler to provide full core language support for
standard C++.
Several Java-to-C translators out there (some translate Java source;
others translate JavaByteCode to C)
Many experimental language compilers use C as a backend, rather than
emitting assembly language directly.
SqueakSmalltalk's VirtualMachine is written in a subset of Smalltalk
which gets translated to C and fed to the C compiler. The
VirtualMachine used by Scheme48 is written in a StaticallyTyped
SchemeLanguage dialect called PreScheme which is compiled to C. (The
PreScheme compiler itself is written in full Scheme.)
Several SchemeImplementations compile to C (e.g. RScheme, Bigloo and
Chicken). These Schemes often use the technique described in
CheneyOnTheMta to provide support for ProperTailRecursion.
More recently, compilers targeting a subset of JavaScript capable of efficient on-the-fly compilation have been created - emscripten.
And if you count assembly language as well as high level languages, WebAssembly or other bytecode languages fit.

scripting or programming language? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
When is a language considered a scripting language?
what is the difference between programming and scripting languages? I have worked on C/C++ for a while and now I started looking at Python, I was told it is a good scripting language this post.
But as I'm learning, I'm finding that everything that can be done with C till now can be done with Python! so what is the actual true difference between scripting and programming languages?
I actually believe the question is a bit misleading. Of course a scripting language is also a programming language. But there are differences:
Between Compiled and Interpreted Languages.
Traditionally a language like c is compiled into machine code that can be understood directly by a cpu. A "script language" on the other hand usually is not being compiled into machine code before execution but interpreted using an interpreter.
The advantage of an interpreted language usually is that it has a faster development cycle because no compilation is necessary and it is easier to move from one platform to another. E.g. python scripts can be executed on windows, linux, mac without changes.
The advantage of a compiled language on the other hand is that it executes usually much faster.
I used "usually" and "traditionally" very often because there are now technologies that make it much harder to draw the line. E.g. it is possible to compile python code directly into native code and there are also interpreters for c code. Also "Just In Time" compiler and virtual machines make it harder to draw here black and white.
More: http://en.wikipedia.org/wiki/Interpreted_language
Duck-Typed and Strong-Typed Languages
Usually script languages are duck-typed which means that a variable can be assigned any type and there is no or only optional checking of types. In compiled languages on the other side like C and C++ every variable is typed and it can and will only hold values of that type.
The advantage of a duck-typed language is usually that it requires less physical typing and less code (e.g. type names can be left of function declarations etc...) and it is easier to write reusable functions.
The advantage of a strong-typed language usually is that it "helps" the programmer finding bugs before running the application. E.g. the compiler would complain about type errors without the need to run the concrete line where the error is happening. Especially in big projects with many contributors this can become an amazing advantage.
More: http://en.wikipedia.org/wiki/Duck_typing

Difference between Scriptable and programmable

I was confused as to what is the difference between a Script and a Program, but a previously asked question Difference between a script and a program? clarified my doubt but that further leads me to wonder what is the difference between an Object being Scriptable versus being Programmable.
Not sure if this is what you're looking for but scripts are generally interpreted at runtime by another program which does something meaningful, whereas programs are typically executable directly on top of the CPU because they were compiled to assembly.
Notable exceptions are .NET managed languages and Java, which 'compile' to IL and bytecode and need some kind of runtime (CLR, JVM, DVM) to execute.
As noted by Michael Petrotta in the question you reference, scripts are generally interpreted and slowish , programs are generally compiled and fasterish. Compiled is often faster than intepreted because interpretation includes compilation at run time (vague and not always the case, but good enough).
Scriptable, to me, means that the object in question supports the interfaces required to be accessable from one or more script languages (for example, JavaScript and/or VBScript).
Programmable, to me, means that the object in question supports the interfaces required to be accessable from a programming language (for example, C++ or Java).
Interpreted and Compiled languages are all programming languages so it is all programming.
Summary: Scriptable vs Programmable are two vaguely synonomous terms.

What is a good VM for developing a hobby language?

I'm thinking about writing my own little language.
I found a few options, but feel free to suggest more.
JVM
Parrot
OSA
A lot of languages are using the JVM, but unless you write a Java-ish language, all the power the stdlib gives you is going to feel ugly; It's not very good at dynamic stuff either.
Parrot seems a good VM for developing languages, but it has a little abandoned/unfinished/hobby project smell to it.
OSA is what powers Applescript, not a particularly well known VM, but I use Mac, and it offers good system integration.
CLR+Mac doesn't seem a good combination...
My language is going to be an object orientated functional concurrent dataflow language with strong typing and a mix of Python and Lisp syntax.
Sounds good, eh?
[edit]
I accepted Python for now, but I'd like to hear more about OSA and Parrot.
One approach I've played with is to use the Python ast module to build an abstract syntax tree representing the code to run. The Python compile function can compile an AST into Python bytecode, which exec can then run. This is a bit higher level than directly generating bytecode, but you will have to deal with some quirks of the Python language (for example, the fundamental difference between statements and expressions).
In doing this I've also written a "deparse" module that attempts to convert an AST back to equivalent Python source code, just for debugging. You can find code in the psil repository if you're interested.
Have a look at LLVM. It's not a pure VM as such, more a framework with it's own IR that allows you to build high level VMs. Has nice stuff like static code analysis and JIT support
Lua has a small, well-written and fast VM
Python VM - you can really attach a new language to it if you want. Or write (use?) something like tinypy which is a small and simple implementation of the Python VM.
Both options above have access to useful standard libraries that will save you work, and are coded in relatively clean and modular C, so they shouldn't be hard to connect to.
That said, I disagree that Parrot is abandoned/hobby. It's quite mature, and has some very strong developers working on it. Furthermore, it's specifically a VM designed to be targeted by multiple dynamic languages. Thus, is was designed with flexibility in mind.
Have you considered Pypy? From what I've read, in addition to being a Python JIT Compiler, it also has the capability to handle other languages. For example there is a tutorial which explains how to create a Brainfuck JIT compiler using Pypy.

Can statically compiled languages replace scripting language?

Assuming you can get a dynamic interpreter; can statically compiled languages replace scripting language? I never quite understood why anyone would use a scripting language? I am talking about on PC, not a limited system which needs a simplistic interpreter. I seen some python install scripts and seen similar python and C# solutions to a problem. So why use a scripting language?
NOTE: There are things that bother me about C#, i am not asking why not use C# instead. I am asking why use a scripting language? I find static compiled languages much easier to debug and often easier to code in.
There is very little distinction these days between compiling and interpreting. Look at how an interpreted language is executed - the first step is to convert the script into some kind of internal executable form, like byte code that can be executed by a simpler instruction set. This is essentially compilation to a virtual machine format. This is exactly what modern compiled languages do. And when compiled languages are deployed in server-side web apps, they even recompile from the source on the fly. So there's practically no difference in terms of the compile/execute technique.
The only difference is in the details of the instruction set, specifically in the type system. Scripting languages are usually (but not always) dynamically typed. But many large applications are also written in dynamically typed languages too. So again, there is no clear distinction here.
Personally I think static typing, far from being "extra unnecessary effort" (as it is often described) is actually a huge productivity booster, making it much easier to write short snippets correctly on the first attempt, thanks to intellisense/autocompletion. To underline this, look at how Microsoft has improved the jQuery library simply by adding static type information to it (in specially formatted comments) so we can have intellisense in the IDE.
And meanwhile, static languages (including C# and Java) are bringing in more dynamic typing features.
So I see these categories as eventually merging and the distinction being meaningless.
Wikipedia says that a Scripting Language is a language that controls other software. You can do that with C#, but true scripting languages like Powershell are designed specifically for this.
I tend to think of a scripting language in more "interactive" terms than C#. With a scripting language, you can write a line or two of code, execute it and see the results immediately. That's not so easy in C#, where you have to put your code in a Console Application, or fire it off from a unit test, or type it into the Immediate window where you don't have intellisense.
That rapid cycle of write, execute allows rapid prototyping of complete "scripts" in a scripting language, because it gives you immediate feedback on each line of code.
This kind of question often starts flame wars as people are passionate about their respective camps.
In the computer olden days, Unix command line tools and console shells provided a rich scripting environment where all sorts of processing could be done. You didn't need to be an expert programmer in any specific language and could string (pun intended) various programs (other people wrote) together using the pipe structure to massage your data which was mostly text not binary related. It is quick and easy to make changes to your batch command file. You don't have a source file that has to be edited, compiled linked with external static or shared libries/DLLS in the case of Windows.
One thing scripting does not have normally have is speed. You don't write device drives and live internet trading AI systems in scripting. But if you run a script once a day on some data received via e-mail or ftp you don't normally care how long it takes as it can run it background anyway.
Rewind back to the present and the waters become muddy. Some scripting enviroments offer a kind of speed up facility where they will read you script and almost compile and link in modules the same a normal C++ or VB program might use for speed puposes. But this very iffy and can't be relied on.
So how do you choose which route to go. Start doing tasks using scripting. If it runs too slow or you are having to do stuff every 5 minutes then parts of your script might benifit from a section written in a traditional language or the whole thing could be written in a language.
Like anything dabble and learn
Each is used for different purposes. Programs written in scripting languages are often not self-contained; they often function as "glue code" or (as Robert Harvey mentions) to automate a task. You often find scripting language interpreters embedded within an application (cf Python in Blender; Guile, Perl and Python in GIMP; JS in umpteen different browsers; Lua in countless games). Compiled languages, on the other hand, are used to produce self-contained applications. Scripts are mostly cross-platform; compiled applications usually aren't.
Note that a scripting language doesn't necessarily use an interactive interpreter (e.g. Perl), and an interpreted language isn't necessarily use for scripts (e.g. games made using PyGame). Note also that there's nothing about the languages themselves that make them interpreted or compiled. You could have a C# interpreter or a Ruby compiler. There have been a number of Lisp systems that offered both interpreters and compilers.
I would call my shell (bash) a scripting language, and I don't see a replacement comming, which is compiled.
I like to use scala, which is a statically typed language which comes with an interpreter-like REPL-interface, and due to type interference looks pretty much like a scripting language; have a look here: http://www.simplyscala.com/ .
But it isn't meant to be the glue between other programs as the shell is, so for small jobs, which are easily verified by hand and eye, which are just a few lines of code, I prefer to use the shell. And jumping from directory to directory is comfortable in a shell, where the prompt shows where I am.
Before we begin, I don't think that I've ever met a static language user who "got" scripting language without trying them, including myself. It is a different experience.
So no. Basically, you can add features to static languages which makes them superficially seem like scripting languages (like simple type inference), but its not the same:
Many scripting language users hate static languages. They feel constrained. Scripting languages are typically very good at not getting in the users way, which is sacrificed in static languages for speed/correctness.
Duck typing will not appear in static languages.
Scripting language users don't like type annotations. Its not really possible to provide a type-inference system for scripting languages, and the simple type inference appearing in some languages now only works for static types.
Techniques like monkey patching (which to my mind is a very bad idea) is pervasive in Ruby, and allows for very powerful techniques, which won't become available soon in static languages either.
Which isn't to say that a yet-to-be-designed language can't handle scripting language features in a relatively static way, but it would be difficult for it to become popular relative to the entrenched Python/PHP/Perl/Ruby/Javascript set. Factor is the closest thing, AFAICT.
What will happen is that scripting language implementations will get faster by using JITs.
Can a screw driver replace a hammer ? No, because you just don't use them for the same purpose. And if both exist, and if such a lot of people use either one or the other, there must be a reason...
Same anwser for :
class inheritance vs prototype;
imperative vs oo;
static vs dynamic typing;
strongly vs weakly typed;
manual memory management vs GC;
C# vs Java;
blue vs red;
man vs woman;
batman vs superman (but I do think superman would win... wait, there is kryptonite... oh man, I don't know...)
etc...
Because it is shorter to write since it is a higher level language, and it doesn't need the compilation cycle which also makes thing shorter.
I am asking why use a scripting
language? I find static compiled
languages much easier to debug and
often easier to code in.
Because I find loosely-typed dynamic languages without an explicit compile-run cycle much easier to debug and generally easier to code in.