What converts vbscript to machine code? - interpreted-language

Compiled languages like C# and java, have just in time compilers, that convert them (from byte code) into machine code (0s and 1s). How does an interpreted language like VBScript get converted into machine code? Is it done by the operating system?

They don't necessarily get converted to machine code (and often don't).
The interpreter for that program runs the appropriate actions according to what the program requires.
Some interpreters might generate machine code (using JIT compilers), others might stick to plain interpretation of the script.

I know this is old, but given that I can't comment (rep), I want to add a clarifying answer:
An interpreter is used to interpret the script (be it VBScript, javascript, python, or any other script) into individual instructions. These instructions can be in the form of machine code or an intermediate representation (that the OS or other program can use). Some interpreters are made for something closer to assembly language and the source code is more or less executed directly.
Most modern scripting languages (eg, Python, Perl, Ruby) are interpreted to an intermediate representation, or to an intermediate representation and into compiled (aka machine, aka object) code. The important distinction (vs compiled languages) is that an interpreter isn't taking an entire body of code and translating its meaning to machine code, it's taking each line at a time and interpreting its meaning as a standalone unit.
Think of this as the difference between translating an entire essay from English to Russian (compiled code) vs taking each sentence in the essay and translating it directly (interpreted code). You may get a similar effect, but the result won't be identical. More importantly, translating an entire essay as a total body of work takes a lot more effort than doing one sentence at a time as a standalone unit, but the whole translation will be much easier for Russian speakers to read than the rather clunky sentence-by-sentence version. Hence the tradeoff between compiling code vs interpreting code.
Source: https://en.wikipedia.org/wiki/Interpreter_(computing), experience

This is the answer I was looking for. Like javascript engine, there used to be a vbscript engine, that converted human readable code to machine code. This vbscript engine is analogous to the JIT compiler in CLR and JVM. Only that it converts directly from human readable code to machine code. As opposed to C# having an intermediate byte code.

Referring to this VB Script wikipedia article,
When VB script is executed in a browser it uses vbscript.dll to interpret VB script.
When VB script file is executed from command-line or a batch file then cscript.exe is used to interpret VB script.
When VB script is used by Windows OS itself for various purposes like showing error message boxes or yellow colored notification messages in the right corner of the task bar then it is interpreted using wscript.exe which is a windows service.

Related

What is a Interpreter to be exact?

According to Wikipedia an interpreter uses at least one of the following Strategies:
Parse the source code and perform its behavior directly;
Translate source code into some efficient intermediate representation or object code and immediately execute that;
Explicitly execute stored precompiled bytecode made by a compiler and matched with the interpreter Virtual Machine.
So is a program that reads code and executes it directly an interpreter? Does an interpreter need to convert code into binary? Does a compiler need to convert code into binary?
So is a program that reads code and executes it directly an interpreter?
Yes. By definition, an interpreter reads code, then performs what the code tells it to do. Unlike an interpreter, a compiler reads the code then makes an executable file that can be run later.
Does an interpreter need to convert code into binary?
Not always. An interpreter may just read the input code then perform what the code tells it to do, but another type of interpreters use JIT Compilation. Interpreters that use JIT Compilation turn the input code into machine code, but do not make an executable file. Instead, they run the code in memory then throw it away after it has been run. JIT Compilation can be faster than traditional interpreters.
Does a compiler need to convert code into binary?
Yes. In order to create an executable file, a compiler must first read the input code then turn it into something the computer can understand (machine code). This first step is just like JIT Compilation. Unlike JIT Compilation, compilers do not run the machine code it produces, and does not throw it away. Instead, it writes it to a file (called an executable file, or just executable) in a specific format for the OS it is being compiled on. This specific format is why Windows programs cannot run on Linux, and vice-versa.

What is the need of JVM when you can pass the source code?

i am new to java.
i wanted to know this.
what is the need to create the .class file in java ?
can't we just pass the source code to every machine so that each machine can compile it according to the OS and the hardware ?
I believe it's mostly for efficiency reasons.
From wikipedia http://en.wikipedia.org/wiki/Bytecode:
Bytecode, also known as p-code (portable code), is a form of
instruction set designed for efficient execution by a software
interpreter. Unlike human-readable source code, bytecodes are compact
numeric codes, constants, and references (normally numeric addresses)
which encode the result of parsing and semantic analysis of things
like type, scope, and nesting depths of program objects. They
therefore allow much better performance than direct interpretation of
source code.
(my emphasis)
And as others have mentioned possible weak obfuscation of the source code.
The main reason for the compilation is that the Virtual Machines which are used to host java classes and run them only understands bytecode
And since compiling a class each time to the language the virtual machine understands is expensive. That's the only reason why the source code is compiled into bytecode.
But we can also use some compilers which compiles source code directly into machine code.But that's a different story which I don't know about much.

what language is dotnet executable written in?

I thought it would be Common Intermediate Language, but in notepad it does not look like that at all. Does it just look uglier in reality than in tutorials? Or is it some bytecode form that is further compiled from CIL?
It's CIL is the name of the binary format, not of the "assembler" you're thinking of.
Can you possibly imagine that .NET assemblies would be text files?
A .NET executable is a binary file that has a PE header (same as a native executable, but with slightly different values). The PE header tells the OS to load the CLR, which in turn loads the assembly.
The content beyond the header is a binary representation of the CIL code, plus some metadata and other stuff. The text you see in tutorials is the text representation of CIL, in much the same way that the assembly language code you see in a tutorial about assembly language programming is just the text representation of the binary machine code.
See http://www.yetanotherchris.me/home/2010/7/12/inside-net-assemblies-part-1.html (among many others) for more information.
A .Net executable is usually not written, it is compiled from another language such as C#, F# or VB.Net.
The contents of a .Net executable can be viewed with the ILDASM tool.
The contents are first a manifest which is used for reflection, signatures or other meta-code purposes.
Secondly there are the MSIL instructions themselves. These are in a kind of bytecode format, but ILDASM will show you what the instructions are.
And there are sometimes resources such as imagery, sounds or other content packed into the executable.
The executable is just-in-time compiled to native code either during installation (I think this is uncommon), or as a precursor to execution. The resulting native code can be stored for reuse. (This is what I was told during PDC 2001, might be "out of date".)

What form is DLL & what makes it processor dependent

I know DLL contains one or more exported functions that are compiled, linked, and stored separately..
My question is about not about how to create it.. but it is all about in what form it is stored.. Is it going to be in the form of 0's & 1's.. or in assembly commands ADD, MUL, DIV, MOV, CALL, RETURN etc..
Also what makes it to be processor dependent.. (like x86, x87, IBM 700 instruction set)..
Can someone please explain it little briefly..!
First of all, everything in a computer is in the form of "0's & 1's" . The fact that the computer can display some of these as text, pictures, sounds, 3D models, etc. is just a matter of how you interpret them. But down there, at the metal, it's all just "0's & 1's" (also known as bits). Note though that they are always grouped together in groups of 8, and these are called "bytes". It's really for the sake of efficiency, because operating with every bit individually would be too tedious. Actually, todays computers don't even operate on single bytes anymore (or rather - they do it very rarely). Mostly you operate with 4 or 8 bytes at a time, depending on whether you have a 32-bit or 64-bit CPU (that's in layman's terms, it's actually a bit more complicated than that).
As for a .DLL file - like an .EXE file, it contains bytes that describe instructions that a CPU can execute. The CPU takes these bytes directly from the .DLL/.EXE and executes them without any further modifications. That's why these files are CPU-specific. In different CPU architectures the same combination of bytes means different things, so a .DLL/.EXE will run correctly only on the CPU for which it was designed. On other CPUs these bytes will mean some other instructions, and when run, the program will most likely do some utter nonsense and crash immediately.
The assembly commands you mentioned also deserve an explanation. "Assembler" is not a language that a CPU can understand. It's a language a human can understand. It was created because writing directly in machine code (the bytes that the CPU actually understands) is very difficult. What you get is utter gibberish on the screen (try opening some .EXE file in Notepad!) but every bit has to be precisely set for it to work.
So assembly language is basically the same thing, except these instructions are written in text that humans can read. For every machine code that a CPU can understand, there is am instruction with a human-friendly name. An assembly compiler simply reads these instructions and replaces them with the bytes that represent the actual instructions for the CPU to execute. It's a 1:1 operation. Every command in assembly language matches a single machine instruction (again, in layman's terms).
So you see, there isn't even a single assembly language. Every CPU architecture has its own assembly language, because they each have different instructions.
Note though that all this applies to native .DLL/.EXE files. .NET files are different - they don't contain machine code, but rather instructions for an abstract, nonexistent CPU. It's like Java bytecodes. When a .NET .DLL/.EXE is run, the .NET runtime translates it from the abstract instructions to the instructions that the specific CPU can understand. They use a lot of tricks to make this very fast, so these files run almost as fast as simple .DLL/.EXE files.
Does this clear things up? :)
Native DLLs (not .NET assemblies) usually contain machine code that can only be run on a certain platform. The machine code is a sequence of bytes that the processor treats as instructions (ADD, MOV, etc.).
In Windows, dll's are stored in the PE format which is basically a collection of sections that holds the information about how to map it into memory. some sections contains the program's code (which is of course processor dependent), others contains the program's data, other the exported and imported functions and so on.
Managed code is compiled to some intermediate language that is JITed by the run-time as it is executed. therefore, your dll won't contain any processor dependent code and you'll be able to execute your program on any platform with the relevant run-time.
it depends on your DLL. generally, a DLL contains executable code as an EXE file. those code DLLs are processor dependent since the code can only be executed on a specific platform. the code is stored using the same "format" as an EXE file (binary machine code).
however, a DLL can sometimes contains only data: they are then called "resource DLL" and are not processor dependent at all. they act as a container for data files used by applications.
note that many DLLs are hybrids: they contain both code and resources. for example, most DLLs which comprises the user part of the Windows operating system are hybrid: you can open them using Visual Studio or a Resource Explorer to see the resources (the data segments) they contain, or open them with Dependency Walker or dumpbin to see the functions (the code segments) they contain.
(of course this answer is really Windows specific, i don't know for .so files which are the linux equivalent of a DLL)
Both a DLL and an EXE contain executable code.
In the case of a DLL it doesn't have the necessary parts to be directly executable. It must be called from an other piece of executable code. One DLL can call another, but all must ultimately be called from and EXE.
So the rules about what's compatible with what processor that apply to EXEs also apply to DLLs.

Are all scripts written in scripting languages?

I'm confused by the concept of scripts.
Can I say that makefile is a kind of script?
Are there scripts written in C or Java?
I'd refer to Wikipedia for a detailed explanation.
"Scripts" usually refer to a piece of code or set of instructions that run in the context of another program. They usually aren't a standalone executable piece of software.
Makefiles are a script that is run by "make", or MSBuild, etc.
C needs to be compiled into an executable or a library, so programs written in (standard) C would typically not be considered scripts. (There are exceptions, but this isn't the normal way of working with C.)
Java (and especially .net) is a bit different. A typical java program is compiled and run as an executable, but this is a grey area. It is possible to do runtime compilation of a "script" written in java and execute it.
In a very general sense the term "Scripts" relates to code that is deployed and expected to run from the lexical representation. As soon as you compile the code and distribute the resulting output instead of the code it ceases to be a "Script".
Minification and obsfication of a script is not consided a compile and the result is still consider a script.
It depends on your definition of script. For me, a script could be any small program you write for a small purpose. They are usually written in interpreted languages. However, there's nothing stopping you from writing a small program in a compiled language.
For me a script has to consist of a single file. And that file must be able to perform the task for which the script was written with no intermediate steps.
So these would be OK:
bash backup_my_home_dir.sh
perl munge_some_text.pl
python download_url.py
But this wouldn't qualify, even if the file is small:
javac HandyUtility.java
java HandyUtility
Yes it's possible to do scripting in Java. I've seen it many times :)
(this was sarcasm for bad spaghetti code)
The term 'scripting' can cover a fairly broad spectrum of activities. Examples being programming in imperative interpreted languages such as VBScript, Python, or shell scripts such as csh or bash, or expressing a task in declaritive languages such as XSL, SQL or Erlang.
Some scripting languages fall into a category referred to as Domain Specific Languages (DSL's). Good examples of DSL's are 'makefile's, many other types of configuration files, SQL, XSL and so on.
What you're asking is fairly subjective, one man's script is another man's application. If your interpretation of scripting means that using scripting languages should not force a user to follow the traditional compile -> link -> run cycle, then you could form the opinion that you can't write 'scripts' in C or Java.
A script is basically a non-compilable text file in almost any language, or shell, with an interpreter that is used to automate some process, or list of commands, that you perform repeatedly. Scripts are often used for backing up files, compiling routines, svn commits, shell initialization, etc., ad infinitum. There are a million and one things you can do with a script that an executable (complete with installation, etc.) would simply be overkill for.
I write scripts in F#. A recent one is a small data loader to take in some set of data, do a bit of processing to it, and dump it in a DB. ~40 lines. No separate compilation step needed; I can just make F# Interactive run it directly.
Benefit is that I get a fully powered language with a great IDE and all the safety static checking provides, while inference makes it not get verbose like say, Java or C#.
So, that's one language that offers a reasonably decent type system, compilation and checking, isn't interpreteded, but works fine for scripting.