Compiler Optimization of Deterministic Functions - optimization

I was reading about Deterministic Execution, which is that for the same input, you have the same output. I was wondering whether any compiler writer has thought about optimizing deterministic functions at runtime.
For example, take the factorial function. If at runtime, it is detected that it is continuously being called with the same input value, the compiler can cache the output value and instead of executing the factorial function, can directly use that output value. Seems like a nice research topic. Are there any papers or work on this topic?

This is usually called memoization, and is a fairly common optimization in functional languages.

It can be done but as far as I know, it's not common for compilers to do it. The trouble is that users can define as many types as they like and equality in any way that they like, and with heap allocation and stuff it's very, very difficult to prove such a thing. Basically, it could be done, but only if your function involves straight numerical computation, which is rare, and thus it's usually not of high value.

You're talking about referential transparency. And it's a big part of functional programming. talks about profile guided optimization.
doesnot answer your questions per se but in general talks about using runtime behavior to optimize assembly


static and dynamic code analysis

I found several questions about this topic, and all of them with lot of references, but still I don't have a clear idea about that, because most of the references speak about concrete tools and not about the concept in general of the analysis. Thus I have some questions:
About Static analysis:
1. I would like to have a reference, or a summary of which techniques are successful and have more relevance nowadays.
2. What really can they do about discovering bugs, can we make a summary or it is depending of the tool?
About symbolic execution:
1. Where could be enclose symbolic execution? I guess depending of the approach,
I would like to know if they are dynamic analysis, or mix of static and dynamic analysis if it is possible to determine.
I found problems to differentiated the two different techniques in the tools, even I think I know the theoretical difference.
I'm actually working with C
Thanks in advance
I'm trying to give a short answer:
Static analysis looks at the syntactical structure of code and draws conclusions about the program behavior. These conclusions must not always be correct.
A typical example of static analysis is data flow analysis, where you compute sets like used, read, write for every statement. This will help to find e.g. uninitialized values.
You can also analyze the code regarding code-patterns. This way, these tools can be used to check if you are complying to a specific coding standard. A prominent coding standard example is MISRA. This coding standard is used for safety critical systems and avoids problematic constructs in C. This way you can already say a lot about the robustness of your applications against memory leaks, dangling pointers, etc.
Dynamic analysis is not looking at the syntax only, but takes state information into account. In symbolic execution, you are adding assumptions about the possible values of all variables to the statements.
The most expensive and powerful method of dynamic analysis is model checking, where you really look at all possible execution states of the system. You can think of a model checked system as a system that is tested with 100% coverage - but there are of course a lot of practical problems that prevent real systems to be checked that way.
These methods are very powerful, and you can gain a lot from the static code analysis tools especially when combined with a good coding standard.
A feature my software team found really impressive is e.g. that it will tell you in C++ when a class with virtual methods does not have a virtual destructor. Easy to check in fact, but really helpful.
The commercial tools are very expensive, but worth the money, once you learned how to use them. A typical problem in the beginning is that you will get a lot of false alarms, and don't know where to look for the real problem.
Note that nowadays g++ has some of this stuff already built-in, and that you can use something like pclint which is free.
Sorry - this is already getting quite long...hope it's interesting.
The term "static analysis" means that the analysis does not actually run a code. On the other hand, "dynamic analysis" runs a code and also requires some kinds of real test inputs. That is the definition. Nothing more.
Static analysis employs various formal methods such as abstract interpretation, model checking, and symbolic execution. In general, abstract interpretation or model checking is suitable for software verification. Symbolic execution is more appropriate for the purpose of bug finding.
Symbolic execution is categorized into static analysis. However, there is a hybrid method called concolic execution which uses both symbolic execution and dynamic testing.
Added for Zane's comment:
Maybe my explanation was little confusing.
The difference between software verification and bug finding is whether the analysis is sound or not. For example, when we say the buffer overrun analyzer is sound, it means that the analyzer must report all possible buffer overruns. If the analyzer reports nothing, it proves the absence of buffer overruns in the target program. Because model checking is the method that guarantees soundness, it is mostly used for software verification.
On the other hands, symbolic execution which is actively used by today's most commercial static analyzers does not guarantee soundness since sound analysis inherently issues lots, lots of false positives. For the purpose of bug finding, it is more important to reduce false positives even if some true positives are also lost.
In summary,
soundness: there are no false negatives
completeness: there are no false positives
software verification: soundness is more important than completeness
bug finding: completeness is more important than soundness

Good introductory text about GHC implementation?

When programming in Haskell (and especially when solving Project Euler problems, where suboptimal solutions tend to stress the CPU or memory needs) I'm often puzzled why the program behaves the way it is. I look at profiles, try to introduce some strictness, chose another data structure, ... but mostly it's groping in the dark, because I lack a good intuition.
Also, while I know how Lisp, Prolog and imperative languages are typically implemented, I have no idea about implementing a lazy language. I'm a bit curious too.
Hence I would like to know more about the whole chain from program source to execution model.
Things I wonder about:
what typical optimizations are applied?
what is the execution order when there are multiple candidates for evaluation (while I know it's driven from the needed outputs, there may still be big performance differences between first evaluating A and then B, or evaluating B first to detect that you don't need A at all)
how are thunks represented?
how are the stack and the heap used?
what is a CAF? (profiling indicates sometimes that the hotspot is there, but I have no clue)
The majority of the technical information about the architecture and approach of the GHC system is in their wiki. I'll link to the key pieces, and some related papers that people may not know about.
What typical optimizations are applied?
The key paper on this is: A transformation-based optimiser for Haskell,
SL Peyton Jones and A Santos, 1998, which describes the model GHC uses of applying type-preserving transformations (refactorings) of a core Haskell-like language to improve time and memory use. This process is called "simplification".
Typical things that are done in a Haskell compiler include:
Beta reduction;
Dead code elimination;
Transformation of conditions: case-of-case, case elimiation.
Constructed product return;
Full laziness transformation;
Eta expansion;
Lambda lifting;
Strictness analysis.
And sometimes:
The static argument transformation;
Build/foldr or stream fusion;
Common sub-expression elimination;
Constructor specialization.
The above-mentioned paper is the key place to start to understand most of these optimizations. Some of the simpler ones are given in the earlier book, Implementing Functional Languages, Simon Peyton Jones and David Lester.
What is the execution order when there are multiple candidates for evaluation
Assuming you're on a uni-processor, then the answer is "some order that the compiler picks statically based on heuristics, and the demand pattern of the program". If you're using speculative evaluation via sparks, then "some non-deterministic, out-of-order execution pattern".
In general, to see what the execution order is, look at the core, with, e.g. the ghc-core tool. An introduction to Core is in the RWH chapter on optimizations.
How are thunks represented?
Thunks are represented as heap-allocated data with a code pointer.
See the layout of heap objects.
Specifically, see how thunks are represented.
How are the stack and the heap used?
As determined by the design of the Spineless Tagless G-machine, specifically, with many modifications since that paper was released. Broadly, the execution model:
(boxed) objects are allocated on the global heap;
every thread object has a stack, consisting of frames with the same layout as heap objects;
when you make a function call, you push values onto the stack and jump to the function;
if the code needs to allocate e.g. a constructor, that data is placed on the heap.
To deeply understand the stack use model, see "Push/Enter versus Eval/Apply".
What is a CAF?
A "Constant Applicative Form". E.g. a top level constant in your program allocated for the lifetime of your program's execution. Since they're allocated statically, they have to be treated specially by the garbage collector.
References and further reading:
The GHC Commentary
The Spinless Tagless G-machine
Compilation via Transformation
Push/Enter vs Eval/Apply
Unboxed Values as First-Class Citizens
Secrets of the Inliner
Runtime Support for Multicore Haskell
This is probably not what you had in mind in terms of an introductory text, but Edward Yang has an ongoing series of blog posts discussing the Haskell heap, how thunks are implemented, etc.
It's entertaining, both with the illustrations and also by virtue of explicating things without delving into too much detail for someone new to Haskell. The series covers many of your questions:
The Haskell heap and how thunks are stored - the first post in the series
Bindings and CAFs
How the IO monad gets translated into primitives
On a more technical level, there are a number of papers that cover (in concert with other things), parts of what you're wanting to know.:
A paper by SPJ, Simon Marlow et al on GC in Haskell - I haven't read it, but since GC often represents a good porton of the work Haskell does, it should give insight.
The Haskell 2010 report - I'm sure you'll have heard of this, but it's too good not to link to. Can make for dry reading in places, but one of the best ways to understand what makes Haskell the way it is, at least the portions I've read.
A history of Haskell - is more technical than the name would suggest, and offers some very interesting views into Haskell's design, and the decisions behind the design. You can't help but better understand Haskell's implementation after reading it.

Programming languages that define the problem instead of the solution?

Are there any programming languages designed to define the solution to a given problem instead of defining instructions to solve it? So, one would define what the solution or end result should look like and the language interpreter would determine how to arrive at that result. Looking at the list of programming languages, I'm not sure how to even begin to research this.
The best examples I can currently think of to help illustrate what I'm trying to ask are SQL and MapReduce, although those are both sort of mini-languages designed to retrieve data. But, when writing SQL or MapReduce statements, you're defining the end result, and the DB decides the best course of action to arrive at the end result set.
I could see these types of languages, if they exist, being used in crunching a lot of data or finding solutions to a set of equations. The dream language would be one that could interpret the defined problem, identify which parts are parallelizable, and execute the solution across multiple processes/cores/boxes.
What about Declarative Programming? Excerpt from wikipedia article (emphasis added):
In computer science, declarative
programming is a programming paradigm
that expresses the logic of a
computation without describing its
control flow. Many languages
applying this style attempt to
minimize or eliminate side effects by
describing what the program should
accomplish, rather than describing how
to go about accomplishing it. This
is in contrast with imperative
programming, which requires an
explicitly provided algorithm.
The closest you can get to something like this is with a logic language such as Prolog. In these languages you model the problem's logic but again it's not magic.
This sounds like a description of a declarative language (specifically a logic programming language), the most well-known example of which is Prolog. I have no idea whether Prolog is parallelizable, though.
In my experience, Prolog is great for solving constraint-satisfaction problems (ones where there's a set of conditions that must be satisfied) -- you define your input set, define the constraints (e.g., an ordering that must be imposed on the previously unordered inputs) -- but pathological cases are possible, and sometimes the logical deduction process takes a very long time to complete.
If you can define your problem in terms of a Boolean formula you could throw a SAT solver at it, but note that the 3SAT problem (Boolean variable assignment over three-variable clauses) is NP-complete, and its first-order-logic big brother, the Quantified Boolean formula problem (which uses the existential quantifier as well as the universal quantifier), is PSPACE-complete.
There are some very good theorem provers written in OCaml and other FP languages; here are a whole bunch of them.
And of course there's always linear programming via the simplex method.
These languages are commonly referred to as 5th generation programming languages. There are a few examples on the Wikipedia entry I have linked to.
Let me try to answer ... may be Prolog could answer your needs.
I would say Objective Caml (OCaml) too...
This may seem flippant but in a sense that is what stackoverflow is. You declare a problem and or intended result and the community provides the solution, usually in code.
It seems immensely difficult to model dynamic open systems down to a finite number of solutions. I think there is a reason most programming languages are imperative. Not to mention there are massive P = NP problems lurking in the dark that would make such a system difficult to engineer.
Although what would be interesting is if there was a formal framework that could leverage human input to "crunch the numbers" and provide a solution, perhaps imperative code generation. The internet and google search engines are kind of that tool but very primitive.
Large problems and software are basically just a collection of smaller problems solved in code. So any system that generated code would require fairly delimited problem sets that can be mapped to more or less atomic solutions.
Lisp. There are so many Lisp systems out there defined in terms of rules not imperative commands. Google ahoy...
There are various Java-based rules engines which allow declarative programming - Drools is one that I've played with and it seems pretty interesting.
A lot of languages define more problems than solutions (don't take this one seriously).
On a serious note: one more vote for Prolog and different kinds of DSLs designed to be declarative.
I remember reading something about computation using DNA back when I was in college. You would put segments of DNA in a solution that represented segments of the problem, and define it in such a way that if the DNA fits together, it's a valid solution. Then you let the properties of chemicals solve the problem for you and look for finished strands that represent a solution. It sounds sort of like what you are refering to.
I don't recall if it was theoretical or had been done, though.
LINQ could also be considered another declarative DSL (aschewing the argument that it's too similar to SQL). Again, you declare what your solution looks like, and LINQ decides how to find it.
The beauty of these kinds of languages is that projects like PLINQ (which I just found) can spring up around them. Check out this video with the PLINQ developers (WMV direct link) on how they parallelize solution finding without modifying the LINQ language (much).
While mathematical proofs don't constitute a programming language, they do form a formal language where you simply define solutions (as long as you allow nonconstructive proofs). Of course, it's not algorithmic, so "math" might not be an acceptable answer.
Meta Discussion
What constitutes a problem or a solution is not absolute and depends on the level of abstraction that you are taking as a reference point.
Let's compare the following 3 languages: SQL, C++, and CPU instructions.
C++ vs CPU instructions
If you choose array manipulation as the desired level of abstraction, then C++ allows you to "define the problem" instead of the solution:
array[i * 2 + 3] = 5;
array[t] = array[k - m] - 1;
Note what this C++ snippet does not state: how the memory is laid out, how many bits are used by each array element, which CPU registers hold the data, and even in which order the arithmetic operations will be performed (as long as the result is the same).
The C++ compiler, however, will translate this code to lower-level CPU instructions that will contain all of these details.
At the abstraction level of array manipulation, C++ is declarative, and CPU instructions are imperative.
SQL vs C++
If you choose a sorting algorithm as the desired level of abstraction, then SQL allows you to "define the problem" instead of the solution:
select *
from table
order by key
This snippet of code is declarative with respect to the sorting algorithm's level of abstraction because it declares that the output is sorted without using lower-level concepts (like array manipulation).
If you had to sort an array in C++ (without using a library), the program would be expressed in terms of array manipulation steps of a particular sorting algorithm.
void sort(int *array, int size) {
int key, j;
for(int i = 1; i < size; i++) {
key = array[i];
j = i;
while(j > 0 && array[j-1] > key) {
array[j] = array[j-1];
array[j] = key;
This snippet is not declarative with respect to the sorting algorithm's level of abstraction because it uses concepts (such as array manipulation) that are constituents of the sorting algorithm.
To summarize, whether a language defines problems or solutions depends on what problems and solutions you are referring to.
Many answers here have brought up examples: SQL, LINQ, Prolog, Lisp, OCaml. I am sure there are many useful levels of abstractions with respect to which these languages are declarative.
However, do not forget that you can build a language with an even higher level of abstraction on top of them.

how can a compiler that recognizes the iterators be implemented?

I have been using iterators for a while and I love them.
But although I have thought hard about it, I could not figure out "how a compiler that recognizes the iterators" be implemented. I have also researched about it, but could not find any resource explaining the situation in the compiler-design context.
To elaborate, most of the articles about Iterators imply there is some sort of 'magic' implementing the desired behaviour. They suggest the compiler maintains a state machine in order to follow where the execution is (where the last 'yield return' is seen). I am especially interested in this property of Iterators that enables the lazy evaluation.
By the way, I know what state machines are, have already taken a compiler design course, studied the Dragon Book. But appearently, I cannot relate what I have studied to the 'magics' of csc.
Any knowledge or differential thoughts are appreciated.
It's simpler than it seems. The compiler can decompose the iterator function into individual chunks; chunks are divided by yield statements.
The state machine just needs to keep track of which chunk we're currently in, and upon next invocation of the iterator, jumps directly to this chunk. We also need to keep track of all local variables (of course).
Then, we need to consider a few special cases, in particular loops containing yields. Fortunately, IL (but not C# itself) allows goto to jump into loops and resume them.
Notice that there are some very complicated edge cases, e.g. C# doesn't allow yield in finally blocks because it would be very difficult (impossible?) to leave the function upon yield, and later resume the function, perform clean-up, re-throw any exception and preserve the stack trace.
Eric Lippert has posted an in-depth description of the process. (Read the articles he has linked to, as well!)
One thing I would try would be to write a short example in C#, compile it, and then use Reflector on it. I think that this "yield return" thing is just syntax sugar, so you should be able to see how the compiler handles it in the output of the disassembler.
But, well, I don't really know much about these things so maybe I'm completely wrong.

Confused about three optimization techniques

How do you exactly perform "commoning"?
How does Kleene fixed-point theorem help in optimization?
How do you eliminate free variables from local function definitions in programs written in non-functional languages?
EDIT: These are NOT my homework questions. I am in my summer break.
EDIT2: Well I am just begininng to study compiler optimizations and dont have a particular code that I want to optimize. Could you just tell me what are the general methods you can use the above three optimization techniques or at least tell me the resouces that properly explain them?
Commoning is done by bottom-up hashing.
Kleene's theorem allows the compiler to implement an iterative solution to recursion equations that give facts about the program. A simple example of a fact is that at a certain point, variable i is always equal to 0.
If you have a local function with free variables that are let-bound or lambda-bound in an enclosing function, then by definition you are dealing with a language that has first-class functions. The free variables are typically dealt with by closure conversion, although some compilers use lambda-lifting.
Recommended search terms:
Bottom-up hashing
Common-subexpression elimination
Iterative dataflow analysis
Dataflow optimization made simple
Continuation-passing, closure-passing style
Closure conversion
Lambda lifting
These are what I found on the web, if somebody has access to further information please reply.
William Clinger teaches two of the above techniques and looks into more interesting ones in his class:
These guys are using the Kleene algebra for data flow analysis. I think we can use it in optimizing compilers:
Unfortunately the above paper requires login.
This is what I found about commoning(but didnt help much):,516,448
Last Question's Answer:
Good answer from Norman. (I just hope your prof. doesn't confuse optimizations that a compiler might do with optimizations that the software programmer might do. The latter is less of a technical subject, so there is less to say about it, but in real application it is orders of magnitude more significant.)