Can common subexpression elimination be applied between functions? - optimization

I wonder if some parts of two similar functions are the same and the functions are executed sequentially with the same parameters,
or a function inside a loop has identical computations, will compiler do some optimizations like CSE (common subexpression elimination)
or loop-invariant code motion to reduce the redundancy?
Of course, if the function gets inlined, the compiler will find the chance to do the optimizations. But if the function isn't inlined, can compiler still find chances for eliminating the redundancy?
In other words, the compiler will try to find common subexpressions in a basic block or across basic blocks. How about function level?

Related

What is the exhaustive list of guidelines/practices/rules to fully conform with functional paradigm?

I've started playing around with Kotlin, but I sense my own limitation in the way I program. My problem is that I still think Java therefore the style is still imperative, my question is to all functional programming zealots , which I believe would be very useful to all people who at the very beginning stage and also need to 'brake' their brain to start building it again; to leave comfort zone and start thinking pseudo and not in "whatever is your first language". I believe it is possible for highly experienced polyglot developers to chew the concepts down to plain advices of what makes your program being written in entirely functional way and what violates the paradigm. I don't know all the quirks but please don't hesitate to include universally accepted terms which might be unknown to me(I can always lookup). At this point I need this set of rules to make myself suffer at first and not break them but then I know I will feel it, analyze guidelines and understand how they are worse/better which of course is my own homework.
So example of these guidelines, would be something like:
Never change state, this can be avoided by using x, y, z
Operate using higher order functions only (I maybe wrong, just example)
I hope the answer will give me long term reference to put myself in extreme conditions where I stop escaping to OOP whenever I feel uncomfortable. And now when I look at Kotlin I understand how I've should've been thinking about problems, it is about intention not about the structure imposed by one language or another. Intention can always be converted to a language of your choice and backed up by design patterns applicable to the language, but to find that middle ground I need to jail myself first from the comfort zone.
Avoid mutable state like the plague.
One of the main points of using functional programming, possibly the main one, is to avoid all the little pitfalls, bugs, issues one needs to deal with when using mutable state. You should do everything you can in order to avoid mutating state. For instance, instead of using C-style for-loops where you need to keep a counter variable updated, use map and other higher-order functions in order to abstract away your iteration patterns. This also means that you should never change the value of a variable if you can avoid that. Instead, you should be defining almost all of your variables, preferrably all of them, as constants, and using functions to compute new values from them instead of mutating them.
Avoid side-effects like the plague.
Mutable state's ugly cousin, side-effects. Side effects mean anything other than taking a value and returning a value in a function. If that function prints data, mutates global variables, sends messages to threads, or anything, anything other than simply taking its parameters, computing a value from them, and returning a value, that function has side-effects. Side-effects are important (see next bullet point), but if you use them a lot, they get impossible to track. Just think of how everyone tells you to avoid global variables in imperative programming. Functional programming goes a step further and tries to avoid all side-effects. The bulk of your program should be made of pure functions. (See ahead)
When you need to use side-effects, keep them contained.
Yes, I just told you to run away from side-effects. However, no program is useful without side-effects of some kind. Graphical User Interface? Side-effect. Audio output? Side-effect. Printing to a shell? Side-effect. So you can't really get rid of side-effects if you want to build useful stuff.
What you should do instead is write your code so that all your side-effecting code lives in a thin layer which mostly calls pure functions and then does the required side-effects using the result of these pure function calls.
Use pure functions for everything you can.
This is sort of the flipside of the previous point. A pure function is a function which has no side-effects and does not mutate anything. It can only take in parameters and return a value. You should use these a lot. For instance, instead of doing your logging within functions which are computing stuff, you should be constructing your log strings using pure functions, and then letting your side-effects layer call these pure functions, call more pure functions in order to format the log strings into a full log, and then output the log itself from your side-effects layer.
Use higher-order functions to structure your code.
Higher-order functions are, in a way, the glue that makes functional programming work. A higher-order function is a function which takes one or more functions as parameters and/or returns a function. The power of higher-order functions is that they can encapsulate many of the patterns which you would use in an imperative-style program in a declarative manner. For instance, let's take a look at the three most common higher-order functions:
map is a function which takes a function and a list of values, applies its function argument to each of those values, and returns a new list with the results. map encapsulates the whole pattern of iterating over a list doing an operation on each value in a declarative manner.
filter is a function which takes a function which returns a boolean and a list of values, applies its function argument to each of those values and returns a list containing only those values for which its function argument returns true. It encapsulates the whole pattern of selecting results from a list in a declarative manner.
reduce, also known as fold, takes an initial value, a binary function and a list of values. It uses its function argument to combine the initial value with the first value of the list, then combines the result with the next value of the list and keeps on doing this until it has reduced the list to just one single value. It encapsulates the entire pattern of obtaining an aggregate value from a list of values.
This is in no way an exhaustive list of higher-order functions, but these three are the most common ones. I hope this has been enough to show how you can structure code which would require a lot of tracking variables using only functions in a declarative manner. If you use these higher-order functions well, it's likely you won't ever need a for or while loop again.
This is definitely not an exhaustive list of functional programming practices, but I think most functional programmers would agree these five guidelines form the core of what functional programming is about. If you want to really learn how to apply these, my advice would be to learn a pure functional programming language such as Haskell, so you are forced to abandon the imperative paradigm and to learn how to structure things functionally instead. I would recommend the fantastic Haskell Programming from First Principles as a starting resource if you choose to go this way. In case you don't want to/can't put down the cash, Brent Yorgey's Haskell course at UPenn is also a great free resource.

Optimization of consecutive map/filter/fold calls

Let's say I have a big list on which I'd like to execute multiple map, filter and fold/reduce calls. For clarity and expressiveness this should be done with small lambda functions passed to map/filter/fold. However, as far as I know, these are actually traversing the list every time, calling the lambda on it (might be inline though) and generating a new list. If this is the case, I could just code a for-each loop and merge all the lambdas into its body.
I measured execution time of a simple map/filter/reduce algorithm and the corresponding imperative for-each loop in Python and the latter was more than two times faster, just as I expected, but I know Python is not the best language in this regard.
My questions are: Is it possible for a compiler to figure out these and somehow merge them into a single loop? Are there any compilers that do this? I'm interested in mainly functional languages (Haskell, Erlang/Elixir, Scala), but would be good to hear about others as well (Rust's implementation, LINQ).
Yes, such optimizations have been considered many times.
One term or method used is "fusion" (also known as stream or map fusion), which has the goal of intelligently inlining iterated trasformations, in patterns like map f . map g = map (f . g). This mostly has to be done with the help of a compiler, but can work on "normal" implementations of these functions (if they are done somewhat intelligently).
Another approach is to perform this kind of inlining manually by accumulating all intermediate closures, and only apply the combinded transformation when the values are actually needed (this is closely related to lazy evaluation, a thing which will in some languages, like Haskell, be done automatically). Such things can be found in Scala's views and Streams, or Clojure's transducers (which work in a more complicated way, though). The problem with these lazy things is that they tend to run into space problems more easily (I've heard).
Iterators in Python (and C#'s IEnumerable/LINQ stuff, and Java's new Streams) principle work via the latter principle, involving a language-provided iteration support (involving some internal state). Which is why xs = map(print, range(10)) will not print anything immediately, and can only be traversed once; at every step of the iteration, the nested iterators will ask each other for the next value, transform it, and update their state. (And probably your measured difference is due more to this involved machinery than to repeated iteration.)

frege pure functions and performance optimizations

My understanding of haskell's pure functions is that they enable performance optimizations like caching (because a pure function returns the same result for the same input every time). What performance optimizations occur for frege's pure functions?
Certainly not caching. I'm not aware of any language that would do this automatically, and for good reasons.
What we do currently is inlining, beta-reduction and elimination of certain value constructions and deconstructions. For example, when you have:
case (\a -> (Just a, Just a)) 42 of (Just b, Just c) -> [c,b]
the compiler just generates code to construct the list
[ 42, 42 ]
This looks not very useful at first sight, since certainly nobody would write such bloated code. However, consider that the lambda expression may be the result of inlining some other function. In fact, in highly abstract code like monadic code, the expansion of the (>>=) operator often leads to code that can be optimized in this way.
While inlining and beta-reduction is good in some cases, one must take care not do overdo it, lest one gets code bloat. Especially in the JVM environment, it is a disadvantage to have huge functions (that is, methods). The JIT can and will do a great job for small methods.

The use of inline functions in Objective-C

I do not understand the point of 'inline functions'.
I know they are faster (such as macros) than ordinary functions, but why isn't every function an 'inline-function'?
Functions that are called from more than one place in code could be inlined, but the expense of that (ever so slight) performance boost is code size. Generally it's considered better to add the few instructions to make a subroutine call than it is to consume additional space inlining a function.
Functions that are called from exactly one place may be inlined without extra overhead.
Code size - if a function is called from multiple locations duplicating the code everywhere can quickly cause significant growth in overall code size.
Code size - Infinite! It is simply impossible to inline code which is recursive, directly or indirectly.

Does over-using function calls affect performance? Specifically in Fortran

I habitually write code with lots of functions, I find it makes it clearer. But now I'm writing some code in Fortran which needs to be very efficient, and I'm wondering whether over-using functions will slow it down, or whether the compiler will work out what's going on and optimise?
I know in Java/Python etc each function is an object, and so creating lots of functions would require them to be created in memory. I also know that in Haskell the functions are reduced into each other, so it makes little difference there.
Does anyone know about the case with Fortran? Is there a difference with using intent/pure functions/declaring fewer local variables/anything else?
Function calls carry a performance cost for stack based languages, like Fortran. They have to add on to the stack and such.
Because of this, most compilers will try to inline function calls aggressively, if it is possible. Most of the time the compiler will make the right choice on whether or not to inline certain functions in your program.
This automatic inlining process will mean that there is no extra cost for writing your function (at all).
This means that you should write your code as cleanly and organized as possible, and it is likely that the compiler will do these optimizations for you. It is more important that your overall strategy for solving the problem is the most efficient than worry about performance of function calls.
Just write the code in the simplest and most well-structured way you can, then when you have it written and tested you can profile it to see if there are any hotspots which require optimisation. Only at that point should you concern yourself with micro-optimisations, and this may not even be necessary if your compiler is doing its job.
I've just spent all morning tuning an app consisting of mixed C and Fortran, and of course it uses a lot of functions. What I found (and what I usually find) is not that functions are slow, but that certain function calls (and very few of them) don't really have to be done at all. For example, clearing memory blocks, just to be neat, but doing it at high frequency.
This is not a function of language, and not really a function of inlining either. Function calls could be free and you would still have the problem that the call tree tends to be more bushy than necessary. You need to find out where to prune it. This is the "profiling" method I rely on.
Whatever you do, find out what needs to be fixed. Don't Guess. Many people don't think of this kind of question as guessing, but when they find themselves asking "Will this work, will that help?", they're poking in the dark, rather than finding out where the problems are. Once they know where the problems are, the fix is obvious.
Typically subroutine / function calls in Fortran will have very little overhead. While the language standard doesn't specify argument passing mechanisms, the typical implementation is "by reference" so no copying is involved, only setting up a new procedure. On most modern architectures this has little overhead. Selecting good algorithms is generally far more important than micro-optimizations.
An exception about calling be quick could be case in which the compiler has to create temporary arrays, for example, if the actual argument is a non-contiguous array subsection and the called procedure argument is a plain contiguous array. Suppose that the dummy argument is dimension (:). Calling it with an array of dimension (:) is simple. If you request a non-unit stride in the call, e.g., array (1:12:3), then the array is non-contiguous and the compiler may need to create a temporary copy. Suppose that the actual argument is dimension (:,:). If the call has array (:,j), the sub-array is contiguous since in Fortran the first index varies fastest in memory and shouldn't need a copy. But array (i,:) is non-contiguous and might require a temporary copy.
Some compilers have options to warn you when temporary array copies are needed so that you can change your code, if you wish.