Use of recursion in Scala when run in the JVM - optimization

From searching elsewhere on this site and the web, tail call optimization is not supported by the JVM. Does that therefore mean that tail recursive Scala code such as the following, which may run on very large input lists, should not be written if it is to run on the JVM?
// Get the nth element in a list
def nth[T](n : Int, list : List[T]) : T = list match {
case Nil => throw new IllegalArgumentException
case _ if n == 0 => throw new IllegalArgumentException
case _ :: tail if n == 1 => list.head
case _ :: tail => nth(n - 1, tail)
}
Martin Odersky's Scala by Example contains the following paragragh which seems to suggests that there are circumstances or other environments where recursion is appropriate:
In principle, tail calls can always re-use the stack frame of the calling
function. However, some run-time environments (such as the Java VM) lack the
primitives to make stack frame re-use for tail calls efficient. A production quality
Scala implementation is therefore only required to re-use the stack frame of a di-
rectly tail-recursive function whose last action is a call to itself. Other tail calls might
be optimized also, but one should not rely on this across implementations.
Can anyone explain what this middle two sentences of this paragraph mean?
Thank you!

Since direct tail recursion is equivalent to a while loop, your example will run efficiently on the JVM because the Scala compiler can compile this to a loop under the hood, simply using a jump. General TCO however is not supported on the JVM, although there is available the tailcall() method which supports tail calls using compiler-generated trampolines.
To ensure that the compiler can correctly optimize a tail-recursive function to a loop, you can use the scala.annotation.tailrec annotation, which will cause a compiler error if the compiler cannot make the desired optimization:
import scala.annotation.tailrec
#tailrec def nth[T](n : Int, list : List[T]) : Option[T] = list match {
case Nil => None
case _ if n == 0 => None
case _ :: tail if n == 1 => list.headOption
case _ :: tail => nth(n - 1, tail)
}
(screw IllegalArgmentException!)

In principle, tail calls can always re-use the stack frame of the calling
function. However, some runtime environments (such as the Java VM) lack the
primitives to make stack frame re-use for tail calls efficient. A production quality
Scala implementation is therefore only required to re-use the stack frame of a di
rectly tail-recursive function whose last action is a call to itself. Other tail calls might
be optimized also, but one should not rely on this across implementations.
Can anyone explain what this middle two sentences of this paragraph mean?
Tail recursion is a special case of a tail call. Direct tail recursion is a special case of tail recursion. Only direct tail recursion is guaranteed to be optimized. Others may be optimized, too, but that's basically just a compiler optimization. As a language feature, Scala only guarantees direct tail recursion elimination.
So, what's the difference?
Well, a tail call is simply the last call in a subroutine:
def a = {
b
c
}
In this case, the call to c is a tail call, the call to b is not.
Tail recursion is when a tail call calls a subroutine that was already called before:
def a = {
b
}
def b = {
a
}
This is tail recursion: a calls b (a tail call), which in turn calls a again. (In contrast to the direct tail recursion described below, this is sometimes called indirect tail recursion.)
However, none of the two examples will get optimized by Scala. Or, more precisely: a Scala implementation is allowed to optimize them, but it is not required to do so. This is in contrast to, e.g. Scheme, where the language specification guarantees that all of the above cases will take O(1) stack space.
The Scala Language Specification only guarantees that direct tail recursion is optimized, i.e. when a subroutine directly calls itself with no other calls in between:
def a = {
b
a
}
In this case, the call to a is a tail call (because it is the last call in the subroutine), it is tail recursion (because it calls itself again) and most importantly it is direct tail recursion, because a directly calls itself without going through another call first.
Note that there are many subtle things that may lead to a method not being directly tail-recursive. For example, if a is overloaded, then the recursion may actually go through different overloads, and thus would no longer be direct.
In practice, this means two things:
you cannot perform an Extract Method Refactoring on a tail-recursive method, at least not including the tail call, because this would turn a directly tail-recursive method (which will get optimized) into an indirectly tail-recursive method (which will not get optimized).
You can only use direct tail recursion. A tail-recursive descent parser, or a state machine, which can be very elegantly expressed using indirect tail recursion, are out.
The main reason for this is that when your underlying execution engine lacks powerful control flow manipulation features such as GOTO, continuations, first-class mutable stack or proper tail calls, then you need to either implement your own stack on top of it, use trampolines, make a global CPS transform or something similarly nasty, in order to provide generalized proper tail calls. All of these have either severe impact on performance or interoperability with other code on the same platform.
Or, as Rich Hickey, the creator of Clojure, said when he was facing the same problem: "Performance, Java interop, tail calls. Pick two." Both Clojure and Scala chose to compromise on tail calls and provide only tail recursion and not full tail calls.
To cut a long story short: yes, the specific example you posted will be optimized, since it is direct tail recursion. You can test this by putting an #tailrec annotation on the method. The annotation does not change whether or not the method gets optimized, it does however guarantee that you will get a compile error if the method can not be optimized.
Due to the above-mentioned subtleties, it is generally a good idea to put an #tailrec annotation on methods that you need to be optimized, both in order to get a compile error, but also as a hint to other developers maintaining your code.

The Scala compiler will attempt to optimize tail calls by "flattening" them into a loop that won't cause a continually expanding stack.
Of course, your code has to be optimizable for it to do so. If you use the annotation #tailrec before your method however (scala.annotation.tailrec) the compiler will REQUIRE the method be optimizable or fail to compile.

Martin's remark is saying that only directly self-recursive calls are candidates (other criteria being met) for the TCO optimization. Indirect, mutually recursive method pairs (or larger sets of recursive methods) cannot be so optimized.

Note that there are JVMs that support tail call optimization (IIRC, IBM's J9 does), it's just not a requirement in the JLS, and Oracle's implementation doesn't do it.

Related

How to make SBCL optimize away possible call to FDEFINITION?

Apologies: I don't have sufficient knowledge to rework this as an easy to understand code snippet.
I've been using the SBCL compiler notes as signs to what might be improved but I'm well out of my depth with this —
; compiling (DEFUN EXECUTE-PARALLEL ...)
; file: /home/dunham/8000-benchmarksgame/bench/spectralnorm/spectralnorm.sbcl-8.sbcl
; in: DEFUN EXECUTE-PARALLEL
; (FUNCALL FUNCTION START END)
; --> SB-C::%FUNCALL THE
; ==>
; (SB-KERNEL:%COERCE-CALLABLE-FOR-CALL FUNCTION)
;
; note: unable to
; optimize away possible call to FDEFINITION at runtime
; because:
; FUNCTION is not known to be a function
—
#+sb-thread
(defun execute-parallel (start end function)
(declare (type int31 start end))
(let* ((num-threads 4))
(loop with step = (truncate (- end start) num-threads)
for index from start below end by step
collecting (let ((start index)
(end (min end (+ index step))))
(sb-thread:make-thread
(lambda () (funcall function start end))))
into threads
finally (mapcar #'sb-thread:join-thread threads))))
#-sb-thread
(defun execute-parallel (start end function )
(funcall function start end))
(The program is here. Measurements for similar programs are here.)
Is it practical to make SBCL "optimize away possible call to FDEFINITION" or is that compiler note an explanation rather than an opportunity?
The reason for the possible call to fdefinition is that it doesn't know that function is a function: it might be the name of one: in general it may be a function designator rather than a function. To keep the compiler quiet, explain to it that it is a function with a suitable type declaration, which is (declare (type function function)): you just need to declare that its type is function).
Rainer is right: there is ε chance that this is ever going to be a performance problem, given you're starting a new thread. In particular it is fairly likely that adding a declaration will make no difference at all:
without a declaration the call to funcall will get compiled as something like 'check the type of the object: if it is a function, call it; if it is not, call fdefinition on it and call the result;';
with a declaration then the overall function looks like 'check the object is a function, signalling an error if not ... call the function'.
In both cases, if the object is a function, there is one type check and one call: the type check is just in a different place. In the first case, the code will still work if the object is merely the name of a function, while with the type check it won't.
And in both of these cases this is code where you care calling make-thread: if this is anything like as fast as a function call, even via fdefinition I would be really impressed by the threading system! Almost certainly the performance of this function is entirely dominated by the overhead of making threads.
In real code, avoid optimizations like that - unless really needed
Is it practical to make SBCL "optimize away possible call to FDEFINITION" or is that compiler note an explanation rather than an opportunity?
Generally it does not matter, especially since most Lisp code should not be compiled with optimization qualities (speed 3) (safety 0) (space 0), since it may open up the software to runtime errors and crashes depending on the implementation and program used. Calling things unchecked (without safety), other than functions or symbols naming functions, via funcall might be dangerous enough to cause a program crash.
For a specific benchmark one might check via timings if a type declaration and a specialized fdefinition compilation brings any advantage.
a type declaration
A type declaration to make clear that a variable named fn is referencing an object of type function would be:
(declare (type function fn))
in the specific benchmark program FDEFINITION won't be called anyway
In the example you have provided, fdefinition will not be called anyway.
(setf foo (lambda (x) x)) ; foo references a function object
(funcall foo 3)
funcall is probably implemented by something like this:
(etypecase f
((or cons symbol) (funcall (fdefinition f) ...))
(function ...))
Since your code passes a function object, there is never the need to call fdefinition.
The optimization benefit then will be that the runtime type dispatch can be removed and the dead code for the cons or symbol case...
You ask a question about removing an fdefinition but actually your question relies on a premise that the sbcl notes are a good way to drive optimisations and improvements. The notes are a good way to spot obvious issues and places where type declarations can help. They do not tell you what actually makes your program slow. The correct way to improve the performance of a program is to 1. Think if there is a faster algorithm, and 2. Measure it’s performance and work out what is slow.
A single fdefinition call will only matter if it happens in a tight loop (ie it is not single but very plural)
In this case it happens to start a thread. If you are starting threads in a tight loop then your performance problem comes from starting threads in a tight loop. Don’t do that.
If you aren’t starting threads in a tight loop (looking at your code, it appears you are not), there are bigger fish to fry. Why waste time on an fdefinition that maybe gets called 4 times per call to execute-parallel when you can optimise the inner function instead.

Can you clone a Perl 6 Proc?

I was playing with this in 2018.01:
my $proc = Proc.new: :out;
my $f = $proc.clone;
$f.spawn: 'ls';
put $f.out.slurp;
It says it can't do it. It's curious that the error message is about a routine I didn't use and a different class:
Cannot resolve caller stdout(Proc::Async: :bin); none of these signatures match:
(Proc::Async:D $: :$bin!, *%_)
(Proc::Async:D $: :$enc, :$translate-nl, *%_)
in block <unit> at proc-out.p6 line 3
Everything inherits a default clone method from Mu, which does a shallow clone, but that doesn't mean that everything makes sense to clone. This especially goes for objects that might hold references to OS-level things, such as Proc or IO::Handle. As the person who designed Proc::Async, I can say for certain that making it do anything useful on clone was not a design consideration. I didn't design Proc, but I suspect the same applies.
As for the error, keep in mind that the Perl 6 standard library is implemented in Perl 6 (a lot like in Java and .Net, but less like Perl 5 where many things that are provided by default go directly to something written in C). In this particular case, Proc is implemented in terms of Proc::Async. Rakudo tries to trim stack traces somewhat to eliminate calls inside of the setting, which is usually a win for the language user, but in cases like this can be a little less helpful. Running Rakudo with the --ll-exception flag provides the full details, and thus makes clearer what is going on.

Semantics of GCC hot attribute

Assume I have a compilation unit consisting of three functions, A, B, and C. A is invoked once from a function external to the compilation unit (e.g. it's an entry point or callback); B is invoked many times by A (e.g. it's invoked in a tight loop); C is invoked once by each invocation of B (e.g. it's a library function).
The entire path through A (passing through B and C) is performance-critical, though the performance of A itself is non-critical (as most time is spent in B and C).
What is the minimal set of functions which one should annotate with __attribute__ ((hot)) to effect more aggressive optimization of this path? Assume we cannot use -fprofile-generate.
Equivalently: Does __attribute__ ((hot)) mean "optimize the body of this function", "optimize calls to this function", "optimize all descendant calls this function makes", or some combination thereof?
The GCC info page does not clearly address these questions.
Official documentation:
hot
The hot attribute on a function is used to inform the compiler that the function is a hot spot of the compiled program. The function is optimized more aggressively and on many target it is placed into special subsection of the text section so all hot functions appears close together improving locality.
When profile feedback is available, via -fprofile-use, hot functions are automatically detected and this attribute is ignored.
The hot attribute on functions is not implemented in GCC versions earlier than 4.3.
The hot attribute on a label is used to inform the compiler that path following the label are more likely than paths that are not so annotated. This attribute is used in cases where __builtin_expect cannot be used, for instance with computed goto or asm goto.
The hot attribute on labels is not implemented in GCC versions earlier than 4.8.
2007:
__attribute__((hot))
Hint that the marked function is "hot" and should be optimized more aggresively and/or placed near other "hot" functions (for cache locality).
Gilad Ben-Yossef:
As their name suggests, these function attributes are used to hint the compiler that the corresponding functions are called often in your code (hot) or seldom called (cold).
The compiler can then order the code in branches, such as if statements, to favour branches that call these hot functions and disfavour functions cold functions, under the assumption that it is more likely that that the branch that will be taken will call a hot function and less likely to call a cold one.
In addition, the compiler can choose to group together functions marked as hot in a special section in the generated binary, on the premise that since data and instruction caches work based on locality, or the relative distance of related code and data, putting all the often called function together will result in better caching of their code for the entire application.
Good candidates for the hot attribute are core functions which are called very often in your code base. Good candidates for the cold attribute are internal error handling functions which are called only in case of errors.
So, according to these sources, __attribute__ ((hot)) means:
optimize calls to this function
optimize the body of this function
put body of this function to .hot section (to group all hot code in one location)
After source code analysis we can say that "hot" attribute is checked with (lookup_attribute ("hot", DECL_ATTRIBUTES (current_function_decl)); and when it is true, the functions's node->frequency is set to NODE_FREQUENCY_HOT (predict.c, compute_function_frequency()).
If the function has frequency as NODE_FREQUENCY_HOT,
If there is no profile information and no likely/unlikely on branches, maybe_hot_frequency_p will return true for the function (== "...frequency FREQ is considered to be hot."). This turns value of maybe_hot_bb_p into true for all Basic Blocks (BB) in the function ("BB can be CPU intensive and should be optimized for maximal performance.") and maybe_hot_edge_p true for all edges in function. In turn in non -Os-modes these BB and edges and also loops will be optimized for speed, not for size.
For all outbound call edges from this function, cgraph_maybe_hot_edge_p will return true ("Return true if the call can be hot."). This flag is used in IPA (ipa-inline.c, ipa-cp.c, ipa-inline-analysis.c) and influence inline and cloning decisions

Difference between State, ST, IORef, and MVar

I am working through Write Yourself a Scheme in 48 Hours (I'm up to about 85hrs) and I've gotten to the part about Adding Variables and Assignments. There is a big conceptual jump in this chapter, and I wish it had been done in two steps with a good refactoring in between rather then jumping at straight to the final solution. Anyway…
I've gotten lost with a number of different classes that seem to serve the same purpose: State, ST, IORef, and MVar. The first three are mentioned in the text, while the last seems to be the favored answer to a lot of StackOverflow questions about the first three. They all seem to carry a state between consecutive invocations.
What are each of these and how do they differ from one another?
In particular these sentences don't make sense:
Instead, we use a feature called state threads, letting Haskell manage the aggregate state for us. This lets us treat mutable variables as we would in any other programming language, using functions to get or set variables.
and
The IORef module lets you use stateful variables within the IO monad.
All this makes the line type ENV = IORef [(String, IORef LispVal)] confusing - why the second IORef? What will break if I'll write type ENV = State [(String, LispVal)] instead?
The State Monad : a model of mutable state
The State monad is a purely functional environment for programs with state, with a simple API:
get
put
Documentation in the mtl package.
The State monad is commonly used when needing state in a single thread of control. It doesn't actually use mutable state in its implementation. Instead, the program is parameterized by the state value (i.e. the state is an additional parameter to all computations). The state only appears to be mutated in a single thread (and cannot be shared between threads).
The ST monad and STRefs
The ST monad is the restricted cousin of the IO monad.
It allows arbitrary mutable state, implemented as actual mutable memory on the machine. The API is made safe in side-effect-free programs, as the rank-2 type parameter prevents values that depend on mutable state from escaping local scope.
It thus allows for controlled mutability in otherwise pure programs.
Commonly used for mutable arrays and other data structures that are mutated, then frozen. It is also very efficient, since the mutable state is "hardware accelerated".
Primary API:
Control.Monad.ST
runST -- start a new memory-effect computation.
And STRefs: pointers to (local) mutable cells.
ST-based arrays (such as vector) are also common.
Think of it as the less dangerous sibling of the IO monad. Or IO, where you can only read and write to memory.
IORef : STRefs in IO
These are STRefs (see above) in the IO monad. They don't have the same safety guarantees as STRefs about locality.
MVars : IORefs with locks
Like STRefs or IORefs, but with a lock attached, for safe concurrent access from multiple threads. IORefs and STRefs are only safe in a multi-threaded setting when using atomicModifyIORef (a compare-and-swap atomic operation). MVars are a more general mechanism for safely sharing mutable state.
Generally, in Haskell, use MVars or TVars (STM-based mutable cells), over STRef or IORef.
Ok, I'll start with IORef. IORef provides a value which is mutable in the IO monad. It's just a reference to some data, and like any reference, there are functions which allow you to change the data it refers to. In Haskell, all of those functions operate in IO. You can think of it like a database, file, or other external data store - you can get and set the data in it, but doing so requires going through IO. The reason IO is necessary at all is because Haskell is pure; the compiler needs a way to know which data the reference points to at any given time (read sigfpe's "You could have invented monads" blogpost).
MVars are basically the same thing as an IORef, except for two very important differences. MVar is a concurrency primitive, so it's designed for access from multiple threads. The second difference is that an MVar is a box which can be full or empty. So where an IORef Int always has an Int (or is bottom), an MVar Int may have an Int or it may be empty. If a thread tries to read a value from an empty MVar, it will block until the MVar gets filled (by another thread). Basically an MVar a is equivalent to an IORef (Maybe a) with extra semantics that are useful for concurrency.
State is a monad which provides mutable state, not necessarily with IO. In fact, it's particularly useful for pure computations. If you have an algorithm that uses state but not IO, a State monad is often an elegant solution.
There is also a monad transformer version of State, StateT. This frequently gets used to hold program configuration data, or "game-world-state" types of state in applications.
ST is something slightly different. The main data structure in ST is the STRef, which is like an IORef but with a different monad. The ST monad uses type system trickery (the "state threads" the docs mention) to ensure that mutable data can't escape the monad; that is, when you run an ST computation you get a pure result. The reason ST is interesting is that it's a primitive monad like IO, allowing computations to perform low-level manipulations on bytearrays and pointers. This means that ST can provide a pure interface while using low-level operations on mutable data, meaning it's very fast. From the perspective of the program, it's as if the ST computation runs in a separate thread with thread-local storage.
Others have done the core things, but to answer the direct question:
All this makes the line type ENV =
IORef [(String, IORef LispVal)]
confusing. Why the second IORef? What
will break if I do type ENV = State
[(String, LispVal)] instead?
Lisp is a functional language with mutable state and lexical scope. Imagine you've closed over a mutable variable. Now you've got a reference to this variable hanging around inside some other function -- say (in haskell-style pseudocode) (printIt, setIt) = let x = 5 in (\ () -> print x, \y -> set x y). You now have two functions -- one prints x, and one sets its value. When you evaluate printIt, you want to lookup the name of x in the initial environment in which printIt was defined, but you want to lookup the value that name is bound to in the environment in which printIt is called (after setIt may have been called any number of times).
There are ways besids the two IORefs to do this, but you certainly need more than the latter type you've proposed, which doesn't allow you to alter the values that names are bound to in a lexically-scoped fashion. Google the "funargs problem" for a whole lot of interesting prehistory.

Can you write any algorithm without an if statement?

This site tickled my sense of humour - http://www.antiifcampaign.com/ but can polymorphism work in every case where you would use an if statement?
Smalltalk, which is considered as a "truly" object oriented language, has no "if" statement, and it has no "for" statement, no "while" statement. There are other examples (like Haskell) but this is a good one.
Quoting Smalltalk has no “if” statement:
Some of the audience may be thinking
that this is evidence confirming their
suspicions that Smalltalk is weird,
but what I’m going to tell you is
this:
An “if” statement is an abomination in an Object Oriented language.
Why? Well, an OO language is composed
of classes, objects and methods, and
an “if” statement is inescapably none
of those. You can’t write “if” in an
OO way. It shouldn’t exist.
Conditional execution, like everything
else, should be a method. A method of
what? Boolean.
Now, funnily enough, in Smalltalk,
Boolean has a method called
ifTrue:ifFalse: (that name will look
pretty odd now, but pass over it for
now). It’s abstract in Boolean, but
Boolean has two subclasses: True and
False. The method is passed two blocks
of code. In True, the method simply
runs the code for the true case. In
False, it runs the code for the false
case. Here’s an example that hopefully
explains:
(x >= 0) ifTrue: [
'Positive'
] ifFalse: [
'Negative'
]
You should be able to see ifTrue: and
ifFalse: in there. Don’t worry that
they’re not together.
The expression (x >= 0) evaluates to
true or false. Say it’s true, then we
have:
true ifTrue: [
'Positive'
] ifFalse: [
'Negative'
]
I hope that it’s fairly obvious that
that will produce ‘Positive’.
If it was false, we’d have:
false ifTrue: [
'Positive'
] ifFalse: [
'Negative'
]
That produces ‘Negative’.
OK, that’s how it’s done. What’s so
great about it? Well, in what other
language can you do this? More
seriously, the answer is that there
aren’t any special cases in this
language. Everything can be done in an
OO way, and everything is done in an
OO way.
I definitely recommend reading the whole post and Code is an object from the same author as well.
That website is against using if statements for checking if an object has a specific type. This is completely different from if (foo == 5). It's bad to use ifs like if (foo instanceof pickle). The alternative, using polymorphism instead, promotes encapsulation, making code infinitely easier to debug, maintain, and extend.
Being against ifs in general (doing a certain thing based on a condition) will gain you nothing. Notice how all the other answers here still make decisions, so what's really the difference?
Explanation of the why behind polymorphism:
Take this situation:
void draw(Shape s) {
if (s instanceof Rectangle)
//treat s as rectangle
if (s instanceof Circle)
//treat s as circle
}
It's much better if you don't have to worry about the specific type of an object, generalizing how objects are processed:
void draw(Shape s) {
s.draw();
}
This moves the logic of how to draw a shape into the shape class itself, so we can now treat all shapes the same. This way if we want to add a new type of shape, all we have to do is write the class and give it a draw method instead of modifying every conditional list in the whole program.
This idea is everywhere in programming today, the whole concept of interfaces is all about polymorphism. (Shape is an interface defining a certain behavior, allowing us to process any type that implements the Shape interface in our method.) Dynamic programming languages take this even further, allowing us to pass any type that supports the necessary actions into a method. Which looks better to you? (Python-style pseudo-code)
def multiply(a,b):
if (a is string and b is int):
//repeat a b times.
if (a is int and b is int):
//multiply a and b
or using polymorphism:
def multiply(a,b):
return a*b
You can now use any 2 types that support the * operator, allowing you to use the method with types that haven't event been created yet.
See polymorphism and what is polymorhism.
Though not OOP-related: In Prolog, the only way to write your whole application is without if statements.
Yes actually, you can have a turing-complete language that has no "if" per se and only allows "while" statements:
http://cseweb.ucsd.edu/classes/fa08/cse200/while.html
As for OO design, it makes sense to use an inheritance pattern rather than switches based on a type field in certain cases... That's not always feasible or necessarily desirable though.
#ennuikiller: conditionals would just be a matter of syntactic sugar:
if (test) body; is equivalent to x=test; while (x) {x=nil; body;}
if-then-else is a little more verbose:
if (test) ifBody; else elseBody;
is equivalent to
x = test; y = true;
while (x) {x = nil; y = nil; ifBody;}
while (y) {y = nil; elseBody;}
the primitive data structure is a list of lists. you could say 2 scalars are equal if they are lists of the same length. you would loop over them simultaneously using the head/tail operators and see if they stop at the same point.
of course that could all be wrapped up in macros.
The simplest turing complete language is probably iota. It contains only 2 symbols ('i' and '*').
Yep. if statements imply branches which can be very costly on a lot of modern processors - particularly PowerPC. Many modern PCs do a lot of pipeline re-ordering and so branch mis-predictions can cost an order of >30 cycles per branch miss.
On console programming it's sometimes faster to just execute the code and ignore it than check if you should execute it!
Simple branch avoidance in C:
if (++i >= 15)
{
i = 0;
)
can be re-written as
i = (i + 1) & 15;
However, if you want to see some real anti-if fu then read this
Oh and on the OOP question - I'll replace a branch mis-prediction with a virtual function call? No thanks....
The reasoning behind the "anti-if" campaign is similar to what Kent Beck said:
Good code invariably has small methods and
small objects. Only by factoring the system into many small pieces of state
and function can you hope to satisfy the “once and only once” rule. I get lots
of resistance to this idea, especially from experienced developers, but no one
thing I do to systems provides as much help as breaking it into more pieces.
If you don't know how to factor a program with composition and inheritance, then your classes and methods will tend to grow bigger over time. When you need to make a change, the easiest thing will be to add an IF somewhere. Add too many IFs, and your program will become less and less maintainable, and still the easiest thing will be to add more IFs.
You don't have to turn every IF into an object collaboration; but it's a very good thing when you know how to :-)
You can define True and False with objects (in a pseudo-python):
class True:
def if(then,else):
return then
def or(a):
return True()
def and(a):
return a
def not():
return False()
class False:
def if(then,else):
return false
def or(a):
return a
def and(a):
return False()
def not():
return True()
I think it is an elegant way to construct booleans, and it proves that you can replace every if by polymorphism, but that's not the point of the anti-if campaign. The goal is to avoid writing things such as (in a pathfinding algorithm) :
if type == Block or type == Player:
# You can't pass through this
else:
# You can
But rather call a is_traversable method on each object. In a sense, that's exactly the inverse of pattern matching. "if" is useful, but in some cases, it is not the best solution.
I assume you are actually asking about replacing if statements that check types, as opposed to replacing all if statements.
To replace an if with polymorphism requires a method in a common supertype you can use for dispatching, either by overriding it directly, or by reusing overridden methods as in the visitor pattern.
But what if there is no such method, and you can't add one to a common supertype because the super types are not maintained by you? Would you really go to the lengths of introducing a new supertype along with subtypes just to get rid of a single if? That would be taking purity a bit far in my opinion.
Also, both approaches (direct overriding and the visitor pattern) have their disadvantages: Overriding the method directly requires that you implement your method in the classes you want to switch on, which might not help cohesion. On the other hand, the visitor pattern is awkward if several cases share the same code. With an if you can do:
if (o instanceof OneType || o instanceof AnotherType) {
// complicated logic goes here
}
How would you share the code with the visitor pattern? Call a common method? Where would you put that method?
So no, I don't think replacing such if statements is always an improvement. It often is, but not always.
I used to write code a lot as the recommend in the anti-if campaign, using either callbacks in a delegate dictionary or polymorphism.
It's quite a beguiling argument, especially if you are dealing with messy code bases but to be honest, although it's great for a plugin model or simplifying large nested if statements, it does make navigating and readability a bit of a pain.
For example F12 (Go To Definition) in visual studio will take you to an abstract class (or, in my case an interface definition).
It also makes quick visual scanning of a class very cumbersome, and adds an overhead in setting up the delegates and lookup hashes.
Using the recommendations put forward in the anti-if campaign as much as they appear to be recommending looks like 'ooh, new shiny thing' programming to me.
As for the other constructs put forward in this thread, albeit it has been done in the spirit of a fun challenge, are just substitutes for an if statement, and don't really address what the underlying beliefs of the anti-if campaign.
You can avoid ifs in your business logic code if you keep them in your construction code (Factories, builders, Providers etc.). Your business logic code would be much more readable, easier to understand or easier to maintain or extend. See: http://www.youtube.com/watch?v=4F72VULWFvc
Haskell doesn't even have if statements, being pure functional. ;D
You can do it without if per se, but you can't do it without a mechanism that allows you to make a decision based on some condition.
In assembly, there's no if statement. There are conditional jumps.
In Haskell for instance, there's no explicit if, instead, you define a function multiple times, I forgot the exact syntax, but it's something like this:
pseudo-haskell:
def posNeg(x < 0):
return "negative"
def posNeg(x == 0):
return "zero"
def posNeg(x):
return "positive"
When you call posNeg(a), the interpreter will look at the value of a, if it's < 0 then it will choose the first definition, if it's == 0 then it will choose the second definition, otherwise it will default to the third definition.
So while languages like Haskell and SmallTalk don't have the usual C-style if statement, they have other means of allowing you to make decisions.
This is actually a coding game I like to play with programming languages. It's called "if we had no if" which has its origins at: http://wiki.tcl.tk/4821
Basically, if we disallow the use of conditional constructs in the language: no if, no while, no for, no unless, no switch etc.. can we recreate our own IF function. The answer depends on the language and what language features we can exploit (remember using regular conditional constructs is cheating co no ternary operators!)
For example, in tcl, a function name is just a string and any string (including the empty string) is allowed for anything (function names, variable names etc.). So, exploiting this we can do:
proc 0 {true false} {uplevel 1 $false; # execute false code block, ignore true}
proc 1 {true false} {uplevel 1 $true; # execute true code block, ignore flase}
proc _IF {boolean true false} {
$boolean $true $false
}
#usage:
_IF [expr {1<2}] {
puts "this is true"
} {
#else:
puts "this is false"
}
or in javascript we can abuse the loose typing and the fact that almost anything can be cast into a string and combine that with its functional nature:
function fail (discard,execute) {execute()}
function pass (execute,discard) {execute()}
var truth_table = {
'false' : fail,
'true' : pass
}
function _IF (expr) {
return truth_table[!!expr];
}
//usage:
_IF(3==2)(
function(){alert('this is true')},
//else
function(){alert('this is false')}
);
Not all languages can do this sort of thing. But languages I like tend to be able to.
The idea of polymorphism is to call an object without to first verify the class of that object.
That doesn't mean the if statement should not be used at all; you should avoid to write
if (object.isArray()) {
// Code to execute when the object is an array.
} else if (object.inString()) {
// Code to execute if the object is a string.
}
It depends on the language.
Statically typed languages should be able to handle all of the type checking by sharing common interfaces and overloading functions/methods.
Dynamically typed languages might need to approach the problem differently since type is not checked when a message is passed, only when an object is being accessed (more or less). Using common interfaces is still good practice and can eliminate many of the type checking if statements.
While some constructs are usually a sign of code smell, I am hesitant to eliminate any approach to a problem apriori. There may be times when type checking via if is the expedient solution.
Note: Others have suggested using switch instead, but that is just a clever way of writing more legible if statements.
Well, if you're writing in Perl, it's easy!
Instead of
if (x) {
# ...
}
you can use
unless (!x){
# ...
}
;-)
In answer to the question, and as suggested by the last respondent, you need some if statements to detect state in a factory. At that point you then instantiate a set of collaborating classes that solve the state specific problem. Of course, other conditionals would be required as needed, but they would be minimized.
What would be removed of course would be the endless procedural state checking rife in so much service based code.
Interesting smalltalk is mentioned, as that's the language I used before being dragged across into Java. I don't get home as early as I used to.
I thought about adding my two cents: you can optimize away ifs in many languages where the second part of a boolean expression is not evaluated when it won't affect the result.
With the and operator, if the first operand evaluates to false, then there is no need to evaluate the second one. With the or operator, it's the opposite - there's no need to evaluate the second operand if the first one is true. Some languages always behave like that, others offer an alternative syntax.
Here's an if - elseif - else code made in JavaScript by only using operators and anonymous functions.
document.getElementById("myinput").addEventListener("change", function(e) {
(e.target.value == 1 && !function() {
alert('if 1');
}()) || (e.target.value == 2 && !function() {
alert('else if 2');
}()) || (e.target.value == 3 && !function() {
alert('else if 3');
}()) || (function() {
alert('else');
}());
});
<input type="text" id="myinput" />
This makes me want to try defining an esoteric language where blocks implicitly behave like self-executing anonymous functions and return true, so that you would write it like this:
(condition && {
action
}) || (condition && {
action
}) || {
action
}