Achieving polymorphism in functional programming

Achieving polymorphism in functional programming - oop

I'm currently enjoying the transition from an object oriented language to a functional language. It's a breath of fresh air, and I'm finding myself much more productive than before.
However - there is one aspect of OOP that I've not yet seen a satisfactory answer for on the FP side, and that is polymorphism. i.e. I have a large collection of data items, which need to be processed in quite different ways when they are passed into certain functions. For the sake of argument, let's say that there are multiple factors driving polymorphic behaviour so potentially exponentially many different behaviour combinations.
In OOP that can be handled relatively well using polymorphism: either through composition+inheritance or a prototype-based approach.
In FP I'm a bit stuck between:
Writing or composing pure functions that effectively implement polymorphic behaviours by branching on the value of each data item - feels rather like assembling a huge conditional or even simulating a virtual method table!
Putting functions inside pure data structures in a prototype-like fashion - this seems like it works but doesn't it also violate the idea of defining pure functions separately from data?
What are the recommended functional approaches for this kind of situation? Are there other good alternatives?

Putting functions inside pure data structures in a prototype-like fashion - this seems like it works but doesn't it also violate the idea of defining pure functions separately from data?
If virtual method dispatch is the way you want to approach the problem, this is a perfectly reasonable approach. As for separating functions from data, that is a distinctly non-functional notion to begin with. I consider the fundamental principle of functional programming to be that functions ARE data. And as for your feeling that you're simulating a virtual function, I would argue that it's not a simulation at all. It IS a virtual function table, and that's perfectly OK.
Just because the language doesn't have OOP support built in doesn't mean it's not reasonable to apply the same design principles - it just means you'll have to write more of the machinery that other languages provide built-in, because you're fighting against the natural spirit of the language you're using. Modern typed functional languages do have very deep support for polymorphism, but it's a very different approach to polymorphism.
Polymorphism in OOP is a lot like "existential quantification" in logic - a polymorphic value has SOME run-time type but you don't know what it is. In many functional programming languages, polymorphism is more like "universal quantification" - a polymorphic value can be instantiated to ANY compatible type its user wants. They're two sides of the exact same coin (in particular, they swap places depending on whether you're looking at a function from the "inside" or the "outside"), but it turns out to be extremely hard when designing a language to "make the coin fair", especially in the presence of other language features such as subtyping or higher-kinded polymorphism (polymorphism over polymorphic types).
If it helps, you may want to think of polymorphism in functional languages as something very much like "generics" in C# or Java, because that's exactly the type of polymorphism that, e.g., ML and Haskell, favor.

Well, in Haskell you can always make a type-class to achieve a kind of polymorphism. Basically, it is defining functions that are processed for different types. Examples are the classes Eq and Show:
data Foo = Bar | Baz
instance Show Foo where
show Bar = 'bar'
show Baz = 'baz'
main = putStrLn $ show Bar
The function show :: (Show a) => a -> String is defined for every data type that instances the typeclass Show. The compiler finds the correct function for you, depending on the type.
This allows to define functions more generally, for example:
compare a b = a < b
will work with any type of the typeclass Ord. This is not exactly like OOP, but you even may inherit typeclasses like so:
class (Show a) => Combinator a where
combine :: a -> a -> String
It is up to the instance to define the actual function, you only define the type - similar to virtual functions.
This is not complete, and as far as I know, many FP languages do not feature type classes. OCaml does not, it pushes that over to its OOP part. And Scheme does not have any types. But in Haskell it is a powerful way to achieve a kind of polymorphism, within limits.
To go even further, newer extensions of the 2010 standard allow type families and suchlike.
Hope this helped you a bit.

Who said
defining pure functions separately from data
is best practice?
If you want polymorphic objects, you need objects. In a functional language, objects can be constructed by glueing together a set of "pure data" with a set of "pure functions" operating on that data. This works even without the concept of a class. In this sense, a class is nothing but a piece of code that constructs objects with the same set of associated "pure functions".
And polymorphic objects are constructed by replacing some of those functions of an object by different functions with the same signature.
If you want to learn more about how to implement objects in a functional language (like Scheme), have a look into this book:
Abelson / Sussman: "Structure and Interpration of Computer programs"

Mike, both your approaches are perfectly acceptable, and the pros and cons of each are discussed, as Doc Brown says, in Chapter 2 of SICP. The first suffers from having a big type table somewhere, which needs to be maintained. The second is just traditional single-dispatch polymorphism/virtual function tables.
The reason that scheme doesn't have a built-in system is that using the wrong object system for the problem leads to all sorts of trouble, so if you're the language designer, which to choose? Single despatch single inheritance won't deal well with 'multiple factors driving polymorphic behaviour so potentially exponentially many different behaviour combinations.'
To synopsize, there are many ways of constructing objects, and scheme, the language discussed in SICP, just gives you a basic toolkit from which you can construct the one you need.
In a real scheme program, you'd build your object system by hand and then hide the associated boilerplate with macros.
In clojure you actually have a prebuilt object/dispatch system built in with multimethods, and one of its advantages over the traditional approach is that it can dispatch on the types of all arguments. You can (apparently) also use the heirarchy system to give you inheritance-like features, although I've never used it, so you should take that cum grano salis.
But if you need something different from the object scheme chosen by the language designer, you can just make one (or several) that suits.
That's effectively what you're proposing above.
Build what you need, get it all working, hide the details with macros.
The argument between FP and OO is not about whether data abstraction is bad, it's about whether the data abstraction system is the place to stuff all the separate concerns of the program.
"I believe that a programming language should allow one to define new data types. I do not believe that a program should consist solely of definitions of new data types."

http://www.haskell.org/haskellwiki/OOP_vs_type_classes#Everything_is_an_object.3F nicely discusses some solutions.

Related

"Many functions operating upon few abstractions" principle vs OOP

The creator of the Clojure language claims that "open, and large, set of functions operate upon an open, and small, set of extensible abstractions is the key to algorithmic reuse and library interoperability". Obviously it contradicts the typical OOP approach where you create a lot of abstractions (classes) and a relatively small set of functions operating on them. Please suggest a book, a chapter in a book, an article, or your personal experience that elaborate on the topics:
motivating examples of problems that appear in OOP and how using "many functions upon few abstractions" would address them
how to effectively do MFUFA* design
how to refactor OOP code towards MFUFA
how OOP languages' syntax gets in the way of MFUFA
*MFUFA: "many functions upon few abstractions"

There are two main notions of "abstraction" in programming:
parameterisation ("polymorphism", genericity).
encapsulation (data hiding),
[Edit: These two are duals. The first is client-side abstraction, the second implementer-side abstraction (and in case you care about these things: in terms of formal logic or type theory, they correspond to universal and existential quantification, respectively).]
In OO, the class is the kitchen sink feature for achieving both kinds of abstraction.
Ad (1), for almost every "pattern" you need to define a custom class (or several). In functional programming on the other hand, you often have more lightweight and direct methods to achieve the same goals, in particular, functions and tuples. It is often pointed out that most of the "design patterns" from the GoF are redundant in FP, for example.
Ad (2), encapsulation is needed a little bit less often if you don't have mutable state lingering around everywhere that you need to keep in check. You still build ADTs in FP, but they tend to be simpler and more generic, and hence you need fewer of them.

When you write program in object-oriented style, you make emphasis on expressing domain area in terms of data types. And at first glance this looks like a good idea - if we work with users, why not to have a class User? And if users sell and buy cars, why not to have class Car? This way we can easily maintain data and control flow - it just reflects order of events in the real world. While this is quite convenient for domain objects, for many internal objects (i.e. objects that do not reflect anything from real world, but occur only in program logic) it is not so good. Maybe the best example is a number of collection types in Java. In Java (and many other OOP languages) there are both arrays, Lists. In JDBC there's ResultSet which is also kind of collection, but doesn't implement Collection interface. For input you will often use InputStream that provides interface for sequential access to the data - just like linked list! However it doesn't implement any kind of collection interface as well. Thus, if your code works with database and uses ResultSet it will be harder to refactor it for text files and InputStream.
MFUFA principle teaches us to pay less attention to type definition and more to common abstractions. For this reason Clojure introduces single abstraction for all mentioned types - sequence. Any iterable is automatically coerced to sequence, streams are just lazy lists and result set may be transformed to one of previous types easily.
Another example is using PersistentMap interface for structs and records. With such common interfaces it becomes very easy to create resusable subroutines and do not spend lots of time to refactoring.
To summarize and answer your questions:
One simple example of an issue that appears in OOP frequently: reading data from many different sources (e.g. DB, file, network, etc.) and processing it in the same way.
To make good MFUFA design try to make abstractions as common as possible and avoid ad-hoc implementations. E.g. avoid types a-la UserList - List<User> is good enough in most cases.
Follow suggestions from point 2. In addition, try to add as much interfaces to your data types (classes) as it possible. For example, if you really need to have UserList (e.g. when it should have a lot of additional functionality), add both List and Iterable interfaces to its definition.
OOP (at least in Java and C#) is not very well suited for this principle, because they try to encapsulate the whole object's behavior during initial design, so it becomes hard add more functions to them. In most cases you can extend class in question and put methods you need into new object, but 1) if somebody else implements their own derived class, it will not be compatible with yours; 2) sometimes classes are final or all fields are made private, so derived classes don't have access to them (e.g. to add new functions to class String one should implement additional classStringUtils). Nevertheless, rules I described above make it much easier to use MFUFA in OOP-code. And best example here is Clojure itself, which is gracefully implemented in OO-style but still follows MFUFA principle.
UPD. I remember another description of difference between object oriented and functional styles, that maybe summarizes better all I said above: designing program in OO style is thinking in terms of data types (nouns), while designing in functional style is thinking in terms of operations (verbs). You may forget that some nouns are similar (e.g. forget about inheritance), but you should always remember that many verbs in practice do the same thing (e.g. have same or similar interfaces).

A much earlier version of the quote:
"The simple structure and natural applicability of lists are reflected in functions that are amazingly nonidiosyncratic. In Pascal the plethora of declarable data structures induces a specialization within functions that inhibits and penalizes casual cooperation. It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures."
...comes from the foreword to the famous SICP book. I believe this book has a lot of applicable material on this topic.

I think you're not getting that there's a difference between libraries and programmes.
OO libraries which work well usually generate a small number of abstractions, which programmes use to build the abstractions for their domain. Larger OO libraries (and programmes) use inheritance to create different versions of methods and introduce new methods.
So, yes, the same principle applies to OO libraries.

Object-oriented programming in a purely functional programming context?

Are there any advantages to using object-oriented programming (OOP) in a functional programming (FP) context?
I have been using F# for some time now, and I noticed that the more my functions are stateless, the less I need to have them as methods of objects. In particular, there are advantages to relying on type inference to have them usable in as wide a number of situations as possible.
This does not preclude the need for namespaces of some form, which is orthogonal to being OOP. Nor is the use of data structures discouraged. In fact, real use of FP languages depend heavily on data structures. If you look at the F# stack implemented in F Sharp Programming/Advanced Data Structures, you will find that it is not object-oriented.
In my mind, OOP is heavily associated with having methods that act on the state of the object mostly to mutate the object. In a pure FP context that is not needed nor desired.
A practical reason may be to be able to interact with OOP code, in much the same way F# works with .NET. Other than that however, are there any reasons? And what is the experience in the Haskell world, where programming is more pure FP?
I will appreciate any references to papers or counterfactual real world examples on the issue.

The disconnect you see is not of FP vs. OOP. It's mostly about immutability and mathematical formalisms vs. mutability and informal approaches.
First, let's dispense with the mutability issue: you can have FP with mutability and OOP with immutability just fine. Even more-functional-than-thou Haskell lets you play with mutable data all you want, you just have to be explicit about what is mutable and the order in which things happen; and efficiency concerns aside, almost any mutable object could construct and return a new, "updated" instance instead of changing its own internal state.
The bigger issue here is mathematical formalisms, in particular heavy use of algebraic data types in a language little removed from lambda calculus. You've tagged this with Haskell and F#, but realize that's only half of the functional programming universe; the Lisp family has a very different, much more freewheeling character compared to ML-style languages. Most OO systems in wide use today are very informal in nature--formalisms do exist for OO but they're not called out explicitly the way FP formalisms are in ML-style languages.
Many of the apparent conflicts simply disappear if you remove the formalism mismatch. Want to build a flexible, dynamic, ad-hoc OO system on top of a Lisp? Go ahead, it'll work just fine. Want to add a formalized, immutable OO system to an ML-style language? No problem, just don't expect it to play nicely with .NET or Java.
Now, you may be wondering, what is an appropriate formalism for OOP? Well, here's the punch line: In many ways, it's more function-centric than ML-style FP! I'll refer back to one of my favorite papers for what seems to be the key distinction: structured data like algebraic data types in ML-style languages provide a concrete representation of the data and the ability to define operations on it; objects provide a black-box abstraction over behavior and the ability to easily replace components.
There's a duality here that goes deeper than just FP vs. OOP: It's closely related to what some programming language theorists call the Expression Problem: With concrete data, you can easily add new operations that work with it, but changing the data's structure is more difficult. With objects you can easily add new data (e.g., new subclasses) but adding new operations is difficult (think adding a new abstract method to a base class with many descendants).
The reason why I say that OOP is more function-centric is that functions themselves represent a form of behavioral abstraction. In fact, you can simulate OO-style structure in something like Haskell by using records holding a bunch of functions as objects, letting the record type be an "interface" or "abstract base class" of sorts, and having functions that create records replace class constructors. So in that sense, OO languages use higher-order functions far, far more often than, say, Haskell would.
For an example of something like this type of design actually put to very nice use in Haskell, read the source for the graphics-drawingcombinators package, in particular the way that it uses an opaque record type containing functions and combines things only in terms of their behavior.
EDIT: A few final things I forgot to mention above.
If OO indeed makes extensive use of higher-order functions, it might at first seem that it should fit very naturally into a functional language such as Haskell. Unfortunately this isn't quite the case. It is true that objects as I described them (cf. the paper mentioned in the LtU link) fit just fine. in fact, the result is a more pure OO style than most OO languages, because "private members" are represented by values hidden by the closure used to construct the "object" and are inaccessible to anything other than the one specific instance itself. You don't get much more private than that!
What doesn't work very well in Haskell is subtyping. And, although I think inheritance and subtyping are all too often misused in OO languages, some form of subtyping is quite useful for being able to combine objects in flexible ways. Haskell lacks an inherent notion of subtyping, and hand-rolled replacements tend to be exceedingly clumsy to work with.
As an aside, most OO languages with static type systems make a complete hash of subtyping as well by being too lax with substitutability and not providing proper support for variance in method signatures. In fact, I think the only full-blown OO language that hasn't screwed it up completely, at least that I know of, is Scala (F# seemed to make too many concessions to .NET, though at least I don't think it makes any new mistakes). I have limited experience with many such languages, though, so I could definitely be wrong here.
On a Haskell-specific note, its "type classes" often look tempting to OO programmers, to which I say: Don't go there. Trying to implement OOP that way will only end in tears. Think of type classes as a replacement for overloaded functions/operators, not OOP.

As for Haskell, classes are less useful there because some OO features are more easily achieved in other ways.
Encapsulation or "data hiding" is frequently done through function closures or existential types, rather than private members. For example, here is a data type of random number generator with encapsulated state. The RNG contains a method to generate values and a seed value. Because the type 'seed' is encapsulated, the only thing you can do with it is pass it to the method.
data RNG a where RNG :: (seed -> (a, seed)) -> seed -> RNG a
Dynamic method dispatch in the context of parametric polymorphism or "generic programming" is provided by type classes (which are not OO classes). A type class is like an OO class's virtual method table. However, there's no data hiding. Type classes do not "belong" to a data type the way that class methods do.
data Coordinate = C Int Int
instance Eq Coordinate where C a b == C d e = a == b && d == e
Dynamic method dispatch in the context of subtyping polymorphism or "subclassing" is almost a translation of the class pattern in Haskell using records and functions.
-- An "abstract base class" with two "virtual methods"
data Object =
Object
{ draw :: Image -> IO ()
, translate :: Coord -> Object
}
-- A "subclass constructor"
circle center radius = Object draw_circle translate_circle
where
-- the "subclass methods"
translate_circle center radius offset = circle (center + offset) radius
draw_circle center radius image = ...

I think that there are several ways of understanding what OOP means. For me, it is not about encapsulating mutable state, but more about organizing and structuring programs. This aspect of OOP can be used perfectly fine in conjunction with FP concepts.
I believe that mixing the two concepts in F# is a very useful approach - you can associate immutable state with operations working on that state. You'll get the nice features of 'dot' completion for identifiers, the ability to easy use F# code from C#, etc., but you can still make your code perfectly functional. For example, you can write something like:
type GameWorld(characters) =
let calculateSomething character =
// ...
member x.Tick() =
let newCharacters = characters |> Seq.map calculateSomething
GameWorld(newCharacters)
In the beginning, people don't usually declare types in F# - you can start just by writing functions and later evolve your code to use them (when you better understand the domain and know what is the best way to structure code). The above example:
Is still purely functional (the state is a list of characters and it is not mutated)
It is object-oriented - the only unusual thing is that all methods return a new instance of "the world"

Theory behind object oriented programming

Alonzo Church's lambda calculus is the mathematical theory behind functional languages. Has object oriented programming some formal theory ?

Object Orientation comes from psychology not math.
If you think about it, it resembles more how humans work than how computers work.
We think in objects that we class-ify. For instance this table is a seating furniture.
Take Jean Piaget (1896-1980), who worked on a theory of children's cognitive development.
Wikipedia says:
Piaget also had a considerable effect in the field of computer science and artificial intelligence.
Some cognitive concepts he discovered (that imply to the Object Orientation concept):
Classification The ability to group objects together on the basis of common features.
Class Inclusion The understanding, more advanced than simple classification, that some classes or sets of objects are also sub-sets of a larger class. (E.g. there is a class of objects called dogs. There is also a class called animals. But all dogs are also animals, so the class of animals includes that of dogs)
Read more: Piaget's developmental theory http://www.learningandteaching.info/learning/piaget.htm#ixzz1CipJeXyZ

OOP is a bit of a mixed bag of features that various languages implement in slightly different ways. There is no single formal definition of OOP but a number of people have tried to describe OOP based on the common features of languages that claim to be object oriented. From Wikipedia:
Benjamin Cuire Pierce and some other researchers view as futile any attempt to distill OOP to a minimal set of features. He nonetheless identifies fundamental features that support the OOP programming style in most object-oriented languages:
Dynamic dispatch – when a method is invoked on an object, the object itself determines what code gets executed by looking up the method at run time in a table associated with the object. This feature distinguishes an object from an abstract data type (or module), which has a fixed (static) implementation of the operations for all instances. It is a programming methodology that gives modular component development while at the same time being very efficient.
Encapsulation (or multi-methods, in which case the state is kept separate)
Subtype polymorphism
object inheritance (or delegation)
Open recursion – a special variable (syntactically it may be a keyword), usually called this or self, that allows a method body to invoke another method body of the same object. This variable is late-bound; it allows a method defined in one class to invoke another method that is defined later, in some subclass thereof.

Abadi and Cardelli have written A Theory Of Objects, you might want to look into that. Another exposition is given the venerable TAPL (IIRC, they approach objects as recursive records in a typed lambda calculus). I don't really know much about this stuff.

One formal definition I've run into for strongly defining and constraining subtyping is the Liskov Substitution Principle. It is certainly not all of object-oriented programming, but yet it might serve as a link into the formal foundations in development.

I'd check out wikipedia's page on OO http://en.wikipedia.org/wiki/Object-oriented_programming It's got the principles and fundamentals and history.
My understanding is that it was an evolutionary progression of features and ideas in a variety of languages that finally came together with the push in the 90's for GUI's going mainstream. But i could be horribly wrong :-D
Edit: What's even more interesting is that people still argue about "what makes an OO language OO"..i'm not sure the feature set is even generally agreed upon that defines an OO language.

The history (simplified) goes that way :
First came the spagetti code.
Then came the procedural code (Like C and Pascal).
Then came modular code (Like in modula).
Then came the object oriented code (Like in smalltalk).
Whats the porpuse of object oriented programming ?
You can only understand if you recall history.
At first code was simply a sequence of instructions given to the computer (Literally in binary representation)
Then came the macro assemblers. With mneomonics for instructions.
Then people detected that sometimes you have code that is repeated around.
So they created GOTO. But GOTO (Or branch or jump etc) cannot return back to where it was called, and cannot give direct return values, nor can accept formal parameters (You had to use global variables).
Against the first problem, people created subroutines (GOSUB-like). Groups of instructions that could be called repeatedly and return to where it was called.
Then people detected that routines would be more usefull if they had parameters and could return values.
For this they created functions, procedures and calling conventions. Those abstractions where called on top of an abstraction called the stack.
The stack allows formal parameters, return values and something called recursion (direct or indirect).
With the stack and the ability for a function to be called arbitrarely (even indirectly), came the procedural programming, solving the GOTO problem.
But then came the large projects, and the necessity to group procedures into logical entities (modules).
Thats where you will understand why object oriented programming evolved.
When you have a module, you have module local variables.
Think about this :
Module MyScreenModule;
Var X, Y : Integer;
Procedure SetX(value : Integer);
Procedure SetY(value : Integer);
End Module.
There are X and Y variables that are local to that module. In that example, X and Y holds the position of the cursor. But lets suppose your computer has more than one screen. So what can we do now ? X and Y alone arent able to hold the X and Y values of all screens you have. You need a way to INSTANTIATE that information. Thats where the jump from modular programming goes to object oriented programming.
In a non object oriented language you would usually do :
Var Screens : Array of Record X, Y : Integer End;
And then pass a index value to each module call :
Procedure SetX(ScreenID : Integer; X : Integer);
Procedure SetY(ScreenID : Integer; Y : Integer);
Here screenid refers to wich of the multiple screens that you are talking about.
Object oriented inverts the relationship. Instead of a module with multiple data instances (Effectively what screenid does here), you make the data the first class citizen and attach code to it, like this :
Class MyScreenModule;
Field X, Y : Integer;
Procedure SetX(value : Integer);
Procedure SetY(value : Integer);
End Class.
Its almost same thing as a module !
Now you instantiate it by providing a implicit pointer to a instance, like :
ScreenNumber1 := New MyScreenModule;
And then proceed to use it :
ScreenNumber1::SetX(100);
You effectively turned your modular programming into a multi-instance programming where the variable holding the module pointer itself differentiates each instance. Gotcha ?
Its an evolution of the abstraction.
So now you have multiple-instances, whats the next level ?
Polymorphism. Etc. The rest is pretty standard object oriented lessons.
Whats the point ? The point is that object oriented (like procedures, like subroutines etc) did not evolve from a theoretical standpoint but from the praxys of many coders working around decades. Its a evolution of computer programming, a continual evolution.

IMO a good example of what makes a successful OO language could be found by comparing the similarities between JAVA and C#. Both are extremely popular and very similar. Though I think the general idea of a minimum OO language can be found by looking at Simula67. I believe the general idea behind Object Oriented programming to be that it makes it seem like the computer thinks more like a human, this is supported by things like inheritance (both class "mountain bike" and "road bike" belong to parent class "bicycle", and have the same basic features). Another important idea is that objects (which can be executable lines of code) can be passed around like variables, effectively allowing the program to edit itself based on certain criteria (although this point is highly arguable, I cite the ability to change every instance of an object based on one comparison). Another point to make is modularity. As entire programs could effectively be passed around like a variable (because everything is treated as an object), it becomes easier to modify large programs, by simply modifying the class or method being called, and never having to modify the main method. Because of this, expanding the functionality of a program can become much simpler. This is why web businesses love languages like C# and JAVA (which are full fledged OO). Gaming companies like C++ because it gives them some of the control of an imperative language (like C) and some of the features of object orientation (like C# or JAVA).

Object-oriented is a bit of a misnomer and I don't really think classes or type systems have much to do with OO programming. Alan Kay's inspiration was biological and what's really going on that matters is communication. A better name would be message-oriented programming. There is plenty of theory about messaging, for example Pi calculus and the actor model both have rigorous mathematical descriptions. And that is really just the tip of the iceberg.

What about Petri nets? Object might be a place, a composition an arc, messages tokens. I have not though about it very thoroughly, so there might be some flaws I am not aware of, but you can investigate - there is a lot of theoretical works related to Petri nets.
I found this, for example:
http://link.springer.com/book/10.1007%2F3-540-45397-0
Readable PDF: http://www.informatik.uni-hamburg.de/bib/medoc/M-329.pdf

In 2000 in my degree thesys I proposed this model; very shortly:
y + 1 = f(u, x)
x + 1 = g(u, x)
where:
u: input
y: output
x: state
f: output function
g: state function
Formally it's very similar to finite state machine, but the difference is that U, Y, S are set not finite, but infinite (numerable) and f and g are Touring Machine (TM).
f and g togheter form a class; if we add an initial state x0 we have an object. So OOP in something more than a TM a TM is a specific case of OOP. Note that the state x is of a different level than the state "inside" the TM. Inside the TM there are not side effect, the x state count for side effect.

What are the tell-tale signs of bad object oriented design? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
When designing a new system or getting your head around someone else's code, what are some tell tale signs that something has gone wrong in the design phase? Are there clues to look for on class diagrams and inheritance hierarchies or even in the code itself that just scream for a design overhaul, particularly early in a project?

The things that mostly stick out for me are "code smells".
Mostly I'm sensitive to things that go against "good practice".
Things like:
Methods that do things other than what you'd think from the name (eg: FileExists() that silently deletes zero byte files)
A few extremely long methods (sign of an object wrapper around a procedure)
Repeated use of switch/case statements on the same enumerated member (sign of sub-classes needing extraction)
Lots of member variables that are used for processing, not to capture state (might indicate need to extract a method object)
A class that has lots of responsibilities (violation of Single Repsonsibility principle)
Long chains of member access (this.that is fine, this.that.theOther is fine, but my.very.long.chain.of.member.accesses.for.a.result is brittle)
Poor naming of classes
Use of too many design patterns in a small space
Working too hard (rewriting functions already present in the framework, or elsewhere in the same project)
Poor spelling (anywhere) and grammar (in comments), or comments that are simply misleading

I'd say the number one rule of poor OO design (and yes I've been guilty of it too many times!) is:
Classes that break the Single
Responsibility Principle (SRP) and
perform too many actions
Followed by:
Too much inheritance instead of
composition, i.e. Classes that
derive from a sub-type purely so
they get functionality for free.
Favour Composition over Inheritance.

Impossible to unit test properly.

Anti-patterns
Software design anti-patterns
Abstraction inversion : Not exposing implemented functionality required by users, so that they re-implement it using higher level functions
Ambiguous viewpoint: Presenting a model (usually OOAD) without specifying its viewpoint
Big ball of mud: A system with no recognizable structure
Blob: Generalization of God object from object-oriented design
Gas factory: An unnecessarily complex design
Input kludge: Failing to specify and implement handling of possibly invalid input
Interface bloat: Making an interface so powerful that it is extremely difficult to implement
Magic pushbutton: Coding implementation logic directly within interface code, without using abstraction.
Race hazard: Failing to see the consequence of different orders of events
Railroaded solution: A proposed solution that while poor, is the only one available due to poor foresight and inflexibility in other areas of the design
Re-coupling: Introducing unnecessary object dependency
Stovepipe system: A barely maintainable assemblage of ill-related components
Staralised schema: A database schema containing dual purpose tables for normalised and datamart use
Object-oriented design anti-patterns
Anemic Domain Model: The use of domain model without any business logic which is not OOP because each object should have both attributes and behaviors
BaseBean: Inheriting functionality from a utility class rather than delegating to it
Call super: Requiring subclasses to call a superclass's overridden method
Circle-ellipse problem: Subtyping variable-types on the basis of value-subtypes
Empty subclass failure: Creating a class that fails the "Empty Subclass Test" by behaving differently from a class derived from it without modifications
God object: Concentrating too many functions in a single part of the design (class)
Object cesspool: Reusing objects whose state does not conform to the (possibly implicit) contract for re-use
Object orgy: Failing to properly encapsulate objects permitting unrestricted access to their internals
Poltergeists: Objects whose sole purpose is to pass information to another object
Sequential coupling: A class that requires its methods to be called in a particular order
Singletonitis: The overuse of the singleton pattern
Yet Another Useless Layer: Adding unnecessary layers to a program, library or framework. This became popular after the first book on programming patterns.
Yo-yo problem: A structure (e.g., of inheritance) that is hard to understand due to excessive fragmentation

This question makes the assumption that object-oriented means good design. There are cases where another approach is much more appropriate.

One smell is objects having hard dependencies/references to other objects that aren't a part of their natural object hierarchy or domain related composition.
Example: Say you have a city simulation. If the a Person object has a NearestPostOffice property you are probably in trouble.

One thing I hate to see is a base class down-casting itself to a derived class. When you see this, you know you have problems.
Other examples might be:
Excessive use of switch statements
Derived classes that override everything

In my view, all OOP code degenerates to procedural code over a sufficiently long time span.
Granted, if you read my most recent question, you might understand why I am a little jaded.
The key problem with OOP is that it doesn't make it obvious that your object construction graph should be independent of your call graph.
Once you fix that problem, OOP actually starts to make sense. The problem is that very few teams are aware of this design pattern.

Here's a few:
Circular dependencies
You with property XYZ of a base class wasn't protected/private
You wish your language supported multiple inheritance

Within a long method, sections surrounded with #region / #endregion - in almost every case I've seen, that code could easily be extracted into a new method OR needed to be refactored in some way.
Overly-complicated inheritance trees, where the sub-classes do very different things and are only tangentially related to one another.
Violation of DRY - sub-classes that each override a base method in almost exactly the same way, with only a minor variation. An example: I recently worked on some code where the subclasses each overrode a base method and where the only difference was a type test ("x is ThisType" vs "x is ThatType"). I implemented a method in the base that took a generic type T, that it then used in the test. Each child could then call the base implementation, passing the type it wanted to test against. This trimmed about 30 lines of code from each of 8 different child classes.

Duplicate code = Code that does the same thing...I think in my experience this is the biggest mistake that can occur in OO design.

Objects are good create a gazillion of them is a bad OO design.

Having all you objects inherit some base utility class just so you can call your utility methods without having to type so much code.

Find a programmer who is experienced with the code base. Ask them to explain how something works.
If they say "this function calls that function", their code is procedural.
If they say "this class interacts with that class", their code is OO.

Following are most prominent features of a bad design:
Rigidity
Fragility
Immobility
Take a look at The Dependency Inversion Principle

When you don't just have a Money\Amount class but a TrainerPrice class, TablePrice class, AddTablePriceAction class and so on.
IDE Driven Development or Auto-Complete development. Combined with extreme strict typing is a perfect storm.
This is where you see what could be a lot of what could be variable values become class names and method names as well as the gratuitous use of classes in general. You'll also see things like all primitives becoming objects. All literals as classes. Function parameters as classes. Then conversion methods everywhere. You'll also see things like a class wrapping another delivering a subset of methods to another class inclusive of only the ones it needs at present.
This creates the possibility to generate an near infinite amount of code which is great if you have billable hours. When variables, contexts, properties and states get unrolled into hyper explicit and overly specific classes then this creates an exponential cataclysm as sooner or later those things multiply. Think of it like [a, b] x [x, y]. This can be further compounded by an attempt to create a full fluent interface as well as adhere to as many design patterns as possible.
OOP languages are not as polymorphic as some loosely typed languages. Loosely typed languages often offer runtime polymorphism in shallow syntax that static analysis can't handle.
In OOP you might see forms of repetition hard to automatically detect that could be turned into more dynamic code using maps. Although such languages are less dynamic you can achieve dynamic features with some extra-work.
The trade of here is that you save thousands (or millions) of lines of code while potentially loosing IDE features and static analysis. Performance can go either way. Run time polymorphism can often be converted to generated code. However in some cases the space is so huge that anything other than runtime polymorphism is impossible.
Problems are a lot more common with OOP languages lacking generics and when OOP programmers try to strictly type dynamic loosely typed language.
What happens without generics is where you should have A for X = [Q, W, E] and Y = [R, T, Y] you instead see [AQR, AQT, AQY, AWR, AWT, AWY, AER, AET, AEY]. This is often due to fear or using typeless or passing the type as a variable for loosing IDE support.
Traditionally loosely typed languages are made with a text editor rather than an IDE and the advantage lost through IDE support is often gained in other ways such as organising and structuring code such that it is navigable.
Often IDEs can be configured to understand your dynamic code (and link into it) but few properly support it in a convenient manner.
Hint: The context here is OOP gone horrifically wrong in PHP where people using simple OOP Java programming traditionally have tried to apply that to PHP which even with some OOP support is a fundamentally different type of language.
Designing against your platform to try to turn it into one your used to, designing to cater to an IDE or other tools, designing to cater to supporting Unit Tests, etc should all ring alarm bells because it's a significant deviation away from designing working software to solve a given category of problems or a given feature set.

Why the claim that C# people don't get object-oriented programming? (vs class-oriented)

This caught my attention last night.
On the latest ALT.NET Podcast Scott Bellware discusses how as opposed to Ruby, languages like C#, Java et al. are not truly object oriented rather opting for the phrase "class-oriented". They talk about this distinction in very vague terms without going into much detail or discussing the pros and cons much.
What is the real difference here and how much does it matter? What are other languages then are "object-oriented"? It sounded pretty interesting but I don't want to have to learn Ruby just to know what if anything I am missing.
Update
After reading some of the answers below it seems like people generally agree that the reference is to duck-typing. What I'm not sure I understand still though is the claim that this ultimately changes all that much. Especially if you are already doing proper TDD with loose coupling etc. Can someone show me an example of a specific thing I could do with Ruby that I cannot do with C# and that exemplifies this different OOP approach?

In an object-oriented language, objects are defined by defining objects rather than classes, although classes can provide some useful templates for specific, cookie-cutter definitions of a given abstraction. In a class-oriented language, like C# for example, objects must be defined by classes, and these templates are usually canned and packaged and made immutable before runtime. This arbitrary constraint that objects must be defined before runtime and that the definitions of objects are immutable is not an object-oriented concept; it's class oriented.

The duck typing comments here are more attributing to the fact that Ruby and Python are more dynamic than C#. It doesn't really have anything to do with it's OO Nature.
What (I think) Bellware meant by that is that in Ruby, everything is an object. Even a class. A class definition is an instance of an object. As such, you can add/change/remove behavior to it at runtime.
Another good example is that NULL is an object as well. In ruby, everything is LITERALLY an object. Having such deep OO in it's entire being allows for some fun meta-programming techniques such as method_missing.

IMO, it's really overly defining "object-oriented", but what they are referring to is that Ruby, unlike C#, C++, Java, et al, does not make use of defining a class -- you really only ever work directly with objects. Conversely, in C# for example, you define classes that you then must instantiate into object by way of the new keyword. The key point being you must declare a class in C# or describe it. Additionally, in Ruby, everything -- even numbers, for example -- is an object. In contrast, C# still retains the concept of an object type and a value type. This in fact, I think illustrates the point they make about C# and other similar languages -- object type and value type imply a type system, meaning you have an entire system of describing types as opposed to just working with objects.
Conceptually, I think OO design is what provides the abstraction for use to deal complexity in software systems these days. The language is a tool use to implement an OO design -- some make it more natural than others. I would still argue that from a more common and broader definition, C# and the others are still object-oriented languages.

There are three pillars of OOP
Encapsulation
Inheritance
Polymorphism
If a language can do those three things it is a OOP language.
I am pretty sure the argument of language X does OOP better than language A will go on forever.

OO is sometimes defined as message oriented. The idea is that a method call (or property access) is really a message sent to another object. How the recieveing object handles the message is completely encapsulated. Often the message corresponds to a method which is then executed, but that is just an implementation detail. You can for example create a catch-all handler which is executed regardless of the method name in the message.
Static OO like in C# does not have this kind of encapsulation. A massage has to correspond to an existing method or property, otherwise the compiler will complain. Dynamic languages like Smalltalk, Ruby or Python does however support "message-based" OO.
So in this sense C# and other statically typed OO languages are not true OO, sine thay lack "true" encapsulation.

Update: Its the new wave.. which suggest everything that we've been doing till now is passe.. Seems to be propping up quite a bit in podcasts and books.. Maybe this is what you heard.
Till now we've been concerned with static classes and not unleashed the power of object oriented development. We've been doing 'class based dev.' Classes are fixed/static templates to create objects. All objects of a class are created equal.
e.g. Just to illustrate what I've been babbling about... let me borrow a Ruby code snippet from PragProg screencast I just had the privilege of watching.
'Prototype based development' blurs the line between objects and classes.. there is no difference.
animal = Object.new # create a new instance of base Object
def animal.number_of_feet=(feet) # adding new methods to an Object instance. What?
#number_of_feet = feet
end
def animal.number_of_feet
#number_of_feet
end
cat = animal.clone #inherits 'number_of_feet' behavior from animal
cat.number_of_feet = 4
felix = cat.clone #inherits state of '4' and behavior from cat
puts felix.number_of_feet # outputs 4
The idea being its a more powerful way to inherit state and behavior than traditional class based inheritance. It gives you more flexibility and control in certain "special" scenarios (that I've yet to fathom). This allows things like Mix-ins (re using behavior without class inheritance)..
By challenging the basic primitives of how we think about problems, 'true OOP' is like 'the Matrix' in a way... You keep going WTF in a loop. Like this one.. where the base class of Container can be either an Array or a Hash based on which side of 0.5 the random number generated is.
class Container < (rand < 0.5 ? Array : Hash)
end
Ruby, javascript and the new brigade seem to be the ones pioneering this. I'm still out on this one... reading up and trying to make sense of this new phenomenon. Seems to be powerful.. too powerful.. Useful? I need my eyes opened a bit more. Interesting times.. these.

I've only listened to the first 6-7 minutes of the podcast that sparked your question. If their intent is to say that C# isn't a purely object-oriented language, that's actually correct. Everything in C# isn't an object (at least the primitives aren't, though boxing creates an object containing the same value). In Ruby, everything is an object. Daren and Ben seem to have covered all the bases in their discussion of "duck-typing", so I won't repeat it.
Whether or not this difference (everything an object versus everything not an object) is material/significant is a question I can't readily answer because I don't have sufficient depth in Ruby to compare it to C#. Those of you who on here who know Smalltalk (I don't, though I wish I did) have probably been looking at the Ruby movement with some amusement since it was the first pure OO language 30 years ago.

Maybe they are alluding to the difference between duck typing and class hierarchies?
if it walks like a duck and quacks like a duck, just pretend it's a duck and kick it.
In C#, Java etc. the compiler fusses a lot about: Are you allowed to do this operation on that object?
Object Oriented vs. Class Oriented could therefore mean: Does the language worry about objects or classes?
For instance: In Python, to implement an iterable object, you only need to supply a method __iter__() that returns an object that has a method named next(). That's all there is to it: No interface implementation (there is no such thing). No subclassing. Just talking like a duck / iterator.
EDIT: This post was upvoted while I rewrote everything. Sorry, won't ever do that again. The original content included advice to learn as many languages as possible and to nary worry about what the language doctors think / say about a language.

That was an abstract-podcast indeed!
But I see what they're getting at - they just dazzled by Ruby Sparkle. Ruby allows you to do things that C-based and Java programmers wouldn't even think of + combinations of those things let you achieve undreamt of possibilities.
Adding new methods to a built-in String class coz you feel like it, passing around unnamed blocks of code for others to execute, mixins... Conventional folks are not used to objects changing too far from the class template.
Its a whole new world out there for sure..
As for the C# guys not being OO enough... dont take it to heart.. Just take it as the stuff you speak when you are flabbergasted for words. Ruby does that to most people.
If I had to recommend one language for people to learn in the current decade.. it would be Ruby. I'm glad I did.. Although some people may claim Python. But its like my opinion.. man! :D

I don't think this is specifically about duck typing. For instance C# supports limited duck-typing already - an example would be that you can use foreach on any class that implements MoveNext and Current.
The concept of duck-typing is compatible with statically typed languages like Java and C#, it's basically an extension of reflection.
This is really the case of static vs dynamic typing. Both are proper-OO, in as much as there is such a thing. Outside of academia it's really not worth debating.
Rubbish code can be written in either. Great code can be written in either. There's absolutely nothing functional that one model can do that the other can't.
The real difference is in the nature of the coding done. Static types reduce freedom, but the advantage is that everyone knows what they're dealing with. The opportunity to change instances on the fly is very powerful, but the cost is that it becomes hard to know what you're deaing with.
For instance for Java or C# intellisense is easy - the IDE can quickly produce a drop list of possibilities. For Javascript or Ruby this becomes a lot harder.
For certain things, for instance producing an API that someone else will code with, there is a real advantage in static typing. For others, for instance rapidly producing prototypes, the advantage goes to dynamic.
It's worth having an understanding of both in your skills toolbox, but nowhere near as important as understanding the one you already use in real depth.

Object Oriented is a concept. This concept is based upon certain ideas. The technical names of these ideas (actually rather principles that evolved over the time and have not been there from the first hour) have already been given above, I'm not going to repeat them. I'm rather explaining this as simple and non-technical as I can.
The idea of OO programming is that there are objects. Objects are small independent entities. These entities may have embedded information or they may not. If they have such information, only the entity itself can access it or change it. The entities communicate with each other by sending messages between each other. Compare this to human beings. Human beings are independent entities, having internal data stored in their brain and the interact with each other by communicating (e.g. talking to each other). If you need knowledge from someone's else brain, you cannot directly access it, you must ask him a question and he may answer that to you, telling you what you wanted to know.
And that's basically it. This is real idea behind OO programming. Writing these entities, define the communication between them and have them interact together to form an application. This concept is not bound to any language. It's just a concept and if you write your code in C#, Java, or Ruby, that is not important. With some extra work this concept can even be done in pure C, even though it is a functional language but it offers everything you need for the concept.
Different languages have now adopted this concept of OO programming and of course the concepts are not always equal. Some languages allow what other languages forbid, for example. Now one of the concepts that involved is the concept of classes. Some languages have classes, some don't. A class is a blueprint how an object looks like. It defines the internal data storage of an object, it defines the messages an object can understand and if there is inheritance (which is not mandatory for OO programming!), classes also defines from which other class (or classes if multiple inheritance is allowed) this class inherits (and which properties if selective inheritance exists). Once you created such a blueprint you can now generate an unlimited amount of objects build according to this blueprint.
There are OO languages that have no classes, though. How are objects then build? Well, usually dynamically. E.g. you can create a new blank object and then dynamically add internal structure like instance variables or methods (messages) to it. Or you can duplicate an already existing object, with all its properties and then modify it. Or possibly merge two objects into a new one. Unlike class based languages these languages are very dynamic, as you can generate objects dynamically during runtime in ways not even you the developer has thought about when starting writing the code.
Usually this dynamic has a price: The more dynamic a language is the more memory (RAM) objects will waste and the slower everything gets as program flow is extremely dynamically as well and it's hard for a compiler to generate effective code if it has no chance to predict code or data flow. JIT compilers can optimize some parts of that during runtime, once they know the program flow, however as these languages are so dynamically, program flow can change at any time, forcing the JIT to throw away all compilation results and re-compile the same code over and over again.
But this is a tiny implementation detail - it has nothing to do with the basic OO principle. It is nowhere said that objects need to be dynamic or must be alterable during runtime. The Wikipedia says it pretty well:
Programming techniques may include
features such as information hiding,
data abstraction, encapsulation,
modularity, polymorphism, and
inheritance.
http://en.wikipedia.org/wiki/Object-oriented_programming
They may or they may not. This is all not mandatory. Mandatory is only the presence of objects and that they must have ways to interact with each other (otherwise objects would be pretty useless if they cannot interact with each other).

You asked: "Can someone show me an example of a wonderous thing I could do with ruby that I cannot do with c# and that exemplifies this different oop approach?"
One good example is active record, the ORM built into rails. The model classes are dynamically built at runtime, based on the database schema.

This is really probably getting down to what these people see others doing in c# and java as opposed to c# and java supporting OOP. Most languages cane be used in different programming paradigms. For example, you can write procedural code in c# and scheme, and you can do functional-style programming in java. It is more about what you are trying to do and what the language supports.

I'll take a stab at this.
Python and Ruby are duck-typed. To generate any maintainable code in these languages, you pretty much have to use test driven development. As such, it is very important for a developer to easily inject dependencies into their code without having to create a giant supporting framework.
Successful dependency-injection depends upon on having a pretty good object model. The two are sort of two sides of the same coin. If you really understand how to use OOP, then you should by default create designs where dependencies can be easily injected.
Because dependency injection is easier in dynamically typed languages, the Ruby/Python developers feel like their language understands the lessons of OO much better than other statically typed counterparts.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas