Related
Coming from an OOP background, Haskell's type system and the way data constructors and typeclasses interact is difficult to conceptualize. I can understand how each are used for simple examples, but some more complication examples of data structures that are very well-suited for an OOP style are proving non-trivial to translate into similarly elegant and understandable types.
In particular, I have a problem with organizing a hierarchy of data such as the following.
This is a deeply nested hierarchical inheritance structure, and the lack of support for subtyping makes it unclear how to turn this structure into a natural-feeling alternative in Haskell. It may be fine to replace something like Polygon with a sum data type, declaring it like
data Polygon
= Quad Point Point
| Triangle Point Point Point
| RegularNGon Int Radius
| ...
But this loses some of the structure, and can only really satisfactorily be done for one level of the hierarchy. Typeclasses can be used to implement a form of inheritance and substructure in that a Polygon typeclass could be a subclass of a Shape, and so maybe all Polygon instances have implementations for centroid :: Point and also vertices :: [Point], but this seems unsatisfactory. What would be a good way of capturing the structure of the picture in Haskell?
You can use sum types to represent the entire hierarchy, without losing structure. Something like this would do it:
data Shape = IsPoint Point
| IsLine Line
| IsPolygon Polygon
data Point = Point { x :: Int, y :: Int }
data Line = Line { a :: Point, b :: Point }
data Polygon = IsTriangle Triangle
| IsQuad Quad
| ...
And so on. The basic pattern is you translate each OO abstract class into a Haskell sum type, with each of its immediate OO subclasses (that may themselves be abstract) as variants in the sum type. The concrete classes are product/record types with the actual data members in them.1
The thing you lose compared to the OOP you're used to by modeling things this way isn't the ability to represent your hierarchy, but the ability to extend it without touching existing code. Sum types are "closed", where OO inheritance is "open". If you later decide that you want a Circle option for Shape, you have to add it to Shape and then add cases for it everywhere you pattern match on a Shape.
However, this kind of hierarchy probably requires fairly liberal downcasting in OO. For example, if you want a function that can tell if two shapes intersect that's probably an abstract method on Shape like Shape.intersects(Shape other), so each sub-type gets to write its own implementation. But when I'm writing Rectangle.intersects(Shape other) it's basically impossible generically, without knowing what other subclasses of Shape are out there. I'll have to be using isinstance checks to see what other actually is. But that actually means that I probably can't just add my new Circle subclass without revisiting existing code; an OO hierarchy where isinstance checks are needed is de-facto just as "closed" as the Haskell sum type hierarchy is. Basically pattern matching on one of the sum-types generated by applying this pattern is the equivalent of isinstancing and downcasting in the OO version. Only because the sum types are exhaustively known to the compiler (only possible because they're closed), if I do add a Circle case to Shape the compiler is able to tell me about all the places that I need to revisit to handle that case.2
If you have a hierarchy that doesn't need a lot of downcasting, it means that the various base classes have substantial and useful interfaces that they guarantee to be available, and you usually use things through that interface rather than switching on what it could possibly be, then you can probably use type classes. You still need all the "leaf" data types (the product types with the actual data fields), only instead of adding sum type wrappers to group them up you add type classes for the common interface. If you can use this style of translation, then you can add new cases more easily (just add the new Circle data type, and an instance to say how it implements the Shape type class; all the places that are polymorphic in any type in the Shape class will now handle Circles as well). But if you're doing that in OO you always have downcasts available as an escape hatch when it turns out you can't handle shapes generically; with this design in Haskell it's impossible.3
But my "real" answer to "how do I represent OO type hierarchies in Haskell" is unfortunately the trite one: I don't. I design differently in Haskell than I do in OO languages4, and in practice it's just not a huge problem. But to say how I'd design this case differently, I'd have to know more about what you're using them for. For example you could do something like represent a shape as a Point -> Bool function (that tells you whether any given point is inside the shape), and having things like circle :: Point -> Int -> (Point -> Bool) for generating such functions corresponding to normal shapes; that representation is awesome for forming composite intersection/union shapes without knowing anything about them (intersect shapeA shapeB = \point -> shapeA point && shapeB point), but terrible for calculating things like areas and circumferences.
1 If you have abstract classes with data members, or you have concrete classes that also have further subclasses you can manually push the data members down into the "leaves", factor out the inherited data members into a shared record and make all of the "leaves" contain one of those, split a layer so that you have a product type containing the inherited data members and a sum type (where that sum type then "splits" into the options for the subclasses), stuff like that.
2 If you use catch-all patterns then the warning might not be exhaustive, so it's not always bullet proof, but how bullet proof it is is up to how you code.
3 Unless you opt into runtime type information with a solution like Typeable, but that's not an invisible change; your callers have to opt into it as well.
4 Actually I probably wouldn't design a hierarchy like this even in OO languages. I find it doesn't turn out to be as useful as you'd think in real programs, hence the "favour composition over inheritance" advice.
You may be looking for a Haskell equivalent of dynamic dispatch, such that you could store a heterogeneous list of values supporting distinct implementations of a common Shape interface.
Haskell's existential types support this kind of usage. It's fairly rare for a Haskell program to actually need existential types -- as Ben's answer demonstrates, sum types can handle this kind of problem. However, existential types are appropriate for a large, open-ended collection of cases:
{-# LANGUAGE ExistentialQuantification #-}
...
class Shape a where
bounds :: a -> AABB
draw :: a -> IO ()
data AnyShape = forall a. Shape a => AnyShape a
This lets you declare instances in an open-ended style:
data Line = Line Point Point
instance Shape Line where ...
data Circle= Circle {center :: Point, radius :: Double}
instance Shape Circle where ...
...
Then, you can build your heterogeneous list:
shapes = [AnyShape(Line a b),
AnyShape(Circle a 3.0),
AnyShape(Circle b 1.8)]
and use it in a uniform way:
drawIn box xs = sequence_ [draw s | AnyShape s <- xs, bounds s `hits` box]
Note that you need to unwrap your AnyShape in order to use the class Shape interface functions. Also note that you must use the class functions to access your heterogeneous data -- there is no other way to "downcast" the unwrapped existential value s! Its type only makes sense within the local scope, so the compiler will not let it escape.
If you are trying to use existential types, yet find yourself needing to "downcast" them, sum types might be a better fit.
I have a lot to learn in the way of OO patterns and this is a problem I've come across over the years. I end up in situations where my classes' sole purpose is procedural, just basically wrapping a procedure up in a class. It doesn't seem like the right OO way to do things, and I wonder if someone is experienced with this problem enough to help me consider it in a different way. My specific example in the current application follows.
In my application I'm taking a set of points from engineering survey equipment and normalizing them to be used elsewhere in the program. By "normalize" I mean a set of transformations of the full data set until a destination orientation is reached.
Each transformation procedure will take the input of an array of points (i.e. of the form class point { float x; float y; float z; }) and return an array of the same length but with different values. For example, a transformation like point[] RotateXY(point[] inList, float angle). The other kind of procedure wold be of the analysis type, used to supplement the normalization process and decide what transformation to do next. This type of procedure takes in the same points as a parameter but returns a different kind of dataset.
My question is, what is a good pattern to use in this situation? The one I was about to code in was a Normalization class which inherits class types of RotationXY for instance. But RotationXY's sole purpose is to rotate the points, so it would basically be implementing a single function. This doesn't seem very nice, though, for the reasons I mentioned in the first paragraph.
Thanks in advance!
The most common/natural approach for finding candidate classes in your problem domain is to look for nouns and then scan for the verbs/actions associated with those nouns to find the behavior that each class should implement. While this is generally a good advise, it doesn't mean that your objects must only represent concrete elements. When processes (which are generally modeled as methods) start to grow and become complex, it is a good practice to model them as objects. So, if your transformation has a weight on its own, it is ok to model it as an object and do something like:
class RotateXY
{
public function apply(point p)
{
//Apply the transformation
}
}
t = new RotateXY();
newPoint = t->apply(oldPoint);
in case you have many transformations you can create a polymorphic hierarchy and even chain one transformation after another. If you want to dig a bit deeper you can also take a look at the Command design pattern, which closely relates to this.
Some final comments:
If it fits your case, it is a good idea to model the transformation at the point level and then apply it to a collection of points. In that way you can properly isolate the transformation concept and is also easier to write test cases. You can later even create a Composite of transformations if you need.
I generally don't like the Utils (or similar) classes with a bunch of static methods, since in most of the cases it means that your model is missing the abstraction that should carry that behavior.
HTH
Typically, when it comes to classes that contain only static methods, I name them Util, e.g. DbUtil for facading DB access, FileUtil for file I/O etc. So find some term that all your methods have in common and name it that Util. Maybe in your case GeometryUtil or something along those lines.
Since the particulars of the transformations you apply seem ad-hoc for the problem and possibly prone to change in the future you could code them in a configuration file.
The point's client would read from the file and know what to do. As for the rotation or any other transformation method, they could go well as part of the Point class.
I see nothing particularly wrong with classes/interfaces having just essentially one member.
In your case the member is an "Operation with some arguments of one type that returns same type" - common for some math/functional problems. You may find convenient to have interface/base class and helper methods that combine multiple transformation classes together into more complex transformation.
Alternative approach: if you language support it is just go functional style altogether (similar to LINQ in C#).
On functional style suggestion: I's start with following basic functions (probably just find them in standard libraries for the language)
collection = map(collection, perItemFunction) to transform all items in a collection (Select in C#)
item = reduce (collection, agregateFunction) to reduce all items into single entity (Aggregate in C#)
combine 2 functions on item funcOnItem = combine(funcFirst, funcSecond). Can be expressed as lambda in C# Func<T,T> combined = x => second(first(x)).
"bind"/curry - fix one of arguments of a function functionOfOneArg = curry(funcOfArgs, fixedFirstArg). Can be expressed in C# as lambda Func<T,T> curried = x => funcOfTwoArg(fixedFirstArg, x).
This list will let you do something like "turn all points in collection on a over X axis by 10 and shift Y by 15": map(points, combine(curry(rotateX, 10), curry(shiftY(15))).
The syntax will depend on language. I.e. in JavaScript you just pass functions (and map/reduce are part of language already), C# - lambda and Func classes (like on argument function - Func<T,R>) are an option. In some languages you have to explicitly use class/interface to represent a "function" object.
Alternative approach: If you actually dealing with points and transformation another traditional approach is to use Matrix to represent all linear operations (if your language supports custom operators you get very natural looking code).
I need to deal with a two objects of a class in a way that will return a third object of the same class, and I am trying to determine whether it is better to do this as an independent function that receives two objects and returns the third or as a method which would take one other object and return the third.
For a simple example. Would this:
from collections import namedtuple
class Point(namedtuple('Point', 'x y')):
__slots__ = ()
#Attached to class
def midpoint(self, otherpoint):
mx = (self.x + otherpoint.x) / 2.0
my = (self.y + otherpoint.y) / 2.0
return Point(mx, my)
a = Point(1.0, 2.0)
b = Point(2.0, 3.0)
print a.midpoint(b)
#Point(x=1.5, y=2.5)
Or this:
from collections import namedtuple
class Point(namedtuple('Point', 'x y')):
__slots__ = ()
#not attached to class
#takes two point objects
def midpoint(p1, p2):
mx = (p1.x + p2.x) / 2.0
my = (p1.y + p2.y) / 2.0
return Point(mx, my)
a = Point(1.0, 2.0)
b = Point(2.0, 3.0)
print midpoint(a, b)
#Point(x=1.5, y=2.5)
and why would one be preferred over the other?
This seems far less clear cut than I had expected when I asked the question.
In summary, it seems that something like a.midpoint(b) is not preferred since it seems to give a special place to one point or another in what is really a symmetric function that returns a completely new point instance. But it seems to be largely a matter of taste and style between something like a freestanding module function or a function attached to the class, but not meant to be called by the insance, such as Point.midpoint(a, b).
I think, personally, I stylistically lean towards free-standing module functions, but it may depend on the circumstances. In cases where the function is definitely tightly bound to the class and there is any risk of namespace pollution or potential confusion, then making a class function probably makes more sense.
Also, a couple of people mentioned making the function more general, perhaps by implementing additional features of the class to support this. In this particular case dealing with points and midpoints, that is probably the overall best approach. It supports polymorphism and code reuse and is highly readable. In a lot of cases though, that would not work (the project that inspired me to ask this for instance), but points and midpoints seemed like a concise and understandable example to illustrate the question.
Thank you all, it was enlightening.
The first approach is reasonable and isn't conceptually different from what set.union and set.intersection do. Any func(Point, Point) --> Point is clearly related to the Point class, so there is no question about interfering with the unity or cohesion of the class.
It would be a tougher choice if different classes were involved: draw_perpendicular(line, point) --> line. To resolve the choice of classes, you would pick the one that has the most related logic. For example, str.join needs a string delimiter and a list of strings. It could have been a standalone function (as it was in the old days with the string module), or it could be a method on lists (but it only works for lists of strings), or a method on strings. The latter was chosen because joining is more about strings than it is about lists. This choice was made eventhough it led to the arguably awkward expression delimiter.join(things_to_join).
I disagree with the other respondent who recommended using a classmethod. Those are often used for alternate constructor signatures but not for transformations on instances of the class. For example, datetime.fromordinal is a classmethod for constructing a date from something other than an instance of the class (in this case, an from an int). This contrasts with datetime.replace which is a regular method for making a new datetime instance based on an existing instance. This should steer you away from using classmethod for the midpoint computation.
One other thought: if you keep midpoint() with the Point() class, it makes it possible to create other classes that have the same Point API but a different internal representation (i.e. polar coordinates may be more convenient for some types of work than Cartesian coordinates). If midpoint() is a separate function you start to lose the benefits of encapsulation and of a coherent interface.
I would choose the second option because, in my opinion, it is clearer than the first. You are performing the midpoint operation between two points; not the midpoint operation with respect to a point. Similarly, a natural extension of this interface could be to define dot, cross, magnitude, average, median, etc. Some of those functions will operate on pairs of Points and others may operate on lists. Making it a function makes them all have consistent interfaces.
Defining it as a function also allows it to be used with any pair of objects that present a .x .y interface, while making it a method requires that at least one of the two is a Point.
Lastly, to address the location of the function, I believe it makes sense to co-locate it in the same package as the Point class. This places it in the same namespace, which clearly indicates its relationship with Point and, in my opinion, is more pythonic than a static or class method.
Update:
Further reading on the Pythonicness of #staticmethod vs package/module:
In both Thomas Wouter's answer to the question What is the difference between staticmethod and classmethod in Python and Mike Steder's answer to init and arguments in Python, the authors indicated that a package or module of related functions is perhaps a better solution. Thomas Wouter has this to say:
[staticmethod] is basically useless in Python -- you can just use a module function instead of a staticmethod.
While Mike Steder comments:
If you find yourself creating objects that consist of nothing but staticmethods the more pythonic thing to do would be to create a new module of related functions.
However, codeape rightly points out below that a calling convention of Point.midpoint(a,b) will co-locate the functionality with the type. The BDFL also seems to value #staticmethod as the __new__ method is a staticmethod.
My personal preference would be to use a function for the reasons cited above, but it appears that the choice between #staticmethod and a stand-alone function are largely in the eye of the beholder.
In this case you can use operator overloading:
from collections import namedtuple
class Point(namedtuple('Point', 'x y')):
__slots__ = ()
#Attached to class
def __add__(self, otherpoint):
mx = (self.x + otherpoint.x)
my = (self.y + otherpoint.y)
return Point(mx, my)
def __div__(self, scalar):
return Point(self.x/scalar, self.y/scalar)
a = Point(1.0, 2.0)
b = Point(2.0, 3.0)
def mid(a,b): # general function
return (a+b)/2
print mid(a,b)
I think the decision mostly depends on how general and abstract the function is. If you can write the function in a way that works on all objects that implement a small set of clean interfaces, then you can turn it into a separate function. The more interfaces your function depends on and the more specific they are, the more it makes sense to put it on the class (as instances of this class will most likely be the only objects the function will work with anyways).
Another option is to use a #classmethod. It is probably what I would prefer in this case.
class Point(...):
#classmethod
def midpoint(cls, p1, p2):
mx = (p1.x + p2.x) / 2.0
my = (p1.y + p2.y) / 2.0
return cls(mx, my)
# ...
print Point.midpoint(a, b)
I would choose version one, because this way all functionality for points is stored in the point class, i.e. grouping related functionality. Additionally, point objects know best about the meaning and inner workings of their data, so it's the right place to implement your function. An external function, for example in C++, would have to be a friend, which smells like a hack.
A different way of doing this is to access x and y through the namedtuple's subscript interface. You can then completely generalize the midpoint function to n dimensions.
class Point(namedtuple('Point', 'x y')):
__slots__ = ()
def midpoint(left, right):
return tuple([sum(a)/2. for a in zip(left, right)])
This design works for Point classes, n-tuples, lists of length n, etc. For example:
>>> midpoint(Point(0,0), Point(1,1))
(0.5, 0.5)
>>> midpoint(Point(5,1), (3, 2))
(4.0, 1.5)
>>> midpoint((1,2,3), (4,5,6))
(2.5, 3.5, 4.5)
Having the SOLID principles and testability in mind, consider the following case:
You have class A and class B which have some overlapping properties. You want a method that copies and/or converts the common properties from class A to class B. Where does that method go?
Class A as a B GetAsB() ?
Class B as a constructor B(A input)?
Class B as a method void FillWithDataFrom(A input)?
Class C as a static method B ConvertAtoB(A source)?
???
It depends, all make sense in different circumstances; some examples from Java:
String java.lang.StringBuilder.toString()
java.lang.StringBuilder(String source)
void java.util.GregorianCalender.setTime(Date time)
ArrayList<T> java.util.Collections.list(Enumeration<T> e)
Some questions to help you decide:
Which dependency makes more sense? A dependent on B, B dependent on A, neither?
Do you always create a new B from an A, or do you need to fill existing Bs using As?
Are there other classes with similar collaborations, either as data providers for Bs or as targets for As data?
I'd rule out 1. because getter methods should be avoided (tell, don't ask principle).
I'd rule out 2. because it looks like a conversion, and this is not a conversion if A and B are different classes which happens to have something in common. At least, this is what it seems from the description. If that's not the case, 2 would be an option too IMHO.
Does 4. implies that C is aware of inner details of B and/or C? If so, I'd rule out this option too.
I'd vote for 3. then.
Whether this is correct OOP theory or not is up for debate, but depending upon the circumstances, I wouldn't rule C out quite so quickly. While ti DOES create a rather large dependency, it can have it's uses if the specific role of C is to manage the interaction (and copying) from A to B. The dependency is created in C specifically to avoid creating such dependency beteween A and B. Further, C exists specifically to manage the dependency, and can be implemented with that in mind.
Ex. (in vb.Net/Pseudocode):
Public Class C
Public Shared Function BClassFactory(ByVal MyA As A) As B
Dim NewB As New B
With B
.CommonProperty1 = A.CommonProperty1
.CommonProperty2 = A.CommonProperty2
End With
Return B
End Function
End Class
If there is a concrete reason to create, say, a AtoBConverterClass, this approach might be valid.
Again, this might be a specialized case. However I have found it useful on occasion. Especially if there are REALLY IMPORTANT reasons to keep A and B ignorant of eachother.
I am a C# developer. Coming from OO side of the world, I start with thinking in terms of interfaces, classes and type hierarchies. Because of lack of OO in Haskell, sometimes I find myself stuck and I cannot think of a way to model certain problems with Haskell.
How to model, in Haskell, real world situations involving class hierarchies such as the one shown here: http://www.braindelay.com/danielbray/endangered-object-oriented-programming/isHierarchy-4.gif
First of all: Standard OO design is not going to work nicely in Haskell. You can fight the language and try to make something similar, but it will be an exercise in frustration. So step one is look for Haskell-style solutions to your problem instead of looking for ways to write an OOP-style solution in Haskell.
But that's easier said than done! Where to even start?
So, let's disassemble the gritty details of what OOP does for us, and think about how those might look in Haskell.
Objects: Roughly speaking, an object is the combination of some data with methods operating on that data. In Haskell, data is normally structured using algebraic data types; methods can be thought of as functions taking the object's data as an initial, implicit argument.
Encapsulation: However, the ability to inspect an object's data is usually limited to its own methods. In Haskell, there are various ways to hide a piece of data, two examples are:
Define the data type in a separate module that doesn't export the type's constructors. Only functions in that module can inspect or create values of that type. This is somewhat comparable to protected or internal members.
Use partial application. Consider the function map with its arguments flipped. If you apply it to a list of Ints, you'll get a function of type (Int -> b) -> [b]. The list you gave it is still "there", in a sense, but nothing else can use it except through the function. This is comparable to private members, and the original function that's being partially applied is comparable to an OOP-style constructor.
"Ad-hoc" polymorphism: Often, in OO programming we only care that something implements a method; when we call it, the specific method called is determined based on the actual type. Haskell provides type classes for compile-time function overloading, which are in many ways more flexible than what's found in OOP languages.
Code reuse: Honestly, my opinion is that code reuse via inheritance was and is a mistake. Mix-ins as found in something like Ruby strike me as a better OO solution. At any rate, in any functional language, the standard approach is to factor out common behavior using higher-order functions, then specialize the general-purpose form. A classic example here are fold functions, which generalize almost all iterative loops, list transformations, and linearly recursive functions.
Interfaces: Depending on how you're using an interface, there are different options:
To decouple implementation: Polymorphic functions with type class constraints are what you want here. For example, the function sort has type (Ord a) => [a] -> [a]; it's completely decoupled from the details of the type you give it other than it must be a list of some type implementing Ord.
Working with multiple types with a shared interface: For this you need either a language extension for existential types, or to keep it simple, use some variation on partial application as above--instead of values and functions you can apply to them, apply the functions ahead of time and work with the results.
Subtyping, a.k.a. the "is-a" relationship: This is where you're mostly out of luck. But--speaking from experience, having been a professional C# developer for years--cases where you really need subtyping aren't terribly common. Instead, think about the above, and what behavior you're trying to capture with the subtyping relationship.
You might also find this blog post helpful; it gives a quick summary of what you'd use in Haskell to solve the same problems that some standard Design Patterns are often used for in OOP.
As a final addendum, as a C# programmer, you might find it interesting to research the connections between it and Haskell. Quite a few people responsible for C# are also Haskell programmers, and some recent additions to C# were heavily influenced by Haskell. Most notable is probably the monadic structure underlying LINQ, with IEnumerable being essentially the list monad.
Let's assume the following operations: Humans can speak, Dogs can bark, and all members of a species can mate with members of the same species if they have opposite gender. I would define this in haskell like this:
data Gender = Male | Female deriving Eq
class Species s where
gender :: s -> Gender
-- Returns true if s1 and s2 can conceive offspring
matable :: Species a => a -> a -> Bool
matable s1 s2 = gender s1 /= gender s2
data Human = Man | Woman
data Canine = Dog | Bitch
instance Species Human where
gender Man = Male
gender Woman = Female
instance Species Canine where
gender Dog = Male
gender Bitch = Female
bark Dog = "woof"
bark Bitch = "wow"
speak Man s = "The man says " ++ s
speak Woman s = "The woman says " ++ s
Now the operation matable has type Species s => s -> s -> Bool, bark has type Canine -> String and speak has type Human -> String -> String.
I don't know whether this helps, but given the rather abstract nature of the question, that's the best I could come up with.
Edit: In response to Daniel's comment:
A simple hierarchy for collections could look like this (ignoring already existing classes like Foldable and Functor):
class Foldable f where
fold :: (a -> b -> a) -> a -> f b -> a
class Foldable m => Collection m where
cmap :: (a -> b) -> m a -> m b
cfilter :: (a -> Bool) -> m a -> m a
class Indexable i where
atIndex :: i a -> Int -> a
instance Foldable [] where
fold = foldl
instance Collection [] where
cmap = map
cfilter = filter
instance Indexable [] where
atIndex = (!!)
sumOfEvenElements :: (Integral a, Collection c) => c a -> a
sumOfEvenElements c = fold (+) 0 (cfilter even c)
Now sumOfEvenElements takes any kind of collection of integrals and returns the sum of all even elements of that collection.
Instead of classes and objects, Haskell uses abstract data types. These are really two compatible views on the problem of organizing ways of constructing and observing information. The best help I know of on this subject is William Cook's essay Object-Oriented Programming Versus Abstract Data Types. He has some very clear explanations to the effect that
In a class-based system, code is organized around different ways of constructing abstractions. Generally each different way of constructing an abstraction is assigned its own class. The methods know how to observe properties of that construction only.
In an ADT-based system (like Haskell), code is organized around different ways of observing abstractions. Generally each different way of observing an abstraction is assigned its own function. The function knows all the ways the abstraction could be constructed, and it knows how to observe a single property, but of any construction.
Cook's paper will show you a nice matrix layout of abstractions and teach you how to organize any class as an ADY or vice versa.
Class hierarchies involve one more element: the reuse of implementations through inheritance. In Haskell, such reuse is achieved through first-class functions instead: a function in a Primate abstraction is a value and an implementation of the Human abstraction can reuse any functions of the Primate abstraction, can wrap them to modify their results, and so on.
There is not an exact fit between design with class hierarchies and design with abstract data types. If you try to transliterate from one to the other, you will wind up with something awkward and not idiomatic—kind of like a FORTRAN program written in Java.
But if you understand the principles of class hierarchies and the principles of abstract data types, you can take a solution to a problem in one style and craft a reasonably idiomatic solution to the same problem in the other style. It does take practice.
Addendum: It's also possible to use Haskell's type-class system to try to emulate class hierarchies, but that's a different kettle of fish. Type classes are similar enough to ordinary classes that a number of standard examples work, but they are different enough that there can also be some very big surprises and misfits. While type classes are an invaluable tool for a Haskell programmer, I would recommend that anyone learning Haskell learn to design programs using abstract data types.
Haskell is my favorite language, is a pure functional language.
It does not have side effects, there is no assignment.
If you find to hard the transition to this language, maybe F# is a better place to start with functional programming. F# is not pure.
Objects encapsulate states, there is a way to achieve this in Haskell, but this is one of the issues that takes more time to learn because you must learn some category theory concepts to deeply understand monads. There is syntactic sugar that lets you see monads like non destructive assignment, but in my opinion it is better to spend more time understanding the basis of category theory (the notion of category) to get a better understanding.
Before trying to program in OO style in Haskell, you should ask yourself if you really use the object oriented style in C#, many programmers use OO languages, but their programs are written in the structured style.
The data declaration allows you to define data structures combining products (equivalent to structure in C language) and unions (equivalent to union in C), the deriving part o the declaration allows to inherit default methods.
A data type (data structure) belongs to a class if has an implementation of the set of methods in the class.
For example, if you can define a show :: a -> String method for your data type, then it belong to the class Show, you can define your data type as an instance of the Show class.
This is different of the use of class in some OO languages where it is used as a way to define structures + methods.
A data type is abstract if it is independent of it's implementation. You create, mutate, and destroy the object by an abstract interface, you do not need to know how it is implemented.
Abstraction is supported in Haskell, it is very easy to declare.
For example this code from the Haskell site:
data Tree a = Nil
| Node { left :: Tree a,
value :: a,
right :: Tree a }
declares the selectors left, value, right.
the constructors may be defined as follows if you want to add them to the export list in the module declaration:
node = Node
nil = Nil
Modules are build in a similar way as in Modula. Here is another example from the same site:
module Stack (Stack, empty, isEmpty, push, top, pop) where
empty :: Stack a
isEmpty :: Stack a -> Bool
push :: a -> Stack a -> Stack a
top :: Stack a -> a
pop :: Stack a -> (a,Stack a)
newtype Stack a = StackImpl [a] -- opaque!
empty = StackImpl []
isEmpty (StackImpl s) = null s
push x (StackImpl s) = StackImpl (x:s)
top (StackImpl s) = head s
pop (StackImpl (s:ss)) = (s,StackImpl ss)
There is more to say about this subject, I hope this comment helps!