How do you show encapsulation in UML class diagram? - oop

My question is pretty straightforward. How do I represent or show encapsulation when modelling a UML class diagram? Inheritance is modelled using the arrows and an abstract class is shown by having the class name in italics or between the following arrows as shown, << Animals >> but what do you do for encapsulation?

Each feature of a class can have a visibility. Private features make it possible to encapsulate the state of the instance. The notation for a private feature is a - sign in front of its name (+ for public, # for protected and ~ for package visibility).
PS: An abstract class is not shown with "arrows". Instead it is shown with the word {abstract} in curly brackets (or as you correctly state by writing the name in italics).
PPS: "Arrows" (<<...>>) are not UML notation. You probably mean guillemets: «...». They are used to annotate language elements, that don't have a distinct notation, like DataTypes. In this case the keyword «dataType» is shown above the name. They could also be the notation for a property of a language element. For example an Activity with isSingleExecution=true will show the keyword «singleExecution». Why this case is not expressed with curly brackets, as in the case of isAbstract=true is shrouded in mystery. Finally they are used for user defined language elements (=stereotypes). Please note, that such elements are one level above the model elements, i.e. on the language level (also called meta level). Therefore, they don't express anything on the level of the modeled system. A stereotype «Animals» defines a new language element, but not an animal.

Related

Restructuring an OOP datatype into Haskell types

Coming from an OOP background, Haskell's type system and the way data constructors and typeclasses interact is difficult to conceptualize. I can understand how each are used for simple examples, but some more complication examples of data structures that are very well-suited for an OOP style are proving non-trivial to translate into similarly elegant and understandable types.
In particular, I have a problem with organizing a hierarchy of data such as the following.
This is a deeply nested hierarchical inheritance structure, and the lack of support for subtyping makes it unclear how to turn this structure into a natural-feeling alternative in Haskell. It may be fine to replace something like Polygon with a sum data type, declaring it like
data Polygon
= Quad Point Point
| Triangle Point Point Point
| RegularNGon Int Radius
| ...
But this loses some of the structure, and can only really satisfactorily be done for one level of the hierarchy. Typeclasses can be used to implement a form of inheritance and substructure in that a Polygon typeclass could be a subclass of a Shape, and so maybe all Polygon instances have implementations for centroid :: Point and also vertices :: [Point], but this seems unsatisfactory. What would be a good way of capturing the structure of the picture in Haskell?
You can use sum types to represent the entire hierarchy, without losing structure. Something like this would do it:
data Shape = IsPoint Point
| IsLine Line
| IsPolygon Polygon
data Point = Point { x :: Int, y :: Int }
data Line = Line { a :: Point, b :: Point }
data Polygon = IsTriangle Triangle
| IsQuad Quad
| ...
And so on. The basic pattern is you translate each OO abstract class into a Haskell sum type, with each of its immediate OO subclasses (that may themselves be abstract) as variants in the sum type. The concrete classes are product/record types with the actual data members in them.1
The thing you lose compared to the OOP you're used to by modeling things this way isn't the ability to represent your hierarchy, but the ability to extend it without touching existing code. Sum types are "closed", where OO inheritance is "open". If you later decide that you want a Circle option for Shape, you have to add it to Shape and then add cases for it everywhere you pattern match on a Shape.
However, this kind of hierarchy probably requires fairly liberal downcasting in OO. For example, if you want a function that can tell if two shapes intersect that's probably an abstract method on Shape like Shape.intersects(Shape other), so each sub-type gets to write its own implementation. But when I'm writing Rectangle.intersects(Shape other) it's basically impossible generically, without knowing what other subclasses of Shape are out there. I'll have to be using isinstance checks to see what other actually is. But that actually means that I probably can't just add my new Circle subclass without revisiting existing code; an OO hierarchy where isinstance checks are needed is de-facto just as "closed" as the Haskell sum type hierarchy is. Basically pattern matching on one of the sum-types generated by applying this pattern is the equivalent of isinstancing and downcasting in the OO version. Only because the sum types are exhaustively known to the compiler (only possible because they're closed), if I do add a Circle case to Shape the compiler is able to tell me about all the places that I need to revisit to handle that case.2
If you have a hierarchy that doesn't need a lot of downcasting, it means that the various base classes have substantial and useful interfaces that they guarantee to be available, and you usually use things through that interface rather than switching on what it could possibly be, then you can probably use type classes. You still need all the "leaf" data types (the product types with the actual data fields), only instead of adding sum type wrappers to group them up you add type classes for the common interface. If you can use this style of translation, then you can add new cases more easily (just add the new Circle data type, and an instance to say how it implements the Shape type class; all the places that are polymorphic in any type in the Shape class will now handle Circles as well). But if you're doing that in OO you always have downcasts available as an escape hatch when it turns out you can't handle shapes generically; with this design in Haskell it's impossible.3
But my "real" answer to "how do I represent OO type hierarchies in Haskell" is unfortunately the trite one: I don't. I design differently in Haskell than I do in OO languages4, and in practice it's just not a huge problem. But to say how I'd design this case differently, I'd have to know more about what you're using them for. For example you could do something like represent a shape as a Point -> Bool function (that tells you whether any given point is inside the shape), and having things like circle :: Point -> Int -> (Point -> Bool) for generating such functions corresponding to normal shapes; that representation is awesome for forming composite intersection/union shapes without knowing anything about them (intersect shapeA shapeB = \point -> shapeA point && shapeB point), but terrible for calculating things like areas and circumferences.
1 If you have abstract classes with data members, or you have concrete classes that also have further subclasses you can manually push the data members down into the "leaves", factor out the inherited data members into a shared record and make all of the "leaves" contain one of those, split a layer so that you have a product type containing the inherited data members and a sum type (where that sum type then "splits" into the options for the subclasses), stuff like that.
2 If you use catch-all patterns then the warning might not be exhaustive, so it's not always bullet proof, but how bullet proof it is is up to how you code.
3 Unless you opt into runtime type information with a solution like Typeable, but that's not an invisible change; your callers have to opt into it as well.
4 Actually I probably wouldn't design a hierarchy like this even in OO languages. I find it doesn't turn out to be as useful as you'd think in real programs, hence the "favour composition over inheritance" advice.
You may be looking for a Haskell equivalent of dynamic dispatch, such that you could store a heterogeneous list of values supporting distinct implementations of a common Shape interface.
Haskell's existential types support this kind of usage. It's fairly rare for a Haskell program to actually need existential types -- as Ben's answer demonstrates, sum types can handle this kind of problem. However, existential types are appropriate for a large, open-ended collection of cases:
{-# LANGUAGE ExistentialQuantification #-}
...
class Shape a where
bounds :: a -> AABB
draw :: a -> IO ()
data AnyShape = forall a. Shape a => AnyShape a
This lets you declare instances in an open-ended style:
data Line = Line Point Point
instance Shape Line where ...
data Circle= Circle {center :: Point, radius :: Double}
instance Shape Circle where ...
...
Then, you can build your heterogeneous list:
shapes = [AnyShape(Line a b),
AnyShape(Circle a 3.0),
AnyShape(Circle b 1.8)]
and use it in a uniform way:
drawIn box xs = sequence_ [draw s | AnyShape s <- xs, bounds s `hits` box]
Note that you need to unwrap your AnyShape in order to use the class Shape interface functions. Also note that you must use the class functions to access your heterogeneous data -- there is no other way to "downcast" the unwrapped existential value s! Its type only makes sense within the local scope, so the compiler will not let it escape.
If you are trying to use existential types, yet find yourself needing to "downcast" them, sum types might be a better fit.

Standard ML: Datatype vs. Structure

I'm reading through Paulson's ML For the Working Programmer and am a bit confused about the distinction between datatypes and structures.
On p. 142, he defines a type for binary trees as follows:
datatype 'a tree = Lf
| Br of 'a * 'a tree * 'a tree;
This seems to be a recursive definition where 'a denotes some fixed type. So any time I see 'a, it must refer to the same type throughout.
On p. 148, he discusses a structure for binary trees:
"...we have been following an imaginary ML session in which we typed in the tree functions one at a time. Now we ought to collect the most important of those functions into a structure, called Tree. We really must do so, because one of our functions (size) clashes with a built-in function. One reason for using structures is to prevent such name clashes.
We shall, however, leave the datatype declaration of tree outside of the structure. If it were inside, we should be forced to refer to the constructors by Tree.Lf and Tree.Br, which would make our patters unreadable. Thus, in the sequel, imagine that we have made the following declarations:
datatype 'a tree = Lf
| Br of 'a * 'a tree * 'a tree;
structure Tree =
struct
fun size Lf = 0
| size (Br( v, t1, t2)) = 1 + size t1 + size t2;
fun depth...
etc...
end;
I'm a little confused.
1) What is the relationship between a datatype and a structure?
2) What is the role of "struct" within the structure definition?
3) Later on, Paulson discusses a structure for dictionaries as binary search trees. He does the following:
structure Dict : DICTIONARY =
struct
type key = string;
type 'a t = (key * 'a) tree;
val empty = Lf;
<a bunch of functions for dictionaries>
This makes me think struct specifies the different primitive or compound types involved int he definition of a Dict.
That's a really fuzzy definition though. Anyone like to clarify?
Thanks for the help,
bclayman
A structure is a module. Everything between the struct and end keywords forms the body of this module. Similarly, you can view a signature as the description of an abstract module interface. Ascribing a signature to a structure (like the : DICTIONARY syntax does in your example) limits the exports of the module to what is specified in that signature (by default, everything would be accessible). That allows you to hide implementation details of a module.
However, ML modules are much richer than that. They can be arbitrarily nested. There are also functors, which are effectively functions from modules to modules ("parameterised modules", if you want). Altogether, the module language in ML forms a full functional language on its own, with structures as the basic entities, functors over them, and signatures describing the "types" of such modules. This little language is a layer on top of the so-called core language, where ordinary values and types live.
So, to answer your individual questions:
1) There is no specific relationship between the datatype and the structure. The latter simply uses the former.
2) struct-end is simply a keyword pair to delimit the structure body (languages in C tradition would probably use curly braces there).
3) As explained above, a structure is a basic module. It can contain (and export) arbitrary other language entities, including other modules. By grouping definitions together, and potentially hiding some of them through a signature ascription, you can express namespacing and encapsulation (in particular, abstract data types).
I should also note that Paulson's book is outdated regarding its description of modules, as it predates the current language version. In particular, it does not describe how to express abstract data types through modules, but instead introduces the obsolete abstype declaration which nobody has been using in almost 20 years. A more extensive and up-to-date introduction to modular programming in ML can be found in Harper's Programming in Standard ML.
In this example, the datatype 'a tree is describing a binary tree (https://en.wikipedia.org/wiki/Binary_tree) that is capable of storing any value of a single type. The 'a in the definition is a variant type which will later be constrained down to a concrete type wherever tree is used with a different type. This allows you to define the structure of a tree once and then use it with any type later on.
The Tree structure is separate from the datatype definition. It is being used to group functions together that operate on the 'a tree datatype. It is being used right now as a way to modularize the code and, as it points out, to prevent namespace clashes.
struct is just an identifier keyword to let the compiler know where your structure definition starts while the end keyword is used to let the compiler know where the definition ends.
The dictionary structure is defining a dictionary (a key -> value data structure) that uses a tree as the internal data structure. Once again, the structure is a collection of functions that will be used to create and operate on dictionaries. The types within the dictionary structure compose the type of the internal data structure that makes up the dictionary. The following functions define the public interface that you're exposing to allow clients to work with dictionaries.

UML Design class diagram: Class with another class as attribute?

I'm having a pretty hard time trying to figure out how to model a certain scenario as a UML design class diagram.
Suppose I have the following situation:
I have a class named CPoint that has two attributes: x and y (coordinates in a R2 plane). Additionally, I have a class named CLine that should have two CPoint as attributes.
This is pretty straight forward to code (I'll use C++ in my example):
class CPoint{
float x;
float y;
//Constructor, gets and sets here
}
And for CLine:
class CLine{
CPoint p1;
CPoint p2;
//Constructor, gets and sets here
}
Now my question is: How do I model such a thing in UML?
I thought of something similar to this:
But then I was told that this is violating the principles of object oriented modeling, so then I did this:
But it does not convince me at all. Additionally, I was reading about design patterns and came to this UML design while reading about singletons:
Which makes me think my initial approach was just right. Additionally, I'm able to see that my first approach is just alright if I think about it as a C++ program. In Java, however, I'd still have to create the object by doing new CPoint(0, 0) in the CLine's constructor. I'm really confused about this.
So, how do I model this situation? Am I perhaps being too concrete when I attempt to model the situation?
Thanks in advance! This isn't letting me sleep at night
In UML an association or an attribute (property) are more or less the same thing, so they are both correct.
In most UML tools however they are different things.
There is not really a rule here, but there are best practices.
My UML Best Practice: Attribute or Association says:
Use Associations for Classes and Attributes for DataTypes
If your CLine has exactly two ends represented by point, than you can define it in UML as class CLine with attributes (just like your CLine on the first example is OK but without association "has") or you can design it as CLine class with two association to CPoint. Multiplicity at CPoint will be 1 with role p1 for the first one and p2 for the second one at the CPoint side.
There is not one best solution. It depends on the context and what you want to model. I agree with Vladimir that you would have two relations with roles p1 and p2. The members x and y should be private I guess (-x, -y) and not public (+x, +y). Furthermore you could model the relation as aggregate or composite (open or closed diamond symbol) but if a single point can be the endpoint of two lines then that is not appropriate. Again, this depends on what you want to model. If construct a new point in the line constructor as stated in the question, then you probably want to use a composition relation as these points do not exist without the line.
(Btw, in the code the coordinates are float and in the diagram ints).

difference between unidirectional association and dependency

According to wikipedia
Dependency is a relationship that shows that an element, or set of elements, requires other model elements for their specification or implementation.[1] The element is dependent upon the independent element, called the supplier.
So is it not the same as unidirectional association?
Do we use dependency when an operation in one class uses object of the other class as its parameter?
How are unidirectional association and dependency different.
Any example would be very helpful
Dependency :
Indicates that a client element(of any kind, including classes,
packages, use cases, etc) has knowledge of another supplier element,
and a change in supplier can effect the client.
So "dependency" is very broad relationship.Suppose that if a class-object(client) has another class-object(supplier) as a member,if a class-object send a message to another class-object,if a class-object takes another class-object as an parameter from its methods, even if a class(client) is subclass of another class(supplier) there will be dependency since change from supplier will effect clients.
Technically all of those relationships can be shown by "Dependency" line. But some of above relationships already has special notations: such as for superclass-subclass relationship we have generalization relationship.No need to show also "dependency" line because if they have generalization relationship, they have dependency. And we have "association" relationship for class-object(client) who has another class-object as a member [attribute]. So also no need to show extra dependency line in this situation.
Actually "Dependency" is badly defined relationship for class diagrams. But it can be usefull for showing dependency in which UML has no special notation such as :
if you has another class-object(supplier) as a parameter in one of your class(client) methods
if you have dependency to global variables
when you call static methods on another classes.
local variables (which you think you have important dependency)
public class RepositoryManager
{
public UpdatePriceFor(ProductDescription description)
{
Time date = Clock::GetTime();
Money oldPrice =description.GetPrice();
...
}
private IList<Item> itemsList = new List<Item>();
}
So all "associations" are also shows "dependency".But "dependency" is
broad-general-weak relationship.As a rule if there is a special
relationship which is more specific-stronger than dependency
relationship than use it. And lastly use all your relationship
"economically". Show only important ones based on modeler-model reader
perspectives.
[ Source : Adapted from Craig Larman's Applying UML and Patterns book ]
Check Fowlers bliki for further information DependencyAndAssociation
Association means that the two associated entities are linked semantically. Dependency only declares that there is a... well, dependency of some sort. All associations are dependencies, while a dependency does not actually mean association. For example, class 'A' depends on class 'B' if it has a method that takes 'B' and passes it as argument to a function in another class. But if 'A' calls some method of class 'B', it should be modeled as association.
Disclaimer I have read the UML specification and also asked myself this question a number of times. I arrived at at the definition above, but I'm still not sure it is 100% correct.

What should I name a class whose sole purpose is procedural?

I have a lot to learn in the way of OO patterns and this is a problem I've come across over the years. I end up in situations where my classes' sole purpose is procedural, just basically wrapping a procedure up in a class. It doesn't seem like the right OO way to do things, and I wonder if someone is experienced with this problem enough to help me consider it in a different way. My specific example in the current application follows.
In my application I'm taking a set of points from engineering survey equipment and normalizing them to be used elsewhere in the program. By "normalize" I mean a set of transformations of the full data set until a destination orientation is reached.
Each transformation procedure will take the input of an array of points (i.e. of the form class point { float x; float y; float z; }) and return an array of the same length but with different values. For example, a transformation like point[] RotateXY(point[] inList, float angle). The other kind of procedure wold be of the analysis type, used to supplement the normalization process and decide what transformation to do next. This type of procedure takes in the same points as a parameter but returns a different kind of dataset.
My question is, what is a good pattern to use in this situation? The one I was about to code in was a Normalization class which inherits class types of RotationXY for instance. But RotationXY's sole purpose is to rotate the points, so it would basically be implementing a single function. This doesn't seem very nice, though, for the reasons I mentioned in the first paragraph.
Thanks in advance!
The most common/natural approach for finding candidate classes in your problem domain is to look for nouns and then scan for the verbs/actions associated with those nouns to find the behavior that each class should implement. While this is generally a good advise, it doesn't mean that your objects must only represent concrete elements. When processes (which are generally modeled as methods) start to grow and become complex, it is a good practice to model them as objects. So, if your transformation has a weight on its own, it is ok to model it as an object and do something like:
class RotateXY
{
public function apply(point p)
{
//Apply the transformation
}
}
t = new RotateXY();
newPoint = t->apply(oldPoint);
in case you have many transformations you can create a polymorphic hierarchy and even chain one transformation after another. If you want to dig a bit deeper you can also take a look at the Command design pattern, which closely relates to this.
Some final comments:
If it fits your case, it is a good idea to model the transformation at the point level and then apply it to a collection of points. In that way you can properly isolate the transformation concept and is also easier to write test cases. You can later even create a Composite of transformations if you need.
I generally don't like the Utils (or similar) classes with a bunch of static methods, since in most of the cases it means that your model is missing the abstraction that should carry that behavior.
HTH
Typically, when it comes to classes that contain only static methods, I name them Util, e.g. DbUtil for facading DB access, FileUtil for file I/O etc. So find some term that all your methods have in common and name it that Util. Maybe in your case GeometryUtil or something along those lines.
Since the particulars of the transformations you apply seem ad-hoc for the problem and possibly prone to change in the future you could code them in a configuration file.
The point's client would read from the file and know what to do. As for the rotation or any other transformation method, they could go well as part of the Point class.
I see nothing particularly wrong with classes/interfaces having just essentially one member.
In your case the member is an "Operation with some arguments of one type that returns same type" - common for some math/functional problems. You may find convenient to have interface/base class and helper methods that combine multiple transformation classes together into more complex transformation.
Alternative approach: if you language support it is just go functional style altogether (similar to LINQ in C#).
On functional style suggestion: I's start with following basic functions (probably just find them in standard libraries for the language)
collection = map(collection, perItemFunction) to transform all items in a collection (Select in C#)
item = reduce (collection, agregateFunction) to reduce all items into single entity (Aggregate in C#)
combine 2 functions on item funcOnItem = combine(funcFirst, funcSecond). Can be expressed as lambda in C# Func<T,T> combined = x => second(first(x)).
"bind"/curry - fix one of arguments of a function functionOfOneArg = curry(funcOfArgs, fixedFirstArg). Can be expressed in C# as lambda Func<T,T> curried = x => funcOfTwoArg(fixedFirstArg, x).
This list will let you do something like "turn all points in collection on a over X axis by 10 and shift Y by 15": map(points, combine(curry(rotateX, 10), curry(shiftY(15))).
The syntax will depend on language. I.e. in JavaScript you just pass functions (and map/reduce are part of language already), C# - lambda and Func classes (like on argument function - Func<T,R>) are an option. In some languages you have to explicitly use class/interface to represent a "function" object.
Alternative approach: If you actually dealing with points and transformation another traditional approach is to use Matrix to represent all linear operations (if your language supports custom operators you get very natural looking code).