At what level should i do checks? - oop

I design new class, which contains the same function boolean isCellEmpty() at each level of abstraction. I have the Matrix class in the bottom of my class hierarchy. On the top I have GraphMainWindow class.
Where should I do checks (e.g. if (i >= 0, i < xCellsCount, j >= 0 and so...)?

Good question, wondered about it myself many times. Answer: In the lowest level. This way errors will never slip undetected.
You can still check for errors in higher level where an algorithmic process makes sense, but the lowest level is the most important.
There are some exceptions to this. For example, if the error is reported via a message that holds the applicatio and you expect many errors to occur in the lowest level. But these are not so common, and you can bend the above rule if you feel it disturbs you.

The simple answer is: at the most generic level possible. The first inheritable class that declares those variables should perform the checks. Anything below that should just defer to the superclass unless overridden functionality is required. In a class further up the inheritance hierarchy from the one you've chosen to use for the checks, the method handling the checks should probably notify subclasses that haven't implemented an overridden version that they're getting default (and possibly useless) behavior. I often raise an exception in such a case.

So to put it in a nutshell you have this classes diagram:
Matrix ( a
^ ^
| |
... b means : b inherits a)
^
|
GraphMainWindow
You have a method isCellEmpty that is found in the base class and every inherited one.
If the datastruct of data that isCellEmpty use to do its checks do not change since the Matrix class, you do them in the Matrix class which is the most generic one.
If you change the datastructure since the Matrix one you should implement the test in the class that changed the datastructure.
Regards

Related

UML Design class diagram: Class with another class as attribute?

I'm having a pretty hard time trying to figure out how to model a certain scenario as a UML design class diagram.
Suppose I have the following situation:
I have a class named CPoint that has two attributes: x and y (coordinates in a R2 plane). Additionally, I have a class named CLine that should have two CPoint as attributes.
This is pretty straight forward to code (I'll use C++ in my example):
class CPoint{
float x;
float y;
//Constructor, gets and sets here
}
And for CLine:
class CLine{
CPoint p1;
CPoint p2;
//Constructor, gets and sets here
}
Now my question is: How do I model such a thing in UML?
I thought of something similar to this:
But then I was told that this is violating the principles of object oriented modeling, so then I did this:
But it does not convince me at all. Additionally, I was reading about design patterns and came to this UML design while reading about singletons:
Which makes me think my initial approach was just right. Additionally, I'm able to see that my first approach is just alright if I think about it as a C++ program. In Java, however, I'd still have to create the object by doing new CPoint(0, 0) in the CLine's constructor. I'm really confused about this.
So, how do I model this situation? Am I perhaps being too concrete when I attempt to model the situation?
Thanks in advance! This isn't letting me sleep at night
In UML an association or an attribute (property) are more or less the same thing, so they are both correct.
In most UML tools however they are different things.
There is not really a rule here, but there are best practices.
My UML Best Practice: Attribute or Association says:
Use Associations for Classes and Attributes for DataTypes
If your CLine has exactly two ends represented by point, than you can define it in UML as class CLine with attributes (just like your CLine on the first example is OK but without association "has") or you can design it as CLine class with two association to CPoint. Multiplicity at CPoint will be 1 with role p1 for the first one and p2 for the second one at the CPoint side.
There is not one best solution. It depends on the context and what you want to model. I agree with Vladimir that you would have two relations with roles p1 and p2. The members x and y should be private I guess (-x, -y) and not public (+x, +y). Furthermore you could model the relation as aggregate or composite (open or closed diamond symbol) but if a single point can be the endpoint of two lines then that is not appropriate. Again, this depends on what you want to model. If construct a new point in the line constructor as stated in the question, then you probably want to use a composition relation as these points do not exist without the line.
(Btw, in the code the coordinates are float and in the diagram ints).

What should I name a class whose sole purpose is procedural?

I have a lot to learn in the way of OO patterns and this is a problem I've come across over the years. I end up in situations where my classes' sole purpose is procedural, just basically wrapping a procedure up in a class. It doesn't seem like the right OO way to do things, and I wonder if someone is experienced with this problem enough to help me consider it in a different way. My specific example in the current application follows.
In my application I'm taking a set of points from engineering survey equipment and normalizing them to be used elsewhere in the program. By "normalize" I mean a set of transformations of the full data set until a destination orientation is reached.
Each transformation procedure will take the input of an array of points (i.e. of the form class point { float x; float y; float z; }) and return an array of the same length but with different values. For example, a transformation like point[] RotateXY(point[] inList, float angle). The other kind of procedure wold be of the analysis type, used to supplement the normalization process and decide what transformation to do next. This type of procedure takes in the same points as a parameter but returns a different kind of dataset.
My question is, what is a good pattern to use in this situation? The one I was about to code in was a Normalization class which inherits class types of RotationXY for instance. But RotationXY's sole purpose is to rotate the points, so it would basically be implementing a single function. This doesn't seem very nice, though, for the reasons I mentioned in the first paragraph.
Thanks in advance!
The most common/natural approach for finding candidate classes in your problem domain is to look for nouns and then scan for the verbs/actions associated with those nouns to find the behavior that each class should implement. While this is generally a good advise, it doesn't mean that your objects must only represent concrete elements. When processes (which are generally modeled as methods) start to grow and become complex, it is a good practice to model them as objects. So, if your transformation has a weight on its own, it is ok to model it as an object and do something like:
class RotateXY
{
public function apply(point p)
{
//Apply the transformation
}
}
t = new RotateXY();
newPoint = t->apply(oldPoint);
in case you have many transformations you can create a polymorphic hierarchy and even chain one transformation after another. If you want to dig a bit deeper you can also take a look at the Command design pattern, which closely relates to this.
Some final comments:
If it fits your case, it is a good idea to model the transformation at the point level and then apply it to a collection of points. In that way you can properly isolate the transformation concept and is also easier to write test cases. You can later even create a Composite of transformations if you need.
I generally don't like the Utils (or similar) classes with a bunch of static methods, since in most of the cases it means that your model is missing the abstraction that should carry that behavior.
HTH
Typically, when it comes to classes that contain only static methods, I name them Util, e.g. DbUtil for facading DB access, FileUtil for file I/O etc. So find some term that all your methods have in common and name it that Util. Maybe in your case GeometryUtil or something along those lines.
Since the particulars of the transformations you apply seem ad-hoc for the problem and possibly prone to change in the future you could code them in a configuration file.
The point's client would read from the file and know what to do. As for the rotation or any other transformation method, they could go well as part of the Point class.
I see nothing particularly wrong with classes/interfaces having just essentially one member.
In your case the member is an "Operation with some arguments of one type that returns same type" - common for some math/functional problems. You may find convenient to have interface/base class and helper methods that combine multiple transformation classes together into more complex transformation.
Alternative approach: if you language support it is just go functional style altogether (similar to LINQ in C#).
On functional style suggestion: I's start with following basic functions (probably just find them in standard libraries for the language)
collection = map(collection, perItemFunction) to transform all items in a collection (Select in C#)
item = reduce (collection, agregateFunction) to reduce all items into single entity (Aggregate in C#)
combine 2 functions on item funcOnItem = combine(funcFirst, funcSecond). Can be expressed as lambda in C# Func<T,T> combined = x => second(first(x)).
"bind"/curry - fix one of arguments of a function functionOfOneArg = curry(funcOfArgs, fixedFirstArg). Can be expressed in C# as lambda Func<T,T> curried = x => funcOfTwoArg(fixedFirstArg, x).
This list will let you do something like "turn all points in collection on a over X axis by 10 and shift Y by 15": map(points, combine(curry(rotateX, 10), curry(shiftY(15))).
The syntax will depend on language. I.e. in JavaScript you just pass functions (and map/reduce are part of language already), C# - lambda and Func classes (like on argument function - Func<T,R>) are an option. In some languages you have to explicitly use class/interface to represent a "function" object.
Alternative approach: If you actually dealing with points and transformation another traditional approach is to use Matrix to represent all linear operations (if your language supports custom operators you get very natural looking code).

Efficient way to define a class with multiple, optionally-empty slots in S4 of R?

I am building a package to handle data that arrives with up to 4 different types. Each of these types is a legitimate class in the form of a matrix, data.frame or tree. Depending on the way the data is processed and other experimental factors, some of these data components may be missing, but it is still extremely useful to be able to store this information as an instance of a special class and have methods that recognize the different component data.
Approach 1:
I have experimented with an incremental inheritance structure that looks like a nested tree, where each combination of data types has its own class explicitly defined. This seems difficult to extend for additional data types in the future, and is also challenging for new developers to learn all the class names, however well-organized those names might be.
Approach 2:
A second approach is to create a single "master-class" that includes a slot for all 4 data types. In order to allow the slots to be NULL for the instances of missing data, it appears necessary to first define a virtual class union between the NULL class and the new data type class, and then use the virtual class union as the expected class for the relevant slot in the master-class. Here is an example (assuming each data type class is already defined):
################################################################################
# Use setClassUnion to define the unholy NULL-data union as a virtual class.
################################################################################
setClassUnion("dataClass1OrNULL", c("dataClass1", "NULL"))
setClassUnion("dataClass2OrNULL", c("dataClass2", "NULL"))
setClassUnion("dataClass3OrNULL", c("dataClass3", "NULL"))
setClassUnion("dataClass4OrNULL", c("dataClass4", "NULL"))
################################################################################
# Now define the master class with all 4 slots, and
# also the possibility of empty (NULL) slots and an explicity prototype for
# slots to be set to NULL if they are not provided at instantiation.
################################################################################
setClass(Class="theMasterClass",
representation=representation(
slot1="dataClass1OrNULL",
slot2="dataClass2OrNULL",
slot3="dataClass3OrNULL",
slot4="dataClass4OrNULL"),
prototype=prototype(slot1=NULL, slot2=NULL, slot3=NULL, slot4=NULL)
)
################################################################################
So the question might be rephrased as:
Are there more efficient and/or flexible alternatives to either of these approaches?
This example is modified from an answer to a SO question about setting the default value of slot to NULL. This question differs in that I am interested in knowing the best options in R for creating classes with slots that can be empty if needed, despite requiring a specific complex class in all other non-empty cases.
In my opinion...
Approach 2
It sort of defeats the purpose to adopt a formal class system, and then to create a class that contains ill-defined slots ('A' or NULL). At a minimum I would try to make DataClass1 have a 'NULL'-like default. As a simple example, the default here is a zero-length numeric vector.
setClass("DataClass1", representation=representation(x="numeric"))
DataClass1 <- function(x=numeric(), ...) {
new("DataClass1", x=x, ...)
}
Then
setClass("MasterClass1", representation=representation(dataClass1="DataClass1"))
MasterClass1 <- function(dataClass1=DataClass1(), ...) {
new("MasterClass1", dataClass1=dataClass1, ...)
}
One benefit of this is that methods don't have to test whether the instance in the slot is NULL or 'DataClass1'
setMethod(length, "DataClass1", function(x) length(x#x))
setMethod(length, "MasterClass1", function(x) length(x#dataClass1))
> length(MasterClass1())
[1] 0
> length(MasterClass1(DataClass1(1:5)))
[1] 5
In response to your comment about warning users when they access 'empty' slots, and remembering that users usually want functions to do something rather than tell them they're doing something wrong, I'd probably return the empty object DataClass1() which accurately reflects the state of the object. Maybe a show method would provide an overview that reinforced the status of the slot -- DataClass1: none. This seems particularly appropriate if MasterClass1 represents a way of coordinating several different analyses, of which the user may do only some.
A limitation of this approach (or your Approach 2) is that you don't get method dispatch -- you can't write methods that are appropriate only for an instance with DataClass1 instances that have non-zero length, and are forced to do some sort of manual dispatch (e.g., with if or switch). This might seem like a limitation for the developer, but it also applies to the user -- the user doesn't get a sense of which operations are uniquely appropriate to instances of MasterClass1 that have non-zero length DataClass1 instances.
Approach 1
When you say that the names of the classes in the hierarchy are going to be confusing to your user, it seems like this is maybe pointing to a more fundamental issue -- you're trying too hard to make a comprehensive representation of data types; a user will never be able to keep track of ClassWithMatrixDataFrameAndTree because it doesn't represent the way they view the data. This is maybe an opportunity to scale back your ambitions to really tackle only the most prominent parts of the area you're investigating. Or perhaps an opportunity to re-think how the user might think of and interact with the data they've collected, and to use the separation of interface (what the user sees) from implementation (how you've chosen to represent the data in classes) provided by class systems to more effectively encapsulate what the user is likely to do.
Putting the naming and number of classes aside, when you say "difficult to extend for additional data types in the future" it makes me wonder if perhaps some of the nuances of S4 classes are tripping you up? The short solution is to avoid writing your own initialize methods, and rely on the constructors to do the tricky work, along the lines of
setClass("A", representation(x="numeric"))
setClass("B", representation(y="numeric"), contains="A")
A <- function(x = numeric(), ...) new("A", x=x, ...)
B <- function(a = A(), y = numeric(), ...) new("B", a, y=y, ...)
and then
> B(A(1:5), 10)
An object of class "B"
Slot "y":
[1] 10
Slot "x":
[1] 1 2 3 4 5

what is the best design pattern for this problem?

I have a class that has several properties. Some properties can be changed by other classes but some properties are dependent on other properties. For example assume that my class has three properties: A, B and C. A and B can be changed by other classes in system and C is equal to A + B. The class generate property change notification So I want when A or B changed, a notification generate for both the changed property (A or B) and a notification is generated for C too.
I have three options (any other?)
1- Create a normal C property (with backing field) and add code in setter of A and B to change C.
2- Create a normal C property and listen to property change notification of my class inside of my class and change C when A or B changes.
3- Create a calculating property for C no setter but getter is A+B, in setter of A (and B), I fire property change for both A (or B) and C.
Which one is a better design pattern (in C#)? I personally like design number 2.
Sounds like an Observer pattern might be useful here. See for example http://www.oodesign.com/observer-pattern.html. Although a search for Observer pattern will yield many results and other examples, some much simpler, and language specific.
I would probably go with a variation on 2 and 3.
You could have a calculated property (getter only) for C so that the C = A + B calculation is only in one place.
Then, as per your option 2, you could listen to property changed events within the same class... but instead of updating C when you detect a PropertyChanged event for A and B, you only need to raise a PropertyChanged event for C at that time.
2 is the purest since it keeps A,B and C separate, but it does involve a bit of overhead qith the string parsing in the property notification.
If it was a simple set of properties I'd be tempted with 1, since they are still reasonably separate but the update is much simpler. 3 is the worst IMO, since A+B are replicating code which should be separate anyway (C notifications).
The problem here is that you are trying to mix the way that things should be done with the way Microsoft forces you to do things... :)
But my rantings aside it think that option 3 sounds cleanest. Certainly not 1, that is the worst by far, and I think that subscribing to your own property change events could lead to some funky problems that would be hard to debug when some poor sap tries to maintain the code in the future...
If you think about it at a high level, what you are suggesting in 3 perfectly describes what is happening in the class:
Any time that property A is changed observers of the class should be notified that property C has also changed (because it has).

What is the antonym of encapsulation?

Using online dictionary tools doesn't really help. I think the way encapsulate is use in computer science doesn't exactly match its meaning in plain English.
What is the antonym of computer science's version of encaspulate? More specifically, what is an antonym for encapsulate that would work as a function name.
Why should I care? Here's my motivation:
// A class with a private member variable;
class Private
{
public:
// Test will be able to access Private's private members;
class Test;
private:
int i;
}
// Make Test exactly like Private
class Private::Test : public Private
{
public:
// Make Private's copy of i available publicly in Test
using Private::i;
};
// A convenience function to quickly break encapsulation on a class to be tested.
// I don't have good name for what it does
Private::Test& foo( Private& p )
{ return *reinterpret_cast<Private::Test*>(&p); } // power cast
void unit_test()
{
Private p;
// using the function quickly grab access to p's internals.
// obviously it would be evil to use this anywhere except in unit tests.
assert( foo(p).i == 42 );
}
The antonym is "C".
Ok, just kidding. (Sort of.)
The best terms I can come up with are "expose" and "violate".
The purpose behind encapsulation is to hide/cover/protect. The antonym would be reveal/expose/make public.
How about Decapsulation..
Though it aint a computer science term, but in medical science, Surgical removal of a capsule or enveloping membrane.. Check out here..
"Removing/Breaking encapsulation" is about the closest thing I've seen, honestly.
If you think of the word in the English sense, to encapsulate means to enclose within something. But in the CS sense, there's this concept of protection levels and it looks like you want to imply circumventing the access levels as well, so something like "extraction" doesn't really convey the meaning you're looking for.
But if you just think of it in terms of what the access levels are, it looks like you're making something public so, how about "publicizing"?
This is not such a simple question - Scott Meyers had an interesting article to demonstrate some of the nuances around encapsulation here.
I'll start with the punchline: If
you're writing a function that can be
implemented as either a member or as a
non-friend non-member, you should
prefer to implement it as a non-member
function. That decision increases
class encapsulation. When you think
encapsulation, you should think
non-member functions.
How about "Bad Idea"?
The true antonym of "Encapsulation" is "Global State".
The general opposite of encapsulation is coupling and we often talk about systems that are tightly coupled or loosely coupled.
The reason you'd want components to be encapsulated is because it makes it easier to reason about how they work.
Take the analogy of trains: the consequence of coupling the railcars is that the driver must consider the characteristics (inertia, length) of the entire train.
Obviously, though, we couple systems because we need them to work together.
Inverted encapsulation and data structures
There's another term that I've been digging for, which is how I came across this question, that refers to a non-standard style of data structures.
The standard style of encapsulation is exemplified by Java's LinkedList; the actual nodes of the list are designed to be inaccessible to the consumer. The theory is that this is an implementation detail and can change to improve performance, while existing code will continue to run.
Another style is the classic functional cons-list. This is a singly linked list, and the idea is that it's so simple that there's nothing to improve about the data structure, e.g.
data [a] = [] | a : [a] deriving (Eq, Ord)
-- Haskellers then work directly with the list
-- There's nothing to hide because it's so simple
typicalHaskell :: [a] -> b
typicalHaskell [] = emptyValue
typicalHaskell h : t = h `doAThing` (typicalHaskell t)
That's the definition from Haskell's standard prelude though the report notes that isn't valid Haskell syntax, and in practice [a] is defined in the guts of the compiler.
Then there's what I'm calling an "inverted" data structure, but I'm still looking for the correct term. This is, I think, really the opposite of encapsulation.
A good example of this is Python's heapq module. The data structure here is a binary heap, but there isn't a Heap class. Rather, you get a collection of functions that operate on generic Python lists and you're responsible for using those methods correctly to ensure the heap invariants are maintained.
How about "spaghetti"?