Design patterns on initializing an object? - oop

What's the recommended way to handle an object that may not be fully initialized?
e.g. taking the following code (off the top of my head in ruby):
class News
attr_accessor :number
def initialize(site)
#site = site
end
def setup(number)
#number = number
end
def list
puts news_items(#site, #number)
end
end
Clearly if I do something like:
news = News.new("siteA")
news.list
I'm going to run into problems. I'd need to do news.setup(3) before news.list.
But, are there any design patterns around this that I should be aware of?
Should I be creating default values? Or using fixed numbers of arguments to ensure objects are correctly initialized?
Or am I simply worrying too much about the small stuff here.

Should I be creating default values?
Does it make sense to set a default? If so this is a perfectly valid approach IMHO
Or using fixed numbers of arguments to ensure objects are correctly initialized?
You should ensure that your objects cannot be constructed in an invalid state, this will make your's and other users of your code much simpler.
in your example not initializing number in some way is a problem, and this method is an example of temporal coupling. You should avoid this, and the two ways you suggested are ways to do this. Alternatively you can have another object or static method responsible for building your object in a valid state instead
If you do have an object which in not fully initialised then any invalid methods should produce appropriate and descriptive exceptions which let the users know that they are using the code incorrectly, and gives examples of the correct usage patterns.
In c# InvalidStateException is usually appropriate and similar exceptions exist in Java. Ruby is beyond my pay grade unfortunately :)

Related

What should I name a class whose sole purpose is procedural?

I have a lot to learn in the way of OO patterns and this is a problem I've come across over the years. I end up in situations where my classes' sole purpose is procedural, just basically wrapping a procedure up in a class. It doesn't seem like the right OO way to do things, and I wonder if someone is experienced with this problem enough to help me consider it in a different way. My specific example in the current application follows.
In my application I'm taking a set of points from engineering survey equipment and normalizing them to be used elsewhere in the program. By "normalize" I mean a set of transformations of the full data set until a destination orientation is reached.
Each transformation procedure will take the input of an array of points (i.e. of the form class point { float x; float y; float z; }) and return an array of the same length but with different values. For example, a transformation like point[] RotateXY(point[] inList, float angle). The other kind of procedure wold be of the analysis type, used to supplement the normalization process and decide what transformation to do next. This type of procedure takes in the same points as a parameter but returns a different kind of dataset.
My question is, what is a good pattern to use in this situation? The one I was about to code in was a Normalization class which inherits class types of RotationXY for instance. But RotationXY's sole purpose is to rotate the points, so it would basically be implementing a single function. This doesn't seem very nice, though, for the reasons I mentioned in the first paragraph.
Thanks in advance!
The most common/natural approach for finding candidate classes in your problem domain is to look for nouns and then scan for the verbs/actions associated with those nouns to find the behavior that each class should implement. While this is generally a good advise, it doesn't mean that your objects must only represent concrete elements. When processes (which are generally modeled as methods) start to grow and become complex, it is a good practice to model them as objects. So, if your transformation has a weight on its own, it is ok to model it as an object and do something like:
class RotateXY
{
public function apply(point p)
{
//Apply the transformation
}
}
t = new RotateXY();
newPoint = t->apply(oldPoint);
in case you have many transformations you can create a polymorphic hierarchy and even chain one transformation after another. If you want to dig a bit deeper you can also take a look at the Command design pattern, which closely relates to this.
Some final comments:
If it fits your case, it is a good idea to model the transformation at the point level and then apply it to a collection of points. In that way you can properly isolate the transformation concept and is also easier to write test cases. You can later even create a Composite of transformations if you need.
I generally don't like the Utils (or similar) classes with a bunch of static methods, since in most of the cases it means that your model is missing the abstraction that should carry that behavior.
HTH
Typically, when it comes to classes that contain only static methods, I name them Util, e.g. DbUtil for facading DB access, FileUtil for file I/O etc. So find some term that all your methods have in common and name it that Util. Maybe in your case GeometryUtil or something along those lines.
Since the particulars of the transformations you apply seem ad-hoc for the problem and possibly prone to change in the future you could code them in a configuration file.
The point's client would read from the file and know what to do. As for the rotation or any other transformation method, they could go well as part of the Point class.
I see nothing particularly wrong with classes/interfaces having just essentially one member.
In your case the member is an "Operation with some arguments of one type that returns same type" - common for some math/functional problems. You may find convenient to have interface/base class and helper methods that combine multiple transformation classes together into more complex transformation.
Alternative approach: if you language support it is just go functional style altogether (similar to LINQ in C#).
On functional style suggestion: I's start with following basic functions (probably just find them in standard libraries for the language)
collection = map(collection, perItemFunction) to transform all items in a collection (Select in C#)
item = reduce (collection, agregateFunction) to reduce all items into single entity (Aggregate in C#)
combine 2 functions on item funcOnItem = combine(funcFirst, funcSecond). Can be expressed as lambda in C# Func<T,T> combined = x => second(first(x)).
"bind"/curry - fix one of arguments of a function functionOfOneArg = curry(funcOfArgs, fixedFirstArg). Can be expressed in C# as lambda Func<T,T> curried = x => funcOfTwoArg(fixedFirstArg, x).
This list will let you do something like "turn all points in collection on a over X axis by 10 and shift Y by 15": map(points, combine(curry(rotateX, 10), curry(shiftY(15))).
The syntax will depend on language. I.e. in JavaScript you just pass functions (and map/reduce are part of language already), C# - lambda and Func classes (like on argument function - Func<T,R>) are an option. In some languages you have to explicitly use class/interface to represent a "function" object.
Alternative approach: If you actually dealing with points and transformation another traditional approach is to use Matrix to represent all linear operations (if your language supports custom operators you get very natural looking code).

Stateful objects, properties and parameter-less methods in favour of stateless objects, parameters and return values

I find this class definition a bit odd:
http://www.extremeoptimization.com/Documentation/Reference/Extreme.Mathematics.LinearAlgebra.SingleLeastSquaresSolver_Members.aspx
The Solve method does have a return value but would not need to because the result is also available in the Solution property.
This is what I see as traditional code:
var sqrt2 = Math.Sqrt(2)
This would be an alternative in the same spirit as the solver in the link:
var sqrtCalculator = new SqrtCalculator();
sqrtCalculator.Parameter = 2;
sqrtCalculator.Run();
var sqrt2 = sqrtCalculator.Result;
What are the pros and cons besides the second version being a bit "untraditional"?
Yes, the compiler won't help the user who forgot to assign some property (parameter) BUT this is the case with all components that contain writeable properties and don't have mandatory values in the constructor.
Yes, threading will not work, BUT each thread can create its own solver.
Yes, the garbage collector won't be able to dispose the solver's result, BUT if the entire solver is disposed it will.
Yes, compilers and processors have special treatment of parameters and return values which makes them fast, BUT the time for parameter handling is mostly neglectable.
And so on. Other ideas?
Well, after a year I found a clear flaw with this "introvert" approach. I am using an existing filter object which should operate on a measurement object but rather operates on itself in a "it's all me and nothing else"-fashion described above. Now the customer wants a recalculation of a measurement object a few minutes after the first calculation, and meanwhile the filter has processed other measurement objects. If it had been stateless and stored its data in the measurement object, it would have been an easy matter to implement a Recalculate method. The only way to solve the problem with an introvert filter is to let a filter instance be a part of the measurement object. Then filters need to be instantiated for every new measurement object. And since filters are a part of a chain the entire chain needs to be recreated. Well, there is some merit to being stateless.

Overextending object design by adding many trivial fields?

I have to add a bunch of trivial or seldom used attributes to an object in my business model.
So, imagine class Foo which has a bunch of standard information such as Price, Color, Weight, Length. Now, I need to add a bunch of attributes to Foo that are rarely deviating from the norm and rarely used (in the scope of the entire domain). So, Foo.DisplayWhenConditionIsX is true for 95% of instances; likewise, Foo.ShowPriceWhenConditionIsY is almost always true, and Foo.PriceWhenViewedByZ has the same value as Foo.Price most of the time.
It just smells wrong to me to add a dozen fields like this to both my class and database table. However, I don't know that wrapping these new fields into their own FooDisplayAttributes class makes sense. That feels like adding complexity to my DAL and BLL for little gain other than a smaller object. Any recommendations?
Try setting up a separate storage class/struct for the rarely used fields and hold it as a single field, say "rarelyUsedFields" (for example, it will be a pointer in C++ and a reference in Java - you don't mention your language.)
Have setters/getters for these fields on your class. Setters will check if the value is not the same as default and lazily initialize rarelyUsedFields, then set the respective field value (say, rarelyUsedFields.DisplayWhenConditionIsX = false). Getters they will read the rarelyUsedFields value and return default values (true for DisplayWhenConditionIsX and so on) if it is NULL, otherwise return rarelyUsedFields.DisplayWhenConditionIsX.
This approach is used quite often, see WebKit's Node.h as an example (and its focused() method.)
Abstraction makes your question a bit hard to understand, but I would suggest using custom getters such as Foo.getPrice() and Foo.getSpecialPrice().
The first one would simply return the attribute, while the second would perform operations on it first.
This is only possible if there is a way to calculate the "seldom used version" from the original attribute value, but in most common cases this would be possible, providing you can access data from another object storing parameters, such as FooShop.getCurrentDiscount().
The problem I see is more about the Foo object having side effects.
In your example, I see two features : display and price.
I would build one or many Displayer (who knows how to display) and make the price a component object, with a list of internal price modificators.
Note all this is relevant only if your Foo objects are called by numerous clients.

What's the benefit of postfixing arrays with "List" for variables?

I have seen numerous developers postfixing variable names with a "List" when it's an array. What's the point of this and would you encourage this style? For example:
// Java
String[] fileList;
String file;
// PHP
$fileList = array();
$file = '';
The idea applies to any language with support for arrays.
The idea? Readability - you can tell the variable is a collection in one glance. This can be achieved with pluralizing the variable name (ie. files).
If the fact that the data type is a list is significant (assuming several different collection types), then using that postfix is the right thing to do (as opposed to simply being a collection).
I personally tend to pluralize variable names, using a list postfix only if it adds information or can't be inferred from the name otherwise (say a list of lists).
I think this is mostly a matter of taste. I tend to name things to reflect what they are or what they do. So if I have a collection or array of File instances, the variable would most probably named files, or have a more specific name if the context allows it. Naming an array of Files fileList is in my humble opinion plain wrong, because, at least in Java, an array is not a List. But then, the compiler won't complain...
More complex collections like a Map get names like keyToValue. So if I had a map which assigns teachers to classrooms this would be called teacherToRoom in my code. I hate grepping through the code to find out what the variables are meant to do, so I try to be as specific as needed with the names.
In conclusion it's all about correct code, and variable names can not influence this outcome from the compiler perspective. But they can very well affect the outcome when it comes to humans working with the code, so it's best to do whatever works for the majority of people working on a codebase.

Object methods and stats - the best object oriented design approach question

I need to write some instance method, something like this (code in ruby):
def foo_bar(param)
foo(param)
if some_condition
do_bar(param)
else
do_baz(param)
end
end
Method foo_bar is a public api.
But I think, param variable here appears too many times. Maybe it would be better to create an private instance variable and use it in foo, do_bar and do_baz method? Like here: (#param is an instance variable in ruby, it can be initialized any time)
def foo_bar(param)
#param = param
foo
if some_condition
do_bar
else
do_baz
end
end
Which code is better? And why?
Is param replacing part of the state of the object?
If param is not changing the object state then it would be wrong to introduce non-obvious coupling between these methods as a convenience.
If param is altering the state of the object then it may still be bad practice to have a public api altering the state - much better to have a single private method responsible for checking and changing the state.
If param is directly setting the state of the object then I would change the instance variable here but only after checking that the new state is not inconsistent
The first version should be preferred for a couple of reasons. First, it makes testing much easier as each method is independent of other state. To test the do_bar method, simply create an instance of its containing class and invoke the method with various parameters. If you chose the second version of code, you'd have to make sure that the object had all the proper instance variables set before invoking the method. This tightly couples the test code with the object and results in broken test cases or, even worse, testcases that should no longer pass, but still do since they haven't been updated to match how the object now works.
The second reason to prefer the first version of code is that it is a more functional style and facilitates easier reuse. Say that another module or lambda function implements do_bar better than the current one. It won't have been coded to assume some parent class with a certain named instance variable. To be reusable, it will have expected any variables to be passed in as parameters.
The functional approach is the much better approach ... even in object oriented languages.
If you do not need param outside of the foo_bar method the first version is better. It is more obvious what information is being passed around and you are keeping it more thread friendly.
And I also agree with Mladen in the comment above: don't add something to the object state that doesn't belong there.