Overextending object design by adding many trivial fields?

Overextending object design by adding many trivial fields? - oop

I have to add a bunch of trivial or seldom used attributes to an object in my business model.
So, imagine class Foo which has a bunch of standard information such as Price, Color, Weight, Length. Now, I need to add a bunch of attributes to Foo that are rarely deviating from the norm and rarely used (in the scope of the entire domain). So, Foo.DisplayWhenConditionIsX is true for 95% of instances; likewise, Foo.ShowPriceWhenConditionIsY is almost always true, and Foo.PriceWhenViewedByZ has the same value as Foo.Price most of the time.
It just smells wrong to me to add a dozen fields like this to both my class and database table. However, I don't know that wrapping these new fields into their own FooDisplayAttributes class makes sense. That feels like adding complexity to my DAL and BLL for little gain other than a smaller object. Any recommendations?

Try setting up a separate storage class/struct for the rarely used fields and hold it as a single field, say "rarelyUsedFields" (for example, it will be a pointer in C++ and a reference in Java - you don't mention your language.)
Have setters/getters for these fields on your class. Setters will check if the value is not the same as default and lazily initialize rarelyUsedFields, then set the respective field value (say, rarelyUsedFields.DisplayWhenConditionIsX = false). Getters they will read the rarelyUsedFields value and return default values (true for DisplayWhenConditionIsX and so on) if it is NULL, otherwise return rarelyUsedFields.DisplayWhenConditionIsX.
This approach is used quite often, see WebKit's Node.h as an example (and its focused() method.)

Abstraction makes your question a bit hard to understand, but I would suggest using custom getters such as Foo.getPrice() and Foo.getSpecialPrice().
The first one would simply return the attribute, while the second would perform operations on it first.
This is only possible if there is a way to calculate the "seldom used version" from the original attribute value, but in most common cases this would be possible, providing you can access data from another object storing parameters, such as FooShop.getCurrentDiscount().

The problem I see is more about the Foo object having side effects.
In your example, I see two features : display and price.
I would build one or many Displayer (who knows how to display) and make the price a component object, with a list of internal price modificators.
Note all this is relevant only if your Foo objects are called by numerous clients.

Related

Differences between Function that returns a string and read only string property [duplicate]

I need to expose the "is mapped?" state of an instance of a class. The outcome is determined by a basic check. It is not simply exposing the value of a field. I am unsure as to whether I should use a read-only property or a method.
Read-only property:
public bool IsMapped
{
get
{
return MappedField != null;
}
}
Method:
public bool IsMapped()
{
return MappedField != null;
}
I have read MSDN's Choosing Between Properties and Methods but I am still unsure.

The C# standard says
§ 8.7.4
A property is a member that provides access to a characteristic of an object or a class. Examples of properties include the length of a string, the size of a font, the caption of a window, the name of a customer, and so on. Properties are a natural extension of fields. Both are named members with associated types, and the syntax for accessing fields and properties is the same. However, unlike fields, properties do not denote storage locations. Instead, properties have accessors that specify the statements to be executed when their values are read or written.
while as methods are defined as
§ 8.7.3
A method is a member that implements a computation or action that can be performed by an object or class. Methods have a (possibly empty) list of formal parameters, a return value (unless the method’s return-type is void ), and are either static or non-static.
Properties and methods are used to realize encapsulation. Properties encapsulate data, methods encapsulate logic. And this is why you should prefer a read-only property if you are exposing data. In your case there is no logic that modifies the internal state of your object. You want to provide access to a characteristic of an object.
Whether an instance of your object IsMapped or not is a characteristic of your object. It contains a check, but that's why you have properties to access it. Properties can be defined using logic, but they should not expose logic. Just like the example mentioned in the first quote: Imagine the String.Length property. Depending on the implementation, it may be that this property loops through the string and counts the characters. It also does perform an operation, but "from the outside" it just give's an statement over the internal state/characteristics of the object.

I would use the property, because there is no real "doing" (action), no side effects and it's not too complex.

I personally believe that a method should do something or perform some action. You are not performing anything inside IsMapped so it should be a property

I'd go for a property. Mostly because the first senctence on the referenced MSDN-article:
In general, methods represent actions and properties represent data.

In this case it seems pretty clear to me that it should be a property. It's a simple check, no logic, no side effects, no performance impact. It doesn't get much simpler than that check.
Edit:
Please note that if there was any of the above mentioned and you would put it into a method, that method should include a strong verb, not an auxiliary verb like is or has. A method does something. You could name it VerifyMapping or DetermineMappingExistance or something else as long as it starts with a verb.

I think this line in your link is the answer
methods represent actions and properties represent data.
There is no action here, just a piece of data. So it's a Property.

In situations/languages where you have access to both of these constructs, the general divide is as follows:
If the request is for something the object has, use a property (or a field).
If the request is for the result of something the object does, use a method.
A little more specifically, a property is to be used to access, in read and/or write fashion, a data member that is (for consuming purposes) owned by the object exposing the property. Properties are better than fields because the data doesn't have to exist in persistent form all the time (they allow you to be "lazy" about calculation or retrieval of this data value), and they're better than methods for this purpose because you can still use them in code as if they were public fields.
Properties should not, however, result in side effects (with the possible, understandable exception of setting a variable meant to persist the value being returned, avoiding expensive recalculation of a value needed many times); they should, all other things being equal, return a deterministic result (so NextRandomNumber is a bad conceptual choice for a property) and the calculation should not result in the alteration of any state data that would affect other calculations (for instance, getting PropertyA and PropertyB in that order should not return any different result than getting PropertyB and then PropertyA).
A method, OTOH, is conceptually understood as performing some operation and returning the result; in short, it does something, which may extend beyond the scope of computing a return value. Methods, therefore, are to be used when an operation that returns a value has additional side effects. The return value may still be the result of some calculation, but the method may have computed it non-deterministically (GetNextRandomNumber()), or the returned data is in the form of a unique instance of an object, and calling the method again produces a different instance even if it may have the same data (GetCurrentStatus()), or the method may alter state data such that doing exactly the same thing twice in a row produces different results (EncryptDataBlock(); many encryption ciphers work this way by design to ensure encrypting the same data twice in a row produces different ciphertexts).

If at any point you'll need to add parameters in order to get the value, then you need a method. Otherwise you need a property

IMHO , the first read-only property is correct because IsMapped as a Attribute of your object, and you're not performing an action (only an evaluation), but at the end of the day consistancy with your existing codebase probably counts for more than semantics.... unless this is a uni assignment

I'll agree with people here in saying that because it is obtaining data, and has no side-effects, it should be a property.
To expand on that, I'd also accept some side-effects with a setter (but not a getter) if the side-effects made sense to someone "looking at it from the outside".
One way to think about it is that methods are verbs, and properties are adjectives (meanwhile, the objects themselves are nouns, and static objects are abstract nouns).
The only exception to the verb/adjective guideline is that it can make sense to use a method rather than a property when obtaining (or setting) the information in question can be very expensive: Logically, such a feature should probably still be a property, but people are used to thinking of properties as low-impact performance-wise and while there's no real reason why that should always be the case, it could be useful to highlight that GetIsMapped() is relatively heavy perform-wise if it in fact was.
At the level of the running code, there's absolutely no difference between calling a property and calling an equivalent method to get or set; it's all about making life easier for the person writing code that uses it.

I would expect property as it only is returning the detail of a field. On the other hand I would expect
MappedFields[] mf;
public bool IsMapped()
{
mf.All(x => x != null);
}

you should use the property because c# has properties for this reason

What should I name a class whose sole purpose is procedural?

I have a lot to learn in the way of OO patterns and this is a problem I've come across over the years. I end up in situations where my classes' sole purpose is procedural, just basically wrapping a procedure up in a class. It doesn't seem like the right OO way to do things, and I wonder if someone is experienced with this problem enough to help me consider it in a different way. My specific example in the current application follows.
In my application I'm taking a set of points from engineering survey equipment and normalizing them to be used elsewhere in the program. By "normalize" I mean a set of transformations of the full data set until a destination orientation is reached.
Each transformation procedure will take the input of an array of points (i.e. of the form class point { float x; float y; float z; }) and return an array of the same length but with different values. For example, a transformation like point[] RotateXY(point[] inList, float angle). The other kind of procedure wold be of the analysis type, used to supplement the normalization process and decide what transformation to do next. This type of procedure takes in the same points as a parameter but returns a different kind of dataset.
My question is, what is a good pattern to use in this situation? The one I was about to code in was a Normalization class which inherits class types of RotationXY for instance. But RotationXY's sole purpose is to rotate the points, so it would basically be implementing a single function. This doesn't seem very nice, though, for the reasons I mentioned in the first paragraph.
Thanks in advance!

The most common/natural approach for finding candidate classes in your problem domain is to look for nouns and then scan for the verbs/actions associated with those nouns to find the behavior that each class should implement. While this is generally a good advise, it doesn't mean that your objects must only represent concrete elements. When processes (which are generally modeled as methods) start to grow and become complex, it is a good practice to model them as objects. So, if your transformation has a weight on its own, it is ok to model it as an object and do something like:
class RotateXY
{
public function apply(point p)
{
//Apply the transformation
}
}
t = new RotateXY();
newPoint = t->apply(oldPoint);
in case you have many transformations you can create a polymorphic hierarchy and even chain one transformation after another. If you want to dig a bit deeper you can also take a look at the Command design pattern, which closely relates to this.
Some final comments:
If it fits your case, it is a good idea to model the transformation at the point level and then apply it to a collection of points. In that way you can properly isolate the transformation concept and is also easier to write test cases. You can later even create a Composite of transformations if you need.
I generally don't like the Utils (or similar) classes with a bunch of static methods, since in most of the cases it means that your model is missing the abstraction that should carry that behavior.
HTH

Typically, when it comes to classes that contain only static methods, I name them Util, e.g. DbUtil for facading DB access, FileUtil for file I/O etc. So find some term that all your methods have in common and name it that Util. Maybe in your case GeometryUtil or something along those lines.

Since the particulars of the transformations you apply seem ad-hoc for the problem and possibly prone to change in the future you could code them in a configuration file.
The point's client would read from the file and know what to do. As for the rotation or any other transformation method, they could go well as part of the Point class.

I see nothing particularly wrong with classes/interfaces having just essentially one member.
In your case the member is an "Operation with some arguments of one type that returns same type" - common for some math/functional problems. You may find convenient to have interface/base class and helper methods that combine multiple transformation classes together into more complex transformation.
Alternative approach: if you language support it is just go functional style altogether (similar to LINQ in C#).
On functional style suggestion: I's start with following basic functions (probably just find them in standard libraries for the language)
collection = map(collection, perItemFunction) to transform all items in a collection (Select in C#)
item = reduce (collection, agregateFunction) to reduce all items into single entity (Aggregate in C#)
combine 2 functions on item funcOnItem = combine(funcFirst, funcSecond). Can be expressed as lambda in C# Func<T,T> combined = x => second(first(x)).
"bind"/curry - fix one of arguments of a function functionOfOneArg = curry(funcOfArgs, fixedFirstArg). Can be expressed in C# as lambda Func<T,T> curried = x => funcOfTwoArg(fixedFirstArg, x).
This list will let you do something like "turn all points in collection on a over X axis by 10 and shift Y by 15": map(points, combine(curry(rotateX, 10), curry(shiftY(15))).
The syntax will depend on language. I.e. in JavaScript you just pass functions (and map/reduce are part of language already), C# - lambda and Func classes (like on argument function - Func<T,R>) are an option. In some languages you have to explicitly use class/interface to represent a "function" object.
Alternative approach: If you actually dealing with points and transformation another traditional approach is to use Matrix to represent all linear operations (if your language supports custom operators you get very natural looking code).

Stateful objects, properties and parameter-less methods in favour of stateless objects, parameters and return values

I find this class definition a bit odd:
http://www.extremeoptimization.com/Documentation/Reference/Extreme.Mathematics.LinearAlgebra.SingleLeastSquaresSolver_Members.aspx
The Solve method does have a return value but would not need to because the result is also available in the Solution property.
This is what I see as traditional code:
var sqrt2 = Math.Sqrt(2)
This would be an alternative in the same spirit as the solver in the link:
var sqrtCalculator = new SqrtCalculator();
sqrtCalculator.Parameter = 2;
sqrtCalculator.Run();
var sqrt2 = sqrtCalculator.Result;
What are the pros and cons besides the second version being a bit "untraditional"?
Yes, the compiler won't help the user who forgot to assign some property (parameter) BUT this is the case with all components that contain writeable properties and don't have mandatory values in the constructor.
Yes, threading will not work, BUT each thread can create its own solver.
Yes, the garbage collector won't be able to dispose the solver's result, BUT if the entire solver is disposed it will.
Yes, compilers and processors have special treatment of parameters and return values which makes them fast, BUT the time for parameter handling is mostly neglectable.
And so on. Other ideas?

Well, after a year I found a clear flaw with this "introvert" approach. I am using an existing filter object which should operate on a measurement object but rather operates on itself in a "it's all me and nothing else"-fashion described above. Now the customer wants a recalculation of a measurement object a few minutes after the first calculation, and meanwhile the filter has processed other measurement objects. If it had been stateless and stored its data in the measurement object, it would have been an easy matter to implement a Recalculate method. The only way to solve the problem with an introvert filter is to let a filter instance be a part of the measurement object. Then filters need to be instantiated for every new measurement object. And since filters are a part of a chain the entire chain needs to be recreated. Well, there is some merit to being stateless.

Efficient way to define a class with multiple, optionally-empty slots in S4 of R?

I am building a package to handle data that arrives with up to 4 different types. Each of these types is a legitimate class in the form of a matrix, data.frame or tree. Depending on the way the data is processed and other experimental factors, some of these data components may be missing, but it is still extremely useful to be able to store this information as an instance of a special class and have methods that recognize the different component data.
Approach 1:
I have experimented with an incremental inheritance structure that looks like a nested tree, where each combination of data types has its own class explicitly defined. This seems difficult to extend for additional data types in the future, and is also challenging for new developers to learn all the class names, however well-organized those names might be.
Approach 2:
A second approach is to create a single "master-class" that includes a slot for all 4 data types. In order to allow the slots to be NULL for the instances of missing data, it appears necessary to first define a virtual class union between the NULL class and the new data type class, and then use the virtual class union as the expected class for the relevant slot in the master-class. Here is an example (assuming each data type class is already defined):
################################################################################
# Use setClassUnion to define the unholy NULL-data union as a virtual class.
################################################################################
setClassUnion("dataClass1OrNULL", c("dataClass1", "NULL"))
setClassUnion("dataClass2OrNULL", c("dataClass2", "NULL"))
setClassUnion("dataClass3OrNULL", c("dataClass3", "NULL"))
setClassUnion("dataClass4OrNULL", c("dataClass4", "NULL"))
################################################################################
# Now define the master class with all 4 slots, and
# also the possibility of empty (NULL) slots and an explicity prototype for
# slots to be set to NULL if they are not provided at instantiation.
################################################################################
setClass(Class="theMasterClass",
representation=representation(
slot1="dataClass1OrNULL",
slot2="dataClass2OrNULL",
slot3="dataClass3OrNULL",
slot4="dataClass4OrNULL"),
prototype=prototype(slot1=NULL, slot2=NULL, slot3=NULL, slot4=NULL)
)
################################################################################
So the question might be rephrased as:
Are there more efficient and/or flexible alternatives to either of these approaches?
This example is modified from an answer to a SO question about setting the default value of slot to NULL. This question differs in that I am interested in knowing the best options in R for creating classes with slots that can be empty if needed, despite requiring a specific complex class in all other non-empty cases.

In my opinion...
Approach 2
It sort of defeats the purpose to adopt a formal class system, and then to create a class that contains ill-defined slots ('A' or NULL). At a minimum I would try to make DataClass1 have a 'NULL'-like default. As a simple example, the default here is a zero-length numeric vector.
setClass("DataClass1", representation=representation(x="numeric"))
DataClass1 <- function(x=numeric(), ...) {
new("DataClass1", x=x, ...)
}
Then
setClass("MasterClass1", representation=representation(dataClass1="DataClass1"))
MasterClass1 <- function(dataClass1=DataClass1(), ...) {
new("MasterClass1", dataClass1=dataClass1, ...)
}
One benefit of this is that methods don't have to test whether the instance in the slot is NULL or 'DataClass1'
setMethod(length, "DataClass1", function(x) length(x#x))
setMethod(length, "MasterClass1", function(x) length(x#dataClass1))
> length(MasterClass1())
[1] 0
> length(MasterClass1(DataClass1(1:5)))
[1] 5
In response to your comment about warning users when they access 'empty' slots, and remembering that users usually want functions to do something rather than tell them they're doing something wrong, I'd probably return the empty object DataClass1() which accurately reflects the state of the object. Maybe a show method would provide an overview that reinforced the status of the slot -- DataClass1: none. This seems particularly appropriate if MasterClass1 represents a way of coordinating several different analyses, of which the user may do only some.
A limitation of this approach (or your Approach 2) is that you don't get method dispatch -- you can't write methods that are appropriate only for an instance with DataClass1 instances that have non-zero length, and are forced to do some sort of manual dispatch (e.g., with if or switch). This might seem like a limitation for the developer, but it also applies to the user -- the user doesn't get a sense of which operations are uniquely appropriate to instances of MasterClass1 that have non-zero length DataClass1 instances.
Approach 1
When you say that the names of the classes in the hierarchy are going to be confusing to your user, it seems like this is maybe pointing to a more fundamental issue -- you're trying too hard to make a comprehensive representation of data types; a user will never be able to keep track of ClassWithMatrixDataFrameAndTree because it doesn't represent the way they view the data. This is maybe an opportunity to scale back your ambitions to really tackle only the most prominent parts of the area you're investigating. Or perhaps an opportunity to re-think how the user might think of and interact with the data they've collected, and to use the separation of interface (what the user sees) from implementation (how you've chosen to represent the data in classes) provided by class systems to more effectively encapsulate what the user is likely to do.
Putting the naming and number of classes aside, when you say "difficult to extend for additional data types in the future" it makes me wonder if perhaps some of the nuances of S4 classes are tripping you up? The short solution is to avoid writing your own initialize methods, and rely on the constructors to do the tricky work, along the lines of
setClass("A", representation(x="numeric"))
setClass("B", representation(y="numeric"), contains="A")
A <- function(x = numeric(), ...) new("A", x=x, ...)
B <- function(a = A(), y = numeric(), ...) new("B", a, y=y, ...)
and then
> B(A(1:5), 10)
An object of class "B"
Slot "y":
[1] 10
Slot "x":
[1] 1 2 3 4 5

Finding variables that share common properties

I'm using Mathematica and have a set of variables (A,B,C,D,...) with properties A=(blue, big, rounded), B=(red, small, spiky), and so forth. Those properties can be common between variables. What would be the best, general way to find all variables that share a common property (of being, for instance, small)? Thanks.

Here's a list of possible properties:
In[1]:= properties={"red","green","blue","big","small","rounded","spiky"};
And here's a list of objects with some of those properties
In[2]:= list={{"blue","big","rounded"},{"red","small","spiky"},
{"red","big","rounded"},{"blue","small","spiky"}};
You can find all objects that have the property of, e.g., being "blue" using Select
In[3]:= Select[list, MemberQ[#,"blue"]&]
Out[3]= {{blue,big,rounded},{blue,small,spiky}}
This could be wrapped up into a function. Although how I would write that function would depend on the data structures and usage that you're planning.
Actually, I just reread you question you have a list of objects with some properties and you want to refer to those objects by name. So you probably want something more like
In[1]:= listProperties["A"]:={"blue","big","rounded"}
listProperties["B"]:={"red","small","spiky"}
listProperties["C"]:={"red","big","rounded"}
listProperties["D"]:={"blue","small","spiky"}
Above I defined some properties that are associated with certain strings. You don't have to use strings in the above or below, and you can create a better structure than that if you want. You could also make a constructor to create the above, such a constructor could also check if the list of properties supplied is of the right form - i.e. does not have contradictory properties, are all in a list of known properties etc...
We then define a function to test if an object/string has a certain property associated with it
In[2]:= hasProperty[obj_, property_]:=MemberQ[listProperties[obj],property]
You might want to return an error or warning message if listProperties[obj] does not have a definition/rule associated with it.
Use Select to find all "objects" in a list that have the associated property "blue":
In[3]:= Select[{"A","B","C","D"}, hasProperty[#,"blue"]&]
Out[3]= {A,D}
There are other ways (probably better ways) to set up such a data structure. But this is one of the simplest ways in Mathematica.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Overextending object design by adding many trivial fields? - oop

Related

Differences between Function that returns a string and read only string property [duplicate]

What should I name a class whose sole purpose is procedural?

Stateful objects, properties and parameter-less methods in favour of stateless objects, parameters and return values

Efficient way to define a class with multiple, optionally-empty slots in S4 of R?

Finding variables that share common properties

Categories

Resources